3D and 4D Computer Vision

Lecture – Winter Semester 25/26

Instructor: Dr. Vladislav Golyanik


Applications of 4D Reconstruction	3D Hand Texture Model for Inverse Problems


3D Shape Correspondence Estimation	Egocentric 3D Human Pose Estimation

Course Description and Learning Goals

Computer Vision is an interdisciplinary research field at the intersection of machine learning and signal processing. It studies techniques for automatic analysis and interpretation of visual and multi-modal input data. This lecture focuses on one of the advanced subfields of Computer Vision related to the inference of the observed higher-dimensional structures (3D and deformable 3D) from lower-dimensional observations (2D images).

Our world is inherently non-rigid at different spatial and temporal scales. Reconstructing and modelling it in 4D from visual observations is a vibrant research field that remains challenging and that has numerous practical applications, for instance, in AR/VR/XR, computer game development, human-computer interaction and sport analytics. The frequent challenges of this field include the ill-posedness of the underlying optimisation problems and settings (e.g., monocular, which is of special interest in Computer Vision) and observed scene conditions (e.g., partial observations, low light or high-speed motions), among many others. The goal of the lecture "3D and 4D Computer Vision" is to introduce foundational concepts of 3D computer vision for deformable and composite scenes (4D = 3D + time) as well as results of the latest research in the field in a systematic and structured manner through generalisation of studied concepts from 3D to 4D cases.

The lecture will cover the fundamentals of 3D computer vision applicable across a wide range of 3D and 4D settings (multiple view geometry, triangulation, stereo vision, bundle adjustment, linear transformations, parametrisations of rotations), different types of visual sensors (RGB, event and depth cameras), 3D and 4D scene representations, deformation models and regularisers, non-rigid structure from motion (NRSfM), shape-from-template, correspondence problems, novel-view synthesis of non-rigid scenes, generative and diffusion models in 4D vision, 3D human pose estimation, egocentric 4D vision as well as video generation of composite scenes. Apart from milestone methods in the field, the lecture will discuss several recent works on 4D vision including state-of-the-art approaches. This lecture is accompanied by triweekly theoretical exercises.

Covered Topics

Fundamentals of 3D and 4D Computer Vision
3D and 4D Scene Representations
Correspondence Problems in 3D
Multi-View Geometry, Structure from Motion and Bundle Adjustment
Monocular and Depth-Based 4D Reconstruction
3D Volumetric Rendering of Rigid and Non-rigid Scenes
3D Human Pose Estimation
Egocentric 4D Vision from Mobile Head-Mounted Devices
3D and 4D Generative and Diffusion Models
Controllable Video Generation for Non-rigid Scenes
Event-based 3D and 4D Vision
Recent 4D Vision Research and Research Trends

Recommended Materials

Book Multiple View Geometry in Computer Vision (by R. Hartley and A. Zisserman)
Book Foundations of Computer Vision (by A. Torralba, P. Isola and W. Freeman)
Book Understanding Deep Learning (by Simon J.D. Prince)
Survey on Dense Monocular Non-Rigid 3D Reconstruction (by Tretschk, Kairanda et al.)
Survey on Advances in Neural Rendering (by Tewari, Thies, Mildenhall, Srinivasan et al.)
Survey on Diffusion Models for Visual Computing (by Po, Wang, Golyanik et al.)
Talk Slides How to Read Academic Papers (by V. Golyanik)

Target Group

The lecture "3D and 4D Computer Vision" targets students of Visual Computing (M.Sc.), Computer Science (M.Sc.) and Data Science and Artificial Intelligence (M.Sc.).

If you have questions, please contact us via golyanik@mpi-inf.mpg.de.

Organisation

Format:	6 CP (weekly lectures and five exercise sheets in a triweekly rhythm)
Language:	English
Registration:	Please register for the course in CMS.
Time:	Wednesdays: 14:15 – 15:45 p.m. (lecture) Thursdays: 14:15 – 15:45 p.m. (exercise)
Location:	Wednesdays: 002 in building E1 5 (MPI-SWS) Thursdays: 024 in E 1 4 (MPI-INF)
Slides:	See the CMS course page.