KinectFusion : Real-Time Dense Surface Mapping and Tracking
IEEE International Symposium on Mixed and Augmented Reality 2011Science and Technology Proceedings (Best paper reward)
Target
Normal maps GreyscalesNoisy data
Outline
• Introduction• Motivation• Background• System diagram• Experiment results• Conclusion
Introduction
• Passive camera• Simultaneous localization and mapping (SLAM)• Structure from motion (SFM)– MonoSLAM [8] (ICCV 2003)– Parallel Tracking and Mapping [17] (ISMAR 2007)
• Disparity– Depth model [26] (2010)
• Pose of camera from Depth models [20] (ICCV 2011)
Motivation
• Active camera : Kinect sensor
• Pose estimation from depth information• Real-time mapping– GPU
Background- Camera sensor
• Kinect Sensor– Infra-red light
• Input Information– RGB image(1)– Raw depth data– Calibrated depth image(2)
(1) (2)
Background – Pose estimation
• Depth maps from two views
• Iterative closest points (ICP) [7]• Point-plane metric [5]
ICP
Background – Pose estimation
• Projective data association algorithm [4]
Background – Scene Representation
• Volume of space• Signed distance function [7]
System Diagram
System Diagram
Pre-defined parameter
• Pose estimation with sensor camera• Raw depth map Rk
• Calibrated depth image Rk(u)
where and
Raw data
K
Rk
Rk(u)
Surface Measurement
• Reduce noise• Bilateral filter
With bilateral filter Without bilateral filter
Surface Measurement
• Vertex map
• Normal vector
Define camera pose
Camera frame k is transferred into the global frame
System Diagram
Surface Reconstruction : Operate environment
L L
L
L3 voxel reconstruction
Surface Reconstruction
• Signed distance function
Truncated Signed Distance Function
Surface
sensor
Fk(p)
0
+v
-v
Axis x
Axis x
+v-v
• Weighting running average
• Dynamic object motion
System Diagram
Surface Prediction from Ray Casting
• Store • Ray casting marches from +v to zero-crossing
Corresponding ray
Surface Prediction from Ray Casting
• Speed-up– Ray skipping– Truncation distance
Surface
sensorAxis x
System Diagram
Sensor Pose Estimation
• Previous frame• Current frame• Assume small motion frame• Fast projective data association algorithm– Initialized with previous frame pose
where
• Vertex correspondences
where
• Point-plane energy
• For z > 0
• Modified equation
where
Experiment Results
• Reconstruction resolution : 2563
• Test camera pose• kinect camera rotates and captures 560 frame
over 19 seconds in turntable
Experiment Results
• Using every 8th frame
Experiment Results : Processing time
Pre-processing raw data, data-associations; pose optimisations; raycasting the surface prediction and surface measurement integration
Demo
Conclusion
• Robust tracking of camera pose by all aligning all depth points
• Parallel algorithms for both tracking and mapping
Reference[8] A. J. Davison. Real-time simultaneous localization and mapping with a single camera. In Proceedings of the International Conference on Computer Vision (ICCV), 2003.
[17] G. Klein and D. W. Murray. Parallel tracking and mapping for small AR workspaces. In Proceedings of the International Symposium on Mixed and Augmented Reality (ISMAR), 2007.
[26] J. Stuehmer, S. Gumhold, and D. Cremers. Real-time dense geometry from a handheld camera. In Proceedings of the DAGM Symposium on Pattern Recognition, 2010.
[20] R. A. Newcombe, S. J. Lovegrove, and A. J. Davison. DTAM: Dense tracking and mapping in real-time. In Proceedings of the International Conference on Computer Vision (ICCV), 2011
[7] B. Curless and M. Levoy. A volumetric method for building complex models from range images. In ACM Transactions on Graphics (SIGGRAPH), 1996.
[5] Y. Chen and G. Medioni. Object modeling by registration of multiple range images. Image and Vision Computing (IVC), 10(3):145–155, 1992.
[4] G. Blais and M. D. Levine. Registering multiview range data to create 3D computer objects. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 17(8):820–824, 1995.