Depth-Based Real Time Head Motion Tracking Using 3D Template Matching

Preview:

Citation preview

Depth-Based Real Time Head Motion Tracking Using 3D Template Matching

以深度影像及三維樣板比對為基礎之即時頭部運動軌跡追蹤

學生:屠愚

指導教授:歐陽明博士

1

Outline

• Introduction

• System Overview

• Algorithm

• Results & System Demo

• Conclusion & Future Work

2

3

Introduction

4

Y-a

xis

Yaw

Pitch

Roll

Y translatio

n

Real-time reconstruction of this 6-DoF motion vectorgiven a stream of video input.

What’s Head Motion Tracking

5

AnimationVideo Game Gaze OrientationDancing

Why Head Motion Tracking

• Color Image Based Methods“Head Pose Estimation in Computer Vision : A Survey”,

E. Murphy-Chutorian and M. M. Trivedi., PAMI 09

• Appearance Template Methods

• Feature Tracking Methods

• Detector Arrays

• Nonlinear Regression Methods

• Manifold Embedding Methods

• Flexible Models

• Geometric Methods

• Hybrid Methods

• Too sensitive to illumination variations! 6

Related Work

• SwissRanger SR4000

– Messa Image (2006)

– $9000

• CameCube 2.0

– PMD Technology (2002)

– $12000

7

Depth Camera

• Kinect by Microsoft– $149.99

• Xtion Pro by Asus– $189 .

– $300 .

8

Depth Camera

9

“Real-time performance-based facial animation”T. Weise et al., SIGGRAPH 2011

“Real Time Head Pose Estimation with Random Regression Forests”, G. Fanelli et al., CVPR 2011

Related Work

10

System Overview

11

12

Avatar ControlDepth Data

Acquisition Real-time Head Pose Estimation

Least Square Error Method

Inverse Rotation

Nose Detection

Sampling

Iterative Optimization Method

13

User Acting Avatar ControlDepth Data

Acquisition Real-time Head Pose Estimation

Least Square Error Method

Inverse Rotation

Nose Detection

Sampling

Iterative Optimization Method

Flow Chart

AlgorithmMethod 1 :Least square

error method

14

15

Nose Detection

16

Z-axis

Zmin+5

Zmin

Nose Detection

17

Nose Detection

Nose Detection

18

Inverse Rotation

19

20

Nose DetectionNose𝑡 𝑥, 𝑦

Nose𝑡−1 𝑥, 𝑦 ,

𝜃𝑡−1, 𝜓𝑡−1, 𝜙𝑡−1,Depth Map

𝑡Searching

Window

Setting

Inverse

Rotation

z-min

Searching

Nose Tracking Flow Chart

21

Ax+By+Cz=1

Least Square Plane Fitting

Yaw & Pitch !

23

Least Square Ellipse Fitting

Roll !

Dynamic weighted average filter

24“Real-time performance-based facial animation” T. Weise et al., Siggraph 2011

Flickering Issue

𝑛𝑜𝑟𝑚𝑎𝑙𝑖𝑧𝑒𝑑𝑤𝑒𝑖𝑔ℎ𝑡 𝜔

𝑓𝑟𝑎𝑚𝑒𝑖 𝑖-1 𝑖-2 𝑖-3 𝑖-4

0.010.1110

𝐻 ∙ 𝑚𝑎𝑥𝑙∈[1,𝑘] 𝑡𝑖 − 𝑡𝑖−𝑙

𝑤𝑗 = 𝑒−𝑗∙𝐻∙𝑚𝑎𝑥𝑙∈[1,𝑘] 𝑡𝑖−𝑡𝑖−𝑙𝑡𝑖

∗ = 𝑗=0𝑘 𝑤𝑗𝑡𝑖−𝑗

𝑗=0𝑘 𝑤𝑗

AlgorithmMethod 2 :Optimization

method

25

Avatar ControlDepth Data

Acquisition

26

User ActingDepth Data

Acquisition Real-time Head Pose Estimation

Least Square Error Method

Inverse Rotation

Nose Detection

Sampling

Iterative Optimization Method

Flow Chart

Avatar Control

27

Model Point Cloud

28

Sample Point Cloud

• 𝑚𝑜𝑡𝑖𝑜𝑛 𝑣𝑒𝑐𝑡𝑜𝑟 𝑎 Θ,𝜓, 𝜙, 𝑡𝑥 , 𝑡𝑦, 𝑡𝑧

• 𝐸 𝑎, 𝑃𝑆, 𝑃𝑀 = 𝑝𝑝∈𝑃𝑆min(𝑑 𝑝′, 𝑃𝑀 )

• 𝑅𝑒𝑠𝑢𝑙𝑡 = 𝑎∗ = arg𝑚𝑖𝑛 (𝐸)

29

Energy Function

30

• Gradient Decent Algorithm

Non-differentiable

Optimization

𝑎𝑘+1 = 𝑎𝑘 − 𝛼𝑘𝛻𝐸(𝑎𝑘)

• 12 Possible Gradients

𝛻1 =

100000

, 𝛻2 =

−100000

, 𝛻3 =

010000

, 𝛻4 =

0−10000

, 𝛻5 =

001000

, 𝛻6 =

00−1000

,

𝛻7 =

000100

, 𝛻8 =

000−100

, 𝛻9 =

000010

, 𝛻10 =

0000−10

, 𝛻11 =

000001

, 𝛻12 =

00000−1

31

Optimization

ResultsOf Method 1

32

33

34

35

ResultsOf Method 2

36

37

38

39

40

Our Method with Whole Database

Our Method with 70% Database (Without Crash)

Our Method with 50% Database (Without Bad Initialization)

41

System Demo

42

Conclusion

43

• Contribution:

– Two novel algorithms.

– Not affected by varying lighting conditions.

– Real-time responses without GPU acceleration.

– Outperforms the state-of-the-art approach (G. Fanelliet al., CVPR 2011).

• Future Work:

– Tracking without nose.

– Combine with NITE skeleton tracking.

– Facial Expression Recognition / Retargeting.

44

Thank You Very Much

45