Multiple View Geometry in Computer Visionpksaho01/teaching/Lecture8.pdf · Multiple View Geometry...

Preview:

Citation preview

Multiple View Geometry

in Computer Vision

Prasanna Sahoo

Department of Mathematics

University of Louisville

1

Camera Models

Lecture 8

2

In this lecture, we will show that a camera is a map-

ping from the 3D world R3 to a 2D image plane R2.

This mapping can be represented by a 3×4 matrix P.

We will examine the model for the following cameras:

• Pinhole camera

• CCD camera

• Finite projective camera

• General projective camera

3

Pinhole Camera

A pinhole camera is a box in which one of the walls

has been pierced to make a small hole through it.

Assuming that the hole is indeed just a point, exactly

one ray from each point in the scene passes through

the pinhole and hits the wall opposite to it. This

results in an inverted image of the scene.

4

Pinhole camera

The word camera has its origins in the Latin camera and the Greek kamara,

both of which refer to a room or a chamber.

5

The inversion of the image is an annoyance.

• However, it can be corrected by considering a virtual

image of the scene on a virtual plane parallel to the

imaging plane but on the opposite side of the pinhole.

6

Basic pinhole camera model

Let the center of projection be the origin of a Eu-

clidean coordinate system. The plane Z = f is called

the focal plane or image plane.

7

A point in space, R3, with coordinates X = (x, y, z)T

is mapped to a point on the image plane where a line

joining the point X to the center of projection meets

the image plane.

It can be easily shown that the point (x, y, z)T is

mapped to the point(f xz , f y

z , f)T

on the image plane.

8

Ignoring the final image coordinate, we see that the

mapping

(x, y, z)T 7→(fx

z,fy

z

)T

describes the central projection mapping from world

to image coordinates.

• This is a mapping from R3 to R2.

9

Some Termnilogies

• The center of projection is called the camera center

or the optical center.

• The plane Z = f is called the focal plane or image

plane.

10

• The line from the camera center perpendicular to

the image plane is called the principal axis or princi-

pal ray of the camera.

• The point where the principal axis meets the image

plane is called the principal point.

• The plane through the camera center parallel to

the image plane is called the principal plane of the

camera.

11

Central Projection Mapping

Using homogeneous coordinates the central projection

map (x, y, z)T 7→ (f x/z, f y/z)T can be described asxyz1

7→f xf yz

=

f 0 0 00 f 0 00 0 1 0

xyz1

= diag(f, f,1) [ I | 0 ]X,

where diag(f, f,1) is a diagonal matrix and [ I | 0 ] is

a matrix divided up into a 3 × 3 block (the identity

matrix) plus a column vector made up of zeros.12

The equation on the last slide can be written as

x = PX

where X denotes the world point represented by the

homogeneous 4-vector (x, y, z,1)T, x the image point

represented by the homogeneous 3-vector (fx, fy, z)T,

and P the 3 × 4 homogeneous camera projection

matrix. Hence

P = diag(f, f,1) [ I | 0 ].13

In deriving the central projection mapping

(x, y, z)T 7→ (fx/z, fx/z)T

it was assumed that the origin of coordinates in the

image plane was at the principal point. However, if

the origin is not at the principal point, then we have

(x, y, z)T 7→ (fx/z + px, fx/z + py)T

where (px, py)T are the coordinates of the principal

point.14

Image and camera coordinate

systems

15

Using homogeneous coordinates the central projection

mapping can be described as

xyz1

7→f x + zpx

f y + zpy

z

=

f 0 px 00 f py 00 0 1 0

xyz1

= K[ I | 0 ]Xcam,

where K is a 3×3 matrix, [ I | 0 ] is a matrix divided up

into a 3× 3 block (the identity matrix) plus a column

vector made up of zeros.

16

The equation on the last slide can be written concisely

asx = K [ I | 0 ]Xcam.

• The matrix K is called camera calibration matrix.

• The homogeneous 4-vector (x, y, z,1)T is written as

Xcam to emphasize that the camera is assumed to be

located at the origin of a Euclidean coordinate system

with the principal axis of the camera pointing straight

down the z-axis.17

Camera Location

The camera is assumed to be located at the origin of a Euclidean coordinate

system with the principal axis of the camera pointing straight down the z-axis.

18

Camera Rotation and Translation

In general, points in R3 will be expressed in terms of

a different Euclidean coordinate frame, known as the

world coordinate frame. The two coordinate frames

are related through a rotation and a translation.

19

World and Camera Coordinate Frames

The two coordinate frames are related through a rotation and a translation.

20

If X̃ ia an inhomogeneous 3-vector representing the

coordinates of point in the world coordinate frame,

and X̃cam represents the same point in the camera

coordinate frame, then

X̃cam = R (X̃− C̃),

where C̃ represents the coordinates of the camera cen-

ter in the world coordinate frame, and R is a 3×3 rota-

tion matrix representing the orientation of the camera

coordinate frame.21

The equation X̃cam = R (X̃ − C̃) can be written in

homogeneous coordinates as

X̃cam =

R −RC̃

0 1

xyz1

=

R −RC̃

0 1

X.

This leads to the following concise formula

x = KR [ I | − C̃ ]X

where X is now in a world coordinate frame.

22

The mapping x 7→ X defined by the formula

x = KR [ I | − C̃ ]X

is the general mapping given by a pinhole camera. A

general pinhole camera

P = KR [ I | − C̃ ]

has 9 degrees of freedom (3 DOF for K, 3 DOF for

R, and 3 DOF for C̃).23

It is often convenient not to make the camera center

explicit in the world to image transformation. Instead

it is represented as

X̃cam = RX̃ + t.

Hence the camera matrix becomes

P = KR [ I | − C̃ ] = K [R | t ]

where t = −RC̃.24

CCD Cameras

In pinhole camera model it is assumed that the image

coordinates have equal scales in both axial directions.

In the case of CCD cameras, this scale factors are

unequal in each direction.

25

EURON Summer School on Visual servoing

Image plane

����

����

����

���

m = (x,y,1)p = (u,v,1)

3D Point

Center of projection

(X,Y,Z)3D Point

Charged Coupled

Device

Image coordinates: p = Km

Camera Geometry – p.9/43

CCD Camera

26

Suppose the scale factors in the directions x and y are

mx and my, respectively. Hence the calibration matrix

K for the CCD camera is given by

K =

mx 0 00 my 00 0 1

f 0 px

0 f py

0 0 1

=

f mx 0 px mx

0 f my py my

0 0 1

.

Hence a CCD camera

P = KR [ I | − C̃ ]

has 10 degrees of freedom (that is 4 + 3 + 3 = 10).27

Note the calibration matrix for the CCD camera can

be written as

K =

αx 0 x0

0 αy y0

0 0 1

where αx = f mx, αy = f my, x0 = px mx, and y0 =

py my.

28

Finite Projective Cameras

A camera P is called a finite projective camera if the

calibration matrix K is of the form

K =

αx s x00 αy y00 0 1

(1)

where s is a parameter known as the skew parameter.

Hence a finite projective camera P is given by

P = KR [ I | − C̃ ].

29

• The 3× 3 submatrix KR of

P = KR [ I | − C̃ ]

is non-singular.

• If P is any 3 × 4 matrix for which the left hand

3 × 3 submatrix, say M, is non-singular, then M can

be decomposed as M = KR, where K is a upper-

triangular matrix of the form (1) and R is a rotation

matrix.30

Therefore if the 3 × 3 submatrix P is non-singular,

then the 3× 4 matrix P can be written as

P = M [ I | M−1 p4 ] = KR[ I | − C̃ ]

where p4 is the last column of P. Thus we have:

Result 5.1. The set of camera matrices of finite

projective cameras is identical with the set of homo-

geneous 3 × 4 matrices for which the left hand 3 × 3

submatrix is non-singular.

31

General Projective Cameras

A camera is called a general projective camera if it

can be represented by an arbitrary homogeneous 3×4

matrix of rank 3.

The rank 3 requirement is needed because if the rank

is less than 3, then the range of the matrix mapping

will be a line or a point but not the whole plane.

32

END

33