Chapter 13 Perspective Projection Geometry

13.1 Introduction
Computer vision problems often involve interpreting information on a two-dimensional (2D) image of a three-dimensional (3D) world in order to determine the placement of the 3D objects portrayed in the image.
To do this requires understanding the perspective transformation governing the geometric way 3D information is projected onto the 2D image.
image formation on the retina, according to Descartes
scrape ox eye, observe from darkened room inverted image of scene
=====Nalwa, A Guided Tour of Computer Vision, Fig. 2.1=====




13.2 One-Dimensional Perspective Projection
$f$: focal length of lens
$u$: distance between object and lens center
$v$: distance between image and lens center
thin-lens equation: lens law:

\begin{displaymath}
\frac{1}{f} = \frac{1}{u} + \frac{1}{v}
\end{displaymath}

light passing lens center does not deflect
light parallel to optical axis will pass focal point
=====Nalwa, A Guided Tour of Computer Vision, Fig. 2.11=====
perspective projection: object point projected along straight line --caption--
=====Nalwa, A Guided Tour of Computer Vision, Fig. 2.3=====
image distortion: room ceiling appears bowed in the image
=====Nalwa, A Guided Tour of Computer Vision, Fig. 2.13=====




pinhole camera: infinitesimally small aperture
pinhole camera: approximated by lens with aperture adjusted to the smallest
pinhole camera: simplest device to form image of 3D scene on 2D surface
=====Nalwa, A Guided Tour of Computer Vision, Fig. 2.2=====
aperture size decreased: image become sharper
diameter of aperture is 0.06 inch, 0.015 inch, 0.0025 inch
aperture below certain size: diffraction: bending of light rays around edge
=====Nalwa, A Guided Tour of Computer Vision, Fig. 2.10=====




lens oriented along the $y$-axis and image plane parallel to $x$-axis
perspective projection gives point $(r,s)$ coordinates $(rf/s,f)$ on image
=====Fig. 13.1=====
$f$: camera constant (different from above equation)
$(r,s,1)$: homogeneous coordinate system for point $(r,s)$
first linear transformation: translates $(r,s,1)$ by distance of $f$
second linear transformation: takes perspective transformation to image line

\begin{displaymath}
\left (
\begin{array}{c}
u \\
v
\end{array}\right ) = \left...
...ht ) = \left (
\begin{array}{c}
r \\
s/f
\end{array}\right )
\end{displaymath}

1D image line coordinate: $x_I = u/v =rf/s $




lens: at origin and looks down $y'$-axis
image line: distance $f$ in front of lens and parallel to $x'$-axis
$x'-y'$ axes: the $x-y$ axes rotated anticlockwise by angle $\theta$
=====Fig. 13.2=====
rewriting the relationship in terms of homogeneous coordinate system

\begin{displaymath}
\left (
\begin{array}{c}
u \\
v
\end{array}\right ) = \left...
...) \left (
\begin{array}{c}
r \\
s \\
1
\end{array} \right )
\end{displaymath}




13.3 The Perspective Projection in 3D
camera lens: along line parallel to $z$-axis
position of lens: center of perspectivity: $(x_0, y_0, z_0)$
$(u,v)$: coordinates of perspective projection of $(x,y,z)$ on image plane

\begin{displaymath}
\left (
\begin{array}{c}
x^* \\
y^* \\
t^*
\end{array}\rig...
...t (
\begin{array}{c}
x \\
y \\
z \\
1
\end{array} \right )
\end{displaymath}

thus

\begin{displaymath}
u = \frac{x^*}{t^*}=f\frac{x-x_0}{z-z_0} \ \ \ \ \
v = \frac{y^*}{t^*}=f\frac{y-y_0}{z-z_0}
\end{displaymath}




13.3.1 Smaller Appearance of Farther Objects
without loss of generality: take center of perspectivity to be origin
perspective projection: objects appear smaller the farther they are
=====Fig. 13.3=====
foreshortening: line segments in plane parallel to image has maximum size
=====Fig. 13.4=====




13.3.2 Lines to Lines
lines in the 3D world transform to lines in the image plane
parallel lines in 3D with nonzero $z$ slope: meet in a vanishing point
=====Nalwa, A Guided Tour of Computer Vision, Fig. 2.4=====




13.3.3 Perspective Projection of Convex Polyhedra are Convex
proofs in textbook, simple but tedious, study as exercise by yourself




13.3.4 Vanishing Point
Perspective projections of parallel 3D lines having nonzero slope along the optic $z$-axis meet in a vanishing point on the image projection plane.




13.3.5 Vanishing Line
All lines lying in planes parallel to the slanted floor have vanishing points that lie along a vanishing line.
=====Fig. 13.5=====
=====Tour Into Picture=====
=====Garfield 17:21=====




13.3.6 3D Lines-2D Perspective Projection Lines
There is a relationship between the parameters of a 3D line and the parameters of the perspective projection of the line.
3D line L:

\begin{displaymath}
L = \left \{ \left (
\begin{array}{c}
x \\
y \\
z
\end{arr...
...n{array}{c}
b_1 \\
b_2 \\
b_3
\end{array} \right ) \right \}
\end{displaymath}

perspective projection of the line $L$:

\begin{displaymath}
M = \left \{ \left (
\begin{array}{c}
u \\
v
\end{array} \r...
...(
\begin{array}{c}
d_1 \\
d_2
\end{array} \right ) \right \}
\end{displaymath}

take camera lens as origin of coordinate system

\begin{displaymath}
u = f\frac{x}{z}
\end{displaymath}


\begin{displaymath}
v=f\frac{y}{z}
\end{displaymath}

for any given $\lambda, \eta$:

\begin{displaymath}
c_1+\eta d_1 = f(a_1+\lambda b_1)/(a_3+\lambda b_3)
\end{displaymath}


\begin{displaymath}
c_2+\eta d_2 = f(a_2+\lambda b_2)/(a_3+\lambda b_3)
\end{displaymath}

eliminate $\eta$:

\begin{displaymath}
c_1 d_2 - c_2 d_1 = \frac{d_2 f(a_1+\lambda b_1)-d_1 f(a_2+\lambda b_2)}{
a_3+\lambda b_3}
\end{displaymath}

for any $\lambda$:

\begin{displaymath}
(c_1 d_2-c_2 d_1)a_3-(d_2 a_1 - d_1 a_2)f+\lambda[b_3(c_1 d_2-c_2 d_1)
-(d_2 b_1 - d_1 b_2)f]=0
\end{displaymath}

thus

\begin{displaymath}
d_2 f a_1 - d_1 f a_2 + (c_2 d_1 - c_1 d_2)a_3 = 0
\end{displaymath}


\begin{displaymath}
d_2 f b_1 - d_1 f b_2 + (c_2 d_1 - c_1 d_2)b_3 = 0
\end{displaymath}




13.4 2D to 3D Inference Using Perspective Projection
perspective projection on unknown 3D line: provides four of six constraints
additional constraints: 3D-world-model information about points, lines




13.4.1 Inverse Perspective Projection
$(u,v)$: perspective projection of a point
$f$: image plane distance from camera lens
thus $(u,v,f)$: 3D coordinate of the point in image plane
camera lens: at the origin
line $L$: inverse perspective projection of the point $(u,v)$

\begin{displaymath}
L = \left \{ \left (
\begin{array}{c}
x \\
y \\
z
\end{arr...
...
\begin{array}{c}
u \\
v \\
f
\end{array} \right ) \right \}
\end{displaymath}




13.4.2 Line Segment with Known Direction Cosines and Known Length
known:

unknown:




13.4.3 Collinear Points with Known Interpoint Distances
known:

unknown:




13.4.4 $N$ Parallel Lines
known:

unknown:




13.4.5 $N$ Lines Intersecting at a Point with Known Angles
known:

unknown: $\cos \theta_{pq}= i_p i_q + j_p j_q + k_p k_q$: known angle between line pair




13.4.6 $N$ Intersecting Lines in a Known Plane
known:

unknown:

\begin{displaymath}
A i_n + B j_n + C k_n = 0
\end{displaymath}

normal vector: perpedicular to direction cosine
=====joke=====




13.4.7 Three Lines in a Plane with One Perpendicular to the Other Two
known:

unknown: three lines in same plane, one perpendicular to other two $k_1 m_1 + k_2 m_2 + k_3 m_3 = 0$: since $L_3$ perpendicular to $L_1, L_2$




13.4.8 Point with Given Distance to a Known Point
known:

unknown: $\sqrt{(\eta u - a)^2+(\eta v - b)^2+(\eta f - c)^2} = \rho$




13.4.9 Point in a Known Plane
known:

unknown: solution:




13.4.10 Line in a Known Plane
known:

unknown: $Ai+Bj+Ck=0$: since line lies in plane
$Aa+Bb+Cc+D=0$: since line lies in plane




13.4.11 Angle
known:

unknown: $\cos \theta = i_0 i_1 + j_0 j_1 + k_0 k_1$: since $\theta$ is the angle between 3D lines




13.4.12 Parallelogram
known:

unknown: $
\left ( \begin{array}{c}
m_1 \\
m_2 \\
m_3
\end{array} \right ),
\left ( \begin{array}{c}
n_1 \\
n_2 \\
n_3
\end{array} \right )$: direction cosines of two sides of parallelogram

\begin{displaymath}
\left ( \begin{array}{c}
A \\
B \\
C
\end{array} \right ...
...}{c}
n_1 \\
n_2 \\
n_3
\end{array} \right ) \right \Vert }
\end{displaymath}




13.4.13 Triangle with One Vertex Known
known:

unknown:

\begin{displaymath}
(x_2-x_1)^2+(y_2-y_1)^2+(z_2-z_1)^2 = S_{12}^2
\end{displaymath}


\begin{displaymath}
(x_3-x_1)^2+(y_3-y_1)^2+(z_3-z_1)^2 = S_{13}^2
\end{displaymath}


\begin{displaymath}
(x_2-x_3)^2+(y_2-y_3)^2+(z_2-z_3)^2 = S_{23}^2
\end{displaymath}


\begin{displaymath}
x_2 = \frac{u_2 z_2}{f}
\end{displaymath}


\begin{displaymath}
y_2 = \frac{v_2 z_2}{f}
\end{displaymath}


\begin{displaymath}
x_3 = \frac{u_3 z_3}{f}
\end{displaymath}


\begin{displaymath}
y_3 = \frac{v_3 z_3}{f}
\end{displaymath}




13.4.14 Triangle with Orientation of One Leg Known
known:

unknown:

\begin{displaymath}
(x_2-x_1)^2+(y_2-y_1)^2+(z_2-z_1)^2 = S_{12}^2
\end{displaymath}


\begin{displaymath}
(x_3-x_1)^2+(y_3-y_1)^2+(z_3-z_1)^2 = S_{13}^2
\end{displaymath}


\begin{displaymath}
(x_2-x_3)^2+(y_2-y_3)^2+(z_2-z_3)^2 = S_{23}^2
\end{displaymath}


\begin{displaymath}
\left ( \begin{array}{c} x_1 \\ y_1 \\ z_1 \end{array} \righ...
...left ( \begin{array}{c} i_1 \\ j_1 \\ k_1 \end{array} \right )
\end{displaymath}




13.4.15 Triangle: three-point spatial resection problem in photogrammetry
known:

unknown: four solutions

\begin{displaymath}
(x_2-x_1)^2+(y_2-y_1)^2+(z_2-z_1)^2 = S_{12}^2
\end{displaymath}


\begin{displaymath}
(x_3-x_1)^2+(y_3-y_1)^2+(z_3-z_1)^2 = S_{13}^2
\end{displaymath}


\begin{displaymath}
(x_2-x_3)^2+(y_2-y_3)^2+(z_2-z_3)^2 = S_{23}^2
\end{displaymath}


\begin{displaymath}
u_n = \frac{f x_n}{z_n} \ \ \ n = 1, 2, 3
\end{displaymath}


\begin{displaymath}
v_n = \frac{f y_n}{z_n} \ \ \ n = 1, 2, 3
\end{displaymath}




13.4.16 Determining the Principal Point by Using Parallel Lines
principal point: point through which the optic axis passes
principal point: so far assumes origin of image reference frame
known:

unknown:




13.5 Circles
known:

unknown: =====Oldie 33:70=====




13.6 Range from Structured Light
structured light: active visual sensing technique upon perspective geometry
structured light: controlled light source with regular pattern onto scene
regular pattern: stripes, grid, ...
light striping and a typical arrangement
=====Ballard and Brown, Computer Vision, Fig. 2.25=====
intensity and range images
=====Ballard and Brown, Computer Vision, Fig. 2.26=====
Two light sources with cylindrical lenses produce sheets of light that intersect in a line lying on the surface of a conveyor belt.
A camera above the belt is aimed so that this line is imaged on a linear array of photosensors.
When there is no object present, all the sensor cells are brightly illuminated.
When part of an object interrupts the incident light, the corresponding region on the linear array is darkened.
The motion of the belt scans the object past the sensor, generating the second image dimension.
=====Horn, Robot Vision, Fig. 5.4=====




13.7 Cross-Ratio
cross-ratio: of perspective projection of 4 collinear points, takes same value




13.7.1 Cross-Ratio Definitions and Invariance
four collinear points: $p+\lambda_1 b, p+\lambda_2 b, p+\lambda_3 b,
p+\lambda_4 b$
$q,r$: centers of perspectivity for two projection images
=====Fig. 13.7=====
Let $p = \left ( \begin{array}{c} p_1 \\ p_2 \end{array} \right ), \ \ \
b = \left ( \begin{array}{c} b_1 \\ b_2 \end{array} \right )$. by perspective projection equations

\begin{displaymath}
u_n = f \frac{p_1+\lambda_n b_1}{p_2+\lambda_n b_2} \ \ \ \ \
n = 1,2,3,4
\end{displaymath}


\begin{displaymath}
u_i-u_j=f \frac{p_1+\lambda_i b_1}{p_2+\lambda_i b_2}-
f \f...
...mbda_i - \lambda_j)}{(p_2+\lambda_i b_2)(p_2+
\lambda_j b_2)}
\end{displaymath}

cross-ratio:

\begin{displaymath}
\gamma(u_1, u_2, u_3, u_4)=\frac{(u_1 - u_2)(u_3 - u_4)}{(u_...
... \lambda_4)}{(\lambda_1 - \lambda_3)
(\lambda_2 - \lambda_4)}
\end{displaymath}

cross-ratio: independent of reference frame, point $p$, direction cosine $b$
cross-ratio: depends only on directed distance of collinear points




13.7.2 Only One Cross-Ratio
each of 4! cross-ratios is a function of cross-ratio $\gamma(u_1, u_2, u_3, u_4)$

\begin{displaymath}
\gamma(u_1, u_2, u_3, u_4)=\gamma(u_2, u_1, u_4, u_3)
=\gamma(u_3, u_4, u_1, u_2)=\gamma(u_4, u_3, u_2, u_1)
\end{displaymath}


\begin{displaymath}
\gamma(u_1, u_2, u_4, u_3)=\gamma(u_2, u_1, u_3, u_4)
=\gamma(u_4, u_3, u_1, u_2)=\gamma(u_3, u_4, u_2, u_1)
\end{displaymath}


\begin{displaymath}
=\frac{-\gamma(u_1, u_2, u_3, u_4)}{1-\gamma(u_1, u_2, u_3, u_4)}
\end{displaymath}


\begin{displaymath}
\gamma(u_1, u_4, u_3, u_2)=\gamma(u_4, u_1, u_2, u_3)
=\gamma(u_2, u_3, u_4, u_1)=\gamma(u_3, u_2, u_1, u_4)
\end{displaymath}


\begin{displaymath}
=1-\gamma(u_1, u_2, u_3, u_4)
\end{displaymath}


\begin{displaymath}
\gamma(u_1, u_3, u_2, u_4)=\gamma(u_3, u_1, u_4, u_2)
=\gamma(u_4, u_2, u_3, u_1)=\gamma(u_2, u_4, u_1, u_3)
\end{displaymath}


\begin{displaymath}
=\frac{1}{\gamma(u_1, u_2, u_3, u_4)}
\end{displaymath}


\begin{displaymath}
\gamma(u_1, u_3, u_4, u_2)=\gamma(u_3, u_1, u_2, u_4)
=\gamma(u_2, u_4, u_3, u_1)=\gamma(u_4, u_2, u_1, u_3)
\end{displaymath}


\begin{displaymath}
=\frac{1}{1-\gamma(u_1, u_2, u_3, u_4)}
\end{displaymath}


\begin{displaymath}
\gamma(u_1, u_4, u_2, u_3)=\gamma(u_4, u_1, u_3, u_2)
=\gamma(u_2, u_3, u_1, u_4)=\gamma(u_3, u_2, u_4, u_1)
\end{displaymath}


\begin{displaymath}
=\frac{1-\gamma(u_1, u_2, u_3, u_4)}{-\gamma(u_1, u_2, u_3, u_4)}
\end{displaymath}




13.7.3 Cross-Ratio in Three Dimensions
The cross-ratio derived from 1D perspective projections in 2D world can be generalized to 2D perspective projections in 3D world.
five co-planar points
$\gamma_{23}$: cross-ratio for the line segment $p_2 p_3$ and $0 < \lambda_1 < \lambda_2 < L_1$

\begin{displaymath}
\gamma_{23}=\frac{\lambda_1 (L_1 - \lambda_2)}{\lambda_2 (L_1 - \lambda_1)}
\end{displaymath}

$\gamma_{25}$: cross-ratio for the line segment $p_2 p_5$ and $0 < \eta_1 < \eta_2 < L_2$

\begin{displaymath}
\gamma_{25}=\frac{\eta_1 (L_2 - \eta_2)}{\eta_2 (L_2 - \eta_1)}
\end{displaymath}

=====Fig. 13.8=====




13.7.4 Using Cross-Ratios
cross-ratios: to aid in establishing correspondences
=====joke=====




tentative term project problems, (30%) total grade:
submit one page in English explaining method, steps, expected results
submit report in a month; report progress every other week
all right to be the same problem with Master's thesis
objective: a working prototype with new, original, novel ideas
objective: not just literature survey
objective: not just straightforward implementation of existing algorithms
objective: all right to modify existing algorithms
1. Neural Network, Gonzalez, Sec. 9.3.3, p. 595
=====Gonzalez, Digital Image Processing, Fig. 9.14=====
2. Fuzzy Logic
3. Image Compression: JPEG, MPEG, Gonzalez, Sec. 6.6, p. 389
=====Gonzalez, Digital Image Processing, Plate X=====
4. wavelet transform
5. segmentation based on texture, Gonzalez, p. 594
=====Gonzalez, Digital Image Processing, Fig. 9.13=====
6. optical character reading: a, b, c, d, ... 0, 1, 2, .....
=====fonts=====
7. stereo vision, Nalwa, Chapter 7
=====Nalwa, A Guided Tour of Computer Vision, Fig. 7.9=====
=====Nalwa, A Guided Tour of Computer Vision, Fig. 7.10=====
8. handwriting recognition: zip code; on-line, off-line Chinese characters ...
=====zip=====
=====Chinese=====
9. histogram specification: Gonzalez, p. 182, Sec. 4.2.2
=====Gonzalez, Digital Image Processing, Fig. 4.14=====
10. Homomorphic filtering: Gonzalez, Sec. 4.4.3
=====Gonzalez, Digital Image Processing, Fig. 4.41=====
=====Gonzalez, Digital Image Processing, Fig. 4.42=====
11. real-time counting number of cars and their sizes.
12. calculating the sizes of stones, cells, cell nucleus.
=====t_pebbles.im=====
13. trademark resemblance, semi-automatic similarity classification
=====trademark.im=====
14. car plate recognition.
15. structured light 3-D reconstruction. Horn, p. 95
=====Horn, Robot Vision, Fig. 5.4=====
=====Ballard and Brown, Computer Vision, Fig. 2.25=====
=====Ballard and Brown, Computer Vision, Fig. 2.26=====
16. object classification with moments invariant to rotation,
scaling, translation, Gonzalez, p. 514, Sec. 8.3.4
=====Gonzalez, Digital Image Processing, Eq. 8.3-14=====
=====Gonzalez, Digital Image Processing, Fig. 8.24=====
17. photometric stereo, Sec. 12.3, p. 16, Eq. 12.14
18. shape from focus, defocus, Sec. 12.4.1, p. 21
19. shape from polarization, Sec. 12.5, p. 22
20. shape from shading, Horn, p. 226
=====Horn, Robot Vision, Fig. 10.18=====
=====Horn, Robot Vision, Fig. 10.19=====
21. shape from texture, Nalwa, p. 199
=====Nalwa, A Guided Tour of Computer Vision, Fig. 6.1=====
=====Nalwa, A Guided Tour of Computer Vision, Fig. 6.6=====
22. solving correspondence problem or optic flow field
=====truck.im=====
23. motion and shape parameter recovery
=====truck.im=====
24. segmentation of newspaper, documents into title, figure, caption,....
=====magazine.im=====
25. optical distortion correction, Nalwa, p. 46
=====Nalwa, A Guided Tour of Computer Vision, Fig. 2.13=====
26. line labeling of 2D line drawing of 3D objects, Sec. 17.5, p. 404
=====Fig. 17.13=====
27. Computer Tomography: 3D image reconstruction from 2D projections
=====Cho $et\ al.$, Foundations of Medical Imaging, Fig. 6.6=====
=====Wicke, Atlas of Radiologic Anatomy, Fig. 159=====
=====Wicke, Atlas of Radiologic Anatomy, Fig. 160=====
28. road surface inspection: cracks, holes, ...
29. traffic counting: number of trucks, cars, motorcycles, ...
30. finger-print validation
31. face recognition
32. use robot to pick stones out of a bin
=====stone.im=====
33. digital morphing
=====morphing.im=====
34. model-based diagnosis
=====thorax.im=====
35. automatic identification and classification of video scenes for indexing
36. automatic break detection or video partitioning or scene change detection
37. automatic full-video search for objects of interest
e.g. find all frames which contain frogs
38. creating video or image database with functions
such as efficient querying, indexing, retrieval, browsing
39. recognition of video content invariant to viewing conditions
e.g. find all other shots of this scene
40. printed music sheet recognition and translation into MIDI format file
=====music.im=====
41. 360 degree image from several pictures (alignment, color interpolation)
42. wafer defect inspection
=====wafer.defect=====
43. wafer critical dimension measurement
=====Elliott, Integrated Circuit Fabrication Technology, Fig. 8.54=====
44. IC pin inspection: coplanarity of surface mount device
colinearity of dual-in-line package
=====IC.pin=====
45. IC mark printing inspection: smear, contract, scratch, ...
=====IC.mark=====
46. electrical contact point inspection
=====electrical.contact=====
47. digital watermarking: viewing but no printing due to copyright
48. auto focus in digital camera
49. auto exposure in digital camera
50. auto white balance in digital camera
51. color management: sRGB: standard Red, Green, Blue
52. 640X480 ==> 1280X960 from single image
53. super resolution: 640X480 ==> 1280X960 from multiple images
54. video stabilization for digital camcorder




Project due April 9:
camera calibration i.e compute #pixels/mm object displacement
calculate field of view in angles.
calculate and compare with theoretical values.
use lens of focal length: 16mm, 25mm, 50mm
object displacement of: 1mm, 5mm, 10mm, 20mm
object distance of: 0.5m, 1m, 2m
camera parameters: 8mm$\times$ 6mm $\Rightarrow$ 512$\times$485?
Are pixels square or rectangular?



2002-02-26