Computer Vision (I)

Course Number: 526 U1090
Credits: 3
Time: Tuesday 6, 7, 8 (2:10PM-5:00PM)
Classroom: New CSIE Classroom 309
Classification: Elective for junior, senior, and graduate students
Prerequisite: None
Instructor: Chiou-Shann Fuh
Office: New Computer Science and Information Engineering 327
Phone: 23625336 ext. 327, 23630231 ext. 3232 ext. 327
Office Hours: Tuesday 9AM-11AM
Objective: To learn computer and robot vision through extensive
course projects.




Textbook: R. M. Haralick and L. G. Shapiro, Computer
and Robot Vision,
Vol. I, Addison Wesley, Reading, MA, 1992.
Reference: L. G. Shapiro and G. C. Stockman, Computer Vision,
Prentice-Hall, Upper Saddle River, NJ, 2001.
Reference: R. Jain, R. Kasturi, and B. G. Schunck, Machine Vision,
McGraw-Hill, New York, 1995.
Reference: R. C. Gonzalez and R. E. Woods, Digital
Image Processing,
Addison Wesley, Reading, MA, 1992.
Projects: will be assigned every week or every other week (30%)
Examinations: one midterm (30%) and one final (40%)




Content:
This is the first semester of a fast pace course which covers robot and computer vision. This semester covers low-level vision and mostly no reference to the third dimension:
1. Computer Vision: Overview
2. Binary Machine Vision: Thresholding and Segmentation
3. Binary Machine Vision: Region Analysis
4. Statistical Pattern Recognition
5. Mathematical Morphology
6. Neighborhood Operators
7. Conditioning and Labeling
8. The Facet Model
9. Texture
10. Image Segmentation
11. Arc Extraction and Segmentation

Next semester covers higher-level techniques.

Chapter 1 Computer Vision: Overview

1.1 Introduction
computer vision is the science that develops the theoretical and algorithmic basis by which useful information about the world can be automatically extracted and analyzed from an observed image, image set, or image sequence from computations made by special-purpose or general-purpose computers.
computer vision: to emulate human vision with computers
computer vision: dual process of computer graphics: 2D $\rightarrow$ 3D




Information:

=====Oldie 34:10=====




Applications:




image: spatial representation of object, 2D or 3D scene, or another image
intensity image: optic or photographic sensors, radiant energy
range image: line-of-sight distance
image $I(r,c)$: intensity value at row $r$ and column $c$ of the matrix
pixel: picture element: has properties of position and value
gray levels: pixel values of intensity images,
0 (black) - 255 (white) for 8-bit integers




factors determining the difficulty of computer vision problem:

(a) white (gray tone 255) square on black (gray tone 0) background
(b) corners from corner-feature extractor, N: noncorner, C: corner
=====Fig. 1.1=====




atomic image features:




Composite features: atomic features merged




1.2 Recognition Methodology

Recognition methodology must pay attention to:

  1. image formation e.g. perspective or orthographic projection
  2. conditioning
  3. labeling
  4. grouping
  5. extracting
  6. matching




1.2.1 Conditioning
Conditioning is based on a model that suggests that the observed image is composed of an informative pattern modified by uninteresting variations that typically add to or multiply the informative pattern.
e.g. noise suppression, background normalization




1.2.2 Labeling
Labeling is based on a model that suggests that the informative pattern has structure as a spatial arrangement of events, each spatial event being a set of connected pixels.
e.g. thresholding, edge detection, corner finding




1.2.3 Grouping
The grouping operation identifies the events by collecting together or identifying maximal connected sets of pixels participating in the same kind of event.
before grouping: pixels, after grouping: sets of pixels
e.g. segmentation, edge linking




1.2.4 Extracting
The extracting operation computes for each group of pixels a list of its properties.
example properties: centroid, area, orientation, spatial moments
e.g. region holes, arc curvature




1.2.5 Matching
Matching operation determines the interpretation of some related set of image events, associating these events with some given three-dimensional object or two-dimensional shape.
e.g. template matching
=====Garfield 17:82=====

1.3 Outline of Book
This text describes those aspects of computer vision that are needed in robotics and other real-world applications such as industrial-part inspection, medical diagnosis, aerial-image interpretation, and space station maintenance.




Journals




Conferences




Bibliography

=====this file: $\sim$fuh/vcourse/haralick/chapter1.tex=====
man latex
latex chapter1
dvips -f chapter1 $>!$ t.ps
ghostview t.ps
=====$\sim$fuh/vcourse/haralick/programs/style/lena.im=====
=====pseudo.c=====
=====lena.b.im=====
Add /usr/local/vision/man to $MANPATH
Add /usr/local/vision/linux to $PATH
Add /usr/local/SUNWspro/bin to $PATH
Add /usr/local/vision/linux to $LD_LIBRARY_PATH
man hvision: image processing functions.
cc -o pseudo pseudo.c -lhvision -lm
pseudo lena.im lena.r.im lena.g.im lena.b.im
xview lena.im

(to print: change textwidth $\Rightarrow$ 7.2in, textheight $\Rightarrow$ 9.75in,
Huge $\Rightarrow$ normalsize, LARGE $\Rightarrow$ normalsize)
dvips -t landscape -f chapter1 $>!$ t.ps
lpr t.ps




copy $\sim$fuh/.tkinit to your home directory before invoking tk
tk
$>$ list command
$>$ read image : lena.im lena.im
$>$ affine transform : lena.im lena.aff 0 2 2 0 0 0 0
$>$ list image : ??
$>$ list image
$>$ view image : lena.aff
$>$ write image : lena.aff lena.aff
$>$ quit
=====lena.aff=====
=====joke=====




Project due Oct. 9, 2001:

  1. Use B_PIX to write a program to generate
    1. upside-down lena.im
    2. right-side-left lena.im
    3. diagonally mirrored lena.im
  2. Use tk to
    1. rotate lena.im 45 degrees clockwise
    2. shrink lena.im in half
    3. binarize lena.im at 128 to get a binary image (hint: binarize)



2001-09-19
Counter:

FastCounter by bCentral