Course Number: 526 U1090
Time: Tuesday 6, 7, 8 (2:10PM-5:00PM)
Classroom: New CSIE Classroom 309
Classification: Elective for junior, senior, and graduate students
Instructor: Chiou-Shann Fuh
Office: New Computer Science and Information Engineering 327
Phone: 23625336 ext. 327, 23630231 ext. 3232 ext. 327
Office Hours: Tuesday 9AM-11AM
Objective: To learn computer and robot vision through extensive
Textbook: R. M. Haralick and L. G. Shapiro, Computer
and Robot Vision, Vol. I, Addison Wesley, Reading, MA, 1992.
Reference: L. G. Shapiro and G. C. Stockman, Computer Vision,
Prentice-Hall, Upper Saddle River, NJ, 2001.
Reference: R. Jain, R. Kasturi, and B. G. Schunck, Machine Vision,
McGraw-Hill, New York, 1995.
Reference: R. C. Gonzalez and R. E. Woods, Digital
Image Processing, Addison Wesley, Reading, MA, 1992.
Projects: will be assigned every week or every other week (30%)
Examinations: one midterm (30%) and one final (40%)
This is the first semester of a fast pace course which covers robot and computer vision. This semester covers low-level vision and mostly no reference to the third dimension:
1. Computer Vision: Overview
2. Binary Machine Vision: Thresholding and Segmentation
3. Binary Machine Vision: Region Analysis
4. Statistical Pattern Recognition
5. Mathematical Morphology
6. Neighborhood Operators
7. Conditioning and Labeling
8. The Facet Model
10. Image Segmentation
11. Arc Extraction and Segmentation
Next semester covers higher-level techniques.
computer vision is the science that develops the theoretical and algorithmic basis by which useful information about the world can be automatically extracted and analyzed from an observed image, image set, or image sequence from computations made by special-purpose or general-purpose computers.
computer vision: to emulate human vision with computers
computer vision: dual process of computer graphics: 2D 3D
image: spatial representation of object, 2D or 3D scene, or another image
intensity image: optic or photographic sensors, radiant energy
range image: line-of-sight distance
image : intensity value at row and column of the matrix
pixel: picture element: has properties of position and value
gray levels: pixel values of intensity images,
0 (black) - 255 (white) for 8-bit integers
factors determining the difficulty of computer vision problem:
atomic image features:
Composite features: atomic features merged
1.2 Recognition Methodology
Recognition methodology must pay attention to:
Conditioning is based on a model that suggests that the observed image is composed of an informative pattern modified by uninteresting variations that typically add to or multiply the informative pattern.
e.g. noise suppression, background normalization
Labeling is based on a model that suggests that the informative pattern has structure as a spatial arrangement of events, each spatial event being a set of connected pixels.
e.g. thresholding, edge detection, corner finding
The grouping operation identifies the events by collecting together or identifying maximal connected sets of pixels participating in the same kind of event.
before grouping: pixels, after grouping: sets of pixels
e.g. segmentation, edge linking
The extracting operation computes for each group of pixels a list of its properties.
example properties: centroid, area, orientation, spatial moments
e.g. region holes, arc curvature
Matching operation determines the interpretation of some related set of image events, associating these events with some given three-dimensional object or two-dimensional shape.
e.g. template matching
1.3 Outline of Book
This text describes those aspects of computer vision that are needed in robotics and other real-world applications such as industrial-part inspection, medical diagnosis, aerial-image interpretation, and space station maintenance.
=====this file: fuh/vcourse/haralick/chapter1.tex=====
dvips -f chapter1 t.ps
Add /usr/local/vision/man to $MANPATH
Add /usr/local/vision/linux to $PATH
Add /usr/local/SUNWspro/bin to $PATH
Add /usr/local/vision/linux to $LD_LIBRARY_PATH
man hvision: image processing functions.
cc -o pseudo pseudo.c -lhvision -lm
pseudo lena.im lena.r.im lena.g.im lena.b.im
(to print: change textwidth 7.2in, textheight 9.75in,
Huge normalsize, LARGE normalsize)
dvips -t landscape -f chapter1 t.ps
copy fuh/.tkinit to your home directory before invoking tk
read image : lena.im lena.im
affine transform : lena.im lena.aff 0 2 2 0 0 0 0
list image : ??
view image : lena.aff
write image : lena.aff lena.aff
Project due Oct. 9, 2001: