Low Level Processing

A digitized image, at its simplest, can be a two-dimensional array indicating for each point in the image just how light or dark the picture is at this point. Each image point is referred to as a "pixel," and the brightness or image intensity at a pixel is usually represented by an integer between 0 and 255 (0 = black, 255 = white).

Before the image is processed further, it may be necessary to remove "noise" and very fine detail from the image.  Noise is incorrect intensity values introduced in the process of producing the digitized image.  Fine detail may be texture on a wall or floor in a picture of a room in your house.  Removing this detail may allow the vision system to focus on the basic objects in the room without being distracted.

"Smoothing" the Image:  Replacing the intensity value at a given point with an average of the intensity values of this and surrounding points.  The effect:  slightly blur the image.
 

Edge Detection:

Find the simplest features in the image.  The goal is to try to obtain something like a line drawing of the objects in the image.  First, find points that may lie on the edges of objects in the image. The intensity values used are just 0-9 (0 = black; 9 = white).
 

 
Although the edges of the object are sharp, the intensity values at the pixels corresponding to those edges may fall part-way between the intensity of the background and that of the main body of the object.  The edges of the object can be found by looking for places where the intensity at nearby pixels is significantly different.

Difference Operators:

Small masks (2x2 or 3x3 arrays) that are placed over groups of points in the image.  The difference operation now involves taking each mask value, multiplying it by the corresponding image intensity value, and summing the results. Difference operators give an idea of how fast the intensity is changing in a small region of the image.  Where it is changing fast means that there is some interesting feature in the image, like the edge of an object.  Thus, what youíre looking for are places where the rate of intensity change is high.

Edge detection is a computationally expensive operation because of the large amount of data involved.  For a 1024x1024 image, the operation must be repeated over a million times.

Line Fitting:

The output of edge detection is just a set of points believed to lie on an intensity edge.  A more useful method is to aggregate these into lines corresponding to the boundaries of the objects.  So, the "house" image could be represented as six lines corresponding to the edges of the front wall and the roof. One way to find such lines is to start w/an edge point and then move along, looking for other connected edge points.