Understanding image segmentation basics (Part 2)

Last month, we explored the basics of segmentation, the processes that extract information about the structure of objects and separate and discern various items of interest within an image. The two main types of segmentation--region-based and edge-based--were covered. However, because these simple segmentation schemes lack the sophistication to deal with intricate or subtle details of image data, more advanced techniques are often needed in real-world applications. In this article we explain some more sophisticated approaches to threshold-based region segmentation. In subsequent months, other region-based and edge-based operators will be discussed.

For most of these segmentation topics, the software techniques under investigation apply to gray-scale imagery; therefore, the resulting segmentation is based on the intensity information present in an image. However, as discussed in Part 1 (see Vision Systems Design, Sept. 1998, p. 31), the segmentation operations may also be applied to transforms of the data, which base the segmentation on alternative information present, such as color, texture, or relationships, to other information contained in the image such as distance. In fact, these transformations often prove a necessary first step in the application of threshold-based techniques, as regions of interest must be forced to contain uniform representations. In addition, prefiltering may be necessary to remove noise and the effects of uneven illumination (see Vision Systems Design, June 1998, p. 19).

Single threshold calculation

Simple thresholding is the process of separating pixels into background (binary "0") and foreground (binary "1") object-of-interest classes based upon their similarity in gray-level intensity (see Fig. 1). The calculation of the actual threshold can be determined through ad hoc experimentation. However, in situations where the characteristics of the data acquisition cannot be precisely controlled, it is often not practical to use predetermined threshold values to create applications requiring automated segmentation. Variables that affect the threshold settings include lighting uniformity, background-to-object contrast, and interobject variation. In these cases, the threshold setting can be based on an algorithm that "adapts" to the data. The former mode of thresholding is known as static thresholding, as a prefixed value of threshold is used, whereas the latter mode is known as dynamic thresholding, as it changes with the data.

Many proven adaptive-thresholding techniques have been devised, and the determination of their suitability to a particular application requires an analysis of the data, as well as the system operating parameters and goals. Often, the data-acquisition process can be manipulated to allow the use of these techniques. For instance, in the study of cells and tissues, stains are used to mark the objects of interest. These stains increase the contrast or color difference between items of interest and other structures in the image.

Often, statistical approaches, such as the calculation of the moment or energy of the image histogram, can be used to select the threshold value. In these methods, the threshold values are computed deterministically in such a way that the moments of an image to be thresholded are preserved in the output (binary) image.

Another common approach is to set a threshold value by attempting to satisfy a constraint, which states that a certain percentage of pixels in the image must be segmented out. The threshold selection is controlled by providing a threshold range, which specifies the acceptable range of output thresholds, and a percentile range, which specifies the percent of image that should be selected by the output threshold. For instance, an application might specify that the top 20% of intensity values be considered and that between 10% and 30% of all pixels must be selected.

In applications for which the objects to be detected have intensity variations within or around them, a hysteresis technique can often be successfully used. This technique works in a manner similar to a thermostat. A cooling thermostat turns on when the temperature rises above some setting (threshold value), but does not turn off until the temperature falls below this setting minus some additional degrees. When hysteresis is used in thresholding, pixels are assigned a threshold label when they exceed that threshold value or are below this value by some amount but are adjacent to a pixel that does exceed the threshold. Pixel values above the threshold are often referred to as seeds (see Fig. 2).

Multiple threshold calculation

More complex images may have multiple classes of objects with respective intensity values clustering around unique peaks. In this case, a multithreshold technique can segment the data, whereby threshold labels are assigned to image data values that fall into specified ranges. For any pixel in the image with a data value of "v," an output label "l" is assigned as follows, where there are "n" specified threshold values stored in the array "Thresholds"

l = { -0 if v < Thresholds [0] 1 if Thresholds [0] <= v < Thresholds [1]

.......

n - 1 if -Thresholds [n - 2] <= v < Thresholds [n - 1]

n if Thresholds [n - 1] <= v }

Algorithms exist for the automatic determination of multiple threshold values. Peak-valley analysis can be used to identify significant clusters, which can then be used to derive threshold values (see Fig. 3).

Criteria that can be used to identify significant peaks in the image histogram include

Peak height--minimum acceptable height of a histogram peak

Peak separation--minimum acceptable distance between distinct histogram peaks

Peak-valley ratio--minimum acceptable ratio between histogram peaks and adjacent valleys

Number of classes--selects a desired number of most significant peaks.

Another approach is to select thresholds so as to maximize the global average contrast of the edges that are created. The algorithm works as follows:

Histograms of total contrast and edge count per threshold value are created. A given threshold produces some number of binary objects.

A value of contrast for each pixel on the edge of every object is computed and summed for the entire image to give the total contrast. The number of edges is the number of pixels on the boundary of these objects. Dividing the total contrast by the edge count for every threshold value creates the average contrast histogram.

A threshold value is placed at the maximal point in this average contrast histogram (see Fig. 4).

The algorithm can be reiterated to create subsequent threshold values until the desired number of classes is reached. A minimal contrast value can be used to prevent oversegmentation of low-contrast areas. This technique works remarkably well, as it approximates how the human eye "segments" a scene by applying boundaries in areas where maximal contrast occurs.

Global vs. local threshold

All the segmentation approaches can be applied in a global or local manner. The term global means that all the pixels in an image are used in the calculation of the threshold value(s). In local threshold selection, a local neighborhood of pixels is used.

Local threshold selection can be implemented using a fixed or floating-window technique:

Fixed window--The image is divided into a number of small segments called windows. Each window is treated as a unique image during the segmentation. This approach is fast, but problems exist at the boundary of the windows. Therefore, an adjacent window-merging algorithm may have to be used in conjunction.

Floating window--A new neighborhood is created about each pixel. This approach is more computationally expensive than the fixed-window approach and, therefore, often uses smaller windows in the calculation.

Adaptive approaches are useful when the objects of interest take on differing intensity ranges throughout the image. They can occur due to natural variations or from uneven lighting conditions. Two popular adaptive thresholding techniques are locally adaptive thresholding (LAT) and extrema thresholding. The LAT operator is implemented by setting pixels in the output image to "1" when their values are outside a specified data range; this range is defined to be the local mean plus some multiple of the standard deviation of the neighborhood.

For a pixel in the image with a data value of v, an output label l is assigned as follows, where local_ mean and local_SD are the mean and standard deviation calculated for the local neighborhood and scale is a parameter that controls the sensitivity of the thresholding:

l = { 1 if abs (v - mean) > abs(local_mean + (scale * local_SD))

0 otherwise }

The extrema-threshold operator is similar, but it first compares the local standard deviation to a user-supplied minimum value, MinStdDevThreshold. If the local neighborhood about a pixel does not exceed this value, then that pixel is set to zero in the output. If it exceeds this value, then the operator executes like the LAT above. Specification of a minimum acceptable value of the local variance lets the user control the sensitivity of the operator in uniform brightness areas of the image.

Note that the neighborhoods themselves do not need to be square or even rectangular. Kernel masks can be used to define odd-shaped neighborhoods, which give preference to specific features of interest in the data. There are many variations on threshold-based approaches to region segmentation, which can sometimes produce vastly differing results.

Part 1 of "Understanding image-segmentation basics" was published in Vision Systems Design, August 1998.

PETER EGGLESTON is senior director of business development, Imaging Products Division, Amerinex Applied Imaging Inc., Northampton, MA; e-mail: [email protected].

Click here to enlarge image

FIGURE 1. When used as a segmentation technique, the simple thresholding process separates the pixels in an image into background (binary 0) and foreground (binary 1) classes. Commonly based on the gray-scale values of the image, it is mostly suited to segment out objects that have uniform intensity.

Click here to enlarge image

FIGURE 2. In imaging applications in which the objects of interest or the background possess intensity variations, the hysteresis segmentation technique can be used. Note that the darkest areas of this image belong to parts of the grill only, and therefore act as seeds--pixels that exceed the threshold value.

Click here to enlarge image

FIGURE 3. The settings of appropriate threshold values can be automated by directing the software to analyze the peaks and valleys in the image histogram of gray levels. Significant clusters can then be used to derive the thresholds to be applied.

Click here to enlarge image

FIGURE 4. Software can organize image content based on strong interobject contrasts by performing a maximal contrast thresholding technique. For example, note that in this laser range image, several boxes display areas of uniform intensity and contrast because their surfaces are positioned at a uniform distance from the camera. These conditions are detected by reiteratively thresholding the image at gray-scale levels that produce the most amount of edges with the best contrast. The peaks in the average contrast histogram (top) are used to set the actual threshold values (bottom). Data courtesy of Perceptron Inc.

Understanding image segmentation basics (Part 2)

Related

Machine Vision System Monitors Greenhouse-Grown Specialty Crop

Sony Launches New CMOS Sensor

Voice Your Opinion!

To join the conversation, and become an exclusive member of Vision Systems Design, create an account today!

Trending

Test Your Knowledge About Gaussian Noise in Digital Images

Focus on Vision: Robotic Paint Inspection, Battery Detection | June 20, 2025

Lion Vision Develops Battery Detection System