Open source software such as OpenCV (opencv.org) has spurred the rapid deployment of image processing and machine vision. Containing more than 2500 image processing algorithms, the software library contains many a variety of algorithms that perform similar tasks.
"Indeed," says Jayan Eledath, Program Director in the Center for Vision Technologies at SRI International (Menlo Park, CA, USA; www.sri.com), "it seems that every computer vision researcher is eager to create their own broken algorithms rather than improving on others broken algorithms."
With this somewhat pithy remark, Eledath has, under the DARPA Visual Media Reasoning program, developed automated performance characterization (APC) tools to allow researchers to automatically characterize the performance of such algorithms. In a presentation held at October's Embedded Vision Summit organized by the Embedded Vision Alliance (www.embedded-vision.com), he highlighted how the method could determine the performance of numerous face detection algorithms.
To evaluate this performance, it is necessary to determine the probability of both true and false image detection. This, of course, will be dependent on image parameters such as color, scene attributes and the algorithm(s) used to determine whether a match has been found.
Calculating the probability of detection versus that of a false detection over a large number of images can be used to characterize the performance of such face detection algorithms. Furthermore, by clustering specific image parameters, developers can evaluate which parameter returns the highest probability rate. By doing so, face recognition algorithms, for example, could be optimized for maximum performance.
According to Eledath, an API is currently under development that will allow users to evaluate the software. Better still the software will be made freely available on the OpenCV website.
Often, of course, large data sets such as Caltech 101 and 256 datasets are used to evaluate the performance of such algorithms (http://bit.ly/ayH0Sk). However, as Clark Dorman, Principal Architect at Next Century Corporation (Columbia, MD, USA; www.nextcentury.com) points out, such object recognition tasks are dependent on numerous factors including lighting conditions and environmental effects.
In his presentation at the Embedded Vision Summit, Dorman proposed a novel method that combines computer graphics and vision technology that may compensate for these effects. Realizing that numerous objects may appear in a scene at different angles and lighting, Next Century Corporation's Synthetic Image Generation Harness for Training and Testing (SIGHTT) project creates annotated images by combining rendered 3D objects with real backgrounds.
In his presentation, Dorman used 3D modeling tools to generate a computer graphic model of a rocket propelled grenade (RPG). After rendering, the RPG was rotated and placed in different locations on a background image of his office. To add realism to the image, the RPG was composited into the image by using edge matching techniques to compensate for the aliasing and color and brightness variations between the computer generated and real image. To make the scene more realistic, future variations of the software will add lighting effects, shadows, occlusion and haze to the composited image.
To demonstrate the effects of the technique, Dorman applied a scale invariant feature transform (SIFT) to two composite images. In the first image, the RPG was simply placed in the image with no compensation for aliasing and color brightness. In the second image, edge matching was used to make the scene more realistic.
As can be seen, the SIFT algorithm is far more effective after edge matching techniques are applied. Can such synthetic images be used as datasets to evaluate the performance of pattern recognition algorithms? While Dorman admits this is still an active area of research, the SIGHTT software, like SRI's APC tools, will be made available on the OpenCV website next year.
Vision Systems Articles Archives