Neural Networks at Work
Computational methods that model neurons in the brain are being used in machine-vision and image-processing applications.
Computational methods that model neurons in the brain are being used in machine-vision and image-processing applications
Andrew Wilson, Editor
Today’s image-processing software packages use a number of different techniques to perform object recognition and image classification. These include supervised algorithms such as k-nearest neighbor algorithms, support vector machines (SVMs), self-organizing maps (SOMs), and neural networks (see “New Frontiers in Imaging Software,” Vision Systems Design, June 2010).
One of the main benefits of unsupervised learning algorithms such as neural networks is the lack of programming required to develop image recognition or classification applications. Many such neural network programs perform this recognition by using digitized images to feed a network of simulated neurons. Depending on the attributes associated with each pixel in the image, the neurons are then automatically weighted.
In a feed-forward network, these weights are then used as the input of a smaller array of neurons and finally produce a weighted sum or number that represents the image. In this process, it is pixel attributes such as color, gray level, or combined values of groups of pixels that are used as the input to the neural network.
However, research by Simon Thorpe and his colleagues at the Brain and Cognition Research Centre has shown that computation in the nervous system uses the rate at which neurons emit spikes to code information. To commercialize this concept, SpikeNet Technology has developed an image-processing toolkit called SNVision Studio SDK, integrating networks of spiking neurons to both recognize and classify images.
Rather than weight the simulated neurons in the network using pixel attributes, the software uses the order in which simulated neurons “fire” to code information. Because this order coding is contrast independent, the software is also highly tolerant to noise and light conditions within an image.
SpikeNet’s technology is now applied in broadcast transmission content analysis at Repucom International, mobile phone visual search at Buzzar, and by Elan Software Systems in machine-vision systems for the pharmaceutical industry. The company is also working on a large-scale automatic image data mining system for French Homeland Security and an embedded SpikeNet chipset to be integrated into applications for video surveillance and robotics.
For potential developers wishing to incorporate this software into machine-vision or image-processing systems, SpikeNet Technology has made available a free trial version of SNVision Model Builder toolkit on the company’s web site. Images can be opened and the software programmed to learn specific models and tested with various degrees of noise, blur, luminance, contrast changes, rotations, and zoom levels.
After downloading and installing the software on a Windows-based PC, test images in more than 60 raster formats including BMP, JPG, and TIFF, can be used to test the software’s functionality. Rather than use the test images supplied with the software, I imported an original 2048 × 1536-pixel, 2-Mbyte image of a rusty anchor into the program. Since this image was rather large and could not be displayed completely on the 13-in. monitor of my company-supplied 1.8-GHz Compaq nc6230, I resized the image using Adobe Photoshop to a more manageable 1024 × 768-pixel image.
After importing this resized image into SNVision Model Builder, I highlighted two regions within the image—namely the anchor ring and a nearby float. Under >Models>Learn in the software menu, the system was then trained to recognize the two objects. After recognition, I presented the software with an identical image and under >Processing>Recognize, the software recognized both objects in 301.9 ms (see Fig. 1).
|FIGURE 1. Using SNVision Model Builder, image features within a 1024 × 768-pixel color image can be located in 301 ms by simple training. Reducing the size of the image to a VGA format decreases this processing time to less than 100 ms.|
Of course, processing speed depends on image size. Using an identical image, rescaled to 640 × 480 pixels, for example, resulted in an execution time of 99 ms.
To understand the limits of the software’s capabilities, I tested the software with the same image subjected to varying degrees of blur, rotation, and rescaling. After importing the original 1024 × 768-pixel rescaled image into Photoshop, I generated three more “original” images, one with what Adobe terms a “10-pixel blur,” one with 20° of clockwise rotation, and another that was rescaled by 10%.
Blur, rotation, rescale
Following the same image-recognition process using SNVision Model Builder, both objects of interest were found at various processing speeds. When performing image recognition on the image rotated 20° clockwise, the processing time required was 314.4 ms (see Fig. 2).
|FIGURE 2. Features within 1024 × 768-pixel color images rotated by 20° with file sizes of 571 kbytes can be determined in as little as 314 ms using SpikeNet Technology’s software.|
However, it must be noted that when this rotation was initially performed in Photoshop, the generated image was approximately 50% larger than the original 246-kbyte image file since the generation process introduced “white space” into the image. Perhaps most impressively, features in the almost unrecognizable blurred image were found in 283.9 ms (see Fig. 3).
FIGURE 3. Blurring the same image using a 10-pixel Gaussian blur in Adobe Photoshop and subjecting the image to the same recognition process, both the anchor ring and the nearby float were recognized in 283 ms.
These same features could not be recognized by additional 10% image scaling, increased 20-pixel blurring, or greater than 20° rotation. To achieve this, a number of models must be created by modifying SizeMin and SizeMax percentage functions.
To search for targets at a smaller size—due to image reduction or due to images being captured at different distances from the camera—SizeMin can be set to 50, for example. Then the software looks for a 50% smaller target.
To implement a recognition system to process small video images at up to 10 frames/s requires a camera, video acquisition board, PC, and the SpikeNet software. According to Hung Do-Duy of SpikeNet, the software can also be implemented using parallel processing techniques to tackle more demanding tasks that require either larger images or images with larger numbers of target forms to be recognized.
Although the free demonstration software does not allow images, image models, or the recognition results to be saved, these functions are available in the full licensed version of the software, of course.
Brain and Cognition Research Centre
Amsterdam, the Netherlands
Elan Software Systems
Stamford, CT, USA