Neural networks target video motion detection

By R. Winn Hardin,Contributing Editor

A symmetrical-processing security system uses wavelet-compression software to detect motion.

Recent developments in image processing, wavelet compression, and neural-network technology are giving digital surveillance systems a new lease on life. Capable of automatic target detection and alarm issuance, these PC-based systems are using off-the-shelf cameras, frame grabbers, and low-cost PCs to bring inexpensive, high-performance digital surveillance systems to a wider market.

"Surveillance systems should be capable of automatically combating criminal activity while at the same time being simple to install and upgradeable," says Mick Nailard of British Telecom. "Most of all, such systems should be designed to be transportable so that they can be moved from site to site as required," he says.

To meet such demands, Neurodynamics (Cambridge, England) has developed a dual-processor PC security and surveillance system, dubbed Witness, that runs under Windows NT. Using the company's own FPGA-based frame grabber, off-the-shelf cameras, lighting, and networking boards, the system can separate real threats from accepted movement, significantly reducing false alarms.

MULTIPURPOSE ACQUISITION

Plugged into the NT's PCI bus are FPGA-based image-acquisition boards that provide A/D conversion and image compression for up to eight cameras per board (see Fig. 1). At BT and other installations, the Witness system uses NTSC-compatible cameras from JVC (Wayne, NJ), Sony (Park Ridge, NJ) or Panasonic (Secaucus, NJ). The two acquisition boards also provide 16 opto-isolated I/O connections for camera and pan, tilt, lighting and alarm controls.

Neurodynamics uses a neural network to determine whether images contain suspicious persons or activities that could represent a security problem.

Click here to enlarge image

According to Humphrey, wavelet-compression algorithms developed at Cambridge University (Cambridge, England) allow one week of storage for 16 cameras at a rate of 1 frame/s on a 64-Gbyte hard drive at a compression ratio of 20:1. However, regularly scheduled acquisition and storage of images is only one way to approach archiving. The operator can configure Witness to archive only during nighttime, for example, or when the system detects suspicious activity.

The acquisition board communicates with servos and cameras through a proprietary software interface that contains a library of servo and camera commands. "That way, if a customer has a pan/tilt/zoom camera in an office and wants to replace a JVC camera with a Panasonic camera, you don't have to reprogram the system. You just tell the system to change cameras, and it alters the programming accordingly," says David Humphrey, a hardware specialist with Neurodynamics.

PROGRAMMING THE NET

The system also ships with a 10/100 Base-T Ethernet card from Intel (Santa Clara, CA), an ISDN or PSTN modem from US Robotics (Salt Lake City, UT), 128 Mbytes of RAM, and two 64-Gbyte hard drives. Through the Witness software, the operator can allocate the 128 Gbytes of hard-drive space into multiple segments, each acting as a circular loop memory buffer. The system even sends an alarm if a particular segment of memory is not active across a specified time period, which could indicate improper video motion-detection settings in the neural net, a broken camera, or other system problem. Once NTSC images are digitized into RAM, the neural net uses several user-settable parameters to weed out false alarms while detecting motion.

FIGURE 1. Two frame grabbers coupled over the PCI bus are capable of handing up to 16 cameras in the Wintress security system. Off-the-shelf networking cards allow the system access to other security systems.Click here to enlarge image

Just as the time of operation for each camera can be set, the operator can also set motion-detection parameters for each camera. First, the operator accesses each camera's field of view (FOV) through the Witness software interface (see Fig. 2). A mask is created that shows the regions of interest for each FOV. Witness automatically breaks up the image into 800 squares and, without any additional settings, performs a correlation function on each square from one frame to the next. If the correlation level drops below 100%, the software assumes motion and alerts the operator.

To limit these alarms to suspicious activities, the operator uses the mouse and paintbrush function to exclude or include parts of the image from the motion-detection calculation. Solid blocks indicate 0% sensitivity—areas of no interest to the system. In Fig. 2, the building in the background has been masked out of the image, leaving the roadway in the foreground as the main region of interest.

FIGURE 2. From a configuration screen, system operators can define potential target sizes and set factors such as ideal target size and object location within the image. This better allows the neural network to determine if security is at risk.

Click here to enlarge image

The second step is to set target areas for various portions of the image. Using the mouse, the operator draws a box in a given area. This gives the system a sense of perspective based on target sizes in different areas of the image. In Fig. 2, the box is drawn large enough to include cars, but not large enough to include buses.

Finally, the motion threshold and minimum target areas are set. Motion threshold determines how much change must happen between areas in each of the unmasked 800 blocks for motion to be detected.

Using these parameters, the neural net first correlates each of the 800 boxes in sequential images and sees if the amount of change exceeds the motion threshold parameter. Blocks that exceed that threshold are checked to see if they are contiguous and form clusters that match the target size within a given section of the image. Then the clusters are checked to see if they meet the minimum target area.

Finally, Witness performs four additional checks to reduce false alarms. If the total area of motion exceeds the maximum total area setting, such as motion caused by wind blowing on the camera, an alarm is not sent. If the minimum pixel average intensity is too low or the maximum average change based on individual pixel intensity is too high for a given camera type, then the alarm is not issued. According to Humphrey, this function eliminates false alarms caused by noisy cameras in low light situations or by sharp increases in pixel intensity caused by shining a light directly into the camera. Last, the operator can set the number of sequential trigger counts required to cause an alarm.

To complete the 800 correlations per image pair per camera in real time and subsequent decision levels based on operator-set parameters, Neurodynamics makes use of the symmetrical processing afforded by the Windows NT operating system and the power of dual 600-MHz Pentium III processors.

NET TAKES ACTION

Once triggered, Witness can notify up to four remote sites of the intrusion. In the event of multiple alarms, Witness transmits images along with overlaid text data about location, camera position, and up to three fields of user-set data over the Ethernet to a local operator or by dialing up remote stations and sending notification. Archival and storage of images along with textual data are automatically stored based on the early steps of the system configuration.

FIGURE 3. Neurodynamics automated number-plate reader checks patron's number plates against police lists, notifying station attendants and police of known criminals.Click here to enlarge image

Neurodynamics is currently developing an industrial rack that can accommodate up to 250 cameras for a single facility. This PCI-based multiple NT system will be based on a corporate intranet structure, but will allow for a greater number of cameras. "Today, 16 cameras seems to be the cost point for conventional surveillance systems. Any more than that, and the cost skyrockets," Humphrey says.

Also under development is a software module that specifically targets gas stations to help catch people that drive off without paying. The system would automatically read the number plate from each car as it pulls into the station. The value would be sent across an automated computer network that has up-to-date information on stolen cars and cars with histories of criminal activities, such as driving off without paying for gas (see Fig. 3).

Neural networks target video motion detection

MULTIPURPOSE ACQUISITION

PROGRAMMING THE NET

NET TAKES ACTION

Company Information

Related

AI and Rugged Edge Computing Are Fueling a Range of Applications

Zebra Technologies Integrates Acquisitions into Machine Vision Product Line

Voice Your Opinion!

To join the conversation, and become an exclusive member of Vision Systems Design, create an account today!

Trending

Machine Vision System Monitors Greenhouse-Grown Specialty Crop

Teledyne DALSA Launches New Line Scan Camera Series

Deploying Industrial DashCams for Real-Time Process Monitoring in Manufacturing