Disparity maps give robots depth perception
One of the most important problems facing developers of robotic-based vision systems is depth perception or the ability to judge an object’s distance from a camera system.
Andrew Wilson, Editor, firstname.lastname@example.org
One of the most important problems facing developers of robotic-based vision systems is depth perception or the ability to judge an object’s distance from a camera system. Many systems use stereo cameras to provide depth information to a host computer. In turn, this information is used by robotic systems to locate objects in 3-D space.
“Reconstruction of real-world images through stereo cameras can be divided into two steps,” says Andrew Worcester, cofounder of Focus Robotics (Hudson, NH, USA; www.focusrobotics.com). “These consist of a correspondence problem, where for every point in one image, the corresponding point in the other image must be located. Once found, the disparity of these points can be computed. Then, given the disparity map, the focal length of the two cameras and the relative position and orientation of the cameras, the x, y, and z coordinates of points in the image can be computed.”
Focus Robotics introduced its nDepth Vision processor, an OEM subassembly that allows system designers to inexpensively add depth measurements to their systems. The Xilinx Spartan 3-based nDepth processor provides 752 × 480 pixels of depth information at 30 frames/s, approximately 30 times faster than conventional depth-perception software running on a 3-GHz Pentium processor (see Vision Systems Design, August 2005, p. 23).
“Originally,” says Worcester, “the processor was to have been embedded within a stereo camera. However, since our customer base demanded access to the FPGA device used, a more systems-based approach was taken (see figure). This consists of a separate stereo camera based on two MT9V022 CMOS micro cameras from Micron Technology (Boise, ID, USA; www.micron.com) and two PC-104 based FPGA boards hosted by a ReadyBoard800 from Ampro Computer (San Jose, CA, USA; www.ampro.com) running Fedora Core Linux from Red Hat Software (Raleigh, NC, USA; www.redhat.com).
As images are digitized from the camera they are transferred over a single LVDS link directly to the Xilinx Spartan 3 on the first PC-104 board. Here the processor performs image rectification so that only a search along one horizontal scan line is required for a corresponding pixel. Stereo correlation across 9 × 9-pixel regions around each pixel in the left image is then performed and the best matching region of equal size in the right computed using the sum of absolute differences algorithm. Because the horizontal distance (or disparity) searched across the image is inversely proportional to distance, the greater the matched disparity, the closer the object is to the cameras.
“While these depth maps provide the programmer with a visual idea of depth within the image,” says Worcester, “the data rate of 11 Mbytes/s is too large to perform any substantial analysis in real time.” To overcome this, Focus Robotics employs an optional second FPGA-based board on the PC-104 stack that can be used by developers to program additional functionality into the system. “After the disparity map is computed, map data can be pipelined to this FPGA for additional processing.”
To perform obstacle detection and navigation, for example, companies such as Mobile Robots (Amherst, NH, USA; www.mobilerobots.com) have used this system to create a 3-D view of a mobile robot’s environment. Disparity maps are converted to 3-D coordinates that can be combined to create a rough 3-D picture of the world. “The creation of the view coordinates,” says Worcester, “can be ported to the second FPGA in the future to further offload the host PC processor for other tasks.”
According to Jeanne Dietsch, CEO of Mobile Robots, the nDepth system is being incorporated within the company’s Sekkur Homeland Security robot that includes a laser, DGPS, stereovision, and thermal sensing unit. “The nDepth system works well with our mobile robot platforms,” says Dietsch “because of its small size, low power use, and ability to perform the computationally-intensive stereo matching without overloading the CPU.”