Kinect transformed into 3-D image capture system

Aug. 29, 2011
Engineers at the University of California San Diego (UCSD) (San Diego, CA, USA) have modified Microsoft's KinectXbox 360 peripheral to enable it to make handheld 3-D scans of objects.

Engineers at the University of California San Diego (UCSD) (San Diego, CA, USA) have modified Microsoft's Kinect Xbox 360 peripheral to enable it to make handheld 3-D scans of objects.

Originally intended to sit atop a television and sense the movement of users playing video games, the Kinect was repurposed by Jürgen Schulze, a research scientist at UCSD and his master’s student Daniel Tenedorio.

The scanning process itself entails moving the modified device by hand over the surface of an object. Thousands of scans are then stitched together to give a full 3-D model. Since the progress of the scan can be monitored in real time, a user can quickly fill in any holes in the 3-D models.

FIGURE 1. The modified Microsoft Kinect houses an infrared scanner (far left circle), an infrared (IR) sensor (middle circle) and a color camera (far right circle). The five-pronged IR sensor on the top of the Kinect allows its position and orientation in space to be tracked with overhead cameras. Tracking allows the Kinect's scans to seamlessly stitch together thousands of scans into a stable 3-D image.

The original Microsoft Kinect system projects a pattern of infrared (IR) dots onto an object, which are then captured by the device’s IR sensor. The reflected dots create a 3-D depth map. Nearby dots are then linked together to create a triangular mesh grid of the object. The surface of each triangle in the grid is then filled in with texture and color information from the Kinect’s color camera. A scan is taken 10 times per second and data from thousands of scans are combined in real time, yielding a 3-D model of the original object or person.

FIGURE 2. The steps for making a 3-D reconstruction of a real-life stuffed bear (far left) include: 1) projecting a pattern of IR dots onto the bear to construct a depth map (second from left); 2) connecting nearby dots with a triangular mesh grid (third from left); and 3) filling in each triangle in the grid with color and texture information from the Kinect's color camera (far right).

One challenge Schulze and Tenedorio faced was to spatially align all the 3-D scans from the mobile device. Without a mechanism for doing so, the 3-D model created would be a discontinuous jumble of images. To overcome this problem, they added a five-pronged IR sensor to the top of the Kinect. Overhead video cameras track this sensor in space, thereby tagging each of the infrared scans that are captured with their exact position and orientation. This tracking makes it possible to seamlessly stitch together information from the scans, resulting in a stable 3-D image.

The researchers are now working on a tracking algorithm that incorporates smart phone sensors, such as an accelerometer, a gyroscope, and GPS (global positioning system). In combination with the existing approach for stitching scan data together, the tracking algorithm would eliminate the need to acquire position and orientation information from the overhead tracking cameras.

-- Posted by Vision Systems Design

Voice Your Opinion

To join the conversation, and become an exclusive member of Vision Systems Design, create an account today!