Measuring a heartbeat by analyzing how skin color changes over time is an impossible task for a human being. Similarly, low-amplitude motion such as the subtle motion of a breathing infant's chest may be difficult to discern. By exaggerating subtle color changes or imperceptible motion in image sequences, temporal variations can be easily computed.
To reveal these subtle changes, researchers led by professors Frédo Durand and William Freeman at MIT CSAIL (www.csail.mit.edu) have developed a spatio-temporal algorithm that can be implemented at VGA frame rates as fast as 45 frames/sec on a standard laptop computer (http://bit.ly/PsrA2O).
To develop the algorithm, PhDs Hao-Yu Wu, Michael Rubinstein, and John Guttag of MIT CSAIL and Eugene Shih of Quanta Research Cambridge (www.qrclab.com) first spatially decomposed a series of images into different frequency bands using a Laplacian pyramid approach originally proposed by Peter Burt and Edward Adelson for compact image coding (http://bit.ly/KMCbYi). For each image frame in the sequence, a pyramid of images is generated where the bottom-most level represents the original image and the different levels represent details at different spatial scales (see figure). Thus, each frame is decomposed into roughly separate frequency bands.
Using this technique, the original image can be recovered (or collapsed) with no loss of data (http://bit.ly/N92rXG). Before recovering the image, however, temporal processing is performed on each spatial band of the image sequence using a temporal bandpass filter to extract the frequency bands of interest. Selection of this bandpass filter is application-dependent. For example, to amplify the subtle motion of blood vessels, a temporal filter of 0.88 Hz corresponding to a heart rate of 53 beats per minute can be used.
Such a bandpass filter is applied equally to every spatial level generated by the Laplacian pyramid. Extracted bandpass signals are then magnified by an amplification factor to increase any temporal variation between image frames. This magnified signal is then added to the originally spatially decomposed image data and the Laplacian pyramid collapsed to restore the original image sequence. In this way, the restored image sequence highlights subtle motion changes in sequences of images.
Currently, two versions of code are available to perform this motion detection. The first, written in nonoptimized MATLAB code, has been used to process non-real-time image sequences using a six-core PC. The second, written in C++, is also PC-based and has been demonstrated running 640 x 480-pixel video frames at 45 frames/sec.
According to Freeman, the code will soon be posted on the MIT CSAIL web site at http://bit.ly/MxlYjj.
Freeman and his colleagues have demonstrated how the code can be applied in a number of embedded vision applications including monitoring the vital signs of newborns. In this application, the heart rate of a baby was measured by amplifying the color change of a series of facial images as blood flows through the baby's face.
After processing the images, green to red variations were observed in the image sequence that correspond to variations in heart rate. The accuracy of the heart rate computed using this method was then verified by comparing the obtained results with those obtained from an ECG monitor. Using the same spatio-temporal analysis, it was also possible to amplify the motion of the baby's breathing as it lay in a crib. Both of these applications can be viewed at http://bit.ly/KkRCRh.