Microsoft promotes image-processing development
Andy Wilson Editor at Large
1995, Microsoft Corp. (Redmond, WA) established its Vision Technology Group to support computer vision research. Now, in an attempt to standardize the development of image-processing software, the company has recently introduced its Vision Software Development toolkit (VisSDK). This software supports images of any pixel type (through C++ templates) and features a device-independent image-capture interface (see Vision Systems Design, May 1998, p. 7).
Microsoft`s introduction of this toolkit represents more than a standardization attempt. In the last three years, Microsoft`s Vision Group has been busy in other areas of image processing. These areas include image-based modeling and rendering, intelligent video analysis, and vision-based user interfaces.
In image-based modeling, Microsoft is addressing the fundamental problems of computer vision, such as inferring the three-dimensional structure of a scene given multiple images of the scene from different, possibly unknown, vantage points. While attempting to solve these problems, its vision researchers have been successful at creating seamless image mosaics that stitch together overlapping images to form a single panoramic view of an indoor or outdoor scene. Such image mosaics offer a compact representation of the data contained in many images and allow users to view scenes from perspectives other than those from the original images.
In other areas, Microsoft`s intelligent video-analysis research aims at defining and extracting fundamental components of scene information captured in video. Some topics currently under investigation include how to separate static two- and three-dimensional visual information from dynamic video and how to compactly represent information about the contained actions and behaviors. Based on the answers to these questions, Microsoft`s researchers aim to develop technology for interactive video production and manipulation, including seamless integration of real and synthetic visual information, video storage and access, and video over the Internet.
Perhaps the first such products from Microsoft will be in the area of vision-based user interfaces. These will allow computer-based vision systems to recognize people and interpret what they are doing. Already, research from Microsoft`s vision group has resulted in intelligent movie players that recognize when a person is facing the monitor.
Whereas some industry observers might see Microsoft`s entry into the vision market as an attempt by the world`s software leader to dominate another computing area, other observers stress the urgent need for standards in the image-processing field.
With the introduction of Microsoft`s SDK, software developers are offered a common platform with which to offer compatible software. Even if Microsoft itself does muscle into some of the more profitable areas of vision technology, its endorsement is expected to bring image processing into the forefront efforts of more vision researchers, systems integrators, and end users.
In essence, the single comprehensive solution envisaged by Microsoft may be more useful to vision-systems developers than the multiplicity of vision solutions currently on the market. Perhaps, Microsoft can do for imaging what AT&T did for the nation`s telecommunications infrastructure. By applying standards, AT&T rapidly deployed long-distance services across the United States. If Microsoft can do the same for imaging, the technology and deployment of vision systems might also similarly increase.