Computer vision evolves towards ubiquity

Sept. 12, 2016
For most of its history, computer vision was a topic of academic research, gaining its first sizable commercial success in factory automation applications, where it has become an essential technology.

For most of its history, computer vision was a topic of academic research, gaining its first sizable commercial success in factory automation applications, where it has become an essential technology. Nevertheless, vision has remained a niche technology - one that most people do not directly use or interact with on a daily basis.

Now, however, thanks to the emergence of cost-effective processors, image sensors and other semiconductor devices, along with robust algorithms, computer vision can be incorporated into a range of systems - including cost, size and power-constrained devices. TheEmbedded Vision Alliance, a worldwide organization of computer vision hardware, software and service providers founded in 2011, uses the term "embedded vision" to refer to this growing use of computer vision technology in a range of embedded systems, mobile devices, PCs and the cloud.

Key to an understanding of how computer vision will further evolve is the realization that it is an enabling technology, not an end in itself. As with other technologies such as speech recognition, vision will eventually become ubiquitous and, in the process, "invisible." Similar to other technologies that have already become success stories, as computer vision technology improves, new applications are enabled. Some of these applications become successful, and this success encourages further industry investment to further improve the underlying technology.

Algorithms have been developed and refined for decades and are widely available in both proprietary suites and open-source collections such asOpenCV. These sometimes work well, especially when used for the specific tasks for which they were originally designed. However, classical computer vision algorithms are challenged by numerous real-life factors. Many potential applications are plagued by infinitely varying inputs, which combined with the lack of underlying theoretical models of visual perception, lead to the need for exhaustive experimentation to create robust solutions. Uncontrolled environmental conditions - lighting, orientation, motion and occlusion - translate into ambiguity, leading to complex, multi-layered algorithms.

Deep neural networkspromise to assist. Originally developed for object classification tasks, their use has now expanded to include detection, segmentation and other vision functions. While an underlying, all-encompassing theory of visual perception is still lacking, deep learning provides more general solutions to solve various computer vision problems. In the past, large numbers of computer programmers were required to code image-processing algorithms. In the emerging deep learning era, in contrast, a large amount of data is required to train algorithms to classify and identify objects. One thing that has not changed is the need for lots of runtime compute horsepower to execute vision algorithms. Now, however, we also need a lot of compute horsepower for the pre-deployment training required by deep learning algorithms.

For many future vision applications, algorithms will likely converge around various (and in some cases multiple) deep neural networks. Classic vision algorithms will not disappear, but they will likely converge on a smaller range of functions for specific tasks and the processing architectures that run them will similarly evolve. Much industry debate centers on whether "local" or "cloud" processing will dominate. In the era of increasingly pervasive and fast network connectivity, the most common answer will be "both." Both local and cloud processors will becomeincreasingly heterogeneous, harnessing combinations of CPUs, GPUs, DSPs, FPGAs, and specialized imaging, vision and neural network co-processors.

Thankfully, APIs such asOpenCL enable the efficient use of such heterogeneous processors. Even higher-level APIs, such asOpenVX promise to further abstract both the processors used and the underlying algorithms. Enabled by higher levels of abstraction, the focus of vision software development will shift from implementation to integration, which will enable the development of a larger number of applications, helping computer vision to become both ubiquitous and "invisible." In the process, vision will create value both for technology suppliers and the implementers who leverage the technology in their applications.

View Jeff Bier's full session, "Computer Vision 2.0: Where We Are and Where We're Going," from the recent Embedded Vision Summithere.

Jeff Bier
Founder, Embedded Vision Alliance
Co-founder and President, BDTI (Berkeley Design Technology, Inc.)

View more articles from the Embedded Vision Alliance:

• Real-life case studies provide education and encouragement on vision system design challenges and solutions -http://bit.ly/2bHP5Jt

• Deep learning for computer vision: Perspectives from algorithm, market, and processor experts -http://bit.ly/2bHPb3N

• Industry standards simplify computer vision software development -http://bit.ly/2bnWqKr

Voice Your Opinion

To join the conversation, and become an exclusive member of Vision Systems Design, create an account today!