Despite improved tools, libraries and APIs,computer vision software development remains challenging. Several presentations at the recent Embedded Vision Summit addressed various aspects of the computer vision software development process, both as it currently exists and as it may evolve in the future.
First up was Paul Kruszewski, the founder and President of a computer vision company called WRNCH, with a presentation titled "Democratizing Computer Vision Development: Lessons from the Video Game Industry." Kruszewski previously founded a company called AI.implant, which developed the world’s first real-time navigation middleware for 3D models of humans and was subsequently acquired in 2005 by Presagis. He then founded GRIP, which developed the world’s first brain authoring system for video game characters and was acquired in 2011 by Autodesk. WRNCH, quoting from the company's website, "works with the leaders in digital entertainment to stuff bleeding edge computer vision technology in leading game engines to deliver amazing AR/VR applications."
In Kruszweski's opinion, today’s dominant approach to computer vision software development isn't scalable, and this represents a major bottleneck to the deployment of vision-enabled products. The video game industry faced similar challenges in the early 2000s, when it became impractical for developers to write an entire game engine from scratch. Today, in contrast, small teams of independent game developers leverage commercial game engines like Unity to build complex video games that only five years ago would have required a 100+ person team. Inhis talk, Kruszewski projected how computer vision software development may similarly evolve. Here's a preview:
And what's today's dominant approach to computer vision software development, according to Kruszweski? In his own words, it's "hiring a team of computer vision PhDs to hack OpenCV."OpenCV, for those of you not already familiar with it, is the Open Source Computer Vision Library, a collection of more than 2500 software components representing both classic and emerging machine learning-based computer vision functions. And, while Kruszweski is no doubt correct that higher-level tools will eventually enable easier and more widespread vision application development, Gary Bradski, the President and CEO of the OpenCV Foundation, explained how today, OpenCV is already enabling millions of developers.
Bradski launched what's now known as OpenCV while working at Intel Research; Intel subsequently released the library to the public under an open source license. In addition to managing OpenCV's development and distribution since 1999, Bradski previously ran the vision team at Stanford University for the autonomous vehicle that won the 2005 DARPA Grand Challenge race across the desert. Bradski also co-founded the Stanford AI Robotics program; he remains a consulting faculty member in Stanford's computer science department. Out of this program grew an early robotics startup, Willow Garage, where he was a senior scientist. And a more recent robotics startup for which he was both founder and chief scientist, Industrial Perception, was sold to Google in August 2013.
In his Embedded Vision Summit talk, entitled "The OpenCV Open Source Computer Vision Library: What’s New and What’s Coming?," Bradski began by providing an overview of OpenCV, focusing in particular on last summer's v3.0 release and the more recent v3.1 follow-on. According to Bradski, v3.0 was a major overhaul, bringing OpenCV up to modern C++ standards and incorporating expanded support for 3D vision and augmented reality. The newer v3.1 release introduces support for deep neural networks, as well as new and improved algorithms for important functions such as calibration, optical flow, image filtering, segmentation and feature detection. And in addition to providing insight into how developers can utilize today's OpenCV to maximum advantage for vision research, prototyping, and product development. Gary also offered a sneak peek into where the open-source library (and the foundation that manages it) are headed, including a new focus on smart cameras and other embedded applications, as well as on functions for processing light fields.
Here's a preview:
Next week I'll be back with more discussion on a timely computer vision topic. Until then, after watching the above videos, I encourage you to peruse more of the unique computer vision content both on theEmbedded Vision Alliance website and at the organization's YouTube channel. And of course, as always, I welcome your comments.
Editor-in-Chief, Embedded Vision Alliance