Supplemental sensor technologies such as radar or LIDAR (light detection and ranging) will likely be included along with cameras in future vehicles, since computer vision doesn't do very well after dark or in poor weather situations.
Similarly, in explaining the value of vision-based perception for drones, I noted:
Consumer drones are also unlikely to be flown in heavy rain, thick fog and other challenging weather conditions [Editor's note: or after dark, for that matter].
While both statements are true in a general sense, I now realize that they may have been overly definitive. That's because earlier this week, Embedded Vision Alliance member company Movidius announced that it has partnered with thermal imaging technology provider FLIR Systems to bring computer vision capabilities to FLIR's latest thermal imaging camera core, called Boson. Check out the following video:
Even when available visible light is insufficient to enable the identification of an object using a conventional camera, visual perception still possible if the object emits infrared radiation, such as that coming from a device consuming electrical power, for example, or a motor running on hydrocarbon fuel...or a person or other warm-blooded animal. Infrared images even allow accurate face recognition, as study results announced by researchers at the Karlsruhe Institute of Technology last summer suggest.
Here's another example of conventional thinking turned upside down: When using vision algorithms to identify and track an object, one might expect that the higher resolution of the source image containing the image, the better. And generally speaking, that’s correct. But not always. Past a certain point, added detail can confuse a computer vision algorithm. In addition, the higher the resolution of each frame, and the higher the frame rate, the greater the cost and power consumption required for image sensors, processors, memory, and mass storage. Toss in the incremental memory and processing demands of the increasingly popular deep learning algorithm options, and you've likely got a challenging design problem on your hands.
Enter Song Han, a graduate student at Stanford University. As Han's recent presentation at an Embedded Vision Alliance meeting explained, his research has shown that it's possible to efficiently implement a convolutional neural network for object classification with high accuracy via the combination of a network model compression pipeline and an inference engine to accelerate this compressed model...so efficient as to be capable of exclusively using the on-chip RAM in SoCs versus requiring larger external memory devices and arrays (along with associated battery-draining external-bus bandwidth). Other researchers are coming to similarly encouraging conclusions with respect to reducing required source image sizes, as recent technology journal coverage highlights. And in time, with further research and development attention, memory and processing requirements will inevitably continue to diminish.
If these two examples of recent advancements in deployable computer vision have whetted your appetite for more, I encourage you to attend the Embedded Vision Summit, which takes place in less than two weeks in Santa Clara, California. The Summit, an educational forum for product creators interested in incorporating visual intelligence into electronic systems and software, offers two days (and three tracks) of presentations full of vision technology, application and market insights on May 2 and 3, in conjunction with a Technology Showcase that gives you an opportunity to see demos and directly interact with the people developing these advances. And full-day technical workshops on May 4 further stimulate your imagination and enrich your expertise. Register without delay, as space is limited and seats are filling up!
Editor-in-Chief, Embedded Vision Alliance