Convolutional neural networks (CNNs) and other deep learning techniques are rapidly becoming key enabling technologies for applications requiring object recognition and other computer vision capabilities. I first discussed the topic of deep learning in a March column; a follow-up article showcased the Embedded Vision Summit keynote "Large-Scale Deep Learning for Building Intelligent Computer Systems" from Google Senior Fellow Jeff Dean. I'd like to devote this week's column to additional deep learning insights you can acquire from three other Embedded Vision Summit talks.
The first presentation, "How Deep Learning Is Enabling Computer Vision Markets," was delivered by Bruce Daley, Principal Analyst at market research firm Tractica. According to Daley, deep learning algorithms are a key tool for automating and accelerating the analysis of large data sets generated by a proliferation of sensors (including image sensors) in connected devices, and will find a home in a multitude of business applications. Daley's presentation highlights the market factors, technology issues, use cases, and industry ecosystem that are developing around the use of deep learning for computer vision. Based on Tractica’s ongoing research, the presentation provides strategic insights as well as detailed quantification of the various market opportunities. Here's a preview:
The next talk, "TensorFlow: Enabling Mobile and Embedded Machine Intelligence," came from Pete Warden, research engineer at Google. Warden was formerly Chief Technology Officer at Jetpac, which was acquired by Google in 2014 for its deep learning technology optimized to run on mobile and embedded devices. In his presentation, Warden discusses how Google uses the TensorFlow software framework to deploy deep learning in products tailored for these devices, such as Google Translate, Google Photos, and OK Google. He then discusses why the company open-sourced the framework and talks about TensorFlow's focus on supporting low-power applications using approaches such as specialized hardware and small data types. Here's a preview:
Finally, there's "The Road Ahead for Neural Networks: Five Likely Surprises," delivered by Dr. Chris Rowen, Chief Technology Officer for the IP Group at Cadence Design Systems. Rowen is also an IEEE Fellow and was a co-founder of MIPS as well as the founder of Tensilica. His talk examines the fundamental capabilities and limitations of neural network computing, especially for real-time and embedded systems, and paints a picture of how this technology will likely evolve over the next decade. In particular, Rowen explores the impacts of neural networks on industry and forecasts a series of developments, including unexpected changes in business models for data and training, in distribution of neural networks between the cloud and edge devices, in new types of hardware, and in novel software. Here's a preview:
I also encourage you to take a look at a recent interview with Dr. Rowen on deep learning and other technology topics. And for more in-depth understanding, consider attending the hands-on tutorial "Deep Learning for Vision Using CNNs and Caffe," taking place September 22, 2016 in Cambridge, Massachusetts. This full-day tutorial focused on convolutional neural networks for vision and the Caffe framework for deep learning is presented by the primary Caffe developers from the Berkeley Vision and Learning Center. It takes participants from an introduction to the theory behind convolutional neural networks to their actual implementation, and includes hands-on labs using the Caffe open source framework. A discounted registration fee of $600 is available only until July 8; for more tutorial details, and to register, please visit the event page.
Editor-in-Chief, Embedded Vision Alliance