Artificial Intelligence poised for growth in embedded vision and image processing applications
Artificial intelligence is poised for growth in embedded vision and image processing applications, as developers are increasingly applying deep learning and neural networks in applications such as ADAS, bio-metric and gesture-recognition.
Developers increasingly apply deep learning and neural networks in ADAS, biometric and gesture-recognition applications.
Artificial intelligence (AI) has become a very popular expression in recent years. The last century was rich in innovation, especially in mathematics, giving hope of reaching this goal, but many technical or conceptual gaps have buried the various initiatives. Today, however, thanks to tremendous progress over the past twenty years in terms of computational capacity, data accumulation, improvements in mathematical tools, all at reduced cost, we actually can see the birth of AI.
In order to make a machine capable of understanding the world around it, technology has been inspired by biology. 80% of the information enabling humans to locate their place in, and interact with, their universe passes through their eyes.
Figure 1: Yole’s Embedded Image and Vision Processing report highlights revenue growth for IP and silicon companies and provides forecasts for common vision processors.
Much of the research in AI has therefore focused on the ability to analyze images from vision systems. The other main inspiration from biology is the mathematical structure that allows the machine to analyze these images: artificial neural networks, a miniature structural copy of the human brain.
There are a multitude of different neural networks depending mainly on the topology of the connections between neurons, the aggregation function used, the threshold function, and the backpropagation method (if present, the network is called a Convolutional Neural Network or CNN). These mathematical methods are all part of the field of artificial intelligence called ‘deep learning,’ and are broken down into two parts: training and inference.
The vast majority of neural networks have a very variable “training” algorithm (supervised or not) according to the goal to be achieved. The algorithm modifies the synaptic weights according to a data set presented at the input of the network. The goal of this training is to enable the neural network to “learn” from examples.
If the training is properly performed, the network will provide output responses very similar to the input values of the training data set. An inference engine is a software algorithm corresponding to a simulation of deductive reasoning, the neuron network in the case of deep learning. This software is often embedded in the device.
Deep learning has been very successful in many segments over the past ten years. Image-based technologies include facial recognition, iris and gesture monitoring, object and free space detection, and more recently, behavioral recognition.
In terms of the market, the most spectacular progress can be noted in the car, as these technologies are used in ADAS (Advanced driver assistance systems) for the detection of obstacles and the recognition of signs, traffic lights, cars, pedestrians and assorted others. The images come from a bank of cameras arranged on and around the car, while the training is performed in datacenters in dedicated computing machines, and the inference algorithm is embedded either in an ECU (Engine control unit) in the case of semi-autonomous cars or in a complete computer in the case of robotic or fully autonomous cars (Figure 1).
Biometrics is another major segment where deep learning is widely used. We find its algorithms used for the authentication of an individual; the latest Apple phone, the iPhone X, is a notable example thanks to facial recognition in 3D1. In surveillance and homeland security, facial recognition is used in border controls and in the production of identity papers through the use of specialized cameras.
Iris recognition based on deep learning for the authentication of an individual is also increasingly used with a desire for use in mobile devices. Finally, we can add behavioral recognition in this segment, still in the research and development phase, but with encouraging preliminary results. Deep learning is currently integrated in gesture recognition though mainly in the entertainment segment, with on-board computers in the car, gaming, commercial drone controls, etc. The major players in each of these areas are well known. There are Google, Amazon, Facebook, Apple. And the investment of these companies in the AI field has been consistent over the last ten years. At Yole, we still expect a 50% CAGR (Compound annual growth rate) until 2025, with revenues mainly focused on technologies using deep learning. Indeed, Yole is expecting that almost 50% of the US $50 billion expected in 2025 will come from technologies using its algorithms2.
The AI development also cannot be dissociated from specialized hardware development. It is interesting to note that designers and builders of vision processors also provide a software layer via an embedded operating system and/or an SDK (Software development kit).
This makes it very easy to implement software solutions and allows the hardware to be used to the best of its capabilities, while also requiring platform-specific development skills using tools such as embedOS from ARM (Cambridge, UK; www.arm.com), Jetson from NVIDIA (Santa Clara, CA, USA; www.nvidia.com), XSDK from Xilinx Inc. (San Jose, CA, USA; www.xilinx.com), and CDNN toolkit from CEVA (Mountain View, CA; USA; www.ceva-dsp.com).
Companies developing AI for an embedded system must take into account this imposed software layer in developing its solution and design it to be compatible with different types of hardware. As its momentum continues, AI for vision systems promises a bright tomorrow, both at the hardware level with the arrival on the market of dedicated processors, and at the software level with increasingly powerful algorithms to achieve very great precision in the recognition of objects, faces and gestures. The markets to follow are firstly, the automotive market with all the ADAS technologies providing a direct route to autonomy; secondly, the mobile with security systems for the authentication of the individual (unlocking, payment); and then biometrics and its applications in the industrial sectors, surveillance, security, and, to a lesser extent, the smart building and smart home segments. Investments, acquisitions and partnerships are numerous, and they promise to be substantial in the coming years, enough to expect rapid growth and income.
1. System Plus Consulting, partner of Yole Développement proposes several reverse engineering & costing analyses related to Iphone X components: ams ALS & Color Sensor - Infrared Dot Projector - STMicroelectronics NIR Camera - Broadcom AFEM-8072 Mid / High Band Front-End Module - Flood Illuminator & Proximity Sensor Module - Bosch 6-Axis MEMS IMU - Apple A11 Application Processor.
2. Source: Embedded Image and Vision Processing report, Yole Développement, 2017.
Dr. Yohann Tschudi is Software & Market Analyst, and a member of the MEMS & Sensors business unit at Yole Développement (Yole; Lyon-Villeurbanne, France; www.yole.fr).