Fundamental applications of deep learning networks

The following is part two of a two-part series of guest blogs from Johanna Pingel, Product Marketing Manager, MathWorks

Figure 1

View part one here.

Deep neural networks are essential for image classification, but they are now being increasingly used for other tasks as well. Deep networks provide the accuracy and processing speed to let you perform complex analyses of large data sets without having to be a deep learning domain expert. Traditional techniques in signal processing and image processing are still relevant yet can be augmented or replaced as a deep neural network.

Regardless of whether the input data is text, time series, or images, deep learning networks like convolutional neural networks (CNNs) and long short-term memory (LSTM) networks make it possible to process data in new ways. This article takes a closer look at tasks once thought of as traditional domain-specific tasks, and discusses how tools, including MATLAB, combined with deep learning networks can increase efficiency and accuracy.  

Techniques for image processing

Traditional image processing techniques are often used to enhance images, such as reducing noise or blur caused by a camera or sensor. In signal processing, techniques can reduce unwanted background noise, or variability in the signal that can cause confusion in the algorithm.

The question then becomes, can we use a deep learning method to clean and enhance our data and get better results over the traditional methods? The following example provides a high-level overview of the deep learning techniques involving image enhancement.

Image enhancement example

Techniques involving image denoising and deblurring have been around for many years with classic techniques using wavelets and filters, which are common methods for removing noise and blur. Deep learning models are trained to remove noise by being fed thousands of images containing noise, and thousands of images of clean, non-noisy images. The model “learns” to identify and remove the noise in future images, provided the noise in new images is consistent with the noise in the training images.

The following example explores how a pretrained image denoising network can be applied to a Gaussian noise-contained image set.

Figure 1

Figure 1. This figure shows the denoising workflow using a pretrained denoising network. The following example walks through this workflow to remove Gaussian noise from an image. © 1984-2019 The MathWorks, Inc.

Figure 2

 Figure 3. Original non-noisy image (left) and denoised image (right). © 1984-2019 The MathWorks, Inc.

Let’s focus on a few details.

Denoising has left a few side effects, as shown by the two views in Figure 3. This result might be acceptable, or the image might need further processing, based on whichever application it’s used for. 

If using a pretrained network such as DnCNN for image denoising, consider that it can only recognize the type of noise on which it’s been trained. Engineers and scientists can use tools like MATLAB and Deep Learning Toolbox to add more flexibility in training networks to create fully custom denoising neural networks.

This approach also lets you incorporate other image enhancement techniques like deblurring or color adjustment.

Techniques for signal processing

For signal processing, engineers and scientists use the techniques of deep learning and machine learning in applications such as signal denoising, speech classification, and voice identification. They use deep learning architectures in two popular ways:

  • CNN architecture. Since the input to neural networks is traditionally an image, engineers working with signal data can transform their signals into “images” using transformations such as wavelets and spectrograms. The image is then fed to the CNNs as input and the network is trained on many samples of this data.
  • LSTM architecture. LSTM stands for Long Short-Term Memory and can take signal data as input. The key factor is the network can use previous predictions for future predictions. This technique is highly relevant for speech and audio. We can use a simplified speech example to demonstrate this technique. Fill in the blank: “I was born in France, I speak ____.” An algorithm must be able to remember previous words to complete the sentence. (For a refresher of deep learning basics, see the first article in this series: Fundamentals of deep neural networks)

LSTM and CNN techniques have allowed those involved with signal processing to take advantage of deep learning in new ways and offers very exciting results in accuracy and speed.

Signal classification example

The example below involves classifying speech audio files into corresponding classes of words – a process that resembles image classification for a specific set of signals. Here, a spectrogram is used as a 2D representation of the signals in 1D audio files (Figure 4). The 2D signal representation can be used as input to a CNN, like how a “real image” would be used.

 Figure 4

Figure 4. Original audio signals (top) and corresponding spectrograms (bottom). © 1984-2019 The MathWorks, Inc.

In this example, we wish to identify simple commands, such as “on,” “off,” “up,” “down,” “yes,” and “no” – an application similar to a simplified speech assistant device.

The spectrogram function represents one way of converting an audio file into its corresponding time-localized frequency. Since speech is a specialized form of audio processing, it’s best to use speech-specific functions that target frequency areas where speech is most relevant.

Like an image classification problem, training data should be distributed evenly between the classes of words. To reduce false positives, we use a technique to incorporate a category for words likely to be confused with intended categories. For example, words like “mom” and “dawn” can be included as they are similar to the word “on” that we want to recognize. The CNN won’t need to know what these extra words are, just that they aren’t the words it should recognize. The network will then need to be defined and trained with training data.

Once the model is trained, it can classify the input image into appropriate categories (Figure 5).

Figure 5

Figure 5. Classification result for the word “yes.” © 1984-2019 The MathWorks, Inc.

The examples above provide two ways to use deep learning networks in new ways – image denoising and signal classification – and show that deep learning is more than image classification. Of course, there are even more new and exciting ways to apply deep learning to your application. To learn more, see the links below or email me at deep-learning@mathworks.com:

 

More in Boards & Software