How to Deploy Deep Learning Neural Networks in Machine Vision

There are numerous deep-learning methods available for developing machine vision systems or applications. Using commercial software tools, even non-experts use deep learning.
April 30, 2025
11 min read

What you will learn:

  • About the different methods of deep learning—such as anomaly detection or image segmentation— that are suitable for automation and machine vision applications.
  • The situations in which it makes sense to use deep learning in machine vision.
  • How to get started with deploying deep learning in machine vision applications.

In discussions of artificial intelligence or deep learning, you’ll often hear phrases such as neural networks, black box, labeling, etc. These concepts are often difficult for the layperson to understand. People also end up believing that they need solid programming skills to truly master the technology and use it sensibly. Unfortunately, this impression ignores the potential that the technology offers for machine vision, and thus for automating production. Deep learning is not not reserved for computer scientists or programmers. 

Let's Start at the Beginning: What is Deep Learning?

As a subset of machine learning, deep learning is based on multi-layered neural networks that are capable of realistically emulating complex structures and processes of the human brain and making independent decisions. During a comprehensive training process, deep learning models learn to identify certain patterns and relationships by analyzing data.

So much for the theoretical side. But why is the technology so successful in the area of machine vision? It’s because machine vision produces an extremely large amount of image data. It forms the perfect basis for the effective training of neural networks. That's the technical side.

Related: What is Deep Learning and How Do I Deploy It in Imaging?

At the same time, users also benefit from the technology. The recognition rates that deep learning can deliver reach new levels of quality. This also allows entirely new applications to be automated based on machine vision. Deep learning is a development that gives new impetus to machine vision as a whole.

The number of people who find the use of deep learning worthwhile is therefore growing steadily. Many companies, both large and small, are considering the idea of introducing artificial intelligence or deep learning. Frequently, however, they have certain reservations that keep them from taking this step. But using the technology is not as complicated as they might think. There are also tools that make it easier to work with deep learning.

The Right Deep Learning Method for Each Application

When it comes to implementation, the most important question is this: What exactly do you want to automate? The range of deep learning methods available to integrators, plant operators, and machine manufacturers—in short, to everyone who deals with this question —is growing all the time.

Anomaly detection

Anomaly detection allows defects to be recognized very quickly and easily, which makes fault inspection in the quality management process even more efficient. One particular advantage is that the technology requires much less training data compared to conventional deep learning methods. Indeed, 20 to 100 images are all you need for a complete training session. What's more, good images are sufficient for anomaly detection, enabling you to generate a training dataset much faster. An anomaly detection model trained on the basis of good images is then able to detect structural deviations from the training images, i.e., anomalies. This allows you to detect faults whose appearance was not previously apparent.

Related: What is Image Segmentation with Deep Learning?

Global Context Anomaly Detection

Global context anomaly detection goes one step further. It can recognize entirely new anomaly variants, such as missing, deformed, or incorrectly arranged components. As a result, fault detection is no longer limited to structural defects but also covers logical anomalies. This paves the way for entirely new possibilities, such as the inspection of printed circuit boards in semiconductor manufacturing or the verification of printing.

Classification

Classification uses image data to assign objects to a specific category or class, such as a good part or a bad part. This makes it possible to determine a class with a certain degree of probability for each individual image.

Object detection

Object detection, a deep-learning-based technology, localizes the position and class of objects. The process is able to recognize various object entities of different object classes and object instances, including their positions in the image.

Segmentation

There are two types of segmentation based on deep learning: semantic segmentation and instance segmentation.

Semantic segmentation categorizes the pixel-precise localization of trained objects, structures, and faults. During this process, a certain class is assigned to each pixel in the image. By teaching the model based on training data, a specific class can be predicted with a high degree of probability for each pixel in a new image. This approach makes it possible to perform inspection tasks that were previously impossible or only feasible with considerable programming effort.

Instance segmentation combines the benefits of semantic segmentation with those of object detection. This type of segmentation enables objects to be assigned to different classes in a pixel-precise manner. The technology is particularly helpful in applications where objects are very close together, touch each other, or overlap. Typical applications include gripping randomly arranged objects from bins ("random bin picking") and identifying and measuring naturally grown structures.

Related: How Is AI/Machine Learning Impacting Engineering?

Edge extraction

This technology is a relatively new and unique method for robustly extracting edges with the aid of deep learning. It reliably extracts only the desired edge from a large number of edges visible in an image. It’s also able to robustly recognize edges in low-contrast and noisy situations, which permits the extraction of edges that cannot be identified with conventional edge recognition filters. The technology is generally used in combination with rule-based machine vision approaches.

Deep OCR

OCR (optical character recognition) can be used to identify and classify text. When based on deep learning algorithms, the technology is also known as deep OCR. It can deliver robust results, even under challenging conditions, such as when identifying slanted text, distorted letters or characters printed on or etched into reflective surfaces, or highly textured colored backgrounds. With deep OCR, characters are grouped automatically, enabling words to be identified. This increases recognition performance, for example by avoiding misinterpretations of characters with a similar appearance.

Deep Counting

With deep counting, you can localize and count a large number of objects very quickly. The technology is not only guided by the shape of the parts, but it also incorporates other distinctive features, such as color, pattern, or texture, using the deep learning approach. One particular benefit is that deep counting achieves very robust results even when objects are made from highly reflective and amorphous material. It can also be used to reliably record a vast quantity of objects that touch each other or partially overlap. The technology is therefore an excellent choice for counting a wide range of products in the food and beverage industry as well as for the correct and complete packaging of small items such as nuts or bolts.

Where Does Deep Learning Make the Most Sense?

Deep learning opens up entirely new fields of application and makes machine vision accessible to more people, including those who are not very familiar with machine vision or who do not wish to program algorithms themselves. AI systems can generally be set up with their own image files. The advantage is that by training the neural networks, AI often delivers more robust results than classic algorithms. For example, traditional matching works very well when all objects look exactly the same. But AI really shines when the data has a lot of variation, as can happen when changes occur naturally, such as in fruit and vegetables. In such cases, it’s difficult to clearly define the classic features ahead of time: When is a surface good, and when is it not? Another use case applies to manufacturers in production who have very high-quality standards.

Some companies have virtually no production errors and therefore also no error images that could be supplied to a rule-based system. It's possible for a defect to occur only once out of every ten thousand objects. But companies don't know exactly how this will work beforehand. AI-based anomaly detection can help. The technology doesn't have to know what a bad part looks like ahead of time, since it’s trained only on the basis of good parts. Such applications would not have been possible in the past with rule-based programming.

However, the ideal way to achieve the perfect machine vision application is to combine deep learning algorithms with rule-based machine vision techniques. One such application would look something like this: Companies use AI for pre-classification in order to identify a point region of interest, within which a highly precise measurement can be taken using traditional methods. This makes the application as a whole faster and the results more accurate.

How to Get Started with Using Deep Learning in Machine VIsion

To run an application, you first need to have a classic machine vision setup consisting of a camera, appropriate illumination, and suitable computer hardware, such as an industrial PC equipped with a high-performance CPU or (even better) a GPU. But at the heart of any machine vision setup lies powerful machine vision software available from a variety of companies including MVTec.

Related: AI-Driven Digital Twins and the Future of Smart Manufacturing

Optimal Image Data Preparation for Training

To use deep learning applications, you first have to label the training images. The goal of labeling is to note the desired output of the AI model in the image. Such information can be the image class or the object's position within the image. Software that provides an intuitive user interface makes labeling very easy even for beginners and can be used without any programming skills. When taking the next step (preparing the data), keep in mind that the image data must be available in an optimally prepared form.

A particularly practical aspect is that good images are all that is needed to train certain deep learning technologies, so-called "unsupervised" methods such as anomaly detection. These are easy to obtain. Moreover, the number of these image datasets required is from 20 to 100 good images, depending on the condition of the object to be inspected. The training process itself takes place at the press of a button.

A Glimpse into the Deep Learning Black Box

One criticism of deep learning is the lack of transparency in the decision-making processes. While the latest developments, described below, can’t completely illuminate what goes on inside this black box, they do provide certain insights into the inner workings of the neural networks. There are tools that use a heat map to highlight the image areas relevant for decision-making. This is a way to track or influence the behavior of the deep learning algorithms.

Thanks to the Out-of-Distribution Detection (OOD) technology you can identify unforeseen behavior caused by incorrect classifications during operation and take the appropriate measures. When using a deep learning classifier, the system generally assigns unknown objects to a learned class. This can lead to problems, for example, when dealing with error types or foreign bodies that have never been encountered before. The new deep learning feature alerts the user when an object is classified that was not included in the training data. This could be a bottle with a green label, for example, if the system has only been trained on bottles with red or yellow labels. In this case, the message "Out of Distribution" is displayed along with an OOD score indicating how much the object deviates from the trained classes.

Related: Deep Learning at the Edge Simplifies Package Inspection

It's also possible to influence the deep learning results with the help of the threshold value. For example, the threshold value can be set very high for the purpose of anomaly detection, in which case you get only OK results. If you set a lower threshold value, the system correspondingly delivers fewer and fewer OK results and thus no "false negatives." This allows you to flexibly and individually adjust how sensitively the model responds to irregularities.

Starting with Deep Learning: Best with Machine Vision

If companies want to take advantage of the many benefits of deep learning, they need well-thought-out strategies for the targeted implementation and long-term use of the technology. However, like all AI methodologies, the topic is associated with a certain degree of complexity. Machine vision is proving to be a key technology in this context, in which proven deep learning methods can be used efficiently and profitably.

 

 

About the Author

Ulf Schulmeyer

Ulf Schulmeyer is the product manager for MERLIC at MVTec Software GmbH (Munich, Germany). 

An engineer, Schulmeyer has held positions such as product manager and sales manager at Data Becker GmbH, a former publisher of books and software, and FRANZIS GmbH (Haar, Germany), a software company. He joined MVTec (Munich, Germany) in 2023 as product manager for MERLIC, machine vision software for beginners or non-experts that includes deep learning capabililties. His task is to further sharpen the profile of the no-code software.

 

Sign up for Vision Systems Design Newsletters
Get the latest news and updates.

Voice Your Opinion!

To join the conversation, and become an exclusive member of Vision Systems Design, create an account today!