How to Build an AI-Enabled Industrial Vision System

This article explores the architecture of AI-powered industrial inspection systems, emphasizing the importance of edge computing, dedicated hardware accelerators, and open standards to achieve real-time defect detection with low latency and enhanced data security.
Jan. 28, 2026
7 min read

Key Highlights

  • AI inspection systems leverage edge computing to process data locally, reducing latency and improving data security by minimizing network traffic.
  • The architecture includes high-resolution sensors, synchronized lighting, and deterministic communication protocols to ensure precise image capture and reliable data transfer.
  • Open computer-on-module standards like COM-HPC facilitate system scalability, easy upgrades, and integration into existing industrial setups.
  • AI accelerators such as Axelera's Metis AIPU enable efficient, high-speed inference supporting over 60 parts per second inspection rates with low latency.
  • Effective thermal management and power optimization are essential for maintaining system reliability and performance in demanding industrial environments.

Industrial inspection systems are increasingly dependent on AI, partly due to the ongoing shortage of skilled labor, and partly because of growing innovation pressure in industrial production. Inspection systems equipped with modern machine vision (AI vision) technologies can detect defects in real time, automatically reject damaged products, and forward data to analytical systems for process optimization.

But real-time analysis requires substantial computing power to handle and process massive amounts of data on local, on-site devices. Also known as edge AI-based solutions, this approach offers advantages over cloud-based approaches, because data processing takes place directly at the machine, reducing latency. At the same time, data security is improved because sensitive information never leaves the local network, and network traffic is minimized as only relevant events are transmitted.

To provide the necessary computing performance, companies are deploying local edge computers that integrate dedicated AI hardware accelerators, enhancing inference performance and enabling the development of next-generation visual inspection systems.

 

Reference Architecture with Sensors, Cameras, and Edge Computer

To understand why an AI-based optical inspection system requires such high computational performance, the system’s overall architecture must be considered. A reference architecture for an AI vision system consists of sensors, actuators, the edge computing platform, and connected peripheral components.

Sensors and industrial cameras using GigE Vision or USB interfaces capture high-resolution images, with the choice of interface depending on the required bandwidth and cable length. The optical setup typically includes fixed-focus lenses, complemented by polarized lighting and a strobe controller synchronized with trigger signals to ensure optimal illumination during image capture. The triggering itself is initiated by signals from the programmable logic controller (PLC) on the production line, which detects the presence of an object. An encoder ensures that the exposure is triggered at precisely the right moment.

The captured image data is then transmitted directly to the edge computer via GigE Vision or USB. For signal transmission between the camera, lighting, and control systems, deterministic I/O interfaces are used, typically opto-isolated general-purpose input/output signals (GPIOs) for trigger and synchronization signals. Communication with the PLC takes place via industrial protocols such as Ethernet TSN, Modbus/TCP, or OPC UA, ensuring reliable, time-critical, and standardized integration into the production environment.

The edge computing platform receives camera data through the supported interfaces, losslessly buffers it, and routes it into a vision processing pipeline running on high-performance, scalable hardware with an integrated AI accelerator. Such accelerators are available in a variety of form factors, such as M.2 cards that connect to the host system via PCI Express.

On the software side, an operating system, such as Yocto, Ubuntu, or Windows IoT, along with appropriate drivers, camera software development kits (SDKs), or GStreamer, ensures reliable data acquisition. AI inference runtime binaries are integrated via the respective accelerator SDKs, orchestrating the models for image preprocessing, detection, classification, and post-processing.

At the application layer, the inspection logic correlates multi-camera results, logs performance metrics, and deterministically reports to the PLC whether a part is “OK” or “NOK,” or whether a fault condition exists. Such a condition triggers either a controlled stop or a predefined reaction of the production line.

The PLC receives OK/NOK and fault states via the selected communication protocol and transmits corresponding commands to the actuators. Feedback sensors, such as light barriers, monitor the ejection process of defective products, while the system issues alarms through a smart human-machine interface (HMI), signal lights, or SCADA systems. Fail-safe modes, PTP time synchronization, and delay tables ensure robust operation and enable controlled intervention in the event of communication errors or recurring anomalies.

Feedback sensors, such as light barriers, monitor the ejection process of defective products, while the system issues alarms through a smart human-machine interface (HMI), signal lights, or SCADA systems.

Computer-on-Modules Ensure High Availability and Scalability

The compute module, which contains the processing core and connects to surrounding peripheral components, is the heart of the edge platform. Consequently, there are many benefits in designing industrial vision systems on open computer-on-module (COM) standards such as COM-HPC or COM Express.

COM modules based on open standards offer high availability, can be integrated into existing systems, and provide unlimited scalability. Thanks to long-term support across multiple vendors, future processor upgrades are streamlined. Only the COM itself is replaced, while an application-specific carrier board onto which the COM is plugged remains in the system. This approach reduces development effort, design costs, and downtime, while accelerating time-to-market.

To meet the requirements of modern AI-based vision solutions, the carrier board must be designed to provide sufficient PCIe lanes from the COM module to accommodate, for example, an M.2 AI accelerator card. Additionally, enough Gigabit Ethernet interfaces should be provided to enable both camera communication and a deterministic TSN connection to the PLC. Isolated GPIOs for strobe triggers and 24 V-tolerant digital inputs for PLC signals must also be integrated.

 

Using COMs in Combination with an AI Accelerator

The SOM-COM-HPC-A-RPL from SECO is one example of a computer-on-module platform that meets these requirements—particularly in manufacturing environments: 

  • Designed in the high-performance COM-HPC form factor
  • Based on a 13th Gen Intel Core Processor from Intel’s long-term availability IoT roadmap
  • Supports up to 64 GB of DDR5 RAM
  • Offers multiple PCIe Gen3/Gen4 lanes and 2.5 GbE interfaces
  • Available in industrial temperature variants (-40 to +85°C) 

Through an M.2 interface implemented on a companion carrier board, developers can integrate an AI accelerator like the Axelera AI M.2 AI Inference Acceleration card, which incorporates the company’s Metis artificial intelligence processing unit (AIPU). It delivers efficient performance per watt for dedicated vision AI hardware acceleration and is operated via Axelera’s Voyager SDK, which offers native YOLOv8 support.

With this ecosystem, training datasets, and documentation, developers can begin application development. As a fallback, the Intel OpenVINO framework, an open-source AI toolkit, can be executed on the host CPU, with official notebooks (interactive executable tutorials) covering YOLOv8 model conversion for deployment on the Metis AIPU.

 

Comparing the AI Processing Pipeline of the Inspection System

To analyze and compare the expected AI vision system performance, it is essential to assess the efficiency of all components—from image acquisition to inference on the edge platform, and the system’s response control. In doing so, reliable quality decisions can be ensured within the strict timing constraints of industrial production cycles.

In a system like this, total latency typically ranges between 7 and 15 milliseconds (ms), enabling inspection rates of over 60 parts per second. By employing batch processing or multi-stream inference, processing efficiency can be further increased for accurate and timely decisions even at higher throughput rates.

The processing pipeline of an AI vision system is designed to maintain consistently low latency while achieving high detection accuracy. To start, frames are captured via GStreamer v4l2src or the camera SDK and transferred into system RAM within approximately 2 to 4 ms.

Next, preprocessing—including resizing and normalization—is performed either on the CPU or via Voyager or GStreamer. The resulting tensors are passed to the AIPU (e.g., the Metis AIPU) in approximately 0.5 to 1 ms. Inference is executed using the Voyager SDK with YOLOv8n/s and INT8 quantization, achieving 98% to 99.5% accuracy on representative datasets. The results are then transferred back to the host RAM within 2 ms to 6 ms.

Post-processing occurs on the CPU using thresholding and rule-based logic (1 ms), before the final decision is transmitted to the PLC via OPC UA or digital outputs (1–3 ms). The data transfers between the CPU and AIPU occur in the microsecond range and are thus negligible in terms of total latency yet are explicitly accounted for to ensure deterministic timing behavior.

To maintain latency performance over the long term, system thermal management must be carefully designed. The CPU-plus-AIPU typically draws around 55 to 60 W, with an additional 15 to 20 W per camera and 10 to 25 W for each camera’s lighting rig, assuming commonly used industrial-grade LED ring lights. This results in a total power consumption of around 100 W, depending on the number of cameras installed.

A passive heatsink with prominent cooling fins, combined with directed airflow, is therefore essential to provide adequate cooling and ensure reliable operation.

Conclusion

Designing an edge AI vision system for industrial in-line inspection presents developers with software, hardware, and integration challenges. The combination of precise image capture, deterministic PLC communication, and low-latency AI inference demands coordinated drivers, robust data paths, and reliable thermal design.

The platform requirements are clear: high computational performance within tight energy and space constraints, modular scalability, and adherence to open, long-term standards. Systems based on the SECO SOM-COM-HPC-A-RPL module and Axelera AI M.2 accelerator card demonstrate how high-performance, energy-efficient solutions can be realized for next-generation industrial inspection applications.

About the Author

Rodney Feldman

Rodney Feldman is vice president of Products, Innovation, and Marketing at SECO USA.

Sign up for our eNewsletters
Get the latest news and updates

Voice Your Opinion!

To join the conversation, and become an exclusive member of Vision Systems Design, create an account today!