Deep Learning Algorithm Identifies 3D Images in Real Time

Dec. 18, 2023
The researchers say their 3D method, based on YOLOv3 (You Only Look Once), is designed for autonomous vehicles.

Researchers modified an open-source algorithm for 2D image detection, allowing it to recognize both 2D and 3D images.

They say their method is ideal for helping autonomous vehicles navigate their surroundings. During testing, it delivered precise results quickly and outperformed other deep learning architectures.

"By improving detection capabilities, this system could propel autonomous vehicles into the mainstream," says Gwanggil Jeon, a professor in the Department of Embedded Systems Engineering at Incheon National University (Incheon, South Korea), and one of the authors of the new study, published in November 2023 (https://bit.ly/3t5XpMM) in IEEE Transactions on Intelligent Transportation Systems.  

To promote safety, autonomous vehicles need to detect objects accurately in real time even in challenging conditions, such as inclement weather or crowded and chaotic urban landscapes. 

That is why the vehicles typically employ a variety of systems and sensors to navigate their surroundings including LiDAR, RADaR and RGB cameras. By combining sensors, they can compensate for the shortcomings of each technology. For example, LiDAR is not as effective in inclement weather, and RADaR cannot detect small objects.

The 3D Imaging Method

In the current study, the researchers built their 3D method based on YOLOv3 (You Only Look Once), which was originally developed by Joseph Redmon and others at the University of Washington in 2015. YOLO is a one stage image detector, meaning that it views images only once. It has a convolutional neural network architecture and is used broadly in research and industry.   

Related: What is Deep Learning and How do I Deploy it in Imaging?

The 3D method created by the researchers at Incheon National University uses both point clouds and RGB images as inputs and then outputs bounding box coordinates with confidence scores and class labels, such as car or truck. Using end-to-end learning to train the algorithm, the researchers eliminated the need for manually crafted features. This is why they refer to their convolutional neural-network model as a “smart IoT-enabled deep learning based end-to-end 3D object detection system.”

They trained their model using a data set from Lyft (San Francisco, CA, USA) in which image information was captured by 20 autonomous vehicles traveling on a route in Palo Alto, California, USA, for four months. The data comprises about 170,000 urban and suburban scenes.

The vehicles in the Lyft data are equipped with LiDAR sensors on the bumper and roof and six 360° cameras, which are located on the roof. The cameras are synchronized with the LiDAR sensors.

The researchers’ model consists of a feature learning network that transforms point cloud data into features and a convolutional neural network based on YOLOv3.

They ran the training and testing of the model on a NVIDIA (Santa Clara, CA, USA) Tesla P100 GPU 24 GB. They built the model using the Python library TensorFlow.

Related: What is Image Segmentation with Deep Learning

Using the Lyft data, they created comprehensive simulations to assess the accuracy of their model in detecting moving vehicles on roads. They first tested the YOLOv3 on 2D images and then modified the model and tested it on 3D images. The overall accuracy of the model during the tests was 96% for 2D and 97% for 3D object detection.

They also compared their results with other deep learning architectures used for object detection. Their conclusion: “Experimental results demonstrate that the YOLOv3 model achieves high accuracy and outperforms the state-of-the-art architectures in terms of effectiveness,” the researchers wrote.

 

 

About the Author

Linda Wilson | Editor in Chief

Linda Wilson joined the team at Vision Systems Design in 2022. She has more than 25 years of experience in B2B publishing and has written for numerous publications, including Modern Healthcare, InformationWeek, Computerworld, Health Data Management, and many others. Before joining VSD, she was the senior editor at Medical Laboratory Observer, a sister publication to VSD.         

Voice Your Opinion

To join the conversation, and become an exclusive member of Vision Systems Design, create an account today!