MIT Researchers Build Robot That Detects Hidden Items in a Pile

Aug. 16, 2022
The robotic system uses a combination of radio frequency signals, computer vision, and MIT-developed algorithms to calculate the minimum number actions necessary to find and remove the target item.

Researchers at the Massachusetts Institute of Technology (MIT) have developed a robotic system that can detect objects buried in a pile—even if they don’t have RFID.

If at least some items in the pile have RFID tags, the target item does not need to be tagged for the system, known as FuseBot, to recover it using a combination of radio frequency signals, computer vision, and MIT-developed algorithms to calculate the minimum number actions necessary to find and remove the target item.

The researchers reported on their work—“FuseBot: RF-Visual Mechanical Search”—in Robotics: Science and Systems (

“If you're only relying on cameras, you can only see the surface of the pile,” explains Tara Boroushaki, lead author on the article and a research assistant and doctoral student at the Signal Kinetics group in the Media Lab at MIT. (Cambridge, MA, USA; "Essentially what we're trying to do is to give robots the ability to see beyond the line of sight,” she says.

Boroushaki notes that the ability for robots to detect items without RFID tags that are buried in a pile would improve the cost efficiency of many logistics processes, such as processing product returns or picking and packing orders. In factory settings, this capability also would be useful. For example, it would allow a robot to locate a tool that a human put away in the wrong place, she says.

FuseBot comprises a UR5e robotic arm from Universal Robots (Odense, Denmark; with a 2f85 gripper from Robotiq (Quebec City, Canada; 

The Search and Retrieval Process

To find an occluded item, the robot first surveys the pile and creates a model using a 3D RealSense Depth D415 camera from Intel (Santa Clara, CA;, which is mounted on FuseBot’s wrist.

To create the 3D model, the robot uses a voxel grid to represent the pile. It then uses depth information and the camera’s position to determine if each voxel in the grid is occupied, empty, or occluded, the researchers explain in the research paper.  

At the same time, the robot also has two antennas on its wrist that transmit and receive wireless signals to gather information—such as size and location—on every RFID-tagged item in the pile. Since the radio waves can pass through most solid surfaces, the robot can gather information about the environment deep inside the pile.

The two antennas are WA5VJB Log Periodic PCB antennas from Kent Electronics (Sugar Land, TX, USA;  and are connected to two Nuand BladeRF 2.0 Micro software radios (San Francisco, CA, USA; through a ZAPD-21-S+ splitter (0.5-2.0 GHz) from MiniCircuits (Brooklyn, NY, USA; 

Software to control the robot’s operations and communicate between components of the system is housed on a custom-built PC located in the lab.

By merging the information from the RFID and camera systems, FuseBot determines the areas within the pile for which it has no information. Using this information, in addition to knowledge about the size and shape of the target item, FuseBot creates a heat map, showing the high probability and low probability locations for the target item.

In the next step, the robot system uses MIT-developed algorithms to determine the least number of actions required to find the object in the pile. The entire process—imaging the pile, creating a heat map, and removing an object from the pile—is updated every time FuseBot removes an item from the pile and until the target item is found. 

Measuring Performance

To measure FuseBot’s performance, the MIT researchers tested it against another vision system, called X-Ray, which estimates 2D occupancy distributions based exclusively on visual information.

 In comparison testing, the researchers found that FuseBot required fewer actions to find the target item than X-Ray did. FuseBot also was more successful in finding the item, achieving a success rate of 95%, compared with 84% for X-Ray. And it was faster: The median search-and-retrieval time for FuseBot was 62 seconds, compared with 142 seconds for X-Ray.

About the Author

Linda Wilson | Editor in Chief

Linda Wilson joined the team at Vision Systems Design in 2022. She has more than 25 years of experience in B2B publishing and has written for numerous publications, including Modern Healthcare, InformationWeek, Computerworld, Health Data Management, and many others. Before joining VSD, she was the senior editor at Medical Laboratory Observer, a sister publication to VSD.         

Voice Your Opinion

To join the conversation, and become an exclusive member of Vision Systems Design, create an account today!