The robot arm is a uArm Swift Profrom UFACTORY controlled by a Raspberry Pi using the PYARM Python library. Once initialized, the robot is instructed to extend its arm and capture an image of the canvas below using a Vision Camera Kit from UFACTORY, which is based on an OpenMV Cam M7 open source embedded camera. This camera features an STM32F765VI ARM Cortex M7 processor fro m STMicroelectronics running at 216 MHz and is based on the 640 x 480 OV7725 CMOS image sensor from OmniVision Technologies that reaches a speed of 60 fps.
Images are captured by the camera, which then uses OpenCV to find the possible Waldo faces in the photo. The faces are then sent to be analyzed by Google’s AutoML machine learning model service, which compares each one against the trained Waldo model. Available since January, Google’s Cloud AutoML is a suite of machine learning products that enables developers with limited machine learning expertise to train high-quality models specific to their business needs, by leveraging Google’s transfer learning and Neural Architecture Search technology, according to the company.
If a confident match of 95% (0.95) or higher is found, the robot arm is instructed to extend to the coordinates of the matching face and point at it using an attached silicone hand.
If there are multiple Waldos in one photo, according to Redpepper, it will point to each one.
"While only a prototype, the fastest There’s Waldo has pointed out a match has been 4.45 seconds, which is better than most 5-year-olds," said the company on its YouTube video.