Images are captured by the camera, which then uses OpenCV to find the possible Waldo faces in the photo. The faces are then sent to be analyzed by Google’s AutoML machine learning model service, which compares each one against the trained Waldo model. Available since January, Google’s Cloud AutoML is a suite of machine learning products that enables developers with limited machine learning expertise to train high-quality models specific to their business needs, by leveraging Google’s transfer learning and Neural Architecture Search technology, according to the company.
If a confident match of 95% (0.95) or higher is found, the robot arm is instructed to extend to the coordinates of the matching face and point at it using an attached silicone hand.
If there are multiple Waldos in one photo, according to Redpepper, it will point to each one.
"While only a prototype, the fastest There’s Waldo has pointed out a match has been 4.45 seconds, which is better than most 5-year-olds," said the company on its YouTube video.