January 2019 snapshots: 3D vision, vision-guided robots, deep learning at the edge
In the January 2019 snapshots, learn about a 3D vision system used to monitor the behavior of pigs, a vision-guided robot that assembled an IKEA chair, a new device from Intel that enables deep learning at the edge, and intelligent vision technology for video conferencing.
3D vision system monitors the behavior of pigs
Researchers in Scotland have developed a 3D vision system that is used to autonomously detect the tail position of pigs to alert farmers when an outbreak of tail biting is imminent, so they can attempt to prevent it fromhappening.
Tail biting in growing pigs starts without warning. Outbreaks of tail biting result in pain and sickness for bitten pigs and economic losses for farmers, especially when infection through tail wounds results in the meat becoming spoiled, according to Scotland’s Rural College (SRUC; Aberdeen, UK; www.sruc.ac.uk), which collaborated with Scottish farm technology company Innovent Technology Ltd (Turriff, UK; www.itlscotland.co.uk), pig supply chain partners, and the Agricultural Engineering Precision Innovation Centre (Agri-EPI; Edinburgh; UK; www.agri-epicentre.com) on theproject.
Furthermore, tail docking of piglets is partly effective at reducing tail biting in later life but is seen as undesirable mutilation, and its routine use is banned in the European Union, per Council Directive 2008/120/EC. Numerous risk factors exist when it comes to tail biting in growing pigs and outbreaks can start without warning or obvious cause, making it difficult to manage on farms. Recent research shows that pigs’ behavior changes before a damaging tail biting outbreak starts.
Specifically, pigs seem to lower their tails right before a potential incident occurs. To alert farmers to a potential biting incident, the project team developed an automated system to detect tail position. Each pig pen had an ifm (Essen, Germany; www.ifm.com) O3D301 3D Time of Flight camera oriented to cover around one-third of the pen area, located above the feeder, pointing vertically down. Ethernet cables feed the acquired image data from each 3D camera to a fit-PC4 industrial fanless PC from CompuLab (Yokneam Illit, Israel; www.compulab.com), connected to a broadband internet connection, enabling datadownload.
Each pen was also equipped with two CCTV bullet cameras from Gamut (Bristol, UK; www.gamutcctv.co.uk) mounted in the ceiling; one capturing the entire pen, and one capturing above the feeding area where the 3D camera was positioned.
These bullet cameras, along with the 3D camera, recorded 24 hours a day with video data stored on the hard drive of a PC-based CCTV system from GeoVision (Taipei, Taiwan; www.geovision.com.tw). 3D and 2D video images were watched simultaneously to validate the 3D data “byeye.”
Proprietary algorithms developed by Innovent Technology Ltd. were used to locate pigs and orient them, according to the academic paper, Automatic early warning of tail biting in pigs: 3D cameras can detect lowered tail posture before an outbreak (http://bit.ly/VSD-AEW). For each pig that was present under the camera and standing up, an algorithm was used that locates the tail and measures its angle relative to the body on a scale of 0 to 90°, where 0 is a tail hanging down or tucked against the body so it does not stand out from the curve of the back/rump, and 90 is a tail standing up at 90°.
Validation of the camera setup and algorithm found an accuracy of 73.9% at detecting low vs. not low tails (sensitivity 88.4%, specificity 66.8%). In the tests, 23 groups of 29 pigs per group were reared with non-docked tails under typical commercial conditions over 8 batches. Fifteen groups had tail biting outbreaks. When this occurred, enhancements were added to the pen and biters and/or victims were removed and treated.
3D data from outbreak groups showed the proportion of low tail detections increased pre-outbreak and declined post-outbreak.
“This research has achieved everything we hoped for. We can automatically measure tail posture, and we’ve proved it can act as an early warning of tail biting,” said lead author Dr. Rick D’Eath from SRUC.
This work will be further developed in a new three-year project called TailTech (http://bit.ly/VSD-TTS) which—as a result of £676,000 in funding from Innovate UK—will collect data from more diverse pig farms and develop and test a prototype early warning system.
Vision-guided robot autonomously assembles IKEA chair
Researchers from Nanyang Technological University, Singapore (NTU Singapore; Jurong West, Singapore: http://www.ntu.edu.sg/Pages/home.aspx) have developed a robot that uses 3D vision along with robotic arms and grippers to autonomously assemble a chair fromIKEA.
To assemble a STEFAN chair kit from IKEA (Älmhult, Sweden; www.ikea.com) Assistant Professor Pham Quang Cuong and his team used two six-axis DENSO (Kariya, Japan; www.denso.com) VS-060 robot arms, two Gamma six-axis force sensors from ATI Industrial Automation (Apex, NC, USA; www.ati-ia.com) and two 2-Finger 85 parallel grippers from Robotiq (Lévis, QC, Canada; https://robotiq.com). The robot was designed by the team to mimic how a human assembles furniture, with the “arms” being capable of six-axis motion and equipped with parallel grippers to pick up objects. Mounted on the wrists are force sensors that determine how strongly the “fingers” are gripping, and how powerfully they push objects into eachother.
For the robot’s “eyes,” an Ensenso N35-804-16-BL 3D camera from IDS Imaging Development Systems (Obersulm, Germany; www.ids-imaging.com) was used. This camera features two 1.3 MPixel monochrome global shutter CMOS sensors, blue (465 nm) lights, GigE interface, and Power over Ethernet. The projected texture stereo vision camera can achieve a frame rate of 10 fps in 3D, with 30 fps in binning mode. Additionally, the camera features a “FlexView projector,” which operates a piezoelectric actuator and doubles the effective resolution of the 3D point cloud for more exact contours and more robust 3Ddata.
The team coded algorithms using three different open-source libraries to help the robot put the chair together. OpenRAVE (www.openrave.org) was used for collision-free motion planning, the Point Cloud Library (PCL) for 3D computer vision, and the Robot Operating System (ROS; www.ros.org) for integration. The robot began the process of assembly by taking 3D images of the parts laid out on the floor to generate a map of estimated position of different parts. Using the algorithms, the robot plans a two-handed motion that is “fast and collision-free,” with a motion pathway that needs to be integrated with visual and tactile perception, grasping and execution. To ensure that the robot arms can grip the pieces tightly and perform tasks such as inserting wooded plugs, the amount of force exerted has to be regulated. Force sensors mounted on the wrists helped to determine this amount of force, allowing the robot to precisely and consistently detect holes by sliding the wooden plug on the surfaces of the work pieces, and perform tight insertions, according to theteam.
It took the robot 11 minutes and 21 seconds to plan the motion pathways and 3 seconds to locate the parts. After this, it took 8 minutes and 55 seconds to assemble thechair.
“Through considerable engineering effort, we developed algorithms that will enable the robot to take the necessary steps to assemble the chair on its own,” said Cuong. “We are looking to integrate more AI into this approach to make the robot more autonomous, so it can learn the different steps of assembling a chair through human demonstration or by reading the instruction manual, or even from an image of the assembledproduct.”
The team is now working with companies to apply this form of robotic manipulation to a range of industries, according to a press release.
Neural Compute Stick 2 from Intel brings faster deep learning development to the edge
Intel (Santa Clara, CA, USA; www.intel.com) has announced the release of its Neural Compute Stick 2 (Intel NCS2) which is a USB 3.0-based deep learning inference kit and self-contained artificial intelligence accelerator that delivers dedicated deep neural network processing capabilities to a range of host devices at theedge.
Intel’s NCS2—which the company says offers performance boosts over the previous Neural Compute Stick model—is based on the Intel Movidius Myriad X vision processing unit, which features 16 SHAVE vector processing cores and a dedicated hardware accelerator for deep neural network inferences.
The Intel NCS2 is also supported by the Intel Distribution of OpenVINO toolkit, which is based on convolutional neural networks (CNN) and extends workloads across Intel hardware to maximize performance. The Toolkit, suggests Intel, enables CNN-based deep learning inference on the edge and supports heterogeneous execution across computer vision accelerators including CPU, GPU, FPGA, and Intel Movidius Neural Computer Stick using a common application programming interface (API).
Enabling deep neural network testing, tuning and prototyping, the Intel NCS2 was designed to bring developers from “prototyping into production,” by leveraging a range of Intel vision accelerator form factors in real-worldapplications.
“The first-generation Intel Neural Compute Stick sparked an entire community of AI developers into action with a form factor and price that didn’t exist before,” said Naveen Rao, Intel corporate vice president and general manager of the AI ProductsGroup.
He added, “We’re excited to see what the community creates next with the strong enhancement to compute power enabled with the new Intel Neural Compute Stick2.”
With a PC and the new Intel NCS2, developers can have artificial intelligence and computer vision applications up and running in minutes, according to Intel. The stick ports to the computer via USB 3.0 and requires no additional hardware. Supported deep learning frameworks include TensorFlow and Caffe, and compatible operating systems include Ubuntu 16.04.3 LTS (64 bit), CentOS 7.4 (64 bit), and Windows 10 (64bit).
Intel’s new NCS2 device builds on its previous generation of neural compute stick, the Intel Movidius Neural Compute Stick. This unit—which featured the Intel Movidius Myriad 2 vision processing unit—was designed to reduce barriers to developing, tuning, and deploying artificial intelligence applications by delivering dedicated high-performance deep-neural network processing in a small form factor.
Through software and hardware tools, suggested Intel at the time of the product’s release, the Neural Compute Stick brings machine intelligence and artificial intelligence out of the data centers and into end-user devices. View additional details on the device: http://bit.ly/VSD-NCS2.
Uber deploys more than 850 intelligent video communications-enabled conference rooms globally
Altia Systems (Cupertino, CA, USA; www.panacast.com), Zoom Video Communications (San Jose, CA, USA; https://zoom.us) and Uber (San Francisco, CA, USA; www.uber.com) have announced the successful deployment of more than 850 Zoom Rooms—which utilize intelligentvision communications technology—in Uber officesworldwide.
The systems are being used to connect more than 18,000 Uber employees in hundreds of offices around the world. Altia Systems’ PanaCast 2 camera is used in the conference rooms, which utilize a cloud-based enterprise video communications software for conference rooms from Zoom. PanaCast 2 is a panoramic camera system with three separate image sensors and an adjustable field of view through a USB video class (UVC) PTZ (pan, tilt, zoom) command set. The camera has USB 2.0 and USB 3.0 interfaces and features a 180° wide by 54° tall field ofview.
To capture images from all three imagers and to provide panoramic video, the PanaCast features a patent-pending Dynamic Stitching technology, which does an analysis of the overlapping image. Once the geometric correction algorithm has finished correcting sharp angles, the Dynamic Stitching algorithm creates an energy cost function of the entire overlap region and comes up with stitching paths of least energy, which typically lie in the background. Processing is done on the on-board PanaCast video processor, and video is rendered in a mathematically-correct cylindrical projection with ultra-low latency of <~5 ms from photons to USB data, according to thecompany.
Computation is done in real time on a frame-by-frame basis to create the panoramic video. Each video frame is 1600 x 1200 pixels from each imager and joining these frames from three imagers together creates a 3840 x 1080 image. Each imager in the PanaCast 2 camera are 3 MPixel CMOS image sensors that can reach up to 30 fps in YUV422 and MJPEG videoformat.
Altia Systems worked closely with Ravi Sharma, the head of collaboration and AV services at Uber, and Uber’s technical team to deploy innovative equipment and software as a part of Uber’s AV 2.0 initiative. This initiative, according to the company, aims to improve quality across all audio and video components, boost productivity through seamless collaboration, and streamline remote support while remaining simple and cost-effective.
“When people walk into a conference room, they don’t want to struggle with the technology - they want it to just work,” said Ravi. “PanaCast provides that while significantly enhancing employee productivity by streamlining collaboration with its intelligent features. That’s why we’ve deployed hundreds of PanaCast devices across Uber’s conference rooms globally,” he said. “You have five chairs per huddle room. If two are not in the camera’s field of view, those two chairs are not usable and you’ve lost those seats at the table – literally.”
PanaCast’s 180° field of view allows Uber’s teams to utilize 100% of any conference or huddle room, leaving no space wasted, or team members out of sight.