Outdoor vision system drives efficiencies in bulk transport

Coupling patented fiducial markers with a low-cost vision system makes the loading of bulk materials more cost-effective.

Dr. Nicholas Dowson, Dr. Thomas Landgrebe, and Andre le Vieux,

Bulk handlers in the rail industry must manage the loading and transport of trains that comprise upwards of 40 wagons that can be hundreds of meters in length. Efficiency is a primary driver in the bulk handling industry, but a large source of inefficiency stems from the loading of bulk materials with varying density such as grain, iron ore and coal, given that wagons may not be overloaded and under-loading is wasteful.

The process for achieving optimal loading typically involves a degree of human judgment or coarse loading equipment adjustments, but loading variance remains a significant challenge. To date, the use of rail weighbridges have been effective in some situations, but both their high cost and inability to be practically installed at loading stations have limited their use.

To provide direct operator feedback by measuring wagon vertical displacement during the loading process, Cooperative Vision Systems (Brisbane, Queensland, Australia; www.covisvds.com) has developed a vision system that employs a track-side camera that monitors the side of railcars with sub-millimeter precision as they are loaded. Feasibility studies and pilot trials for the system were undertaken in collaboration with bulk grain handler GrainCorp (Sydney, Australia; www.graincorp.com).

The system exploits the linear relationship between loaded weight and downward displacement on the four spring suspensions of the rail-cars, allowing for per-suspension modeling. A printed visual strip consisting of an array of patented Long Range Visual (LRV) fiducial markers and an associated vision system is used for this purpose. LRV tags were key to this application, since both uncontrolled lighting conditions and rail-car motion rendered alternatives less robust or requiring restrictively expensive and complex sensing hardware.

Using an automated system during the loading process allows the system to archive loading data for individual wagons and allows wagons requiring maintenance to be more easily identified. With additional calibration, weight can be inferred from displacement, enabling the wagon to be filled optimally using a mobile-based user interface. Operators using this wearable device can display the identity, type and loading level of the wagon currently being viewed by the system (Figure 1). The current vertical and horizontal displacements of the wagon are updated in real time during the loading process, involving the (typically) 14m long wagons to be moved to several positions during loading to ensure an even load distribution. Indications are displayed once the maximum displacement is approached. The system also records loading profiles (i.e. the displacement characteristic for a wagon as it is loaded) and displays historical profiles to further assist operators.

Figure 1: Operators using a wearable device can display the identity, type and loading level of the wagon currently being viewed by the system.

System design

To detect the fiducials on the side of wagons, a camera with a 1m wide field of view is mounted rigidly at wagon height at a loading point, typically within 2m of the track. The system uses an outdoor 5MPixel P1357-E IP CMOS camera from Axis Communications (Lund, Sweden; www.axis.com) with a wide angle lens with a focal length that can range from typical focal lengths of 2.5-6.0mm. Such low-cost IP cameras are sufficient since approximately 1-2s latencies are acceptable. Compressed streaming video (Motion JPEG or H.264) is then transmitted via Ethernet to a co-located small form factor PC running Linux. A moderate degree of compression can be tolerated by the decoder.

An SL-3918 3486 lumen LED floodlight (Techlight series) from Jaycar Electronics (Sydney, Australia; www.jaycar.com.au) is active night and day, ensuring a light intensity of a minimum of approximately 1000 lumens across the entire field of view to ensure low train speeds of up to 5 or 6 kph can be handled with the camera's shutter speed set at 1/500s. The camera, PC and illuminator share the same power supply. For communications, a 3G modem from Intel connects to the local cellular network. An Android smartphone connects with the PC wirelessly, with a preinstalled application providing real-time updates to the operator. The local NUVO-3000 PC from Neousys Technology (New Taipei City, Taiwan; www.neousys-tech.com) logs the wagon displacement data and uploads the data to a remote Linux server at predefined intervals. The use of a centralized server enables multiple sites to be managed at distant locations. Such scalability is critical where transport networks consist of multiple sites distributed over wide areas often hundreds of kilometers apart (Figure 2).

Figure 2: Cooperative Vision Systems' bulk handling monitoring system captures images from a wagon with LRV strip, processes these images on a local embedded PC and transfers vertical and horizontal displacement of the wagon to a wireless mobile interface. A camera connected to a PC computes the tag 3D position and identity, stores the data and makes the data available via a web application. The mobile device is used as a real-time user interface to obtain feedback on the vehicle's current position and identity.

Long range visual tags

Mounted along the length of each wagon (both sides) is a narrow strip of fiducial markers or tags. These tags are detected by the system and tag locations used to compute the vertical and horizontal location of the wagon relative to the camera.

The sites where the bulk material is loaded are both indoor and outdoor at various angles to the sun or artificial lighting, and loading can occur at any time of day or night. Even with multiple lamps to illuminate the scene, lighting conditions vary substantially. Highlights and shadows are frequently present, as are obscuring factors such as intermittent dust and dirt on the tags.

The substantial variations in lighting conditions make the detection and localization of markers challenging. Hence the key innovation of the system is the design of the fiducial marker used and the software used to detect it. Each strip of markers consists of a grid of long range visual (LRV) tags, a patented fiducial developed by Covis that is comprised of several colored segments within an octagon (Figure 3). The tags are specifically designed for use in uncontrolled outdoor environments such as loading sites.

Figure 3: The structured patterns of color changes of long range visual tags (LRVTs) between sectors are detected and used to locate the tag's center.

In addition to overcoming the challenges of extreme dynamic range and being partially obscured, LRV tags are unrivaled in their abilities to remain detectable under substantial changes in scale and lighting, and their tolerance of blur either due to motion or being positioned far from the camera's focal plane. To achieve such robustness, the structured patterns of color changes between sectors are detected and used to locate the tag's center.

Complementing the tag design is the software used to detect and decode the patterns. The tag detector can rapidly locate and identify multiple tags simultaneously in megapixel video streams, even on limited hardware. The reliability and speed of the camera, combined with its long range capabilities, allow 3D localization of objects in uncontrolled environments. This circumvents the need for complex camera setups or specialized hardware used in other systems, and eliminates the errors introduced by ambiguous fiducial identities that plague simpler setups.

Together, the grid of individual tags on the strip enable an identity to be associated with each wagon and provide a 3D reference describing the vehicle's rigid pose relative to the camera as the tag moves through the camera's field of view. The individual identity associated with each tag provides substantial redundancy, allowing the accuracy in the pose estimate to be maintained in the presence of partial occlusion and shadow. Accuracy is augmented by the fact that individual LRV tags can be located with sub-pixel accuracy.

The geometric relationships on the tag strips are known, allowing for lens and camera calibration to occur on the fly. This involves estimation of both internal parameters (e.g. focal length, radial lens distortion) and external parameters (camera pose) by using the positions of multiple LRV locations and their known spatial arrangement. Rapid and simple calibration means the system can be adapted to a range of situations, and the system is tolerant to small camera bumps or movements. The use of multiple elements also enables sub-pixel accuracies to be obtained. Figure 4 shows an example of the detected tags being used to estimate the vertical displacement of a train wagon during loading.

Figure 4: LRV tags on a grid are located, identified (blue and green annotations) and used to estimate the pose of the grid (magenta and red annotations) and to measure the vertical displacement (black vertical line and annotation) of a railcar during loading.

Infield operation

Figure 5 shows the system operating at a fully functional installation at a commercially active grain loading site and shows displacement of the wagon undergone during loading. As shown, the tag strip is mounted horizontally on the wagon mid-line. The limited size of the individual tags, and the tag strip, relative to the camera field of view are clearly visible. The identity of each tag within the field of view (yellow box) shown is obtained and used to fit the known relative locations of each identified tag to compute the overall rigid pose of the strip. The same approach is used during calibration, but using several pose estimates to characterize the camera's lens distortion and focal length. Calibration data enables more accurate pose estimates to be obtained but is not mandatory.

Figure 5: Tag strips attached to the mid-line of each wagon consist of a grid of LRV tags. Individual locations of tags are combined to measure the 3D pose of the strip to be estimated and the displacement during a load can be measured at sub-pixel accuracy.

The system has been installed at multiple commercially active operating sites, and has demonstrated the capability of the system to monitor rail-car loading in real time using vision system hardware. Providing immediate feedback during loading has demonstrably reduced loading variations, which has a direct impact on the bulk handling industry's cost-effectiveness. The commercial use of the visual displacement system has been recognized with the system receiving an award for innovation at the 2014 Australian Bulk Handling Awards. LRV tags are a cost effective and versatile solution to many applications and look set to expand the domain of accurate vision-based measurement and identification into other industries such as asset management and color calibration.

Dr. Nicholas Dowson, Image Processing Engineer, Dr. Thomas Landgrebe, Director and Image Processing Engineer and Andre le Vieux, Director and Web Architect, Cooperative Vision Systems (Brisbane, Queensland, Australia; www.covisvds.com).