PCI-BASED SPEED IMAGE DATA TRANSFER
By Rick Nelson, Contributing Editor
Pentium-class PCs make cost-effective platforms for performing imaging tasks, particularly with the advent of the peripheral component interconnect (PCI) bus. With a 132-Mbyte/s theoretical and 70- to 80-Mbyte/s continuous bandwidth, the PCI bus speeds image data transfer from frame grabbers to system memory.
The PCI bus offers many advantages over the industry standard architecture (ISA) bus. With 32-bit addressing at 33 MHz, the PCI bus provides a tenfold improvement over the 16-bit, 8-MHz ISA device. Unlike the ISA, the PCI is processor-independent, allowing peripherals to act as bus masters, directly accessing system memory without host-processor intervention (see Fig. 1).
This bus-master capability is ideal for image processing in which data are captured to host. The 30-frame/s video data result in 10- to 40-Mbyte/s data streams, which are readily handled by the PCI Bus.
Because the ISA bus cannot handle real-time video data, ISA frame grabbers must include on-board memory, which serves as an expensive repository for data that may be destined for system memory. Transferring data directly to system memory over the PCI bus eliminates this memory redundancy.
While the PCI bus enables data transfer to system memory, the Display Control Interface for Windows 3.x and Direct Draw Interface for Windows 95 software standards support sending images directly to VGA graphics-card memory. Consequently, you can view acquired video signals in real time while the host simultaneously processes it (see "Understanding frame grabber specifications," p. 45). "Until now, image-processing and machine-vision applications were relegated to expensive platforms such as the VMEbus, specialized processing environments, or workstations. High-performance image processing was not available in PC environments at reasonable cost," says Michael Cyros, product manager at Datacube (Danvers, MA).
With PCI-bus-based hosts, frame grabbers can transfer digitized image data directly to system memory. Data Translation (Marlboro, MA) takes this approach with its analog- and digital-input frame grabbers, which rely on the host for image storage, processing, and display. For example, the DT3155-PM for the Power Macintosh acquires and digitizes four analog video inputs and transfers the pixel data to the host over the PCI bus, performing only limited hardware-based scaling and cropping. The frame grabber relies on the host to store image data and execute image-processing algorithms.
BitFlow (Woburn, MA) takes a similar approach with its Road Runner, emphasizing the PCI board as a multichannel digital-camera interface.
"Road Runner`s sustained transfer rate exceeds the camera data rate, so on-board frame-buffer memory is unnecessary, and host memory can be used as the frame buffer. Data become available to the host as they arrive from the camera, eliminating the 20- to 200-ms delay that frame-grabber buffer memory can impose," says marketing vice president Jeffrey Wilson.
Those advocating limited or no on-board processing and memory also suggest leaving display output tasks to the host. "On-board display hardware is unnecessary. An image-acquisition product that includes a VGA chip binds you to a proprietary architecture that cannot improve with time," says Wilson of BitFlow.
The issue of on-board memory and processing is far from settled, though. Frame-grabber vendors providing on-board VGA chips contend that their display technology keeps pace with advances in their boards` acquisition and processing capabilities. One such firm is Coreco (St. Laurent, Que., Canada), whose Oculus TCI-VGA board provides frame-grabber functions while adding a VGA function that replaces the host VGA card to drive NTSC, PAL, and RGB monitors in dual-screen mode (see Fig. 2).
For real-time image processing, many frame grabbers use look-up tables (LUTs) and custom application-specific integrated circuits (ASICs). LUTs substitute new values for pixel values corresponding to the original input data and perform functions such as gamma correction (compensating for camera gray-scale nonlinearity), improving contrast or de-emphasizing portions of an image that contain no relevant information.
LUTs can also provide false color, generating color displays from gray-scale inputs and assigning different colors for gray-scale ranges. This is useful in contour mapping, because different colors are easier to distinguish than gray-scale levels. "You could perform such functions in software, but RAM LUTs are more cost-effective," says Gail Marshall, Image Nation`s chairman and chief technical officer.
Digital signal processors
Proponents of on-board digital signal processing argue that host PCs simply cannot perform image-processing tasks efficiently. Because of this, many frame grabbers feature on-board digital signal processors from Motorola (Austin, TX) and Texas Instruments (Dallas, TX), along with digital-signal-processor libraries of image-processing software. In its TMS320C80-based Genesis frame grabber, for example, Matrox (Dorval, Que., Canada) offers a library of C80-optimized C routines for machine vision, medical imaging, and image analysis.
Frame grabbers with on-board processing capability allow imaging tasks to be distributed across multiple processors. The Genesis main board includes one processor, and more processing nodes can be added. Image processing can be partitioned across multiple nodes. If such segmentation is not suitable, successive frames can be assigned to separate nodes.
The Genesis includes ASICs to off load data-management tasks from the `C80 DSP. In addition, the system offers a choice of input and output modules for data acquisition and display control (see Fig. 3).
On-board processing can allow a single host to control multiple cameras, whose combined data streams would swamp even the PCI bus. The Eltec Electronik (Mainz, Germany) PCImagine boards have allowed Windows-based PCs to control up to four cameras, each of which outputs 10 Mpixels/s. PCImagine uses the Imagine DSP chip from Arcobel Graphics (Hertogenbosch, Netherlands) to provide the necessary processing power. Acrobel`s Imagine is optimized for graphics and imaging applications and provides the advantages of both general-purpose processors and very long instruction word (VLIW) architectures. Conventional processors are built around functional groups such as ALUs, barrel shifters, and multiplier/ accumulators. Normally, only one group is active per operation; the others remain idle. A model in which these groups are not only separate but totally independent, so they can operate parallel, results in multiple instructions per cycle.
Imagine is based on this model, allowing independent units to be concatenated to operate on streams of incoming data. It includes eight independent 32-bit buses that can pipeline operations. The buses can be dynamically resized, carrying four 8-bit, two 16-bit, or one 32-bit results per clock cycle. An instruction-word field controls each unit, leading to the Imagine`s VLIW architecture.
Figure 1. Industry standard architecture (ISA) frame grabbers typically must store and process data on-board, because the ISA`s 16-bit, 8-MHz limit is inadequate for real-time video-data transfer (a). The 33-MHz, 32-bit PCI bus eliminates the bottleneck, allowing direct transfer of data to host memory, all without intervention of the host CPU (b).
Figure 2. An acquisition-only model and two versions that provide VGA capability make up Coreco`s Oculus TCI frame grabber family. The TCI-VGA version can drive NTSC, PAL, and RGB monitors, replacing a host`s VGA card.
Understanding frame-grabber specifications
To produce acceptable images, multimedia frame grabbers automatically adjust images to improve image quality. For industrial applications, an accurate, digitized version of the original signal is required, and automatic adjustments are not beneficial. Gray-scale images contain all the critical features of an image without burdening the host with the larger bandwidth of color, processing, and memory demands. Consequently, gray-scale frame grabbers are often the best choice for industrial applications.
Digitizing accuracy affects the sharpness of a digitized image that in turn determines whether accurate, repeatable measurements of image features can be made. Most vendors publish pixel jitter and gray-scale noise specifications that can be used to compare products.
Pixel jitter measures how precisely a frame grabber samples gray-scale intensities along each horizontal scan line. To digitize an NTSC signal at 640 pixels per scan line requires an 80-ns sampling interval. Pixel jitter indicates how far sample intervals may deviate from the ideal. Precision frame grabbers should limit jitter to ۭ ns because low jitter is important in detecting fine edges of images.
Frame grabbers should accurately and repeatably assign gray-scale levels to corresponding input-voltage levels. If the voltage range 44 -444 mV corresponds to a gray-scale level of 147, then applying a constant voltage in this range should result in the same gray-scale value. Because of random noise, a roughly bell-shaped distribution of gray-scale values centered on the correct value will be produced. Gray-scale noise is usually specified as the standard deviation of this distribution. Precision frame grabbers should have a gray-scale noise specification of 0.7 gray-scale units or less--or 0.7 least significant bits.
Differences in the incoming video signal can complicate image digitization. Poor lighting and cable losses yield signals that use only a narrow band of the full input range. Input devices such as VCRs often provide irregular sync signals.
Frame grabbers that provide gain and offset adjustments can compensate for limited input-signal ranges, ensuring the use of the full range of gray-scale levels. Compensating for irregular synchronization signals is more complex and involves design trade-offs. Frame grabbers may use phase-locked loops (PLLs) to minimize pixel jitter, but PLLs cannot instantly resynchronize--the better a PLL is at limiting jitter, the worse it is for resynchronizing. For industrial applications, frame grabbers with a crystal-controlled digital clock that can resynchronize very quickly should be used. Once resynchronized, the frame grabber will remain stable in generating pixel intervals.
-- Gail Marshall
Beaverton, OR 97075
Figure 3. ASICs offload data-management tasks from the TMS320C80 processor on Matrox`s Genesis board. The Genesis accepts optional frame-grabber and display modules.