Wavelets compress images at high-resolution

MARCH 29--As electronic storage, retrieval, and distribution of documents becomes faster and cheaper, documents and papers are becoming increasingly digital.

MARCH 29--As electronic storage, retrieval, and distribution of documents becomes faster and cheaper, documents and papers are becoming increasingly digital. In the last decade existing documents have usually been retyped and converted to HTML or Adobe's PDF format. A simple alternative would be to scan the original page and compress the image in JPEG or GIF format by standard compression algorithm.

Unfortunately, these files tend to be quite large if one wants to preserve the readability of the text. To overcome this, researchers at the Centre for Wavelets, Approximation and Information Processing (CWAIP) at the Department of Mathematics, National University of Singapore (www.cwaip.nus.edu.sg/demo/wdic.htm) have developed an approach for compressing document images that makes it possible to transfer a high-quality page at very high compression ratio.

The main idea of the document image-compression technique is to partition and encode separately four parts from which the original image can be reconstructed: the character images, the picture images, the line images and the background image. The character image can be encoded with a novel extent-based morphological matching and clustering and wavelet compression algorithm. A picture image can be encoded with a wavelet-based compression algorithm, which is suitable for gray scale images. A line can be encoded with a one-dimension wavelet-based compression algorithm. The background image also can be encoded with a wavelet-based compression algorithm.

With WDIC, a typical A4 size document page in 8-bit gray scale at 100 dpi can be compressed to 5k--30k. When compared with the original 100-dpi raw image, WDIC achieves a compression ratio ranging from 40 to 200 times. For a typical document page at 300 dpi, WDIC achieves a compression ratio ranging from 100 to 400 times.

WDIC is a progressive codec. It provides progressive decoding not only on background, but also on character images, picture images and line images. Users can choose the compression ratios to obtain a satisfactory image quality interactively. WDIC can also automatically attain the highest compression rate while maintaining high quality for text only without pictures.

A comparison of WDIC with JPEG and SPIHT-like wavelet compression is shown. One page of A4 size document at 100 dpi is compressed. Only the upper-left part of the document is displayed. Interested parties can download the software from www.cwaip.nus.edu.sg/demo/wdicdemo.htm .

More in Boards & Software