DjVuLibre: What does it do?

What's in DjVuLibre

DjVuLibre contains:

Here is a non exhaustive list of the commands included with DjVuLibre:

  • c44: a wavelet-based continuous-tone image encoder (à la JPEG-2000).
  • cjb2: single page encoder for bitonal images (black and white scans).
  • cpaldjvu: encoder for palettized images (a la GIF, but better).
  • bzz: a general-purpose data compressor (a la bzip2).
  • djvused: a powerful command interpreter for manipulating DjVu documents.
  • ddjvu: converts DjVu documents to PBM/PGM/PPM images.
  • djvudump: displays the structure of a DjVu file.
  • djvuextract: extracts chunks from a DjVu file.
  • djvumake: assemble chunks into a DjVu file
  • djvutxt: extract the "hidden text" from a previously OCRed DjVu document.

What's not in DjVuLibre

DjVu is a bit like MPEG in its asymmetry between the decoders and the encoders. Decoders and simple/experimental encoders are open sourced and included in DjVuLibre, but the best encoders (as of today) are owned by LizardTech Inc and kept proprietary. The smarts in the encoder can make a big difference in terms of file size and image quality. Building smart or specialized commercial DjVu encoders (and applications around them) is what companies like LizardTech do for a living.

LizardTech is in fact building its business around selling high-performance encoders, OCR, indexing tools, server software, OEM software development kits, customized systems, specialized viewers, and support. LizardTech builds its high-performance commercial compressors around four pieces of technologies:

  • a fast and high-performance multipage bitonal document encoder (acquired from AT&T)
  • a foreground/background layer segmenter for scanned color documents (acquired from AT&T)
  • a direct converter from PS/PDF to DjVu (licensed from AT&T)
  • an OCR engine (licensed from a 3rd party).
Although it is conceivable that adequate open source replacements for these will eventually become available, the AT&T/LizardTech technologies listed above are not included in DjVuLibre. This means that, at the moment, certain types of document compressed with LizardTech's commercial compressors or with the on-line conversion services (such as Any2DjVu) will end up smaller (and in some cases higher-quality) than the ones compressed with the DjVuLibre encoders.

Yann LeCun, December 2001.

