BZZ
Section: ~DjVuLibre-3.5~ (1)
Updated: 10/11/2001
Index
Return to Main Contents
NAME
bzz - DjVu general purpose compression utility.
SYNOPSIS
Encoding:
bzz -e[blocksize] inputfile outputfile
Decoding:
bzz -d inputfile outputfile
DESCRIPTION
The first form of the command line (option
-e)
compresses the data from file
inputfile
and writes the compressed data into
outputfile.
The second form of the command line (option
-d)
decompressed file
inputfile
and writes the output to
outputfile.
OPTIONS
- -d
-
Decoding mode.
- -e[blocksize]
-
Encoding mode.
The optional argument
blocksize
specifies the size of the input file blocks processed by the Burrows-Wheeler
transform expressed in kilobytes. The default block sizes is 2048
KB.
The maximal block size is 4096
KB.
Specifying a larger block size usually produces higher compression ratios
and increases the memory requirements of both the encoder and decoder.
It is useless to specify a block size that is larger than the
input file.
ALGORITHMS
The Burrows-Wheeler transform is performed using a combination of the
Karp-Miller-Rosenberg and the Bentley-Sedgewick algorithms. This is comparable
to (Sadakane, DCC 98) with a slightly more flexible ranking scheme. Symbols
are then ordered according to a running estimate of their occurrence
frequencies. The symbol ranks are then coded using a simple fixed tree and
the ZP binary adaptive coder (Bottou, DCC 98).
The Burrows-Wheeler transform is also used in the well known compressor
bzip2.
The originality of
bzz
is the use of the ZP adaptive coder.
The adaptation noise can cost up to 5 percent in
file size, but this penalty is usually offset by the benefits of
adaptation.
PERFORMANCE
The following table shows comparative results (in bits per character)
on the Canterbury Corpus (
http://corpus.canterbury.ac.nz
). The very good
bzz
performance on the spreadsheet file
excl
puts the weighted average ahead of much more sophisticated
compressors such as
fsmx.
Compression performance
|
| text | fax | csrc | excl | sprc | tech | poem | html | lisp | man | play | Weighted | Average
|
|
compress | 3.27 | 0.97 | 3.56 | 2.41 | 4.21 | 3.06 | 3.38 | 3.68 | 3.90 | 4.43 | 3.51 | 2.55 | 3.31
|
gzip -9 | 2.85 | 0.82 | 2.24 | 1.63 | 2.67 | 2.71 | 3.23 | 2.59 | 2.65 | 3.31 | 3.12 | 2.08 | 2.53
|
bzip2 -9 | 2.27 | 0.78 | 2.18 | 1.01 | 2.70 | 2.02 | 2.42 | 2.48 | 2.79 | 3.33 | 2.53 | 1.54 | 2.23
|
ppmd | 2.31 | 0.99 | 2.11 | 1.08 | 2.68 | 2.19 | 2.48 | 2.38 | 2.43 | 3.00 | 2.53 | 1.65 | 2.20
|
fsmx | 2.10 | 0.79 | 1.89 | 1.48 | 2.52 | 1.84 | 2.21 | 2.24 | 2.29 | 2.91 | 2.35 | 1.63 | 2.06
|
bzz | 2.25 | 0.76 | 2.13 | 0.78 | 2.67 | 2.00 | 2.40 | 2.52 | 2.60 | 3.19 | 2.52 | 1.44 | 2.16
|
|
Note that DjVu contributors have several
entries in this table. Program
compress
was written some time ago by Joe Orost.
Program
ppmd
is an improvement of the
PPM-C
method invented by Paul Howard.
CREDITS
Program
bzz
was written by Léon Bottou <
leonb@users.sourceforge.net> and
was then improved by Andrei Erofeev <
andrew_erofeev@yahoo.com>, Bill Riemers
<
docbill@sourceforge.net> and many others.
SEE ALSO
djvu(1),
compress(1),
gzip(1),
bzip2(1)
Index
- NAME
-
- SYNOPSIS
-
- Encoding:
-
- Decoding:
-
- DESCRIPTION
-
- OPTIONS
-
- ALGORITHMS
-
- PERFORMANCE
-
- CREDITS
-
- SEE ALSO
-
This document was created by
man2html,
using the manual pages.
Time: 00:05:50 GMT, November 14, 2015