Comparison of Advanced Compression for Bitonal Images
Now includes Acrobat 6 and PARC Silx PDFConverter!
This study extends the example tables for JBIG2-compressed PDF at both the Adobe and CVision websites, and then also adds a study using technical papers and a study using 2 large multipage book documents.
Use these comparisons to see for yourself the relative merits of advanced compression technologies for both DjVu and PDF, as applied to bitonal scanned images, and between the four JBIG2-PDF compression engines available today. These four engines are:
Acrobat Capture 3.0 Agent Pack
Acrobat 6.0
cVision PDFCompressor
PARC Silx PDFConverter
Use the displayed sizes to compare the compression effectiveness. Click the size hyperlinks in the tables below to open files to compare actual performance.
Jump to Tables:
|
Adobe Study - Extended
A comparison by JRA, Sept. 19, 2002, updated Oct. 22, 2002
...a DjVu and CVision JBIG2-PDF extension of...
Document
|
PDF Image with G4 Comp-
ression
|
PDF Image with JBIG2 Comp-
ression
(Capture)
|
PDF Image with JBIG2 Comp-
ression
(CVision)
|
DjVu Image using JB2 Comp-
ression
|
Searchable Image with Group 4 Comp-
ression
|
Searchable Image with JBIG2 Comp-
ression
(Capture)
|
Searchable Image with JBIG2 Comp-
ression
(CVison)
|
Searchable Image DjVu using JB2 Comp-
ression
|
Contract
|
|
|
|
|
|
|
|
|
Annual
Report
|
|
|
|
|
|
|
|
|
Technical
Report
|
|
|
|
|
|
|
|
|
Patent
|
|
|
|
|
|
|
|
|
Total
|
5.84M
|
1.11M
|
921k
|
897k
|
5.13M
|
1.61M
|
1.055M
|
1.12M
|
Average
|
100%
|
19%
|
16%
|
15%
|
100%
|
31%
|
18%
|
22%
|
Comparison Notes:
All Adobe and CVision PDF files have been copied to the PlanetDjVu server, so you can make a good comparison of the relative performance and speed of PDF and DjVu files.
Remember, it is not just file size that is a factor in web performance, it is also file format design and the file viewer plug-in design. I notice a short lag time in the decoding of JBIG2-compressed PDF pages from Capture, while there is barely any lag time at all PDF pages from CVIision and in DjVu. Check for yourself!
DjVu files were produced (and OCRed) at the Any2DjVu conversion server.
CVision JBIG2-PDF files were produced by CVIsion using the latest Build 19 of their compressor.
CVision has introduced some clever text compression in addition to JBIG2 compression, which is why their OCRed JBIG2-PDF is actually smaller than DjVu!
|
cVision Study - Extended
A comparison by JRA, May 14, 2002, updated Oct. 4, 2002
...a DjVu and Acrobat Capture JBIG2-PDF extension of...
Reports
 |
TIFF Group IV size (bytes)
|
PDF size (bytes)
|
JBIG2 size (bytes)
|
JBIG2-PDF size (bytes)
Capture
|
JBIG2-PDF size (bytes)
cVision
|
DjVu size (bytes)
|
Starr report page 39
|
|
|
8,589
|
|
|
|
Patents
 |
TIFF Group IV size (bytes)
|
PDF size (bytes)
|
JBIG2 size (bytes)
|
JBIG2-PDF size (bytes)
Capture
|
JBIG2-PDF size (bytes)
cVision
|
DjVu size (bytes)
|
US Patent #US06122289
|
|
|
155,982
|
|
|
|
Books
 |
TIFF Group IV size (bytes)
|
PDF size (bytes)
|
JBIG2 size (bytes)
|
JBIG2-PDF size (bytes)
Capture
|
JBIG2-PDF size (bytes)
cVision
|
DjVu size (bytes)
|
Page1
|
|
|
19,624
|
|
|
|
Legal
 |
TIFF Group IV size (bytes)
|
PDF size (bytes)
|
JBIG2 size (bytes)
|
JBIG2-PDF size (bytes)
Capture
|
JBIG2-PDF size (bytes)
cVision
|
DjVu size (bytes)
|
Legal Page 1
|
|
|
11,335
|
|
|
|
Financial
 |
TIFF Group IV size (bytes)
|
PDF size (bytes)
|
JBIG2 size (bytes)
|
JBIG2-PDF size (bytes)
Capture
|
JBIG2-PDF size (bytes)
cVision
|
DjVu size (bytes)
|
Financial Contract 1
|
|
|
150,757
|
|
|
|
Fax
 |
TIFF Group IV size (bytes)
|
PDF size (bytes)
|
JBIG2 size (bytes)
|
JBIG2-PDF size (bytes)
Capture
|
JBIG2-PDF size (bytes)
cVision
|
DjVu size (bytes)
|
CCITT image 4
|
|
|
11,792
|
|
|
|
Totals
 |
TIFF Group IV size (bytes)
|
PDF size (bytes)
|
JBIG2 size (bytes)
|
JBIG2-PDF size (bytes)
Capture
|
JBIG2-PDF size (bytes)
cVision
|
DjVu size (bytes)
|
Size
|
2,560,520
|
2,684,641
|
358,079
|
443,010
|
392,156
|
317,448
|
Percent
|
100%
|
105%
|
14%
|
17.3%
|
15.3%
|
12.4%
|
|
Technical Paper Comparison from JRA
This is a set of 59 technical papers presented at a symposium, and originally published on CD_ROM. The scanned text is clean and only one or two fonts are used. This study demonstrates compression levels that can be achieved with the scanning and conversion to searchable-image PDF of clean, laser-printed scientific and technical papers.
Comparison Notes:
To view CIF files, you need the CVista Viewer Plug-in, available from here.
To view DjVu files, you need the DjVu Web Browser Plugin, available from here.
To view JBIG2-PDF files, you need Acrobat Reader 5.0 or or above.
To compare the different formats, you want to test performance and factor these results with the compression levels achieved. Performance is best measured using files that have not yet been opened, and are therefore not in the browser's cache - they are directly opened from the URL address. You have plenty of unopened file links to test with below.
We observe that JBIG2-PDF files from CVision and PARC and Acrobat 6 display faster than JBIG2-PDF from Acrobat Capture, due to faster decoding in the Acrobat Reader Plugin. DjVu files display twice as fast as any PDF alternative, due to having a file and plugin design that is designed specifically for display speed on the web.
JBIG2-PDF files produced by cVision and PARC Silx engines are smaller than those produced by Adobe Acrobat Capture 3.0 Agent Pack and Adobe Acrobat 6.0.
The "Arithmetic" compression mode of the PARC Silx engine produces smaller files than the default "Huffman" compression mode.
All compression in this comparision is "lossy". cVision calls it "Perceptually Lossless", which is perhaps a better term than "lossy". It means that small dissimilarities are allowed between characters that are considered to be the same for compression purposes. We did not find any character recognition errors from using "lossy" in any of the comparision files. The other compression setting is "lossless", which does not permit any differences at all, howevery tiny, in characters considered to be the same for compression purposes. This results in a much larger file. We believe that "lossy" can be used in almost all cases, since it is indeed "perceptually lossless".
File-
name
|
TIFF-G4
300 dpi
|
PDF
300 dpi
|
PDF
240 dpi
|
JBIG2-PDF 300 dpi
Capture
|
JBIG2-PDF 300 dpi
CVision
|
JBIG2-PDF
300 dpi
PARC Silx
Huffman
|
JBIG2-PDF
300 dpi
PARC Silx Arith.
|
JBIG2-PDF
300 dpi
Acrobat 6
|
CIF
300 dpi
|
DjVu
300 dpi
|
001
|
|
|
|
|
|
|
|
|
|
|
002
|
|
|
|
|
|
|
|
|
|
|
003
|
|
|
|
|
|
|
|
|
|
|
004
|
|
|
|
|
|
-
|
-
|
|
|
|
005
|
|
|
|
|
|
|
|
|
|
|
006
|
|
|
|
|
|
|
|
|
|
|
007
|
|
|
|
|
|
|
|
|
|
|
008
|
|
|
|
|
|
-
|
-
|
|
|
|
009
|
|
|
|
|
|
|
|
|
|
|
010
|
|
|
|
|
|
-
|
-
|
|
|
|
011
|
|
|
|
|
|
-
|
-
|
|
|
|
012
|
|
|
|
|
|
-
|
-
|
|
|
|
013
|
|
|
|
|
|
|
|
|
|
|
014
|
|
|
|
|
|
|
|
|
|
|
015
|
|
|
|
|
|
|
|
|
|
|
016
|
|
|
|
|
|
|
|
|
|
|
017
|
|
|
|
|
|
|
|
|
|
|
018
|
|
|
|
|
|
|
|
|
|
|
019
|
|
|
|
|
|
|
|
|
|
|
020
|
|
|
|
|
|
|
|
|
|
|
021
|
|
|
|
|
|
|
|
|
|
|
022
|
|
|
|
|
|
|
|
|
|
|
023
|
|
|
|
|
|
-
|
-
|
|
|
|
024
|
|
|
|
|
|
-
|
-
|
|
|
|
025
|
|
|
|
|
|
|
|
|
|
|
026
|
|
|
|
|
|
-
|
-
|
|
|
|
027
|
|
|
|
|
|
|
|
|
|
|
028
|
|
|
|
|
|
|
|
|
|
|
029
|
|
|
|
|
|
|
|
|
|
|
030
|
|
|
|
|
|
-
|
-
|
|
|
|
031
|
|
|
|
|
|
|
|
|
|
| | |