scanning notes (was RE: VAX 11/780 /750 Diagnostics - new project)

From: Eric Smith <>
Date: Sun Jan 12 20:40:00 2003

Matt asks:
> Do you have any tips for scanning books? Such as resolution, color
> settings, etc...

300 to 400 DPI. B&W only, unless the originals make use of color.
Save B&W images as TIFF Class F Group 4, except for pages that
have photographs. Those should be saved as JPEG (either color or
grey scale). Text and line art should *never* be saved as JPEG,
because that makes it blurry [*]. I recommend saving any "mixed" pages
(containing text/line art AND photographs) as both TIFF Class F G4 and

Many programs only offer a vague "TIFF" option for saved files, without
telling you what kind of TIFF they will produce. That's like asking
a car salesman what kind of engine a car has, and being told "internal
combustion". It's true, but it hasn't told you anything useful. You
can use the "tiffinfo" program from the libtiff package to find out
what kind of compression a TIFF file really has.

Group 3 lossless compression is an acceptable alternative to Group 4,
but doesn't achieve quite as good a compression ratio.

More recently, JBIG and JBIG2 standards for lossless compression have
appeared. They offer even higher compression rations than G4. However,
JBIG and JBIG2 are currently encumbered by patents, so not much software
uses them yet. I recommend avoiding them.

If you plan to convert the scanned documents to PDF files, it is useful
to note that G3 and G4 compression are natively supported by the PDF
standard and all compliant viewer software.


[*] JPEG compression is designed specifically for lossy compression
of continuous tone images. The compression is achieved by throwing
away high frequency components of the image. Text and line art have
sharp edges with a lot of high-frequency components, so JPEG compression
causes blurring. Although you can adjust the degree of lossiness of
most JPEG compressors, if you turn it down enough to not cause noticable
blurring, you also don't get nearly as good a compression ration as
lossless Group 4 encoding.
Received on Sun Jan 12 2003 - 20:40:00 GMT

This archive was generated by hypermail 2.3.0 : Fri Oct 10 2014 - 23:35:59 BST