manuals in pdf (resolution, compression)

From: Alexander Schreiber <als_at_thangorodrim.de>
Date: Sun Jun 27 07:23:55 2004

On Sat, Jun 26, 2004 at 11:27:16AM -0400, Tom Owad wrote:
> I have a lot of manuals I want to scan and am trying to decide upon the
> best format. I'd like some opinions on the following scans of a 128-page
> Franklin AceWriter manual.
>
> On the low end is a pdf of bitmap images. It's hideous, but only 3.5 MB.
>
> The high end is a 40 MB pdf of jpeg images. This one's easy on the eyes,
> but is an awfully large download and I'm wondering if it might not print
> as nicely as the bitmap.

JPEG ist absolutely evil for anything that doesn't match the design area
of it - namely, basically compressing photo (-like) images. Using JPEG for
black-and-white is a bad idea (yes, there is a special monochrome
subformat of JPEG, designed to deal gracefully with monochrome images,
but almost nobody uses it where it would be appropriate).

> In the middle is a pdf of compressed jpegs at 15 MB. This looks good to
> my eyes, I just wonder about using compressed jpegs for archiving...
>
> <www.applefritter.com/temp/acewriter_lo.pdf> (3.5 MB)
> <www.applefritter.com/temp/acewriter_med.pdf> (15 MB)
> <www.applefritter.com/temp/acewriter_hi.pdf> (40 MB)
> (Disregard the incorrect ordering of the pages.)
>
> Thoughts? Which of the three would you most want to download?

I'm scanning anything for archival with 600 dpi monochrome, save it as
TIFF G4 compressed images, then convert those into PDF (using tumble, a
great tool) and Djvu as well, also tarring up the TIFFs. This leaves me
with 3 files for each (multipage) document:
 - .tar - the archived raw .tiff files,
 - .pdf - the .tiff converted to .pdf (basically a PDF wrapper around a
   collection of Fax G4 compressed TIFF images),
 - .djvu - the .tiff converted to .djvu

An example for a scan from a german computer magazine, 4 pages of A4
paper, full page scans:
 - djvu 430962 bytes
 - pdf 1505417 bytes
 - tar 1566720 bytes

The reason for the (rather high) 600 dpi is simple: printing those back
on the laser printer results in very high quality copies - far better
that what most copy machines are capable of. It also result in a
sufficiently high image quality that you can later feed the .tiff to an
OCR software (which is another reason for keeping the original TIFF
files).

Dropping the resolution by half to 300 dpi would cut the filesize down
to 25% of the above. But anything lower than 300 dpi might just be too
lossy.

Regards,
     Alex.
-- 
"Opportunity is missed by most people because it is dressed in overalls and
 looks like work."                                      -- Thomas A. Edison
Received on Sun Jun 27 2004 - 07:23:55 BST

This archive was generated by hypermail 2.3.0 : Fri Oct 10 2014 - 23:37:01 BST