manuals in pdf (resolution, compression)

From: Paul Williams <paul_at_frixxon.co.uk>
Date: Sun Jun 27 02:45:03 2004

Tom Owad wrote:
>
> Here's the first three pages, a 152 KB pdf file: <www.applefritter.com/
> temp/sample.pdf>. Any reservations about the quality of the file, or
> anything else, before I start doing a whole bunch more just like this?

These scans are still the original 150dpi grayscale scans,
down-converted to 1bpp, aren't they? I haven't tried OCRing these pages,
but they look as if the bold text especially would upset the program.

The best technique is to scan at the *final* colour depth and
resolution, so I would suggest rescanning these pages at 400dpi, 1bpp. I
used to scan at 600dpi, but for text pages it doesn't produce noticeably
better results than 400dpi, and the files are significantly larger.

Al has already mentioned Eric Smith's tumble
<http://tumble.brouhaha.com>, which does a great job of concatenating a
bunch of TIFFs and JPEGs into one PDF file, converting the TIFFs to the
best Group 4 compression on the way. It will even allow you to add
bookmarks (with some limitations).

I used to use tumble, but now I am using the Perl PDF::API2 module,
which allows me to take a more hybrid (read: time-consuming!) approach
to each page. I now scan text pages at 400dpi, 1bpp and then rescan any
pages with photographs at 150dpi or 200dpi grayscale, saved as JPEG. I
then crop the JPEG to just the photograph and put that over the top of
the TIFF page (having removed the TIFF version of the photograph). That
improves the overall page size, because the TIFF compression doesn't
have to cope with the inevitable dithering of the photograph.

You can see an example of this approach in the ADM-3A Maintenance Manual
at <http://vt100.net/lsi/adm3a-mm/adm3a-mm.pdf> (124pp, 7.8 MiB). This
manual has a bookmarked Table of Contents, photographs and text on the
same page, and a colour schematic of the double-sided PCB at the back.

-- 
Paul
Received on Sun Jun 27 2004 - 02:45:03 BST

This archive was generated by hypermail 2.3.0 : Fri Oct 10 2014 - 23:37:01 BST