Formats for Scanned Documentation (was: Disk Drive Documents)

From: Scott Ware <ware_at_xtal.pharm.nwu.edu>
Date: Tue Jun 8 12:45:06 1999

On Tue, 8 Jun 1999, Tony Duell wrote:

> I see... So 'Portable' document format is only portable to a few systems.
> Great...
>
> Incidentally, does this mean that the output of the Acrobat Reader is
> Level 2 Postscript (if it includes some kind of compressed image that
> needs a filter to decode it, it appears that that's the case)? In which
> case I couldn't print it anyway :-(

Unfortunately, that appears to be the case, even when the "Level 1 only"
option is set when creating the postscript file from the PDF in Acrobat.
I did only try this using the Linux version of Acrobat Reader, so it could
be a platform or version-specific problem.

After looking back at the output, the level 2 features that are included
in the (supposedly) level 1 postscript seem to be causing the problems
with Ghostview that I attributed to problems with G4 TIFF. That's what I
get for trusting a check box...

> > CCITT G4 TIFF is itself a reasonable format for storing scanned
> > documentation.
>
> Is the format documented anywhere? How hard would it be to turn them into
> Portable BitMap files? (Yes, I know such files are _massive_, but then
> they're not designed for transmission, or even permanent storage).

Easy! The ImageMagick 'convert' utility can output to many formats,
including portable bitmaps.

% convert 8ech1-3.tif +adjoin pbm:page%03d.pbm

should do exactly what you want. The 2.2 MB TIFF file (containing 64 1496
x 2391 1-bit images) yields about 28 MB of portable bitmap files. Tarring
and gzipping these files results in a 3.3 MB .tar.gz file, which isn't
too bad. In this case, G4 TIFF might not have been worth the trouble.

G4 TIFF is covered in the TIFF version 6 documentation, and the
compression technique is described in more detail in a CCITT paper that is
mentioned in the TIFF documentation. G4 TIFF support is also included in
libtiff.

In any case, I'm thankful for the efforts of list members who have put
documentation on line, regardless of the format. Having the documents in
an open, documented format would be great, but a little bit of format
fiddling still beats "does anyone have documentation for foo?" and hours
in front of a photocopier.

--
Scott Ware                       ware_at_xtal.pharm.nwu.edu
Received on Tue Jun 08 1999 - 12:45:06 BST

This archive was generated by hypermail 2.3.0 : Fri Oct 10 2014 - 23:32:15 BST