Archiving docs

From: Shawn T. Rutledge <ecloud_at_bigfoot.com>
Date: Wed Apr 1 00:27:43 1998

> >in JPEG format. I'm trying to save as much information as I can, so

JPEG is a bad idea for anything other than photos. It works best with subtle
color variations, and tends to make a mess of sharp details.

> Actually, 300 is the max without interpolation. Stuff that's truly B&W
> should be scanned in B&W mode. The problem is that grayscale picks up
> variations in print strength, smudges, etc. B&W says "This dot is more
> than x dark, it's black. This dot is less than x dark, it's white."

That's good advice; however I think in theory the results are equivalent if
you first scan in grayscale, then use Photoshop to convert to B&W with the
default threshold. But Photoshop also lets you adjust the threshold when
converting to B&W, rather than leave it up to the scanner, and you can tweak
it to better compensate for the effects of old yellowed paper.

B&W images also compress astonishingly well with GIF, which is lossless and
thus preserves sharp details well. The good compression is due to the large
amount of white space in typical schematics and drawings.

My total output of scanned manuals has so far been two :-); it really is time
consuming when you try to touch up the pictures to look really great and fix
all the OCR errors and stuff. The manuals I did (which aren't computer
related) are at
http://www.goodnet.com/~ecloud/hamradio.html. I spent way too much time on
the pictures, but as an example, at the top of
http://www.goodnet.com/~ecloud/plboard/ss32.html is a schematic which is 539 x
269 pixels (a nearly ideal width for a typical browser window) and only 5K in
size! That's what GIF can do for you if the drawings are clean enough.

In the end, unless you have a lot of time to spend, you might be best off
getting Acrobat and using their scan-to-pdf feature (which I believe comes
with the standard payware product now). Supposedly, it tries to OCR, and
spellchecks, but if the OCR algorithm is sufficiently unsure about the
identity of a word, it just leaves the scanned bitmap in there instead. You
can manually fix those boo-boos later but in the mean time you have something
that looks really close to the original document. I haven't tried it, that's
just what I have heard. If it's as good as advertised, then PDF has indeed
found its niche (I don't believe in it for many other purposes; html is more
flexible and much better for use on the Internet whenever you have a digital
source to begin with).

-- 
  _______                 KB7PWD _at_ KC7Y.AZ.US.NOAM   ecloud_at_bigfoot.com
 (_  | |_)  Shawn T. Rutledge            http://www.bigfoot.com/~ecloud
 __) | | \_____________________________________________________________
Received on Wed Apr 01 1998 - 00:27:43 BST

This archive was generated by hypermail 2.3.0 : Fri Oct 10 2014 - 23:30:39 BST