OCR'ing old manuals

From: Jules Richardson <julesrichardsonuk_at_yahoo.co.uk>
Date: Sun Sep 14 06:50:00 2003

 --- Eric Smith <eric_at_brouhaha.com> wrote: >
> Note that you should NEVER save scans of text and line art in a lossy
> form such as JPEG.

Absolutely - I tend to save everything as TIFF format and in as high a
resolution as seems practical (occasionally I'll use Paint Shop Pro's format if
I need something with layer support). I'm not a big fan of JPEG images...

I'm hoping to get away with scanning things at 300dpi (in this case it's all
printed documentation with a few diagrams, rather than colour images). Not only
will that save space but I can also use my older scanner (which won't do more
than 300dpi I believe, but is at least SCSI and so should transfer data to the
host a little quicker).

I'm a little wary of saving things in mono as somebody else mentioned - I'm
sure that could have a negative effect on the OCR process at a later date.
Greyscale (8 bit) I expect is fine though.

> I've written a program to take B&W TIFF files and color or B&W JPEG
> files and produce a PDF file:
> http://tumble.brouhaha.com/

I'll have a look at that - might come in handy. :-) Image
processing/manipulation I do find pretty interesting.,,,



Backward conditioning: putting saliva in a dog's mouth in an attempt to make a bell ring.

Want to chat instantly with your online friends? Get the FREE Yahoo!
Messenger http://mail.messenger.yahoo.co.uk
Received on Sun Sep 14 2003 - 06:50:00 BST

This archive was generated by hypermail 2.3.0 : Fri Oct 10 2014 - 23:36:25 BST