20,046 page doc archive still available

From: Paul Koning <pkoning_at_equallogic.com>
Date: Thu Sep 9 09:54:56 2004

>>>>> "ed" == ed sharpe <esharpe_at_uswest.net> writes:

 ed> Paul, who made the engine for the plugin iris or?

I have no idea. I just grabbed it when I saw it and put it to work.

The biggest issue is that cleanup is a real pain. Acrobat lets you
"edit" the text, but only line by line. Yuck. And the OCR does a
fair job of picking up changes in font, but it does get it wrong some
of the time, so you can end up with a fairly ugly mix of Courier and
Times. If the goal is to make searchable text and a smaller file,
that isn't a big deal. If the goal is to make a clean document,
that's different.

I did two projects: a 400 or so page A-10 flight manual with goal #1,
and the Ethernet standard, 90 or so pages, goal #2. That second one
was quite a lot of work. It's arguable whether it was worth the
trouble. Unfortunately, I did not find Al Kossow's archive until
after I was finished...


 ----- Original Message ----- From: "Paul Koning" <pkoning_at_equallogic.com> To:
>> I have had good success with Adobe's OCR plugin for Acrobat --
>> free for the download with a 50 page at a time limit. (It will do
>> bigger docs, in 50 page pieces.) It worked well enough to produce
>> useful output from a manual full of pictures (a flight manual).
Received on Thu Sep 09 2004 - 09:54:56 BST

This archive was generated by hypermail 2.3.0 : Fri Oct 10 2014 - 23:37:28 BST