Manuals to PDF advice sought

From: Gunther Schadow <>
Date: Wed Jun 13 21:19:57 2001

Jeff Hellige wrote:

> >Re: OCR -- Trouble I've had is (and this is just pickiness, if the
> >actual info's all you care about then it's no prob) you invariably
> >lose the font and other aspects of the original appearance of the
> >document, which is a bummer. I converted a PDF of Sun Remarketing's
> >Lisa DIY guide into HTML with images because I wanted search engines
> >to be able to index the content.
> The ability to search the PDF would be nice, but I think the
> amount of work required to do the OCR and then do all the formatting
> and such would outweigh that benifit, though the OCR'd PDF's tend to
> be smaller as well. I'd prefer to keep the original layout, fonts
> and all, though.

What comes as relief -- if you have that software -- you can have
PDF with two layers: a searchable OCRed layer and a viewable pixel
layer. You view the pixels and search the OCRed text. I have been
told it works nicely.


Gunther Schadow, M.D., Ph.D.          
Medical Information Scientist      Regenstrief Institute for Health Care
Adjunct Assistant Professor        Indiana University School of Medicine
Received on Wed Jun 13 2001 - 21:19:57 BST

This archive was generated by hypermail 2.3.0 : Fri Oct 10 2014 - 23:33:58 BST