20,046 page doc archive still available

From: Jay West <jwest_at_classiccmp.org>
Date: Fri Sep 3 21:21:10 2004

John, I also would be happy to host, or at least mirror, the archive you
speak of on the classiccmp server. I know there's plenty of space.


Jay West

----- Original Message -----
From: "John Foust" <jfoust_at_threedee.com>
To: <cctalk_at_classiccmp.org>
Sent: Friday, September 03, 2004 8:01 PM
Subject: 20,046 page doc archive still available

> Back in March 2001 I posted about a cache of 20,046 pages
> of scanned docs I received from someone on the net.
> See the TOC below, followed by his explanation of how
> he did it.
> It consumed several CD-Rs, compressed. I now have a DVD burner
> as well, so I'd be glad to make copies on new or old media.
> (It is actually all available on a hidden web page that
> I disclose if someone sends me a pointed email, but I'd
> hate to stress my little T-1.)
> Anyone care to upgrade it to OCR'd PDF or whatever would be
> considered a next-best method of preservation and search-ability?
> I know it's possible with a handful of Linux, but my to-do list
> is already too long.
> - John
> I've made contact with a guy who's scanned 20,046 pages of the
> docs listed below, at 300 to 400 DPI. He first told me about the
> UCSD p-System docs he'd scanned. Below the list is his description
> of the process he followed.
> I'm planning to get a copy of what he has and burn it to CD-R.
> Does anyone else have an interest in these docs, or have any
> ideas about distribution without massive copyright violation?
> - John
> 6502
> MOS 6502 datasheet
> 6502 Assembly Language Subroutines (Leventhal)
> AMD 29000 Memory Design Handbook
> Am29027 Arithmetic Accelerator
> Am29C327 Floating Point Processor
> Data General
> C Language Reference Manual
> GATE User's Manual
> AOS/VS Internals Manual
> AOS/VS Programmer's Manual, volume 1
> AOS/VS System Calls Dictionary
> CEO User's Manual
> Eclipse 32-bit Principles of Operation
> Eclipse 32-bit System Functional Characteristics
> Fortran-77 Environment Manual
> Fortran-77 Reference Manual
> Fairchild
> Clipper User's Manual
> RISC System Programmer's Guide
> R3000 Assembly Language Programmer's Guide
> R3000 Hardware User Manuals
> R3000 Language Programmer's Guide
> High-speed CMOS databook
> Motorola
> 68000 Family Reference
> 68020 User's Manual
> 68851 User's Manual
> 88100 User's Manual
> 88200 User's Manual
> Linear Interface Integrated Circuits
> 53C90A/B Advanced SCSI Controller (2 different manuals)
> 53C94/5/6 databook
> 53CF94/96-2 Fast SCSI Controller
> Disk Array Controller Firmware
> Disk Array Controller Hardware
> Disk Array Controller Software
> Floppy Disk Controller (SCSI-to-FD)
> National Semiconductor
> NS32532 Datasheet
> Series 32000 Programmer's Reference Manual
> DP8490 Enhanced Asynchronous SCSI Interface
> NS32CG16 Programmer's Reference Supplement
> Graphics Handbook
> Series 32000 Databook
> DRAM Management databook
> Embedded Controller Databook
> Ohio Scientific
> C4P User's Manual (2 different manuals)
> 65V Programmer's manual
> Schematics for:
> 502 CPU board
> 505 CPU board
> 527 24K memory board
> 540 Video board
> 542 Polled Keyboard
> Pinnacle Systems
> 2 User's manuals for their 68k machine (My P-system machine)
> P-system manuals IV.12
> Operating System Reference
> Program Development Reference
> Application Development Guide
> Fortran 77 Reference
> Assembler Reference
> Weitek
> WTL4167 Floating-Point Coprocessor datasheet
> Most of these are from about 1988 to 1992, with the exception of the OSI
> documentation, of course, which is from 1979.
> ---
> > What sort of process did you follow? What sort of devices?
> As far as the process, I scanned a manual in and checked to make sure
> all the pages were there. If they weren't, I'd scan the pages that
> didn't make it, and go through all the pages again. I'll admit this is a
> little anal, but better safe than sorry. (When you're using a lot of
> shell scripts, you never know if you accidently deleted a page with an
> "mv" command.) When all the pages where there, I'd go through the manual
> one more time to check for general quality (no folded corners, no torn
> pages, etc.) If all was good, the manual would be moved to the directory
> that would be the root directory of my CD-ROM. That's pretty much it.
> The big manuals of more than 1000 pages really sucked, because I'd
> generally have to make 3 or more passes to get those completely correct.
> If I was going to do it again, I'd probably break the larger manuals
> into smaller chunks to avoid this problem.
> One thing that made the whole process a lot easier was the netpbm
> utilities. I wrote a script to convert the manuals from ~2500x3300 TIFs
> to ~500x600 GIFs. My machine takes about 2 seconds to process a 300-400
> DPI TIF, but only a fraction of a second for a 75 DPI GIF. I'd run my
> script, then do something else for a while. When it was done, I could
> flip through the GIFs with GQview and inspect about 2-4 pages per
> second. That saved a lot of time.
> I assume that, by "devices", you mean what type of scanners I used. I
> started with an HP 6350cse (with ADF) that I bought for this very
> purpose. However, having never owned a scanner before, I was a little
> disappointed with how slow the "fast" scanners are. Fortunately, imaging
> is an integral part of the software my company sells and, as luck would
> have it, we were demoing a new scanner from Fujitsu. This thing
> literally does 60 pages/min at 300 dpi - *both* sides. It's about half
> that fast at 400 dpi, which I had to use for the IC databooks to get the
> fine print. Needless to say, I did most of my scanning on that.
> By the way, to date, I've processed 20046 pages. I'm kinda burned out,
> though, so it'll be a while before I do any more.
Received on Fri Sep 03 2004 - 21:21:10 BST

This archive was generated by hypermail 2.3.0 : Fri Oct 10 2014 - 23:37:27 BST