Let's develop an open-source media archive standard

From: Antonio Carlini <a.carlini_at_ntlworld.com>
Date: Thu Aug 12 15:45:47 2004

> Part of the problem here is that if the file containing
> the archive had any bit rot, most systems designed to
> read that media would simply fail to read the information.
> In many cases, it might not even be possible to read the
> CRC after the error occured.

My reason for mandating a CRC would be that otherwise
you have no way of knowing that an error has occurred.
An unreliable archive format is a terrible thing to
foist on future generations: they should be able to
know whether the data has been correctly preserved
or not. Without a CRC (or similar mechanism) you have
no guarantee of that. One of the advantages of storing
zip (or rar or whatever) archives on CD is that if
you can copy it off and unzip it, you can reasonably
expect the end result to be an accurate reproduction of
the original.

> For error correction, one must also realize that one
> can only correct a single burst of errors that is
> smaller than a size specified by the correction method.

Error correction is just an added bonus and may or may
not be worth the additional effort. Perhaps optional
additional redundancy (OpenVMS BACKUP will write a
parity block of data for every N blocks of real
data, PAR implements "RAID for newsgroups").

None of these schemes prevent you fiddling with
the data to use your additional expert knowledge
to "fix up" a broken data stream (as long as
you are willing to go the extra mile and have the
knowledge to do so, as in the case you relate).

However, if you read the archive and you get a
bunch of bits out and no error, how do you even
know to look (unless, again as in your case, the
initial data encoding is such that certain "impossible"
conditions might occur and give you a clue).

That's why an error detection mechanism of some
sort is essential.


Antonio Carlini arcarlini_at_iee.org
Received on Thu Aug 12 2004 - 15:45:47 BST

This archive was generated by hypermail 2.3.0 : Fri Oct 10 2014 - 23:36:34 BST