Let's develop an open-source media archive standard

From: Hans Franke <Hans.Franke_at_siemens.com>
Date: Thu Aug 12 10:22:52 2004

Am 12 Aug 2004 10:33 meinte Cini, Richard:
> How would one actually go about re-generating an original media from
> the metafile? Do we contemplate connecting some future computer's I/O port
> to a 34-pin ribbon cable connected to a 1980's vintage floppy drive? At some
> point in this process we're going to have to make some detailed assumptions
> on how the metadata will be used 50 or 100 years from now.

> Also, the metafile not only has to include information about the
> "user" data areas of the disk but also the system areas (the stuff written
> to the media by the controller -- address marks, gaps, sync bits, etc.).

> This would require us to not only compile the general media format
> data but also data on the controller used to generate the media (chip specs,
> data gleaned from examining the "format" programs used, etc.).

> The reason why I ask is that somehow we're going to have to test the
> archival/restoration process to see that it works. It's like making tape
> backups but never testing them with a restore.

> This might be obvious, but I've been accused of stating that before
> :-)


Well, Rich, that'S part of the tool-chain to be used.

As I see it, the data format is nothing else, than the
lingua franca to connect various tools . After all, a
format won't do any work, programms do.

In my little definiton there are 3 kind of programms:

- Readers
A programm that reads the data from the original media
and generate a XML representation according to the spec
(to be stored somewhere, transferred, whatever, eventualy
in a modern environment)

- (Meta-)Handlers
A programm that archives, cataloges, manages or does
whatever with the XML representation.

- Writers
A programm that gets the XML representation as input
and writes it out to the original media.

Typicaly readers and writers would be small programms
running in the original environment. Also specific for
reader and writers is that these programms must have an
understanding of the media format they are operate on.

A programm that writes the data gathered from the XML
representation into a file format understood by an
emulator would also fit the writer category.

Programm that manipulates the XML repraesentation and
or the data within (handers) are a wide variety and
include tools like

- converting various data formats (e.g. IntelHex <->
  Motorola format)
- extract data from the representation for useage
  with other tools - e.g. extract a ROM image to
  plain IntelHex so it can be downloaded to a PROMer.
  Or maybe a tool that pulls a media out of a configuration

- restructure the XML repraesentation- e.g. if the
  media is split up into several files or made up as
  linked sections and a certain (simple) writer is only able
  to handle a single file with a linear repraesentation,
  such a tool would include all the external references
  into one file, and all data directly into the blocks
  where they occure.

As for the records, I have not problem if each of these
programms will be a separate class and we end up with
a hundret class names :). in fact, emulators who directly
read the XML representation are already a hard case to
fit in - personaly I see them as Handlers, since the don't
_write_ the XML representation, but rather interprete and
handle the XML repraesentation.

The important part about XML is this tool chain idea.
Let's take just the (P)ROM example. Except for the reader,
which is specific to a certain computer modell, all other
tools can be used to produce a new PROM for any machine,
no matter what PROM is to be made.

Usualy, if I do a lecture about such things someone comes
up with the quite clever idea that this van be done with
any format as long as it'S the one used by all tools, and
he's right. There is principly no difference between some
clever made up XML tags wich may have meaning to a geek,
and let's say if we do a TIFF like structure where we go
#and calculate all Friday 13th relative to Jan 1st 1900
and use that number as ID.

Well, except for the little fuzzy advantage that we don't
have to fiddle around with all the meta structure, since
it's already done in the XML def, it's open for extensions,
and most important, if using a proprietary standard, chances
that we can use some third party tool, not sepcialy made
to cover our format are ZERO, while when using XML they
ate greater than zero. THat's the difference... And if
it's only a little commandline tool for XPATH expressions,
which already works as query tool for our archive. A simple
line as in

xpt file:*/*_at_FORMAT[T6250]

lists all files in the current directory that have somewhere
a FORMAT tag of T6250 (a tape format)

Realy, I love XML
(After I got rid of some of the unnecersarry free floating
bells and whistles :)

VCF Europa 6.0 am 30.April und 01.Mai 2005 in Muenchen
Received on Thu Aug 12 2004 - 10:22:52 BST

This archive was generated by hypermail 2.3.0 : Fri Oct 10 2014 - 23:36:34 BST