Let's develop an open-source media archive standard

From: Jules Richardson <julesrichardsonuk_at_yahoo.co.uk>
Date: Thu Aug 12 05:32:44 2004

On Thu, 2004-08-12 at 00:41, Vintage Computer Festival wrote:
> On Wed, 11 Aug 2004, Sean 'Captain Napalm' Conner wrote:
>
> > It was thus said that the Great Vintage Computer Festival once stated:
> > >
> > > > XML is more a more "current" technology but I was trying to keep with the
> > > > platform neutrality by sticking to text-only and not assuming the use of any
> > > > other technology like XML.
> > >
> > > XML is platform neutral because it's basically ASCII, right?
> >
> > Nope. XML files can be represented in multiple character sets, possibly
> > including (but certainly not limited to):
> >
> <snip!>
> >
> > Best decide this now.
>
> Ok, I choose US-ASCII. This will be up for debate I'm sure, but surely
> US-ASCII is the most widely deployed character set in the world currently?

*if* we've decided that it's sensible to use XML for this over some
mechanism, then does it matter? I thought that to be compliant with the
XML spec, the XML document should say what version of the spec and what
character encoding it uses?

In other words, who cares what charset is used - people can use whatever
charset makes sense for them. It just needs to be spelled out that it's
mandatory to say which charset is in use for the archive to be valid.

Someone in Japan, say, may well want to fill in data fields in the
archive (such as description) using their native language. We shouldn't
stop them from doing this and force them to use a single-byte character
set such as ASCII.

I'd rather future generations stumble across an archive in Japanese and
have to translate it if necessary at the time, rather than say someone
in Japan who wasn't too good at English be forced to fill in data in
English and end up with ambiguous information.


cheers

Jules
Received on Thu Aug 12 2004 - 05:32:44 BST

This archive was generated by hypermail 2.3.0 : Fri Oct 10 2014 - 23:36:34 BST