Files as files

From: Pete Turnbull <pete_at_dunnington.u-net.com>
Date: Wed Jun 10 21:49:34 1998

On Jun 10, 14:56, Max Eskin wrote:
> How do UNIX files work? Is there a header of some sort?

Not really. Certainly not consistently across all file types. Often
command scripts have a comment at the top, and some versions of unix (eg,
Irix) embed a "tag" number into executables so they can distinguish
individual programs/versions quickly, but other than that, filetype
determination is done by looking at various parts of a file and comparing
what's found ("magic numbers") to a database (the "magic" file).

So, for example, my system "knows" a certain file is a command script
because the first 256 characters are all ASCII (which means it's probably a
text file of some sort) and the file permissions are set such that it's
executable (not merely readable).

It also knows that a certain file is an ELF-format executable for a 32-bit
little-endian MIPS processor with a version 1 architecture (ie it will run
on *old* MIPS cpus as well as newer ones) because the bytes at offset 1 are
"ELF", at offset 4 there's a binary "1" (which in this context means 32-bit
not 64-bit), at offset 5 there's another "1" (little-endian), at offset 16
"2" means "executable", and at offset 18 "0" means "MIPS" (not Sparc,
80x86, 68000, etc).

Some magic numbers are much simpler to decode, of course: a file that
begins with "GIF89a" is a GIF file, surprise, surprise. And the more you
dig, the more detail you can work out.

-- 
Pete						Peter Turnbull
						Dept. of Computer Science
						University of York
Received on Wed Jun 10 1998 - 21:49:34 BST

This archive was generated by hypermail 2.3.0 : Fri Oct 10 2014 - 23:31:04 BST