In all the recent discussion (elsewhere) about multi-homing, it has
become clear to me that's there's another characterization of a graph which
people who do routing need to think about. It's one which is, as far as I
know, new; I've looked through "Distances in Graphs", by Harary, and didn't
see it. (If I've reinvented something, please send me a reference! :-)
I'm telling this mailing list since the people here seem more likely
to appreciate graph theory arcana...
That characterization is what I call "spacing" (alternate name glady
accepted!), and it's a measure of how widely "spaced" the connectivity of a
graph is; i.e. how far apart the connections of a node are to the fabric of
the graph. It's important because I conjecture that routing overhead is more
expensive in a graph with high spacing. More on all this below. First some
formal definitions.
"Spacing" is in some sense similar to the "degree" of a graph. For
those of you who've not delved into graph theory, the degree of a node is the
number of links to it. The degree of a graph is an average (for "non-regular"
graphs, which have nodes of differing degree) of the degrees of all the
constituent nodes, although this is a rough measure: more accurate would be a
histogram of the degrees of the individual nodes. (A graph theoery term called
"degree sequence" is essentially this histogram.)
In similar fashion, the "spacing" of a node says something about how
widely spaced its connections to the rest of the graph are, and one can define
the spacing of the graph as a whole as either an average, or a histogram, etc,
etc. The spacing of a single node (well, actually, of a single arc, to begin
with) is defined as follows.
Since the representation of a graph can be fairly plastic, one can't
simply say "oh, that arc goes a long way", since one can redraw the graph to
bring the node on the other end closer. However, if one considers the
*shortest alternate path* between the nodes at either end of the arc, that is
an exact, and useful (as we shall see) definition. One simply tries to find
the shortest path betwen those two nodes which is *not* the one-hop path of
that arc.
One can thus speak of the spacing of one arc of a node; the spacing of
the node as a whole (for nodes which have arcs of differing spacing) can be
either an average, or, probably more usefully (because of it's effect on
routing overhead cost, to be discussed), the maximum of the spacing of the
individual arcs of the node. A graph as a whole would have a spacing sequence,
or, more usefully to us, a spacing average.
So, why is this important? Routing overhead is higher in graphs with
high average spacing is higher, for a given measure of path optimality
(remember that hierarchical routing results in paths which are, while not
optimal, close to optimal, with a tradefoff between how close to optimal the
paths are and how much routing overhead there is).
This is easy to see: if I have regular NEWS graph (i.e. one of
constant degree 4), the scope over which any given node has to be visible, to
produce a given measure of optimality in the path, is quite small. If, on the
other hand, one node has an extra arc with high spacing, to produce equally
optimal paths to all destinations, information about destinations near one end
of the arc will have to flow out to nodes near the other end of the arc, a
routing load above and beyond the original.
--> While adding the highly-spaced arc has made paths between many pairs
--> of nodes shorter, it has done so at the cost of increased routing
--> overhead.
To look an another example, consider a multi-homed network. If it
is singly homed, it can be given an address which is part of the area to
which it is attached, and the extra routing load is minimal. If, on the
other hand, it has two links, then it must be visible over a far larger
scope (one that extends around both of the areas to which it is connected),
with consequent high routing overhead costs.
This is easy to see. For the multi-homing to be useful if either links
fails, it has to be visible as a separate destination. (If this is not the
case, but rather it is numbered as part of one area or the other, then if the
linke from it that area fails, traffic to it will continue to go to that area
even though that area no longer has a link to it.)
Note that it does not have to be separately visible over a global
scope, though; a smaller scope will do, one enclosing both locations to which
it is connected to the rest of the graph. Exactly how much bigger the scope
has to be is controlled by how optimal one wishes the routes to be; if the
scope is minimal, traffic for that destination may head to the wrong part of
the enclosing abstraction before arriving at the scope border, from whence
it does take a more optimal path to the destination.
It is important to realize, however, that multiply-homed sites
represent just one class of thing which produce high-average-spacing networks;
there are others.
The real networks we tend to build will probably have high average
spacing. The reason is that in a large network with low average spacing, most
node pairs will be separated by long paths (although, as previously discussed
with Ohta-san, exactly how average path length is reduced as highly-spaced
links are introduced is hard to predict).
This is probably undesirable, but maybe not. Even if the switches are
not cut-through, the store-and-forward delay added by each additional switch
adds a minimum extra (packet_length/transmission_speed) delay (plus queueing
delays), but as transmission speeds go up, these are pretty minimal. E.g.
with 1 Kbyte packets, a 1 Gbi/sec links, that's 8 usec per switch, hardly a
killer when we have SOL delays, coast-to-cost, in the 20 msec range; S&F
delays in 1000 switches will only up that by 50%.
Anyway, I was going to say more about high average spacing networks,
and their effect on routign overhead, but I have run out of brain cells
completely at this hour. More later.
Noel