More Thoughts on Multihoming
			July 6, 2006


Note: This note is mostly not directly about multihoming itself, but on the
routing issues which are the key thing "behind" multihoming.

[In this note I use the newly-made-up jargon 'topo-name' to describe the
'names' (in a general CS sense) that the path-selection system uses. Most
people call these 'addresses', but I prefer to use a different term because
the term 'address' brings along too much mental baggage in too many people.]


Corollaries to Multihoming Classification Observation #3:

MCO #3 said:

  There are 3 different multihoming scenarios which are worth distinguishing
  when one considers mechanisms:

  - A site (i.e. collection of hosts, networks and interconnecting routers)
	which has multiple links to the rest of the world
  - A host (without embedded router) which is connected to multiple networks
  - A host (without embedded router) which has multiple names (i.e. addresses)
	and/or interfaces to a single network

  These differ enough in fundamentals to possibly need different engineering
  approaches to handle them.

I'd like to start this note by exploring what this implies for multihomed
hosts (both types), in a little more detail.

First, let's focus in on that "(without embedded router)". I put that in
there because if a host *does* want to be a router, then if it has (e.g.) two
interfaces, to two separate networks (so its interfaces have topo-names A.p.1
and B.q.2), then it can simply participate in the routing so that packets
"for" interface B.q.2 can be sent to A.p.1, and the rest of the network
doesn't need to know that A.p.1 and B.q.2 are the same machine. Yes, there is
a massive amount of glossing over here about whether the overhead in the
routing to do this is practical, but I'm examining this issue at an
architectural level; from that level, it's clear that host multihoming
(either kind) must involve some other mechanism than "having the routing do
it".

Site multihoming is a totally different kettle of fish, because there
"having the routing do it" *is* a possibility. The rest of this note will
focus on site multihoming, because that's the kind of multihoming that
currently seems to be the most popular. (It's worth keeping in the back of
one's mind, though, that some solutions to site multihoming, such as the use
of multiple topo-names, also support host multihoming.)


Site Multihoming Observation #1:

Site multihoming is, fundamentally, a path-selection (routing) issue. That
is, there are multiple paths through the physical connectivity fabric, and we
wish to make more than one of them available for use.

This does not sound hard. Why is it so difficult? The issue turns out to be
simply one of efficiency - i.e. how high a cost (or overhead) do we want the
path selection to have/incur. If any infinite cost were acceptable, there
would be no problem.


Path-Selection Observation #1:

The mathematical tools used to handle the underlying model (which is a graph)
have something to say about this issue - especially when it comes to graphs
which are both i) very large, and ii) richly connected - which is to say, in
more formal terms, that the average and maximal path lengths through the
graph grow slowly as the graph grows.

Since the pioneering work of Kleinrock and Kamoun, it has been understood
that path-selection in very large graphs (*whether richly-connected or not*)
is a tradeoff between two different costs (or overheads):

- on the one hand, the overhead of the path-selection mechanisms (routing)
themselves (data sets, updates, computation, etc);
- and on the other hand the non-optimality of paths (i.e. traffic takes more
hops than the shortest possible path).

We might visualize this as a knob on our path-selction mechanism: turned one
way, paths get closer to the optimum, and the routing overhead rises; turned
the other way, paths become less optimal, but the routing becomes less
expensive.

Path-Selection Observation #1a:

If the knob is turned too far toward the "low-overhead" end, the path
non-optimality may become infinite; i.e. the routing will fail to discover
any valid path, even though one exists.

Path-Selection Observation #1b:

One part of K+K's analysis assumed that very large graphs would not be very
richly connected, and proceeded to show that reasonable paths would still be
available for a class of inexpensive path-selction mechanisms (all those
using hierarchical topo-names).

However, the network these days *is* fairly richly connected, and so we have
been pitched back into the "cost/path-optimality" tussle - and in particular,
we risk running afould of PSO #1a - i.e. failing to find a path even when
one physically exists.


Path-Selection Observation #2:

We can distinguish between two fundamentally different ways of distributing
the information needed to do path-selection, between 'push' and 'pull' models.

In the 'push' model, which is the one most contemporary routing systems use,
iformation is distributing (usually widely) prior to need - whether this is
topology-connectivity information (as in OSPF, IS-IS, etc) or
destination-table information (as in RIP, BGP, etc).

In the 'pull' model, information is obtained only when it is though it will
be needed. There are not really any contemporary data networks which operate
in this mode, but road networks do - when one is preparing to make an
extensive visit to a distant city, one often obtains a road map for that city
prior to arriving.

The latter model is interesting, in the context of site multihoming in a
large, richly-connected graph, because it offers a way of limiting the
overhead of operating the path selection. Obviously, if information is only
distributed to those who need it, instead of everyone, the overhead will
be reduced.

There have been a number of proposals to utilize 'pull' methods to support
multihoming.

One of the earliest ones I can recall was from Dave Clark; he once (I don't
recall exactly when, but I think it was at the time of the IAB Retreat of
Routing, which later led to the creation of ROAD - but it could have been
earlier) wrote a draft (I'm not sure if it ever made it all the way to
completion, alas) of an idea to have "stub" source routes, from major
high-level topological entities, to the destination, and include them along
with the response to a name lookup. (This is similar to what one often sees
on "directions" sheets for events/attractions, where it will say something
like "if coming from city X, follow this set of steps; if travelling on
motorway/autostrada/interstate/autobahn Y, follow this other set".) The user
would select whichever made the most sense (e.g. if they knew their site was
directly connected to large ISP X, and one of the possible stub source routes
was from X), and use it.

At about the same time (I no longer recall if it was inspired by his idea, or
not), in the map-distribution system Nimrod, we had roughly the same idea -
when getting ready to select a path from A.p.1 to B.q.2, one might ask for
detailed maps of B and B.q - in particular, to see if there were any direct
connections from B.q that didn't show up in the "default" map.

Path-Selection Observation #2a:

It is important to realize that approaches such as Shim6, which provide
multiple addresses to hosts inside multihomed sites, *are*, in a way, 'pull'
systems.

The information which is provided - the multiple addresses - are routing
information, of a sort. They are information which something trying to find a
path can use to select one of several possible paths (which is the essence of
site multihoming, to make use of more than one potential path - see SMO #1).
And obviously the method of distribution is a 'pull' system.


Additional Path-Selection Observation #3:

In controlling the overhead of operating the path-selection, limiting scopes
for the distribution of information are a key tool. We saw simple examples of
this back with subnets, when routing information for subnets of a network was
not distributed outside the network.

This represents the simplest model, where the 'abstraction naming boundary'
(i.e. the boundary - i.e. line enclosing a portion of the graph - around that
portion of the graph which can be referred to as if it were a single node) is
the same as the 'abstraction action boundary' (i.e. the boundary outside
which detailed information about the connectivity inside the ANB is not
distributed). (To put it in less abstract terms, if an entity A is composed
of A.1 ... A.n, the boundary of A is the ANB of A, and the boundary outside
which routing information about A.1 ... A.n is no longer visible are the AAB
of A.)

Some simple results about path-selection are possible with these concepts in
hand. For example, if the ANB and the AAB of an entity are congruent (i.e.
identical), then if the entity is partitioned, connectivity to things inside
it cannot be fully maintained.

The relevance of these concepts to site multihoming is to the
"cost/path-optimality" tussle referred to in PSO #1b; viable site multihoming
in a large, richly-connected graph may require control of the overhead,
and intelligent settings on ANB's and AAB's *may* provide a tool.

If, for instance, a system which generally provides/uses only *two* kinds of
AAB - either ones congruent with the ANB, or global - will have a harder time
controlling routing overhead than one which provides a richer range of AAB's
- and therefore more opportunities to control the overhead. However, without
a detailed analysis based on actual data, it is impossible to say whether the
reduction in overhead would be sufficient, or whether some more aggressive
approach to controlling overhead (e.g. use of 'pull' methods) would be
necessary.