Internet Draft I. Castineyra Nimrod J. N. Chiappa January 1995 M. Steenstrup ?draft-ietf-nimrod-arch-00.txt? Expires July 1995 The Nimrod Routing Architecture Status of this Memo This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." To learn the current status of any Internet-Draft, please check the "1id-abstracts.txt" listing contained in the Internet- Drafts Shadow Directories on ds.internic.net (US East Coast), nic.nordu.net (Europe), ftp.isi.edu (US West Coast), or munnari.oz.au (Pacific Rim). Abstract We present a scalable internetwork routing architecture, called Nimrod. The Nimrod architecture is designed to accommodate a dynamic internetwork of arbitrary size with heterogeneous service requirements and restrictions and to admit incremental deployment throughout an internetwork. The key to Nimrod's scalability is its ability to represent and manipulate routing-related information at multiple levels of abstraction. Internet Draft Nimrod Architecture July 1994 Contents 1 Introduction 1 2 Overview 1 2.1 Constraints of the Internetworking Environment . . . . . . . . . . 2 2.2 The Basic Routing Functions . . . . . . . . . . . . . . . . . . . . 3 2.3 Scalability Features . . . . . . . . . . . . . . . . . . . . . . . 5 2.3.1 Clustering and Abstraction. . . . . . . . . . . . . . . . . . . 5 2.3.2 Restricting Information Distribution. . . . . . . . . . . . . . 6 2.3.3 Local Selection of Feasible Routes. . . . . . . . . . . . . . . 6 2.3.4 Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3.5 Limiting Forwarding Information . . . . . . . . . . . . . . . . 7 3 Architectural Overview 7 3.1 Endpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2 Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2.1 Connectivity Specifications . . . . . . . . . . . . . . . . . . 9 3.3 Nodes and Arcs . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.4 BTEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.5 Locators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.6 Node Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.6.1 Arcs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.6.2 Internal Maps . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.6.3 Transit Connectivity. . . . . . . . . . . . . . . . . . . . . . 11 3.6.4 Inbound Connectivity. . . . . . . . . . . . . . . . . . . . . . 12 3.6.5 Outbound Connectivity . . . . . . . . . . . . . . . . . . . . . 12 i Internet Draft Nimrod Architecture July 1994 4 Physical Realization 12 4.1 Contiguity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 4.2 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.3 Multiple Locator Assignment . . . . . . . . . . . . . . . . . . . . 14 5 Forwarding 20 5.1 Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 5.2 Trust . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 5.3 Connectivity Specification (CSC) Mode . . . . . . . . . . . . . . . 22 5.4 Flow Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 5.5 Datagram Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 6 Connectivity Specification Sequence Mode 26 7 Renumbering 26 8 Security Considerations 27 9 Authors' Addresses 27 ii Internet Draft Nimrod Architecture July 1994 1 Introduction Nimrod is a scalable routing architecture designed to accommodate a continually expanding and diversifying internetwork. First suggested by Chiappa in [1], the Nimrod architecture has undergone revision and refinement through the efforts of the Nimrod working group of the IETF. In this document, we present a detailed description of this architecture. The goals of Nimrod are as follows: 1. To support a dynamic internetwork of arbitrary size by providing mechanisms to control the amount of routing information that must be known throughout an internetwork. 2. To provide service-specific routing in the presence of multiple constraints imposed by service providers and users. 3. To admit incremental deployment throughout an internetwork. We have designed the Nimrod architecture to meet these goals. The key features of this architecture include: 1. Representation of internetwork connectivity and services in the form of maps at multiple levels of abstraction. 2. User-controlled route generation and selection based on maps and traffic service requirements. 3. User-directed packet forwarding along established paths. Nimrod is a general routing architecture that can be applied to routing both within a single routing domain and among multiple routing domains. As a general internetwork routing architecture designed to deal with increased internetwork size and diversity, Nimrod is equally applicable to both the TCP/IP and OSI environments. 2 Overview Before describing the Nimrod architecture in detail, we provide an overview. We begin with the internetworking requirements, followed by the routing functions, and concluding with Nimrod's scaling characteristics. 1 Internet Draft Nimrod Architecture July 1994 2.1 Constraints of the Internetworking Environment Internetworks are growing and evolving systems, in terms of number, diversity, and interconnectivity of service providers and users, and therefore require a routing architecture that can accommodate internetwork growth and evolution. A complicated mix of factors such as technological advances, political alliances, and service supply and demand economics will determine how an internetwork will change over time. However, correctly predicting all of these factors and all of their effects on an internetwork may not be possible. Thus, the flexibility of an internetwork routing architecture is its key to handling unanticipated requirements. In developing the Nimrod architecture, we first assembled a list of internetwork environmental constraints which have implications for routing. This list, enumerated below, includes observations about the present Internet; it also includes predictions about internetworks five to ten years in the future. 1. The Internet will grow to include O(10^9) networks. 2. The number of internetwork users may be unbounded. 3. The capacity of internetwork resources is steadily increasing but so is the demand for these resources. 4. Routers and hosts have finite processing capacity and finite memory, and networks have finite transmission capacity. 5. Internetworks comprise different types of communications media -- including wireline and wireless, terrestrial and satellite, shared multiaccess and point-to-point -- with different service characteristics in terms of throughput, delay, error and loss distributions, and privacy. 6. Internetwork elements -- networks, routers, hosts, and processes -- may be mobile. 7. Service providers will specify offered services and restrictions on access to those services. Restrictions may be in terms of when a service is available, how much the service costs, which users may subscribe to the service and for what purposes, and how the user must shape its traffic in order to receive a service guarantee. 8. Users will specify traffic service requirements which may vary widely among sessions. These specifications may be in terms of requested qualities of service, the amounts they are willing to pay for these services, the times at which they want these services, and the providers they wish to use. 2 Internet Draft Nimrod Architecture July 1994 9. A user traffic session may include m sources and n destinations, where m, n > or = 1. 10. Service providers and users have a synergistic relationship. That is, as users develop more applications with special service requirements, service providers will respond with the services to meet these demands. Moreover, as service providers deliver more services, users will develop more applications that take advantage of these services. 11. Support for varied and special services will require more processing, memory, and transmission bandwidth on the part of both the service providers offering these services and the users requesting these services. Hence, many routing-related activities will likely be performed not by routers and hosts but rather by independent devices acting on their behalf to process, store, and distribute routing information. 12. Users requiring specialized services (e.g., high guaranteed throughput) will usually be willing to pay more for these services and to incur some delay in obtaining them. 13. Service providers are reluctant to introduce complicated protocols into their networks, because they are more difficult to manage. 14. Vendors are reluctant to implement complicated protocols in their products, because they take longer to develop. Collectively, these constraints imply that a successful internetwork routing architecture must support special features, such as service-specific routing and component mobility in a large and changing internetwork, using simple procedures that consume a minimal amount of internetwork resources. We believe that the Nimrod architecture meets these goals, and we justify this claim in the remainder of this document. 2.2 The Basic Routing Functions The basic routing functions provided by Nimrod are those provided by any routing system, namely: 1. Collecting, assembling, and distributing the information necessary for route generation and selection. 2. Generating and selecting routes based on this information. 3. Establishing in routers information necessary for forwarding packets along the selected routes. 3 Internet Draft Nimrod Architecture July 1994 4. Forwarding packets along the selected routes. The Nimrod approach to providing this routing functionality includes map distribution according to the "link-state" paradigm, localization of route generation and selection at traffic sources and destinations, and specification of packet forwarding through path establishment by the sources and destinations. Link-state map distribution permits each service provider to have control over the services it offers, through both distributing restrictions in and restricting distribution of its routing information. Restricting distribution of routing information serves to reduce the amount of routing information maintained throughout an internetwork and to keep certain routing information private. However, it also leads to inconsistent routing information databases throughout an internetwork, as not all such databases will be complete or identical. We expect routing information database inconsistencies to occur often in a large internetwork, regardless of whether privacy is an issue. The reason is that we expect some devices to be incapable of maintaining the complete set of routing information for the internetwork. These devices will select only some of the distributed routing information for storage in their databases. Route generation and selection, based on maps and traffic service requirements, may be completely controlled by the users or, more likely, by devices acting on their behalf and does not require global coordination among routers. Thus these devices may generate routes specific to the users' needs, and only those users pay the cost of generating those routes. Locally-controlled route generation allows incremental deployment of and experimentation with new route generation algorithms, as these algorithms need not be the same at each location in an internetwork. Packet forwarding, according to paths, may be completely controlled by the users or the devices acting on their behalf. These paths may be specified in as much detail as the maps permit. Such packet forwarding provides freedom from forwarding loops, even when routers in a path have inconsistent routing information. The reason is that the forwarding path is a route computed by a single device and based on routing information maintained at a single device. We note that the Nimrod architecture and Inter-Domain Policy Routing (IDPR) [2] share in common link-state routing information distribution, localized route generation and path-oriented message forwarding. In developing the Nimrod architecture, we have drawn upon experience gained in developing and experimenting with IDPR. 4 Internet Draft Nimrod Architecture July 1994 2.3 Scalability Features Nimrod must provide service-specific routing in arbitrarily large internetworks and hence must employ mechanisms that help to contain the amount of internetwork resources consumed by the routing functions. We provide a brief synopsis of such mechanisms below, noting that arbitrary use of these mechanisms does not guarantee a scalable routing architecture. Instead, these mechanisms must be used wisely, in order enable a routing architecture to scale with internetwork growth. 2.3.1 Clustering and Abstraction The Nimrod architecture is capable of representing an internetwork as clusters of entities at multiple levels of abstraction. Clustering reduces the number of entities visible to routing. Abstraction reduces the amount of information required to characterize an entity visible to routing. Clustering begins by aggregating internetwork elements such as hosts, routers, and networks according to some predetermined criteria. These elements may be clustered according to relationships among them, such as "managed by the same authority", or so as to satisfy some objective function, such as "minimize the expected amount of forwarding information stored at each router". Nimrod does not mandate a particular cluster formation algorithm. New clusters may be formed by clustering together existing clusters. Repeated clustering of entities produces a hierarchy of clusters with a unique universal cluster that contains all others. The same clustering algorithm need not be applied at each level in the hierarchy. All elements within a cluster must satisfy at least one relation, namely connectivity. That is, if all elements within a cluster are operational, then any two of them must be connected by at least one route that lies entirely within that cluster. This condition prohibits the formation of certain types of separated clusters, such as the following. Suppose that a company has two branches located at opposite ends of a country and that these two branches must communicate over a public network not owned by the company. Then the two branches cannot be members of the same cluster, unless that cluster also includes the public network connecting them. Once the clusters are formed, their connectivity and service information is abstracted to reduce the representation of cluster characteristics. Example abstraction procedures include elimination of services provided by a small fraction of the elements in the cluster or expression of services in terms of average values. Nimrod does not mandate a particular abstraction algorithm. The same abstraction algorithm need not be applied to each cluster, and multiple abstraction algorithms may be applied to a single cluster. 5 Internet Draft Nimrod Architecture July 1994 A particular combination of clustering and abstraction algorithms applied to an internetwork results in an organization related to but distinct from the physical organization of the component hosts, routers, and networks. When a clustering is superimposed over the physical internetwork elements, the cluster boundaries may not necessarily coincide with host, router, or network boundaries. Nimrod performs its routing functions with respect to the hierarchy of entities resulting from clustering and abstraction, not with respect to the physical realization of the internetwork. In fact, Nimrod need not even be aware of the physical elements of an internetwork. 2.3.2 Restricting Information Distribution The Nimrod architecture supports restricted distribution of routing information, both to reduce resource consumption associated with such distribution and to permit information hiding. Each cluster determines the portions of its routing information to distribute and the set of entities to which to distribute this information. Moreover, recipients of routing information are selective in which information they retain. Some examples are as follows. Each cluster might automatically advertise its routing information to its siblings (i.e., those clusters with a common parent cluster). In response to requests, a cluster might advertise information about specific portions of the cluster or information that applies only to specific users. A cluster might only retain routing information from clusters that provide universal access to their services. 2.3.3 Local Selection of Feasible Routes Generating routes that satisfy multiple constraints is usually an NP-complete problem and hence a computationally intensive procedure. With Nimrod, only those entities that require routes with special constraints need assume the computational load associated with generation and selection of such routes. Moreover, the Nimrod architecture allows individual entities to choose their own route generation and selection algorithms and hence the amount of resources to devote to these functions. 2.3.4 Caching The Nimrod architecture encourages caching of acquired routing information in order to reduce the amount of resources consumed and delay incurred in obtaining the information in the future. The set of routes generated as a by-product of generating a particular route is an example of routing information that is amenable to caching; future requests for any of these routes may be satisfied directly from the route cache. However, as with any caching scheme, the cached information may become stale and its use may result in poor quality routes. Hence, the routing information's expected 6 Internet Draft Nimrod Architecture July 1994 duration of usefulness must be considered when determining whether to cache the information and for how long. 2.3.5 Limiting Forwarding Information The Nimrod architecture supports two separate approaches for containing the amount of forwarding information that must be maintained per router. The first approach is to multiplex, over a single path (or tree, for multicast), multiple traffic flows with similar service requirements. The second approach is to install and retain forwarding information only for active traffic flows. With Nimrod, the service providers and users share responsibility for the amount of forwarding information in an internetwork. Users have control over the establishment of paths, and service providers have control over the maintenance of paths. This approach is different from that of the current Internet, where forwarding information is established in routers independent of demand for this information. 3 Architectural Overview Nimrod is a hierarchical, map-based routing architecture that has been designed to support a wide range of user requirements and to scale to very large dynamic internets. Given a traffic stream's description and requirements (both quality of service requirements and usage-restriction requirements), Nimrod's main function is to manage in a scalable fashion how much information about the internetwork is required to choose a route for that stream. In other words, to manage the trade-off between amount of information about the internetwork and the quality of the computed route. Nimrod is implemented as a set of protocols and distributed databases. The following sections describe the basic architectural concepts used in Nimrod. The protocols and databases are specified in other documents. 3.1 Endpoints The basic entity in Nimrod is the endpoint. An endpoint represents a user of the internetwork layer: for example, a transport connection. Each endpoint has at least one endpoint identifier (EID). Any given EID corresponds to a single endpoint. EIDs are globally unique, relatively short "computer-friendly" bit strings---for example, small multiples of 64 bits. EIDs have no topological significance whatsoever. For ease of management, EIDs might be organized hierarchically, but this is not required. 7 Internet Draft Nimrod Architecture July 1994 BEGIN COMMENT In practice, EIDs will probably have a second form, which we can call the endpoint label (EL). ELs are ASCII strings of unlimited length, structured to be used as keys in a distributed database (much like DNS names). Information about an endpoint---for example, how to reach it---can be obtained by querying this distributed database using the endpoint's label as key. END COMMENT 3.2 Maps The basic data structure used for routing is the map. A map expresses the available connectivity between different points of an internetwork. Different maps can represent the same region of a physical network at different levels of detail. A map is a graph composed of nodes and arcs. Properties of nodes are contained in attributes associated with them. Arcs have no attributes. Arcs appear in maps as attributes of nodes. Nimrod defines languages to specify attributes and to describe maps. Maps are used by routers to generate routes. In general, it is not required that different routers have consistent maps. In this document we speak only of routers. By "router" we mean a physical device that implements functions related to routing: for example, forwarding, route calculation, path set-up. A given device need not be capable of doing all of these to be called a router. Later---for example, in protocol specification documents---it might be convenient to explicitly split these functionalities. We might then speak of forwarding engines, path set-up agents, route servers, etc. BEGIN COMMENT Nimrod has been designed so that there will be no routing loops even when the routing databases of different routers are not consistent. A consistency requirement would not permit representing the same region of the internetwork at different levels of detail. Also, a routing-database consistency requirement would be hard to guarantee in the very large internets Nimrod is designed to support. END COMMENT 8 Internet Draft Nimrod Architecture July 1994 3.2.1 Connectivity Specifications By connectivity between two points we mean the available services and the restrictions on their use. Connectivity specifications are among the attributes associated with nodes. The following are informal examples of connectivity specifications: o "Between these two points, there exists best-effort service with no restrictions." o "Between these two points, guaranteed 10 ms delay can be arranged for traffic streams whose data rate is below 1 Mbyte/sec and that have low (specified) burstiness." o "Between these two points, best-effort service is offered, as long as the traffic originates in and is destined to research organizations." BEGIN COMMENT Connectivity specifications can be defined not only between two points, but also between sets of points. Nimrod includes a language to define connectivity specifications. END COMMENT 3.3 Nodes and Arcs A node represents a region of the physical network. The region of the network represented by a node can be as large or as small as desired: a node can represent a continent or a process running inside a host. Moreover, as explained in section 4, a region of the network can simultaneously be represented by more than one node. Arcs are unidirectional. An arc has two distinguishable ends: a head and a tail. (Arcs are often visualized and drawn as arrows; the head of the arc corresponds to the head of the arrow.) The head and tail of an arc are each connected to a node. The presence of an arc between two nodes specifies that traffic can flow between those two points in the direction indicated by the arc (from tail to head). Between two given nodes, there can be only one arc in each direction. Arcs always connect different nodes. 9 Internet Draft Nimrod Architecture July 1994 3.4 BTEs The distinguishable components of a map are called basic topological entities (BTEs): nodes and connectivity specifications. 3.5 Locators A locator is a string of binary digits that identifies a basic topological entity (BTE) in a map. Different BTEs have necessarily different locators. A given BTE is assigned only one locator. Locators identify BTEs and specify *where* a BTE is in the network. Locators do *not* specify a path to the BTE. In this document locators are written as ASCII strings that include colons to underline node structure: for example, a:b:c. This does not mean that the representation of locators in packets or in databases will necessarily have something equivalent to the colons. A given physical element of the network might implement more than one BTE---for example, a router that is part of two different nodes. Though this physical element might therefore be associated with more than one locator, the BTEs that this physical element implements have each only one locator. A node is said to own those locators that have as a prefix the locator of the node. In a node that has an internal map, the locators of all BTEs in this internal map are prefixed by the locator of the original node. Specifically, the locators of nodes appearing in the internal map of a node are prefixed by the locator of that node. Arcs do not have locators. The locators of all connectivity specifications associated with a node are also prefixed by the node's locator. All routing map information is expressed in terms of locators, and routing selections are based on locators. EIDs are *not* used in making routing decisions---see section 5. 3.6 Node Attributes The following are node attributes defined by Nimrod. 10 Internet Draft Nimrod Architecture July 1994 3.6.1 Arcs Arcs appear in maps as attributes of their tail node---the "from" node. Every arc associated with a node identifies a neighboring node to which the original node can send data to. Given a node, an "adjacent source" is a node that can send data to the original node. That is, a node that has an arc with the original node as its head. Similarly, an "adjacent destination" is a node to which the original node can send data. That is, a node that is the head of one of the orignal node's arcs. 3.6.2 Internal Maps As part of its attributes, a node can have internal maps. A router can obtain a node's internal maps---or any other of the node's attributes, for that matter---by requesting that information from a representative of that node. (A router associated with that node can be such a representative.) A node's representative can in principle reply with different internal maps to different requests---for example, because of security concerns. This implies that different routers in the network might have different internal maps for the same node. Given a map, a router can obtain a more detailed map by substituting one of the map's nodes by one of that node's internal maps. This process can be continued recursively. (Presumably, a router would expand nodes in the region of the map of its current interest.) Nimrod defines standard internal maps that are intended to be used for specific purposes. A node's "detailed map" gives more information about the region of the network represented by the original node. Typically, it is closer to the physical realization of the network than the original node. The nodes of this map can themselves have detailed maps. 3.6.3 Transit Connectivity For a given node, this attribute specifies the services available between adjacent sources and adjacent destinations. This attribute is requested and used when a router intends to route traffic *through* a node. Conceptually, the traffic connectivity attribute is a matrix that is indexed by a pair of locators: the locator of an adjacent source and the locator of an adjacent destination. The entry indexed by such a pair contains the connectivity specifications of the services available across the given node for traffic entering from the adjacent source going to the adjacent destination. The actual format of this attribute need not be a matrix. This document does not specify the format for this attribute, nor for the next two attributes. 11 Internet Draft Nimrod Architecture July 1994 3.6.4 Inbound Connectivity For a given node, this attribute represents connectivity from adjacent sources to points within the node. This attribute is requested and used when a router intends to route traffic to a point within the node but does not have, and either cannot or does not want to obtain, a detailed map of the node. The inbound connectivity attribute identifies what connectivity specifications are available between pairs of locators. The first element of the pair is the locator of an adjacent source, node, the second is a locator owned by the node. 3.6.5 Outbound Connectivity For a given node, this attribute represents connectivity from points within the node to adjacent destinations. This attribute identifies what connectivity specifications are available between pairs of locators. The first element of the pair is a locator owned by the node, the second is the locator of an adjacent destination. 4 Physical Realization A network is modeled as being composed of physical elements: routers, hosts, and communication links. The links can be either point-to-point---e.g., T1 links---or multi-point---e.g., ethernets, X.25 networks, IP-only networks, etc. The physical representation of a network can have associated with it one or more Nimrod maps. A Nimrod map is a function not only of the physical network, but also of the configured clustering of elements (locator assignment) and of the configured connectivity. Nimrod has no pre-defined "lowest level": for example, it is possible to define and advertise a map that is physically realized inside a CPU. In this map, a node could represent, for example, a process or a group of processes. The user of this map need not necessarily know or care. ("It is turtles all the way down!", in [3] page 63.) 4.1 Contiguity Locators sharing a prefix must be assigned to a contiguous region of a map. That is, two elements (BTEs) of a map that have been assigned locators sharing a prefix should be connected to each other with elements that themselves have been assigned locators with that prefix. The main consequence of this requirementD is that "you cannot take your locator with 12 Internet Draft Nimrod Architecture July 1994 you." As an example of this, see figure 1, consider two providers x.net and y.net (these designations are *not* locators but DNS names) which appear in a Nimrod map as two nodes with locators A and B. Assume that corporation z.com (also not a locator) was originally connected to x.net. Locators corresponding to elements in z.com are, in this example, A-prefixed. Corporation z.com decides to change providers---severing its physical connection to x.net. The connectivity requirement described in this section implies that, after the provider change has taken place, elements in z.com will have been, in this example, assigned B-prefixed locators and that it is not possible for them to receive data destined to A-prefixed locators through y.net. The contiguity requirement simplifies routing information exchange: if it were permitted for z.com to receive A-prefixed locators through y.net, it would be necessary that a map that contains node B include information about the existence of a group of A-prefixed locators inside node B. Similarly, a map including node A would have to include information that the set of A-prefixed locators asigned to z.com is not to be found within A. The more situations like this happen, the more the hierarchical nature of Nimrod is subverted to "flat routing." The contiguity requirement can also be expressed as "EIDs are stable; locators are ephemeral." The contiguity requirement rules out some approaches to implementing mobility in Nimrod. For example, a mobile host cannot advertise its "home" locator from its new location. For more on mobility see [4]. 4.2 An Example Figure 2 shows a physical network. Hosts are drawn as squares, routers as diamonds, and communication links as lines. The network shown has the following components: five ethernets ---EA through EE; five routers---RA through RE; and four hosts---HA through HD. Routers RA, RB, and RC interconnect the backbone ethernets---EB, EC and ED. Router RD connects backbone EC to a network consisting of ethernet EA and hosts HA and HB. Router RE interconnects backbone ED to a network consisting of ethernet EE and hosts HC and HD. The assigned locators appear in lower case beside the corresponding physical entity. Figure 3 shows a Nimrod map for that network. The nodes of the map are represented as squares. Lines connecting nodes represent two arcs in opposite directions. Different regions of the network are represented at different detail. Backbone b1 is represented as a single node. The region of the network with locators prefixed by "a" is represented as a single node. The region of the network with locators prefixed by "c" is represented in full detail. 13 Internet Draft Nimrod Architecture July 1994 +-------+ +-------+ | | | | | x.net | | y.net | | | | | | | | | +-------+ +-------+ / / / / / / / +-------+ | | | z.com | | | | | +-------+ Figure 1: Connectivity after switching providers 4.3 Multiple Locator Assignment Physical elements can form part of, or implement, more than one BTE. In this sense it can be said that they can be assigned more than one locator. Consider figure 4, which shows a physical network. This network is composed of routers (RA, RB, RC, and RD), hosts (HA, HB, and HC), and communication links. Routers RA, RB, and RC are connected with point-to-point links. The two horizontal lines in the bottom of the figure represent ethernets. The figure also shows the locators assigned to hosts and routers. In figure 4, RA and RB have each been assigned one locator (a:t:r1 and b:t:r1, respectively). RC has been assigned locators a:y:r1 and b:d:r1; one of these two locators shares a prefix with RA's locator, the other shares a prefix with RB's locator. Hosts HA and HB have each been assigned three locators. Host HC has been assigned one locator. Depending on what communication paths have been set up between points, different Nimrod maps result. A possible Nimrod map for this network is given in figure 5. Nodes and arcs represent the *configured* clustering and connectivity of the network. Notice that even though a:y and b:d are defined on the same hardware, the map shows no connection between them: this connection has not been configured. A packet given to node `a' addressed to a locator prefixed 14 Internet Draft Nimrod Architecture July 1994 a:h1 +--+ a:h2 +--+ |HA| |HB| | | | | +--+ +--+ a:e1 | | --------------------- EA | /\ /\ /RB\ b1:r1 /RD\ b2:r1 /\ /\ \ / / \/ \ \/ EB b1:t:e1 / \ | EC ------------------------ -------------------------- b2:e1 / \ / \ /\ \ /RA\ b1:r2 \/\ \ / /RC\ b2:t:r2 \/ \ / \ \/ \ / ED ----------------------------------- b3:t:e1 | | | /\ /RE\ b3:t:r1 \ / EE \/ ----------------------------- c:e1 | | +--+ +--+ |HC| c:h1 |HD| c:h2 | | | | +--+ +--+ Figure 2: Example Physical Network 15 Internet Draft Nimrod Architecture July 1994 +-----+ +-----+ +----------+ | | | | | |--------------| b2:t| --------------| a | | | | | | | | b1 | +-----+ +-----+ | | | | | | | | | +----------+ | \ | \ | \ | \ | \ +--------+ \ | | ------- | b3:t:e1| | | +--------+ | | | | +-------+ | | |b3:t:r1| | | +-------+ | +-----+ +-----+ +-----+ | | | | | | | c:h1|-------| c:e1|-----| c:h2| | | | | | | +-----+ +-----+ +-----+ Figure 3: Nimrod Map 16 Internet Draft Nimrod Architecture July 1994 a:t:r1 b:t:r1 +--+ +--+ |RA|------------|RB| +--+ +--+ \ / \ / \ / \ / \ / \ / \ / \ +--+ |RC| a:y:r1 +--+ b:d:r1 | --------------------------- a:y:h1 +--+ +--+ +--+ a:y:h2 b:d:h2 |HA| |RD| c:r1 |HB| b:d:h1 c:h1 +--+ +--+ +--+ c:h2 | | -------------------- +--+ |HC| c:h3 +--+ Figure 4: Multiple Locators 17 Internet Draft Nimrod Architecture July 1994 a b c +-------------+ +-------------+ +---------------+ | | | | | | | a:t | | b:t | | | | +--+ | | +--+ | | | | | |--------------|--| | | | | | +--+ | | +--+ | | | | | | | | | | | | +--+ | | +--+ | | | | + + | | + + | | | | +--+ a:y | | +--+ b:d | | | | | | | | | +-------------+ +-------------+ +---------------+ Figure 5: Nimrod Map with "b:d" would have to travel from node a to node b via the arc joining them before being directed towards its destination. Similarly, the map shows no connection between the c node and the other two top level nodes. If desired, these connections could be established, which would necessitate setting up the exchange of routing information. Figure 6 shows the map when these connections have been established. In the strict sense, Nimrod nodes do not overlap: they are distinct entities. But, as we have seen in the previous example, a physical element can be given more than one locator, and, in that sense, participate in implementing more than one node. That is, two different nodes might be defined on the same hardware. In this sense, Nimrod nodes can be said to overlap. But to notice this overlap one would have to know the physical-to-map correspondence. It is not possible to know when two nodes share physical assets by looking only at a Nimrod map. BEGIN COMMENT A contiguous region of the network that is not Nimrod-aware can be represented as node with associated transit connectivity specifications that describe the connectivity offered by this region of the network. An example of this is an IP-only network that is connected to the Nimrod internetwork via Nimrod routers. Nimrod-aware hosts connected to this network are represented as nodes connected to this node. Nimrod packets destined for Nimrod hosts, or for Nimrod routers "on the other side of the network," could, for example, be encapsulated inside IP packets. IP-only hosts connected to this network can be reached from other IP-only clouds by, for example, encapsulating IP packets inside 18 Internet Draft Nimrod Architecture July 1994 +--------+ +--------+ | | | | | a:t:r1 |-----------------------------------------------| b:t:r1 | | | | | +--------+ +--------+ | | | | | /-----------------------------------------\ | | | | | | | | | | +--------+ +--------+ +--------+ | | | | | | | | | | | a:y:h1 --------| c:h1 |--------------------| b:d:h1 | | | | | | | | | | | +--------+ +--------+ +--------+ | | | | | | | | | +--------+ | | +------+ +------+ | +--------+ | | | | | | | | | | | | a:y:r1 | | | | c:r1 |--| c:h3 | | | b:d:r1 | | | | | | | | | | | | +--------+ | | +------+ +------+ | +--------+ | | | | | | | | | +--------+ +--------+ +--------+ | | | | | | | | | | | a:y:h2 |-------- c:h2 |--------------------| b:d:h2 | | | | | | | | | | | +--------+ +--------+ +--------+ | | | | | | | | | | | | | | \-----------------------------------------/ | \-------------------------------------------------------------/ Figure 6: Nimrod Map II 19 Internet Draft Nimrod Architecture July 1994 packets of the format being used by Nimrod. Nimrod routers connecting the IP network to the Nimrod internetwork would "de-capsulate" packets destined to IP-only hosts. IP-only hosts could, for example, be given locators prefixed by the locator of a Nimrod router that knows how to get packets to them, this way putting them "inside" the associated Nimrod router. Other treatments are possible: for example, they could be given locators prefixed with the locator of the node that represents the IP network. In the first case, "within" a router, only that router needs to know how to forward packets to IP hosts; however, this makes this router a single point of failure. In the second case, all Nimrod routers connected to this node need to know how to forward IP packets to IP-only hosts. To simplify packet forwarding, the locator for an IP-only host might include the IP address of the host. END COMMENT 5 Forwarding Nimrod does not specify a packet format. It is possible to use Nimrod with different formats, conceivably simultaneously, in the same network. This section specifies Nimrod's requirements on the packet-forwarding mechanism. Nimrod supports four forwarding modes: 1. Connectivity Specification Chain (CSC) mode: in this mode, packets carry a list of connectivity specification locators. The packet is required to go through the nodes that own the connectivity specifications using the services specified. The nodes associated with the listed connectivity specifications should define a continuous path in the map. A more detailed description of the requirements of this mode is given in section 5.3. 2. Connectivity Specifications Sequence (CSS) mode: in this mode, packets carry a list of connectivity specification locators. The packet is supposed to go sequentially through the nodes that own each one of the listed connectivity specifications in the order they were specified. The nodes need not be neighbours. This mode can be seen as a generalization of the CSC mode. Notice that CSCs are said to be a *chains* of locators, CSSs are *sequence* of locators. This difference emphasizes the contiguity requirement in CSCs. A detailed description of this mode is in section 6. 3. Flow mode: in this mode, the packet header includes a path-id that indexes state that has been previously set up in routers along the path. Packet forwarding when flow state has been established is relatively simple: follow the instructions in the routers' state. 20 Internet Draft Nimrod Architecture July 1994 Nimrod includes a mechanism for setting up this state. A more detailed description of this mode can be found in section 5.4. 4. Datagram mode: in this mode, every packet header carries source and destination locators. This mode can be seen as a special case of the CSS mode. Forwarding is done following procedures as indicated in section 5.5. BEGIN COMMENT The obvious parallels are between CSC mode and IPV4's strict source route and between CSS mode and IPV4's loose source route. END COMMENT In all of these modes, the packet header also carries locators and EIDs for the source and destinations. In normal operation, forwarding does not take the EIDs into account, only the receiver does. EIDs are carried for demultiplexing at the receiver, and to detect certain error conditions. For example, if the EID is unknown at the receiver, the locator and EID of the source included in the packet could be used to generate an error message to return to the source (as usual, this error message itself should probably not be allowed to be the cause of other error messages). Forwarding can also use the source locator and EID to respond to error conditions, for example, to indicate to the source that the state for a path-id cannot be found. Packets can be visualized as moving between nodes in a map. A packet's header indicates, implicitly or explicitly, a destination locator. In a packet that uses the datagram, CSC, or CSS forwarding mode, the destination locator is explicitly indicated in the header. In a packet that uses the flow forwarding mode, the destination locator is implied by the path-id and the distributed state in the network (it might also be included explicitly). Given a map, a packet moves to the node in this map to which the associated destination locator belongs. If the destination node has a "detailed" internal map, the destination locator must belong to one of the nodes in this internal map (otherwise it is an error). The packet goes to this node (and so on, recursively). 5.1 Policy CSC and CSS mode packets implement policy by specifying the connectivity specifications associated with those nodes that the packet should traverse. Strictly speaking, there is no policy information included in the packet header. That is, in principle, it is not possible to determine what criteria were used to select the route by looking at the header. The packet header only contains the results of the route generation process. 21 Internet Draft Nimrod Architecture July 1994 Similarly, in a flow mode packet, policy is implicit in the chosen route. A datagram-mode packet can indicate a limited form of policy routing by the choice of destination and source locators. For this choice to exist, the source or destination endpoints must have several locators associated with them. This type of policy routing is capable of, for example, choosing providers. 5.2 Trust A node that does not divulge its internal map can work internally any way its administrators decide, as long as the node satisfies its external characterization as given in its Nimrod map advertisements. Therefore, the advertised Nimrod map should be consistent with a node's actual capabilities. For example, consider the network shown in figure 7 which shows a physical network and the advertised Nimrod map. The physical network consists of hosts and a router connected together by an ethernet. This node can be sub-divided into component nodes by assigning locators as shown in the figure and advertising the map shown. The map seems to imply that it is possible to send packets to node a:x without these being observable by node a:y; however, this is actually not enforceable. In general, it is reasonable to ask how much trust should be put in the maps obtained by a router. Even when a node is "trustworthy," and the information received from the node has been authenticated, there is always the possibility of an honest mistake. These are difficult issues that are not unique to Nimrod. Many research and standards groups are addressing them. We plan to incorporate the output of these groups into Nimrod as they become available. 5.3 Connectivity Specification (CSC) Mode Routing for a CSC packet is specified by a list of locators carried in the packet header. The locators correspond to connectivity specifications that make the specified path, in the order that they appear along the path. These connectivity specifications are attributes of nodes. Note that the route indicated by a CSC packet is specifed in terms of connectivity specifications rather than physical entities: a locator in the CSC header could correspond to a type of service between two points of the network without specifying the physical path. Given two connectivity specification locators that appear consecutively in the header of a CSC mode packet, there should exist an arc going from the node corresponding to the first connectivity specification to the node corresponding to the second connectivity specification. The first connectivity specification referenced in a CSC mode header should be an outbound connectivity specification; similarly, the last connectivity 22 Internet Draft Nimrod Architecture July 1994 +--+ |RA| a:r1 +--+ | | | | ------------------------------- +--+ +--+ |Ha| a:x:h1 |Ha| a:y:h2 +--+ +--+ Physical Network a | +----------------|-------------------- | | | | +----+ | | |a:r1| | | a:x +----+ a:y | | +------+ / \ +-------+ | | | | / \| | | | | | | | | | | | | | | | +------+ +-------+ | | | + -----------------------------------+ Advertised Nimrod Map Figure 7: Example of Misleading Map 23 Internet Draft Nimrod Architecture July 1994 specification referenced in a CSC mode header should be an indicated connectivity specification; the rest should be transit connectivity specifications. 5.4 Flow Mode The header of a flow mode packet includes a path-id field. This field identifies state that has been established in intermediate routers. This header might also contain locators and EIDs for the source and destination. Nimrod includes protocols to set up and modify flow-related state in intermediate routers. These protocols not only identify the requested route, but also describe the resources requested by the flow---e.g., bandwidth, delay, etc. The result of a set-up attempt might be either confirmation of the set-up or notification of its failure. The source-specified routes in flow mode are specified in terms of CSCs. 5.5 Datagram Mode A realistic routing architecture must include an optimization for datagram traffic, by which we mean user transactions which consist of single packets, such as a lookup in a remote translation database. Either of the two previous modes contains unacceptable overhead if much of the network traffic consists of such datagram transactions. A mechanism is needed which is approximately as efficient as the existing IPV4 "hop-by-hop" mechanism. Nimrod has such a mechanism. The scheme can be characterized by the way it divides the state in a datagram network, between routers and the actual packets. Most packets currently contain only a small amount of state associated with the forwarding process ("forwarding state")---the hop count. Nimrod proposes that enlarging the amount of forwarding state in packets can produce a system with useful properties. It was partially inspired by the efficient source routing mechanism in SIP ( [5]), and the locator pointer mechanism in PIP ( [6]). Nimrod datagram mode uses pre-set flow-mode state to support a strictly non-looping path, but without a source-route. In the datagram mode, the packet contains, in addition to a locally usable path-id field: o the source and destination locators, and o a pointer into the locators. The pointer starts out at the lowest level of the source locator, and moves "up" that locator, then to the destination locator, and then "down". In 24 Internet Draft Nimrod Architecture July 1994 addition to these extra fields in the packet, all routers have to contain a minimal set of "pre-setup" flows, or be prepared to set these flows on demand, to certain routers which are at critical places in the abstraction hierarchy. The "pre-setup" flows do not actually have to be set up in advance, but can be created on demand. There is a minimum set of flows which do have to be *able* to be set up for the system to operate, however. It is purely a local decision, which, if any, of those flows to set up before there is an actual traffic requirement for them. As an efficiency move, when a datagram requires that a flow actually be set up to handle it, the data packet could be sent along with the flow setup request, avoiding the round-trip delay. We call these flows "datagram mode flows", or "DMFs", realizing that none of them need be created until actually needed. The actual operation of the mechanism is fairly simple. While going up the source locator, each "active" router (i.e., one that actually makes a decision about where to send the packet, as opposed to handling it as part of a flow) selects a DMF which will take the packet to the "next higher" level object in the source locator, advances the pointer, and sends the packet off along that DMF. When it gets to the end of that DMF, the process repeats, until the packet reaches a router which is at the least common intersection of the two locators. (e.g., for A:P:Q:R and A:X:Y:Z, this would be when the packet reaches A). The process then inverts, with each active router selecting a DMF which takes the packet to the next lower object in the destination locator. So, A would select a flow to A:X, and once it got to A:X, A:X would select a flow to A:X:Y, etc. It can easily be seen that the process guarantees that the resulting path is loop-free. Each flow selected must necessarily get the packet closer to its destination (since each flow selection results in the pointer being monotonically advanced through the locator), and the flows themselves are guaranteed not to loop when their paths are selected, prior to being set up. If the system keeps more than the minimal set of DMFs (which is just up to one border router in internal routers, and down to each object one level down for each border router), and keep the table sorted for efficient lookups (e.g., in much the same way as the current routing table for hop-by-hop datagrams), routing can be more efficient. For example, using the case above (a packet from A:P:Q:R to A:X:Y:Z), if A:P:Q is actually a neighbour to A:X:Y, and maintains a flow directly from A:P:Q to A:X:Y, then when the packet reaches A:P:Q, instead of going the rest of the way up and down, the pointer can be set into the destination locator at A:X:Y, and the packet sent there directly. Traffic monitoring and analysis (again, using purely local algorithms) can result in a database being created over time, which shows which DMFs above and beyond the minimal set are worth keeping around. This traffic 25 Internet Draft Nimrod Architecture July 1994 monitoring would also show which flows from the required minimal set of DMFs would be useful to set up in advance of actual traffic which needed them. Again, however, all these sets can be changed in a local, incremental way, without disturbing the operation of the system as a whole. These new fowarding state fields would not be covered by an end-end authentication system, any more than the existing hop count field (which is also forwarding state) would be. This would prevent problems caused by the fact that the contents of these fields change as the packet traverses the network. The forwarding of these packets can be quite efficient (possibly more so than even standard hop-by-hop). In the non-active routers, the packet is associated with a flow in a way that makes possible hardware processing without any software involvement at all. In active routers, the process of looking up the next DMF would be about as expensive as the current routing table lookup, and the main difference would be that the result of that lookup would have to be stored in the packet, not a great expense. 6 Connectivity Specification Sequence Mode The connectivity specification sequence mode specifies a route by a list of connectivity specification locators. There are no contiguity restrictions on consecutive locators. BEGIN COMMENT The CSS and CSC modes can be seen as combination of the datagram and flow modes. Therefore, in a sense, the basic forwarding modes of Nimrod are just these last two. END COMMENT 7 Renumbering This section presents an example of how to renumber a Nimrod network. Figure 8 shows a network halfway in the process of being renumbered. The figure shows the physical network and the associated locators. The network is formed by router RA which is connected to three ethernets. The figure shows five hosts, "HA" to "HE". To the right of each host two locators are shown. The first locator shown corresponds to the old numbering; the second, to the new numbering. Renumbering has consisted of adding a new level of hierarchy---to simplify the work of RA, say. Because it is possible for a network element to have more than one locator, the two sets of locators can be active at the same time. Initially, only 26 Internet Draft Nimrod Architecture July 1994 the first set of locators is active. It means that Router RA knows to which ethernet a packet should be directed given the locator in the header. (Given a packet destined to one of the hosts, the router would pick one of the three interfaces based on the "host part" of the locator---i.e., "h1" in locator a:h1.) When the second set of locators is introduced, for a time, Router RA would forward based on the two sets of locators---because the first set of locators might still be cached by some sources. Eventually, RA would de-activate the original set of locators. Presumably, RA would be prepared to forward messages to the new set of locators before the DNS (or its equivalent) is instructed to use them. If a packet containing an old locator is given to RA after the locator has been de-activated, an error message would be generated. There exists the possibility that the old locators might be re-assigned. If a packet is received by the wrong endpoint, this situation can be detected by looking at the destination EID which is included in the packet header. The renumbering scheme described above implies that it should be possible to update the DNS (or its equivalent) securely and, relatively, dynamically. However, because renumbering will, most likely, be infrequent and carefully planned, we expect that the load on this updating mechanism should be manageable. A second implication of this renumbering scheme is a requirement for a secure and simple way to update hosts' and routers' locators. 8 Security Considerations Security Considerations are not addressed in this document. 9 Authors' Addresses Isidro Castineyra BBN Systems and Technologies 10 Moulton Street Cambridge, MA 02138 Phone: (617) 873-6233 Email: isidro@bbn.com Martha Steenstrup BBN Systems and Technologies 10 Moulton Street Cambridge, MA 02138 Phone: (617) 873-3192 Email: msteenst@bbn.com Noel Chiappa 27 Internet Draft Nimrod Architecture July 1994 +--+ | | a:r1 |RA| +--+ | /|\ / | \ / | \ / | \ / | \ / | \ / | \ / | \ / | \ / | \ / | \ / | \ ----------------- --------- --------------------------- +--+ +--+ +--+ +--+ +--+ | | a:h5 | | a:h1 | | a:h2 | | a:h4 | | A:h3 |HD| a:a:h1 |HA| a:a:h2 |HB| a:a:h1 |HC| a:c:h1 |HE| A:c:h3 +--+ +--+ +--+ +--+ +--+ Figure 8: Renumbering a Network 28 Internet Draft Nimrod Architecture July 1994 Email: jnc@ginger.lcs.mit.edu References [1] J. N. Chiappa, "A New IP Routing and Addressing Architecture," IETF Internet Draft, 1991. [2] M. Steenstrup, "Inter-Domain Policy Routing Protocol Specification: version 1," RFC 1479, June 1993. [3] R. Wright, Three Scientists and their Gods Looking for Meaning in an Age of Information. New York: Times Book, first ed., 1988. [4] S. Ramanathan, "Mobility Support for Nimrod: Requirements and Approaches.," Working Draft, June 1994. [5] S. Deering, "SIP: Simple Internet Protocol," IEEE Network, vol. 7, May 1993. [6] P. Francis, "A Near-Term Architecture for Deploying Pip," IEEE Network, vol. 7, May 1993. 29