Nimrod Data Packet Forwarding 1 Introduction Nimrod forwarding agents are responsible for establishing forwarding state information and for forwarding packets according to this state information and according to forwarding directives carried along in the packets. The forwarding state information maintained by forwarding agents is derived from the routes selected by forwarding agents and is installed using a "path management" protocol. Forwarding agents select routes according to the traffic service requirements supplied by endpoint representatives acting on behalf of the hosts and according to the available maps. Nimrod routes are expressed as a sequence of locators of nodes and corresponding connectivity specifications. Each route must at least contain information about the source and destination nodes. Different portions of the same route may be expressed at different granularities of nodes with respect to the node clustering hierarchy. Forwarding agents establish forwarding information according to the routes selected. The forwarding state established in forwarding agents along the specified route is called a path, and it may ultimately connect one or more source endpoints and one or more destination endpoints. Multiple traffic sessions may use the same path, thus helping to contain the amount of internetwork resources consumed in managing paths. Moreover, multiple paths may be established based on the same route. 2 Forwarding Modes Nimrod supports two packet forwarding modes: flow and datagram. Flow mode requires the establishment of forwarding state specific to the traffic session, in forwarding agents along the routes selected for that session. With flow mode, each session is assigned one or more paths, derived from the selected routes. The minimum forwarding state required for flow-mode forwarding includes the path "label", service guarantees (if any), and the path's previous- and next-hop forwarding agents (and path labels). Flow mode provides fast and consistent packet forwarding according to path labels, once the session-specific state is established in the forwarding agents. However, this state consumes memory in forwarding agents, and the protocol for its installation imposes some delay before data packets can be forwarded successfully to their destinations. Flow mode should be used when session-specific state is necessary to meet traffic service requirements or when the cost of forwarding state installation and maintenance can be amortized over many packets. Hence, we recommend flow mode for traffic sessions requiring guaranteed service or consisting of several packets. Datagram mode does not require the establishment of any forwarding state specific to the session. In datagram mode, data packets carry a description of the selected route, which guides the packet forwarding decisions at forwarding agents along the route. For each node in the specified route, a forwarding agent at the entry to that node makes an independent decision for forwarding traffic toward the next node on the route, and hence the session source and destination relinquish some control over packet forwarding. Datagram mode, however, does provide robust forwarding, in the sense that the intermediate forwarding agents can base their packet forwarding decisions on the current state of their portion of the internetwork. Datagram mode should be used when the state of the internetwork is unpredictable or when the cost of forwarding state installation and maintenance is unacceptable. Hence, we recommend datagram mode for traffic sessions traversing highly mobile or unreliable portions of an internetwork or consisting of few packets. 3 Path Management Protocol Forwarding agents use a path management protocol to install and remove forwarding state information from their forwarding databases. Each forwarding agent maintains forwarding information for those paths that originate, terminate, or pass through it. Paths may be set up from source to destination or from destination to source. Each path has an initiator and a target. We expect that most paths will be set up from the source endpoint to the destination endpoint. Hence, the initiator usually begins the path setup procedure on behalf of the source endpoint, and the target usually accepts or rejects a path on behalf of the destination endpoint. Forwarding agents try to form new paths by piecing together existing paths rather than by setting up new paths, provided the existing paths meet the traffic service requirements. (See the Nimrod functionality document for an example of this procedure.) Hence, Nimrod paths are inherently multilevel. This method provides the lowest-cost packet forwarding in terms of the amount of route generation and forwarding state installation required. In a busy internetwork, there are likely to be many existing paths, and hence we expect this mechanism to be much less expensive than individually setting up and maintaining the paths required for every traffic session. Paths are identified by path labels, which are unique along the path but not necessarily globally unique throughout the internetwork. By no requiring global uniqueness of path labels, paths can have relatively short labels, reducing the overhead in carrying them in packets and the overhead of looking up forwarding information indexed by these labels. We currently believe that we can assign relatively short path labels that are not globally unique in a way that results in minimal collisions of path labels from different paths, by "spreading" the path labels. (We will discuss this further in a separate note.) 3.1 Protocol Packets The path management protocol uses three packets: setup, accept, and teardown. These packets are described below; explicit formats will be provided in a future version of this memo. All such packets travel along the path to which they refer. Path management protocol packets may be used to collect and return performance monitoring information for a path (e.g., path delay and throughput) as well as set up and tear down a path. The setup packet is generated by the path initiator and is used to establish forwarding state in forwarding agents. It contains the label for the path, the route, the endpoint identifiers, an indication of whether the path is source- or destination-initiated, any service requirements, and any monitored information for the path. The accept packet is generated by the path target and is used to indicate successful path establishment from initiator to target. It contains the label for the accepted path and any monitored information for the path. Accept packets travel backwards along paths. A path may be used for data transport before an accept packet is generated by the target or received by the initiator. Note that a source as initiator may wish to wait for an accept packet from the target before sending data on a path is as follows. For example, if the source pays for all packets sent, whether or not they are successfully received at the destination, it may want to wait to make sure that the path is successfully established before sending data to the destination. The teardown packet is generated by any forwarding agent on the path and is used to remove forwarding state. It contains the label for the path to tear down, the reason for the teardown and associated information, and any monitored information for the path. Teardown packets travel in both directions along paths and may result from any of the following: - loss of a lower-level path which is a component of the specified path; - a timeout (paths have a maximum lifetime to ensure that forwarding state for broken paths is eventually removed); - a change in service requirements for the traffic session; - a change in connectivity specifications for a node on the route; - preemption in favor of another path. All path management protocol packets (and in fact all Nimrod packets except data packets) are covered by integrity and authentication checks and are sent using a reliable transaction protocol (with positive and negative acknowledgements) between successive forwarding agents along the path. This helps to reduce the amount of network resources consumed by retransmissions in lossy environments and also helps to determine problems (e.g., incompatible protocol versions) for unsuccessful mpacket transmissions. 3.2 Protocol Finite-State Machines There are two finite-state machines for the path management protocol, one applicable to the initiator and one applicable to any other forwarding agent in the path. 3.2.1 Initiator The state machine for the initiating forwarding agent has four states: idle, check, ready, and done. State transitions are depicted below: idle -> check: This transition occurs when the initiator begins to set up a path. check -> ready: This transition occurs after the initiator has successfully completed all of the consistency and resource availability checks for the path. These checks are described in section 3.2.3. At this point, the initiator installs the forwarding information for the path in its forwarding database. From the perspective of the initiator, the path may be used to carry data traffic. check -> idle: This transition occurs if the initiator fails to complete the consistency and resource availability checks for the path. ready -> done: This transition occurs when the initiator receives an accept packet from the target. ready -> idle, done -> idle: This transition occurs when the initiator receives or generates a teardown packet for the path. At this point, the initiator removes the forwarding information for the path from its forwarding database. 3.2.2 Intermediate and Target The state machine for the intermediate and target forwarding agents has three states: idle, check, and ready. State transitions are depicted below: idle -> check: This transition occurs when the forwarding agent receives a setup packet. check -> ready: This transition occurs after the forwarding agent has successfully completed all of the consistency and resource availability checks for the path. At this point, the forwarding agent installs the forwarding information for the path in its forwarding database. From the perspective of the forwarding agent, the path may be used to carry data traffic. check -> idle: This transition occurs if the forwarding agent fails to complete the consistency and resource availability checks for the path. ready -> idle: This transition occurs when the forwarding agent receives or generates a teardown packet for the path. At this point, the initiator removes the forwarding information for the path from its forwarding database. 3.2.3 Check State Actions When in the check state, a forwarding agent must perform a series of tests to determine whether to install forwarding information for the path. These tests are partitioned into two sets, those related to consistency and those related to resource availability. Consistency checks include verification of the following: - The setup packet is not out of date and is not a duplicate. This check is not performed by the initiator. - The path label carried in the setup packet is not already in use at the forwarding agent. - The forwarding agent acts on behalf of the current node in the route carried in the setup packet. - The connectivity specification associated with the node and carried in the setup packet is a valid connectivity specification for this node. - The node's service restrictions do not preclude carrying traffic along the specified route. - The route carried in the setup packet meets the target endpoint's service requirements. This check is specific to the target. When the target receives a setup packet, it passes the packet to endpoint representative for corresponding endpoint. This endpoint representative performs the appropriate check and returns the result to the target. If the check is successful, the target accepts the path. Otherwise, the target tears down the path and takes one of the following actions: 1. If the target is at the destination endpoint, the teardown packet contains the destination endpoint's service requirements. The initiator is responsible for obtaining a feasible route that accounts for these service requirements. 2. If the target is at the source endpoint, the teardown packet indicates that the source will generate a feasible route. If the setup packet successfully passes all of these consistency checks, the forwarding agent performs a set of resource availability checks including verification of the following: - The forwarding database can accommodate state for a new path. - There exists a feasible path to the next node on the specified route. The forwarding agent may have to request a route and set up this path or there may be such a path already established. If such a path Q has already been established, the forwarding agent associates the current path P with Q so that traffic entering the forwarding agent along the path labelled P will be forwarded along the path labelled Q. If no such path Q yet exists, the forwarding agent requests a route from a route agent. This route must go from the forwarding agent to the next node in path P's route, and it must satisfy the service requirements for path P (carried in P's setup packet). If the forwarding agent obtains a feasible route, it proceeds to set up a path for that route and determines whether the necessary resources can be reserved along path Q. Only when the forwarding agent receives an accept packet for path Q does it make the corresponding path label associations between P and Q in its forwarding database. Note that the forwarding agent also makes a path label association for the reverse direction of the path so that path management protocol packets can travel in either direction along a path. Specifically, the forwarding agent makes an association between the path R, the previous hop along P that terminates at the forwarding agent, and path P. 4 Forwarding Information in Nimrod Packets Datagram packets and setup packets for flow mode carry almost exactly the same information and are processed by forwarding agents in similar ways. The principal difference in packet processing for datagram and setup packets is that no forwarding state is established specific to datagrams. Most of the consistency and resource availability checks previously described for setup packets are also performed for datagram packets. Shared contents of setup and datagram packets include: - Source and destination endpoint identifiers. - The route in terms of the locators for nodes and their relevant connectivity specifications. - Service requirements (e.g., bandwidth guarantees, delay bounds) from the perspective of the initiator of the packet. Forwarding agents use this information in deciding how to forward the packet toward the next node on the route. - User data (setup messages might not always contain user data). Most paths are composed of paths which in turn are composed of paths and so on. Hence, a flow mode data packet must contain the path labels of all the component paths at all levels, but only one path label is used for forwarding at a given time. These path labels are stacked in the packet and manipulated (pushed and popped) by the forwarding agents handling the packet. 4.1 IPv6 Optimizations For initial implementations of Nimrod, we plan to encapsulate Nimrod packets within IP packets, independent of the version of IP currently in use. Hence, there is no information that must be added to IP packets (other than the identifier of the enclosed protocol - Nimrod). All Nimrod-specific information will be carried in Nimrod packets encapsulated within IP. For an IPv6 internetwork, one may opt to define Nimrod-specific options that will allow some Nimrod information to migrate up to the IP header. The proposed options include hop-by-hop options, a route option, and end-to-end options. Suggested formats for these options were provided at the December IETF and will be depicted in a future version of this document. 4.1.1 Hop-by-Hop Options The hop-by-hop options are those that must be modified at most forwarding agents and include the stack of path labels (for flow-mode data packets) and monitoring information (which may be carried in setup packets or flow-mode data packets). The monitoring option would enable multiple path performance quantities, such as delay and throughput, to be monitored and updated at each hop along the path. All monitoring information would be expressed in terms of type, length, and value. Collected monitoring information would be returned to the other end of the path in an end-to-end performance option described below. 4.1.2 Route Option The Nimrod route specified in terms of locators of component nodes and connectivity specifications would be included in a route option. 4.1.3 End-to-End Options Source and destination endpoint identifiers would be carried in an end-to-end option. Path service requirements and moitored performance information would be carried in separate end-to-end options. Each service requirement and each performance parameter would be specified in terms of type, length, and value.