Internet Draft J. Noel Chiappa Expires: yyy xx, 2001 zzz xx, 2001 Stack Names in TCP Abstract A companion document [Chiappa] defines, and proposes explicit recognition of, a new fundamental internetworking object, the "stack". It further proposes the adoption of a particular explicit naming space, the "stack name", for these objects. This document provides protocol-level detail on use of these new names in TCP [Postel81B] to provide identification of the entities at each end of a TCP connection, and shows how use of them may be deployed in an interoperable fashion. Notes This document also includes the description of two needed pieces of "infrastructure" support: the ability to have better support for end-end options in IP [Postel81A], and the ability to have a TCP header with longer TCP options. Were this proposal actually accepted for use, these sections out to be split out as small, separate RFC's, one for each topic. For the moment, it's all being lumped here to make it convenient to review. Acknowledgments TBA. 1. Introduction To review the background briefly, the fundamental class of new objects we propose to add are transport (and above) level entities we call 'stacks'. A stack can contain an entire host networking software complex from the transport layer (possibly including a number of different transport layers) up, as well as clients thereof. For example, it might contain a TCP, several open connections, and a number of different applications using these connections. However, it may also contain as little as a single TCP connection. More formally, one can think of a stack as the entity involved in end-end communication; i.e. the fundamental unit of identity for end-end communication. Alternatively, a stack could be a security compartment/entity. Architecturally, fate-sharing is just one reason to divide a collection of state involved in end-end communication into two entities into two, and name each distinctly. Security is another; there may be more, and the stack mechanism provides a general way to name and interact with these entities. To name these new objects, we introduce a new namespace, "stack names" (SN's). The exact syntax, semantics, etc chosen for the SN are not relevant to the mechanisms here - SN's are treated as a simple byte-string for the purposes of the protocol descriptions below. 2 How SN's are Used in TCP The general concept for the use of SN's in TCP (a TCP using stacknames is referred to as 'TCP-SN') is that they replace IPv4 addresses in identifying the entity at the other end of the TCP connection. This removes the current binding of the TCP state with the IP address(es) and replaces it with a more appropriate identity form. In general, SN's are *not* sent in each TCP header. Instead, stack names replace IP addresses in the pseudo-header used to calculate the TCP checksum. So, although they are not explicitly included in each packet, they are included implicitly in each packet, by being in the pseudo-header used to calculate the TCP checksum. Thus, SN's are implicitly bound to each TCP packet. Normal TCP data packets are exactly the same size, with the exact same fields, as at present. The outline of the scheme is simple. SN's (contained in a new TCP option) are exchanged in the initial ICP packets (which use the old pseudo- header, so that unmodified TCP's can parse them), and used thereafter in the pseudo-header, in place of the address. (It would be possible to define a varient of TCP in which the ICP packet was also checksummed with the new-style pseudo-header, with the advantage that such packets could pass through a NAT box without needing to have the TCP checksum modified. However, this would entail the overhead of a timeout and a retransmission for cases where the TCP on the other end is unmodified.) One representative actual implementation (4.4-Lite BSD) was examined to verify that the proposed changes are reasonable to implement and neither extensive nor expensive; they would in fact be easy to do in this particular code base. Sections below cover the new pseudo-header, connection initiation and the new option, input packet processing, and a number of other detailed topics. 3.1 The New Pseudo-Header A SN-based pseudo-header, similar to the current pseudo-header (PH), is defined, as follows: Source SN (m bytes) Destination SN (n bytes) Padding (0-q bytes) Protocol (1 byte) TCP length (2 bytes) The length of padding depends on the checksum algorithm (CA) chosen; addition of this improved end-end naming in TCP might usefully be done concurrently with the introduction of an improved checksum algorithm, since it does modify the way in which the TCP checksum is computed. Note that folding the SN change in with an improved checksum is likely to help with the likelihood of this change being fielded. For further discussion of the issues involved in new checksums, see [Partridge]. One thing to note is that algorithms which produce checksum results of longer than 16 bits need not use the "Alternate Checksum Data Option", as described in Partridge, with consequent processing overhead due to the presence of a TCP option in every packet. Rather, the longer checksum could be "folded" into 16 bits, with a probable minimal loss of error-detecting capability, and the result stored in the exising TCP checksum field. The TCP length is taken from the header of the incoming segment; all the other fields are stored permanently. (A clever implementor will design their checksum routine in such a way that one can start it with partial results pre-loaded, and they will then arrange their connection opening code to save the result of the checksum, run to the point where it has covered all the fields in the pseudo-header up through the protocol; that way, there is zero overhead to the per-packet checksumming from the longer pseudo-header.) 3.2 TCP Connection Initiation with Stack Names A TCP implementing the use of SN's which wants to use them must send an option, the "TCP Stack Name Option" (TCPSN), in its initial SYN packet. The option will contain the SN of the initiator of the TCP connection, and also indicates that that entity supports use of TCP-SN, and wants to use it. The option might look as follows; Option number (1 byte) Option length (x bytes) Source SN length (1 byte) Source SN (m bytes) Destination SN length (1 byte) Destination SN (n bytes) The size of the option length field would normally be 1 byte, unless we adopt an extended TCP option format (see section 4.2, below). There is one issue with the contents of the option, and that it whether or not one should include the SN of the listener that this connection is trying to connect to. The problem is that that SN is not necesarily known in all cases. For instances where one is connecting up to a large service, one may be passed off to one of a number of actual servers which implement the service, each a separate stack, with its own SN. So, a TCPSN option with a Destination SN length of 0 indicates that the source stack does not know the SN of the stack it is trying to open a connection with. (Alternatively, two different options could be specified, one for carrying the initiator's SN, and one for carrying the listener's SN - if the listener's SN is not known, only the option for the initiator's SN would be included, etc.) If the listener supports use of SN's in TCP, and wants to use them, it must respond with a SYN-ACK packet including a TCPSN option, with the fields filled in appropriately (i.e. the contents will normally be reversed, with the Source SN filled in it if was previously null). The TCPSN option must not be sent in any segment that does not have the SYN bit set. There is no reason to provide support for the case where the listener implements SN's and wishes to use them, and the initiator does not - the listener cannot request use of an SN if the initiator does not implement them, or does not want to use them. An implementation of TCP which does not support the TCPSN option should silently ignore it (as required by RFC 1122 [Braden]). Ignoring the option will force any TCP attempting to use the new PH/CA to use the standard TCP PH/CA, thus ensuring interoperability. [What do we do if the Destination SN in the initial SYN packet does not match what the listener thinks its SN is? One likely possibility is that it is an error condition, and an RST should be sent and the connection terminated? Either that, or send an ICMP message - one argument is that this may not be clean, since the SN is a TCP option, but we can return ICMP errors now for "no such port. And should we allow the "Destination SN" in the returned packet to be NULL, or should we force it to be there for sanity checking? And what's the response if it does not match?] [One can use all sorts of IPSec stuff with this, all looked up in a single DNS transaction, along with the A-record for the address, of course.] 3.3 Which PH/CA to Use It should be obvious that the alternative pseudo-header and checksum algorithm (PH/CA) must not be used in the first SYN segment, since until the initiator knows that the listener understands TCP with SN's, any use of the new PH/CA will generate what appear to be bad checksums. Any segment with the SYN bit set must always use the standard TCP checksum algorithm, so that the SYN segment will always be understood by the receiving TCP. Furthermore, the switch to the new PS/CA must be "synchronized", so that each end uses the appropriate PH/CA for the packet it has in hand. This is fairly simple to achieve. From the initiator's end, since it cannot send any further packets until after it has received the SYN-ACK packet, and that packet must contain the acknowledging option, then provided that the recipient indicates that it prepared to use TCP with SN's, any packets sent by the initiator after the initial packet must use the new PH/CA. From the listener's end, the SYN-ACK packet must also be sent with the basic TCP PH/CA, since the initiator may not know the listeners' SN - which it would need to create the pseudo-header needed to verify the incoming packing, which is going to tell it the listeners' SN. Again, any packets sent after that are sent with the new PH/CA. Finally, because RST segments may also be received or sent without complete state information, any segment with the RST bit set must use the standard TCP checksum. 3.4 TCP Header Input Processing Another area which needs to be covered is how a TCP decides which PH/CA to use for incoming TCP packets, and the changes needed to normal input packet processing to achieve this. Normally, the checksum is checked before any other TCP processing is done - but without knowing which checksum to use, how does one check the checksum? (Interestingly, this topic is not discussed in [Partridge], although the solutions are fairly obvious.) One method is to store a state bit indicating whether a particular connection is using the new PH/CA in the TCB, and then re-arrange input processing somewhat. In the new sequence, it is first temporarily assumed that the source and destination port, etc are correct, and they will be used to try and find a "candidate" TCB - before the checkum is checked. Either they are correct, and the correct TCB is found; or they are not correct, and either i) no matching TCB is found (in which case the packet can be silently discarded, or, if it passes the basic PH/CA test, a RST can be returned), or ii) the wrong TCB is found, and used temporarily, until the packet fails the checksum (at which point it will be silently discarded). The packet is then checksummed using the method indicated in the TCB, and if it passes, normal input processing resumes. The second method is to allocate a flag bit ("NPC") in the TCP header to mean "new PH/CA in use"; use of this flag bit would (notionally) not be allowed unless "Stack Name" options had previously been exchanged, so there is no upwards compatability issue. For packets with this bit set, one again needs to find the supposedly correct TCB (because one doesn't know what the actual data is in the PH, since the SN is not carried in the packet), and then checksums the packet with the correct PH/CA. The only functional difference at the receiver of the packet is that for packets without this bit set, input processing is then exactly identical to the existing TCP spec. This would normally not enough of an advantage to make it worth using a flag bit. However, being able to operate TCP-SN through NAT boxes (as detailed below) may make this option preferable. Note that in either case, the extra overhead (not counting the overhead of a newer, and presumably more computationally expensive, checksum) amounts to a test and branch per packet - and even this slight increase in cost might actually be cancelled out, in many implementations, by the technique of storing the partial PH checksum. Thus, use of SN's with TCP produces no increase in overhead in terms of space, and very little in terms of extra processing overhead. 3.5 TCP-SN in the Presence of NAT Boxes While it may appear that TCP-SN is well suited to operation through NAT boxes (since other than the checksum in the two initial ICP packets, nothing in any of the other packets depends on the IP address, so that they may pass through NAT boxes unmodifed), there is one issue that needs to be handled. That is how does a NAT box know *which* TCP packets it can pass through unmodified, and which need to have their checksums updated? One possible option is to have the NAT box watch the ICP, and note which connections transit to TCP-SN; packets on those connections can be left unmodified thereafter. While this may appear to have performance impacts (since TCP packets must be examined for SYN packets) and state storage requirements (since a state bit must be retained for all TCP connections which are using TCP-SN), it is not clear that it has any actual on real NAT boxes, which seem to incur many of these costs already anyway, for other reasons. The other alternative, as alluded to above, is to use a flag bit in the TCP header to indicate which PH/CA is in use; if the NPC bit is set, indicating the packet's checksum is not dependent on the IP address, no adjustment of the TCP checksum needs to be done by the NAT box. In either case, no major difficulty is presented by the use of TCP-SN in the presence of NAT boxes. 3.6 TCP-SN Using End-end Security in the Presence of NAT Boxes While TCP-SN operates easily in the presence of NAT boxes, TCP using both end-end security, and TCP-SN, in the presence of NAT boxes, is more problematic. Of course, at the moment, TCP with end-end security does not function at all in the presence of NAT boxes, so a method to allow it to work, even if convoluted, is an improvement on the existing situation. The fundamental problems are twofold. First, in TCP-SN as specified above, the ICP packets are sent with the old PH/CA - and after passing through a NAT box, this will need to be updated - which will cause the end-end security to fail, since the packet has been modified. Second, NAT boxes inevitably have to modify one or more of the addresses - and this weill generally cause the end-end security to fail. An easy fix is available to the first problem, which is to send the ICP packets using the new checksum, and have the NAT boxes not modify the checksum. This is easy to do if the NPC flag is available. Alternatively, in the packet analysis based approach, NAT boxes must ignore all the packets on a connection using security, not just everything after the ICP. In the case where the initiator does not yet know the SN of the listener, it can temporarily use a pseudo-header with a null listener SN. Note that there is no actual problem with an unmodified host receiving an ICP packet checksummed with the new PH/CA (which it will not be able to understand); since ICP packets are only sent with the new checksum if end-end security is in use, and any TCP connection with end-end security cannot pass through a NAT box anyway, nothing is lost. The second problem is not really a TCP-SN problem at all, but rather a problem with IP Security. 3.7 TCP Input Demultiplexing Complexities Depending on exactly what use the TCP/SN is put to, an issue (which was somewhat glossed over above) can arise with finding the correct TCB for an incoming packet. Briefly, the TCB is found by matching the source and destination port and IP address. Logically, once we start using SN's to identify the TCP at the far end, we should switch to using the source and destination SN and ports to do the demultiplexing. However, since the SN's are not (normally) carried in packets, a number of more complex cases can arise from this. Note that in the "normal" case (i.e. two simple hosts conversing), the existing demultiplexing algorithm will work fine, since the IPv4 address can be used as a reasonable standin for the SN for demultiplexing purposes. The first non-trivial case is with mobile and/or multi-homed hosts. If a host wishes to change the address at which a given connection is terminated, while this is possible, a number of issues have to be resolved. For one, while the binding between connection and address can be changed, that change must be properly synchronized between the two ends, otherwise packets will not be associated with the proper TCP. For another, there are potential security problems unless the binding between connection and IPv4 address (or, alternatively, between SN and IPv4 address) is secured, probably by some cryptographic mechanism. A host which is multi-homed and is experiencing flaky network connectivity might actually wish to have multiple concurrent IPv4 addresses in use. This could easily be handled, at least as far as finding the right TCB for incoming packets (which IPv4 address to use on outgoing packets is a different problem, somebody else's :-) with a "two-level" TCB structure (or, alternatively, depending on the implementation, a slight change to the hash structure by which TCB's are found). In short, multiple TCB index entries (one for each IPv4 address used by the other end) all point to the same TCB. The second, and more painful case, involves mobile stacks, which can potentially cause clashes in the port/address space. An example will make this clear: suppose a connection (from stack S1) with local port Pl and distant port Pd at host address Ad moves from host at address Al1 to a host at address Al2. However, the host at Al2 *already* has a stack (S2) with TCP connection with local port Pl, and distant port Pd at Ad! Now we have two distinct connections with local port Pl at address Al2, and distant port Pd at Ad. Of course, they are to two different stacks at Al2 - but how to differentiate the incoming packets? In more abstract terms, the problem is that a host with multiple stacks resident on it does not have a single TCP port space, but rather multiple ones, one for each stack. If there are no clashes, one can, for incoming packet demultiplexing, have a merged version for that purpose. However, if there is a clash, what to do? One technically feasible solution is to try both pseduo-headers, and see which (if either) generates the correct checksum; only one should. With clever implementation, this is not necessarily as prohibitively expensive as it might be; a partial checksum for the data in the packet can be computed once only, and suitably combined with the pre-computer partial checksum for each pseudo-header in turn, until one matches. (This might be more complex if the new checksum is one in which the partial checksum of the later data depends on the length of the [effectively prepended] pseudo-header; one approach to this is to pad the PH to an even multiple of the checksum cycle length with zero padding.) 3.7.1 The Stack Selector (SSEL) Input Demultiplexing Option Alternatively, one could add an IP option to allow demultiplexing on the SN. This is *not* done as a TCP option because other transport protocols, which do demultiplexing in a similar way, might see similar port clashes, and all can use a common mechanism. Several ideas present themselves for the detailed mechanism. In one, if we wish to minimize the size of the SN-carrying option, and the cost of processing them on input, we can define a short, fixed-length local stack name, a 'stack selector' (SSEL); i.e. of a scope *local* to the destination address only. When an stack moves to a new address, if there are any port clashes, it allocates an SSEL, and communicates that to the far end, for them to use in incoming packets. The option would look like: Option number (1 byte) Option length (1 byte) Destination SSEL (2 bytes) A length of 16 bits if chosen for the SSEL since this allows the complete option to be exactly one long-word in length; 16 bits allows up to 2^16 operating stacks per interface (or host, depending on how the SSEL namespace is managed on a multi-homed host), which seems adequate for all but the most bizarre purposes. Now, when the packets arrive, the SSEL is examined to find the correct stack to hand the packet to; the packet is then demultiplexed on the port, as before. One issue here is that IP options are expensive. See the section below ("IPv4 End-End Payload Protocol") for the solution to this. 3.7.2 The Stack Name Input Demultiplexing Option Another possible mechanism is to actually carry the destination SN in the packet, in an IP option. This is expensive, but might be useful for control operations or fault analysis or other situations where robustness is paramount, and absolute efficiency is less of an issue. (In this case, one might as well carry the source SN, too, for similar reasons.) To allow the possibility of either combination, two separate options are defined, one for each; these option would look like: Option number (1 byte) [Two opcodes allocated, one for source, and Option length (1 byte) one for destination.] Stack Name (n bytes) Restricted IP option space is likely to be an issue here too, as with the TCPSN option; the same IPv4 mechanism can handle this too. 3.7.3 The Burden of Stack Mobility Overhead In closing this section, one should note that this port-clash problem, and the mechanism needed to solve it, is only present if one has mobile stacks. If a host does not support mobile stacks, it need not implement any of these mechanisms. As far as the port clash part of it goes, a non-mobile host might simply decline to provide support for use of SSEL's at the other end, in which case a mobile stack at the far end can fall back to the "find a checksum that works" mechanism. Of course, some of these mechanisms require support at the far end - and not just for port clash mechanisms, but for the general mobility issues. However, in no case will the overhead of actually using those mechanisms, and in particular the overhead the extra data in the packets, be incurred unless one end is mobile. Thus, just as with IPv6 Mobility [Perkins], the overhead needed to support mobility is only incurred by those who wish to make use of it. 4.1 IPv4 End-End Payload Protocol Strictly speaking, this is a bit far afield from SN's, but like TCP Header Extensions (below) it's a needed piece of infrastructure for SN's (although somewhat less desperately so), and it seemed best to describe it here. Basically, the problem is that IP-level end-end options in IPv4 [Postel81A] impose an unacceptable performance impact, since in most high-performance routers, all IPv4 packets with options are all processed by the "slow path". However, IPv6 [Deering] contains an elegant mechanism one can use to avoid this; the "payload" mechanism. We can define a similar End-End Payload protocol for IPv4, one that can i) carry normal user data of any protocol, ii) carry any number of end-end options without having a performance impact, iii) provide room for more options than the IP header can currently support (if desired), and iv) provide for more protocols (again, if desired). The header syntax for the End-End Payload protocol would be: End-end Payload header length (2 bytes) Payload protocol number (2 bytes) End-end Option0 type (x bytes) End-end Option0 length (y bytes) End-end Option0 data (n0 bytes) ... End-end OptionM type (x bytes) End-end OptionM length (y bytes) End-end OptionM data (nM bytes) Padding Note that the "payload protocol" is two bytes, not one, with the first 256 being the same as the exisiting IP protocol numbers. This will allow less-common protocols (e.g. routing protocols) to be given protocol numbers from this "extended" space; they will incur the overhead of having to be carried inside a Payload protocol header, but that is an acceptable tradeoff. Again, options stored in this area could either be in the canonical IP option format, or an "extended" option format (i.e. normally x and y are both 1, but this could be changed here). 4.1.1 Upwardly compatible introduction of the IPv4 End-End Payload Protocol How can one be certain the host at the far end implements the End-End Payload? One can't, but if the host supports the TCPSN option, it must also support the IPv4 E-EP protocol, so it's safe to design TCPSN to require support for the E-EP protocol. 4.2 TCP Header Extension Option One problem with carrying the SN's in a TCP option is the limited amount of room available for options in the TCP header. The maximum length of the TCP header is 15 long-words, or 60 bytes; since the basic TCP header is 20 bytes, this leaves a maximum of 40 bytes for options. This is unlikely to be enough room to hold many longer SN's (especially if some headers hold both the initiator and listener SN's). The solution here is to provide a general purpose extension to TCP allowing longer options. (This is preferred because the number of TCP options, particularly those found in every packet, is increasing all the time, and many, such as SACK, are already cramped by the limited size of the TCP option area.) Several different techniques are possible to do this. All would define a new "Use Extended Header" option (TCPXH, a zero-length option with no data), which is passed from the initiator in the SYN, and returned by the listener in the SYN-ACK, indicating both sides understand the header extension mechanism. If the listener supports use of XH, and wants to allow use of them, it must respond with a SYN-ACK packet including a TCPXH option. The TCPXH option must not be sent in any segment that does not have the SYN bit set. Since presumably this would be implemented at the same time as the TCPSN option, there would be no problem with needing to deal with TCP's which implemented TCPSN, but not the TCPXH. As to the actual method for indicating that an extended header is present, indicating the length thereof, and storing the extended header, there are several possible approaches. One possible alternative for flagging TCP segments with an extended header is the use of a separate option, the "TCP Extended Header Present" option (TCPXHP); the presence of this in any TCP segment would indicate the presence of an extended header. The other possible alternative, and the preferred one, involves the use of yet another flag bit, "Extended Header" (XH), use of which would (notionally) not be allowed unless "Use Extended Header" options had previously been exchanged. Three basic approaches to the storage of the length of the extended header, and the storage of the actual extended header itself, are possible. In the first, if the TCPXHP option is used, it could directly contain the length of the extended header, i.e. the TCPXHP option would look like: Option number (1 byte) Option length (1 byte) Extended Header Length (2 bytes) and the actual data of the extended header would be stored after the normal TCP segment header. Options stored in this area could either be in the canonical TCP option format (1 byte of option number, and 1 byte of length), or an "extended" option format (more than 1 byte of either) could be used if this was deemed desirable. In the second, which uses the XH flag bit, we note that the URGENT field is little used at the moment. It would be possible to reuse the URGENT field, in packets where the XH bit was set (i.e. no packet could have both the XH bit and the URG bit set) to indicate that the URGENT field held the length of an "extended header". Two slightly different interpretations are possible for where and how the extended options are stored. In the former, the true length of the TCP header, in bytes, is in the URGENT field (i.e. the 4-bit TCP header length field is to be ignored), and the options field in the TCP header could thus be far longer than 40 bytes. In the latter, there is a new "extended options" area immediately after the normal TCP header, the length of which is given by the reinterpreted URGENT field. In the latter case, options stored in this area could again either be in the canonical TCP option format, or an "extended" option format. The disadvantage of re-using the URGENT field is that other proposed changes to TCP (e.g. the TCP Message Boundary Option) also propose to reuse this field. One could simply specify that no TCP segment could use *both* the XH and the MBO (since the MBO re-inteprets the URG bit, it would fall under the provision that no segment could have both XH and URG set); this is not a problem for TCPSN, which would only be in the first segment anyway. It might be a problem for other potential clients of XH, such as SACK, though. Alternatively, one could redefine MBO, so that it uses an option in those segments in which it wishes to indicate a message boundary. (This would be the sensible alternative if one wished to allow both MBO and XH in the same TCP segment, since XH is useful for any number of different options, not just TCPSN.) The third potential approach for storing extended options, again used with the XH flag bit, is to define an optional "extended header length" field (probably 2 bytes long) which, when XH is set, will be placed after the TCP header (as defined by the 4-bit TCP header length field). The actual options will follow immediately after. Again, options stored in this area could either be in the canonical TCP option format, or an "extended" option format. 4.2.1 TCP Header Extension and TCPSN There is one small problem with the above - how can one use the extended header to send the TCPSN, which must be in the SYN packet, without knowing whether the listener implement TCPXH - which one can only find out from the SYN-ACK? The tasteless kludge solution is to send a packet with all three: TCPXH and TCPSN options, *and* with the XH bit set (assuming this method of doing the extended header is used - see below for analysis of the usee of the TCPXHP option). One of four things will then happen (since [Postel81B] does not specify the behaviour of a TCP if a segment is received with any of the 'Reserved' bits non-zero): i) you get back a RST (in which case you know that the listener does not implement TCPXH, and thus not TCPSN as well - at this point you can resend the SYN with the TCPXH and the TCPSN), or ii) you get back a SYN-ACK with a TCPXH and a TCPSN (in which case all is well, proceed), or iii) you get back nothing (which can mean that the SYN packet was lost, or, more likely, that the listener TCP did not like the packet and discarded it silently), or iv) you get back a SYN-ACK with no TCPXH and no TCPSN (very bad). For the case where you get no reply, the solution is fairly simple - retransmit the SYN up to a small fixed number of times (to handle the case where it was actually dropped by the network), and then back off to normal TCP. This will on very rare occasions produce a case where a connection is initiated using normal TCP, when both sides are actually capable of doing TCP-SN; this is judged acceptable (and in any case will happen on roughly the same frequency as a report from TCP that the destination host is down, when it is in fact up - and for exactly the same reason - a statistical run of lost packets). The last case, when you get back a SYN-ACK without a TCPXH and a TCPSN, is ugly, because it means that the listener TCP doesn't implement TCPXH but tried to process the packet anyway, so the listening TCP will presumably already have held the extended options as what it thought was user data, intending to pass them onto the client when the connection moved to ESTABLISHED. (This assumes that the listening TCP accepts DATA in the SYN packet - not all do.) The recovery at this point is to send an RST, which should cause the listener (which is in SYN-SENT state) to close the connection, discard the "initial" data, and erase the TCB; thereby allowing an immediate retransmission of the SYN packet without the TCPXH and TCPSN options (which the listener doesn't implement anyway). If the TCPXHP option is used, you should always get the fourth case behaviour (since no flag bit will be set), and the effects will be the same (the listener is buffering the "data"), and it can be handled in the same way (send an RST). References [Braden] Robert T. Braden, "Requirements for Internet Hosts - Communication Layers", RFC 1122, University of Southern California, Information Sciences Institute, Marina Del Rey, Calif., October 1989. [Chiappa] J. Noel Chiappa, "Stacks and Stack Names: A Proposed Enhancement to the Internet Architecture", Internet Draft, In Preparation. [Deering] S. Deering, R. Hinden, "Internet Protocol, Version 6 (IPv6) Specification", RFC 2460, University of Southern California, Information Sciences Institute, Marina Del Rey, Calif., December 1998. [Partridge] Craig Partridge, J. Zweig, "TCP Alternate Checksum Options", RFC 1146, University of Southern California, Information Sciences Institute, Marina Del Rey, Calif., March 1990. [Perkins] Charles Perkins, David B. Johnson, "Mobility Support in IPv6", Internet Draft , April 2000 [Postel81A] Jon Postel, "Internet Protocol", RFC 791, University of Southern California, Information Sciences Institute, Marina Del Rey, Calif., September 1981. [Postel81B] Jon Postel, "Transmission Control Protocol", RFC 793, University of Southern California, Information Sciences Institute, Marina Del Rey, Calif., September 1981.