Internet Draft						J. Noel Chiappa
Expires:  yyy xx, 2001					zzz xx, 2001


			    Stack Names in TCP


Abstract

	A companion document [Chiappa] defines, and proposes explicit
recognition of, a new fundamental internetworking object, the "stack". It
further proposes the adoption of a particular explicit naming space, the
"stack name", for these objects.
	This document provides protocol-level detail on use of these new
names in TCP [Postel81B] to provide identification of the entities at each
end of a TCP connection, and shows how use of them may be deployed in an
interoperable fashion.


Notes

	This document also includes the description of two needed pieces of
"infrastructure" support: the ability to have better support for end-end
options in IP [Postel81A], and the ability to have a TCP header with longer
TCP options.
	Were this proposal actually accepted for use, these sections out to
be split out as small, separate RFC's, one for each topic. For the moment,
it's all being lumped here to make it convenient to review.


Acknowledgments

	TBA.

1. Introduction

	To review the background briefly, the fundamental class of new
objects we propose to add are transport (and above) level entities we call
'stacks'.
	A stack can contain an entire host networking software complex from
the transport layer (possibly including a number of different transport
layers) up, as well as clients thereof. For example, it might contain a TCP,
several open connections, and a number of different applications using these
connections. However, it may also contain as little as a single TCP
connection.
	More formally, one can think of a stack as the entity involved in
end-end communication; i.e. the fundamental unit of identity for end-end
communication. Alternatively, a stack could be a security compartment/entity.
	Architecturally, fate-sharing is just one reason to divide a
collection of state involved in end-end communication into two entities into
two, and name each distinctly. Security is another; there may be more, and
the stack mechanism provides a general way to name and interact with these
entities.

	To name these new objects, we introduce a new namespace, "stack
names" (SN's). The exact syntax, semantics, etc chosen for the SN are not
relevant to the mechanisms here - SN's are treated as a simple byte-string
for the purposes of the protocol descriptions below.


2 How SN's are Used in TCP

	The general concept for the use of SN's in TCP (a TCP using
stacknames is referred to as 'TCP-SN') is that they replace IPv4 addresses in
identifying the entity at the other end of the TCP connection. This removes
the current binding of the TCP state with the IP address(es) and replaces it
with a more appropriate identity form.
	In general, SN's are *not* sent in each TCP header. Instead, stack
names replace IP addresses in the pseudo-header used to calculate the TCP
checksum. So, although they are not explicitly included in each packet, they
are included implicitly in each packet, by being in the pseudo-header used to
calculate the TCP checksum. Thus, SN's are implicitly bound to each TCP
packet. Normal TCP data packets are exactly the same size, with the exact
same fields, as at present.

	The outline of the scheme is simple. SN's (contained in a new TCP
option) are exchanged in the initial ICP packets (which use the old pseudo-
header, so that unmodified TCP's can parse them), and used thereafter in the
pseudo-header, in place of the address.
	(It would be possible to define a varient of TCP in which the ICP
packet was also checksummed with the new-style pseudo-header, with the
advantage that such packets could pass through a NAT box without needing to
have the TCP checksum modified. However, this would entail the overhead of a
timeout and a retransmission for cases where the TCP on the other end is
unmodified.)

	One representative actual implementation (4.4-Lite BSD) was examined
to verify that the proposed changes are reasonable to implement and neither
extensive nor expensive; they would in fact be easy to do in this particular
code base.
	Sections below cover the new pseudo-header, connection initiation and
the new option, input packet processing, and a number of other detailed
topics.

	<Query: Should RST packets include a SN option? Probably yes, they
	had better include stackname information (somehow), since RST
 	processing requires TCB demux.>


3.1 The New Pseudo-Header

	A SN-based pseudo-header, similar to the current pseudo-header (PH),
is defined, as follows:

	Source SN (m bytes)
	Destination SN (n bytes)
	Padding (0-q bytes)
	Protocol (1 byte)
	TCP length (2 bytes)

The length of padding depends on the checksum algorithm (CA) chosen; addition
of this improved end-end naming in TCP might usefully be done concurrently
with the introduction of an improved checksum algorithm, since it does modify
the way in which the TCP checksum is computed. Note that folding the SN
change in with an improved checksum is likely to help with the likelihood of
this change being fielded. For further discussion of the issues involved in
new checksums, see [Partridge].
	One thing to note is that algorithms which produce checksum results
of longer than 16 bits need not use the "Alternate Checksum Data Option", as
described in Partridge, with consequent processing overhead due to the
presence of a TCP option in every packet. Rather, the longer checksum could
be "folded" into 16 bits, with a probable minimal loss of error-detecting
capability, and the result stored in the exising TCP checksum field.

	The TCP length is taken from the header of the incoming segment; all
the other fields are stored permanently.
	(A clever implementor will design their checksum routine in such a
way that one can start it with partial results pre-loaded, and they will then
arrange their connection opening code to save the result of the checksum, run
to the point where it has covered all the fields in the pseudo-header up
through the protocol; that way, there is zero overhead to the per-packet
checksumming from the longer pseudo-header.)


3.2 TCP Connection Initiation with Stack Names

	A TCP implementing the use of SN's which wants to use them must send
an option, the "TCP Stack Name Option" (TCPSN), in its initial SYN packet.
The option will contain the SN of the initiator of the TCP connection, and
also indicates that that entity supports use of TCP-SN, and wants to use it.
The option might look as follows;

	Option number (1 byte)
	Option length (x bytes)
	Source SN length (1 byte)
	Source SN (m bytes)
	Destination SN length (1 byte)
	Destination SN (n bytes)

The size of the option length field would normally be 1 byte, unless we adopt
an extended TCP option format (see section 4.2, below).

	There is one issue with the contents of the option, and that it
whether or not one should include the SN of the listener that this connection
is trying to connect to. The problem is that that SN is not necesarily known
in all cases. For instances where one is connecting up to a large service,
one may be passed off to one of a number of actual servers which implement
the service, each a separate stack, with its own SN. So, a TCPSN option with
a Destination SN length of 0 indicates that the source stack does not know
the SN of the stack it is trying to open a connection with.
	(Alternatively, two different options could be specified, one for
carrying the initiator's SN, and one for carrying the listener's SN - if
the listener's SN is not known, only the option for the initiator's SN
would be included, etc.)

	If the listener supports use of SN's in TCP, and wants to use them,
it must respond with a SYN-ACK packet including a TCPSN option, with the
fields filled in appropriately (i.e. the contents will normally be reversed,
with the Source SN filled in it if was previously null). The TCPSN option
must not be sent in any segment that does not have the SYN bit set.
	There is no reason to provide support for the case where the listener
implements SN's and wishes to use them, and the initiator does not - the
listener cannot request use of an SN if the initiator does not implement
them, or does not want to use them.
      An implementation of TCP which does not support the TCPSN option should
silently ignore it (as required by RFC 1122 [Braden]). Ignoring the option
will force any TCP attempting to use the new PH/CA to use the standard TCP
PH/CA, thus ensuring interoperability.

	[What do we do if the Destination SN in the initial SYN packet does
not match what the listener thinks its SN is? One likely possibility is that
it is an error condition, and an RST should be sent and the connection
terminated? Either that, or send an ICMP message - one argument is that this
may not be clean, since the SN is a TCP option, but we can return ICMP
errors now for "no such port.
And should we allow the "Destination SN" in the returned packet to be NULL,
or should we force it to be there for sanity checking? And what's the
response if it does not match?]

	[One can use all sorts of IPSec stuff with this, all looked up in a
single DNS transaction, along with the A-record for the address, of course.]


3.3 Which PH/CA to Use

	It should be obvious that the alternative pseudo-header and checksum
algorithm (PH/CA) must not be used in the first SYN segment, since until the
initiator knows that the listener understands TCP with SN's, any use of the
new PH/CA will generate what appear to be bad checksums. Any segment with the
SYN bit set must always use the standard TCP checksum algorithm, so that the
SYN segment will always be understood by the receiving TCP.

	Furthermore, the switch to the new PS/CA must be "synchronized", so
that each end uses the appropriate PH/CA for the packet it has in hand. This
is fairly simple to achieve.
	From the initiator's end, since it cannot send any further packets
until after it has received the SYN-ACK packet, and that packet must contain
the acknowledging option, then provided that the recipient indicates that it
prepared to use TCP with SN's, any packets sent by the initiator after the
initial packet must use the new PH/CA.
	From the listener's end, the SYN-ACK packet must also be sent with
the basic TCP PH/CA, since the initiator may not know the listeners' SN -
which it would need to create the pseudo-header needed to verify the incoming
packing, which is going to tell it the listeners' SN.

	<Now that I think about it, this isn't quite true. One could assume
the packet is OK, extract the LSN, do the PH, calculate the checksum, and
toss the packet if bad. So it could be either; the advantage of using the old
one is that it makes a simpler test for NAT boxes as to whether to update the
TCP checksum.>

	Again, any packets sent after that are sent with the new PH/CA.
Finally, because RST segments may also be received or sent without complete
state information, any segment with the RST bit set must use the standard TCP
checksum.


3.4 TCP Header Input Processing

	Another area which needs to be covered is how a TCP decides which
PH/CA to use for incoming TCP packets, and the changes needed to normal input
packet processing to achieve this. Normally, the checksum is checked before
any other TCP processing is done - but without knowing which checksum to use,
how does one check the checksum? (Interestingly, this topic is not discussed
in [Partridge], although the solutions are fairly obvious.)

	One method is to store a state bit indicating whether a particular
connection is using the new PH/CA in the TCB, and then re-arrange input
processing somewhat. In the new sequence, it is first temporarily assumed
that the source and destination port, etc are correct, and they will be used
to try and find a "candidate" TCB - before the checkum is checked.
	Either they are correct, and the correct TCB is found; or they are
not correct, and either i) no matching TCB is found (in which case the packet
can be silently discarded, or, if it passes the basic PH/CA test, a RST can
be returned), or ii) the wrong TCB is found, and used temporarily, until the
packet fails the checksum (at which point it will be silently discarded).
	The packet is then checksummed using the method indicated in the TCB,
and if it passes, normal input processing resumes.

	The second method is to allocate a flag bit ("NPC") in the TCP header
to mean "new PH/CA in use"; use of this flag bit would (notionally) not be
allowed unless "Stack Name" options had previously been exchanged, so there
is no upwards compatability issue. For packets with this bit set, one again
needs to find the supposedly correct TCB (because one doesn't know what the
actual data is in the PH, since the SN is not carried in the packet), and
then checksums the packet with the correct PH/CA.
	The only functional difference at the receiver of the packet is that
for packets without this bit set, input processing is then exactly identical
to the existing TCP spec. This would normally not enough of an advantage to
make it worth using a flag bit. However, being able to operate TCP-SN through
NAT boxes (as detailed below) may make this option preferable.

	Note that in either case, the extra overhead (not counting the
overhead of a newer, and presumably more computationally expensive, checksum)
amounts to a test and branch per packet - and even this slight increase in
cost might actually be cancelled out, in many implementations, by the
technique of storing the partial PH checksum.
	Thus, use of SN's with TCP produces no increase in overhead in terms
of space, and very little in terms of extra processing overhead.


3.5 TCP-SN in the Presence of NAT Boxes

	While it may appear that TCP-SN is well suited to operation through
NAT boxes (since other than the checksum in the two initial ICP packets,
nothing in any of the other packets depends on the IP address, so that they
may pass through NAT boxes unmodifed), there is one issue that needs to be
handled. That is how does a NAT box know *which* TCP packets it can pass
through unmodified, and which need to have their checksums updated?

	One possible option is to have the NAT box watch the ICP, and note
which connections transit to TCP-SN; packets on those connections can be left
unmodified thereafter. While this may appear to have performance impacts
(since TCP packets must be examined for SYN packets) and state storage
requirements (since a state bit must be retained for all TCP connections
which are using TCP-SN), it is not clear that it has any actual on real NAT
boxes, which seem to incur many of these costs already anyway, for other
reasons.
	The other alternative, as alluded to above, is to use a flag bit in
the TCP header to indicate which PH/CA is in use; if the NPC bit is set,
indicating the packet's checksum is not dependent on the IP address, no
adjustment of the TCP checksum needs to be done by the NAT box.

	In either case, no major difficulty is presented by the use of
TCP-SN in the presence of NAT boxes.


3.6 TCP-SN Using End-end Security in the Presence of NAT Boxes

	While TCP-SN operates easily in the presence of NAT boxes, TCP using
both end-end security, and TCP-SN, in the presence of NAT boxes, is more
problematic. Of course, at the moment, TCP with end-end security does not
function at all in the presence of NAT boxes, so a method to allow it to
work, even if convoluted, is an improvement on the existing situation.
	The fundamental problems are twofold. First, in TCP-SN as specified
above, the ICP packets are sent with the old PH/CA - and after passing
through a NAT box, this will need to be updated - which will cause the
end-end security to fail, since the packet has been modified. Second, NAT
boxes inevitably have to modify one or more of the addresses - and this weill
generally cause the end-end security to fail.

	An easy fix is available to the first problem, which is to send the
ICP packets using the new checksum, and have the NAT boxes not modify the
checksum. This is easy to do if the NPC flag is available. Alternatively, in
the packet analysis based approach, NAT boxes must ignore all the packets on
a connection using security, not just everything after the ICP.
	In the case where the initiator does not yet know the SN of the
listener, it can temporarily use a pseudo-header with a null listener SN.
	Note that there is no actual problem with an unmodified host
receiving an ICP packet checksummed with the new PH/CA (which it will not be
able to understand); since ICP packets are only sent with the new checksum if
end-end security is in use, and any TCP connection with end-end security
cannot pass through a NAT box anyway, nothing is lost.

	The second problem is not really a TCP-SN problem at all, but rather
a problem with IP Security.
	<Ran, can you fill in the rest of this.>


3.7 TCP Input Demultiplexing Complexities

<This section is being written last, and is going to be a bit hand-wavy
in some of the details, 'cause I'm burning out.>

	Depending on exactly what use the TCP/SN is put to, an issue (which
was somewhat glossed over above) can arise with finding the correct TCB for
an incoming packet.
	Briefly, the TCB is found by matching the source and destination port
and IP address. Logically, once we start using SN's to identify the TCP at
the far end, we should switch to using the source and destination SN and
ports to do the demultiplexing. However, since the SN's are not (normally)
carried in packets, a number of more complex cases can arise from this.

	Note that in the "normal" case (i.e. two simple hosts conversing),
the existing demultiplexing algorithm will work fine, since the IPv4 address
can be used as a reasonable standin for the SN for demultiplexing purposes.

	The first non-trivial case is with mobile and/or multi-homed hosts.
If a host wishes to change the address at which a given connection is
terminated, while this is possible, a number of issues have to be resolved.
For one, while the binding between connection and address can be changed,
that change must be properly synchronized between the two ends, otherwise
packets will not be associated with the proper TCP. <Mechanism to do this
seems like it will be fairly straightforward, left as an exercise for now.>
	For another, there are potential security problems unless the binding
between connection and IPv4 address (or, alternatively, between SN and IPv4
address) is secured, probably by some cryptographic mechanism. <Again,
mechanism to do this seems like it will be fairly straightforward, left as an
exercise for now; see [Perkins] for a worked example.>
	A host which is multi-homed and is experiencing flaky network
connectivity might actually wish to have multiple concurrent IPv4 addresses
in use. This could easily be handled, at least as far as finding the right
TCB for incoming packets (which IPv4 address to use on outgoing packets is a
different problem, somebody else's :-) with a "two-level" TCB structure (or,
alternatively, depending on the implementation, a slight change to the hash
structure by which TCB's are found). In short, multiple TCB index entries
(one for each IPv4 address used by the other end) all point to the same TCB.

	The second, and more painful case, involves mobile stacks, which can
potentially cause clashes in the port/address space. An example will make
this clear: suppose a connection (from stack S1) with local port Pl and
distant port Pd at host address Ad moves from host at address Al1 to a host
at address Al2. However, the host at Al2 *already* has a stack (S2) with TCP
connection with local port Pl, and distant port Pd at Ad! Now we have two
distinct connections with local port Pl at address Al2, and distant port Pd
at Ad. Of course, they are to two different stacks at Al2 - but how to
differentiate the incoming packets?
	In more abstract terms, the problem is that a host with multiple
stacks resident on it does not have a single TCP port space, but rather
multiple ones, one for each stack. If there are no clashes, one can, for
incoming packet demultiplexing, have a merged version for that purpose.
However, if there is a clash, what to do?

	One technically feasible solution is to try both pseduo-headers, and
see which (if either) generates the correct checksum; only one should. With
clever implementation, this is not necessarily as prohibitively expensive as
it might be; a partial checksum for the data in the packet can be computed
once only, and suitably combined with the pre-computer partial checksum for
each pseudo-header in turn, until one matches. (This might be more complex if
the new checksum is one in which the partial checksum of the later data
depends on the length of the [effectively prepended] pseudo-header; one
approach to this is to pad the PH to an even multiple of the checksum cycle
length with zero padding.)

3.7.1 The Stack Selector (SSEL) Input Demultiplexing Option

	Alternatively, one could add an IP option to allow demultiplexing on
the SN. This is *not* done as a TCP option because other transport protocols,
which do demultiplexing in a similar way, might see similar port clashes, and
all can use a common mechanism. Several ideas present themselves for the
detailed mechanism.
	In one, if we wish to minimize the size of the SN-carrying option,
and the cost of processing them on input, we can define a short, fixed-length
local stack name, a 'stack selector' (SSEL); i.e. of a scope *local* to the
destination address only. When an stack moves to a new address, if there are
any port clashes, it allocates an SSEL, and communicates that to the far end,
for them to use in incoming packets. <Again, the synchronization and security
issues seems like they will be fairly straightforward, and they are left as
an exercise for now.> The option would look like:

	Option number (1 byte)
	Option length (1 byte)
	Destination SSEL (2 bytes)

A length of 16 bits if chosen for the SSEL since this allows the complete
option to be exactly one long-word in length; 16 bits allows up to 2^16
operating stacks per interface (or host, depending on how the SSEL namespace
is managed on a multi-homed host), which seems adequate for all but the most
bizarre purposes.
	Now, when the packets arrive, the SSEL is examined to find the
correct stack to hand the packet to; the packet is then demultiplexed on the
port, as before. One issue here is that IP options are expensive. See the
section below ("IPv4 End-End Payload Protocol") for the solution to this.

3.7.2 The Stack Name Input Demultiplexing Option

	Another possible mechanism is to actually carry the destination SN in
the packet, in an IP option. This is expensive, but might be useful for
control operations or fault analysis or other situations where robustness is
paramount, and absolute efficiency is less of an issue. (In this case, one
might as well carry the source SN, too, for similar reasons.) To allow the
possibility of either combination, two separate options are defined, one for
each; these option would look like:

	Option number (1 byte)	[Two opcodes allocated, one for source, and
	Option length (1 byte)		one for destination.]
	Stack Name (n bytes)

Restricted IP option space is likely to be an issue here too, as with the
TCPSN option; the same IPv4 mechanism can handle this too.

<Question: If we allow carrying the SN's in an IP option, maybe the TCPSN
option should somehow work with this, rather than carrying them there as
well?>

3.7.3 The Burden of Stack Mobility Overhead

	In closing this section, one should note that this port-clash
problem, and the mechanism needed to solve it, is only present if one has
mobile stacks. If a host does not support mobile stacks, it need not
implement any of these mechanisms.
	As far as the port clash part of it goes, a non-mobile host might
simply decline to provide support for use of SSEL's at the other end, in
which case a mobile stack at the far end can fall back to the "find a
checksum that works" mechanism.
	Of course, some of these mechanisms require support at the far end -
and not just for port clash mechanisms, but for the general mobility issues.
However, in no case will the overhead of actually using those mechanisms, and
in particular the overhead the extra data in the packets, be incurred unless
one end is mobile. Thus, just as with IPv6 Mobility [Perkins], the overhead
needed to support mobility is only incurred by those who wish to make use of
it.


4.1 IPv4 End-End Payload Protocol

	Strictly speaking, this is a bit far afield from SN's, but like TCP
Header Extensions (below) it's a needed piece of infrastructure for SN's
(although somewhat less desperately so), and it seemed best to describe it
here.
	Basically, the problem is that IP-level end-end options in IPv4
[Postel81A] impose an unacceptable performance impact, since in most
high-performance routers, all IPv4 packets with options are all processed by
the "slow path". However, IPv6 [Deering] contains an elegant mechanism one
can use to avoid this; the "payload" mechanism.

	We can define a similar End-End Payload protocol for IPv4, one that
can i) carry normal user data of any protocol, ii) carry any number of
end-end options without having a performance impact, iii) provide room for
more options than the IP header can currently support (if desired), and iv)
provide for more protocols (again, if desired). The header syntax for the
End-End Payload protocol would be:

	End-end Payload header length (2 bytes)
	Payload protocol number (2 bytes)
	End-end Option0 type (x bytes)
	End-end Option0 length (y bytes)
	End-end Option0 data (n0 bytes)
	...
	End-end OptionM type (x bytes)
	End-end OptionM length (y bytes)
	End-end OptionM data (nM bytes)
	Padding

Note that the "payload protocol" is two bytes, not one, with the first 256
being the same as the exisiting IP protocol numbers. This will allow
less-common protocols (e.g. routing protocols) to be given protocol numbers
from this "extended" space; they will incur the overhead of having to
be carried inside a Payload protocol header, but that is an acceptable
tradeoff.
	Again, options stored in this area could either be in the canonical
IP option format, or an "extended" option format (i.e. normally x and y are
both 1, but this could be changed here).

4.1.1 Upwardly compatible introduction of the IPv4 End-End Payload Protocol

	How can one be certain the host at the far end implements the End-End
Payload? One can't, but if the host supports the TCPSN option, it must also
support the IPv4 E-EP protocol, so it's safe to design TCPSN to require support
for the E-EP protocol.


4.2 TCP Header Extension Option

	One problem with carrying the SN's in a TCP option is the limited
amount of room available for options in the TCP header. The maximum length of
the TCP header is 15 long-words, or 60 bytes; since the basic TCP header is
20 bytes, this leaves a maximum of 40 bytes for options. This is unlikely to
be enough room to hold many longer SN's (especially if some headers hold
both the initiator and listener SN's).
	The solution here is to provide a general purpose extension to TCP
allowing longer options. (This is preferred because the number of TCP
options, particularly those found in every packet, is increasing all the
time, and many, such as SACK, are already cramped by the limited size of the
TCP option area.) Several different techniques are possible to do this.

	All would define a new "Use Extended Header" option (TCPXH, a
zero-length option with no data), which is passed from the initiator in the
SYN, and returned by the listener in the SYN-ACK, indicating both sides
understand the header extension mechanism.
	If the listener supports use of XH, and wants to allow use of them,
it must respond with a SYN-ACK packet including a TCPXH option. The TCPXH
option must not be sent in any segment that does not have the SYN bit set.
	Since presumably this would be implemented at the same time as the
TCPSN option, there would be no problem with needing to deal with TCP's which
implemented TCPSN, but not the TCPXH.

	As to the actual method for indicating that an extended header is
present, indicating the length thereof, and storing the extended header,
there are several possible approaches.
	One possible alternative for flagging TCP segments with an extended
header is the use of a separate option, the "TCP Extended Header Present"
option (TCPXHP); the presence of this in any TCP segment would indicate the
presence of an extended header. The other possible alternative, and the
preferred one, involves the use of yet another flag bit, "Extended Header"
(XH), use of which would (notionally) not be allowed unless "Use Extended
Header" options had previously been exchanged.

	Three basic approaches to the storage of the length of the extended
header, and the storage of the actual extended header itself, are possible.
	In the first, if the TCPXHP option is used, it could directly contain
the length of the extended header, i.e. the TCPXHP option would look like:

	Option number (1 byte)
	Option length (1 byte)
	Extended Header Length (2 bytes)

and the actual data of the extended header would be stored after the normal
TCP segment header.
	Options stored in this area could either be in the canonical TCP
option format (1 byte of option number, and 1 byte of length), or an
"extended" option format (more than 1 byte of either) could be used if this
was deemed desirable.

	In the second, which uses the XH flag bit, we note that the URGENT
field is little used at the moment. It would be possible to reuse the URGENT
field, in packets where the XH bit was set (i.e. no packet could have both
the XH bit and the URG bit set) to indicate that the URGENT field held the
length of an "extended header". Two slightly different interpretations are
possible for where and how the extended options are stored.
	In the former, the true length of the TCP header, in bytes, is in the
URGENT field (i.e. the 4-bit TCP header length field is to be ignored), and
the options field in the TCP header could thus be far longer than 40 bytes.
In the latter, there is a new "extended options" area immediately after the
normal TCP header, the length of which is given by the reinterpreted URGENT
field. In the latter case, options stored in this area could again either be
in the canonical TCP option format, or an "extended" option format.
	The disadvantage of re-using the URGENT field is that other proposed
changes to TCP (e.g. the TCP Message Boundary Option) also propose to reuse
this field. One could simply specify that no TCP segment could use *both* the
XH and the MBO (since the MBO re-inteprets the URG bit, it would fall under
the provision that no segment could have both XH and URG set); this is not a
problem for TCPSN, which would only be in the first segment anyway. It might
be a problem for other potential clients of XH, such as SACK, though.
Alternatively, one could redefine MBO, so that it uses an option in those
segments in which it wishes to indicate a message boundary. (This would be
the sensible alternative if one wished to allow both MBO and XH in the
same TCP segment, since XH is useful for any number of different options,
not just TCPSN.)

	The third potential approach for storing extended options, again used
with the XH flag bit, is to define an optional "extended header length" field
(probably 2 bytes long) which, when XH is set, will be placed after the TCP
header (as defined by the 4-bit TCP header length field). The actual options
will follow immediately after. Again, options stored in this area could
either be in the canonical TCP option format, or an "extended" option format.

4.2.1 TCP Header Extension and TCPSN

	There is one small problem with the above - how can one use the
extended header to send the TCPSN, which must be in the SYN packet, without
knowing whether the listener implement TCPXH - which one can only find out
from the SYN-ACK? The tasteless kludge solution is to send a packet with all
three: TCPXH and TCPSN options, *and* with the XH bit set (assuming this
method of doing the extended header is used - see below for analysis of the
usee of the TCPXHP option).
	One of four things will then happen (since [Postel81B] does not
specify the behaviour of a TCP if a segment is received with any of the
'Reserved' bits non-zero): i) you get back a RST (in which case you know that
the listener does not implement TCPXH, and thus not TCPSN as well - at this
point you can resend the SYN with the TCPXH and the TCPSN), or ii) you get
back a SYN-ACK with a TCPXH and a TCPSN (in which case all is well, proceed),
or iii) you get back nothing (which can mean that the SYN packet was lost,
or, more likely, that the listener TCP did not like the packet and discarded
it silently), or iv) you get back a SYN-ACK with no TCPXH and no TCPSN (very
bad).

	For the case where you get no reply, the solution is fairly simple -
retransmit the SYN up to a small fixed number of times (to handle the case
where it was actually dropped by the network), and then back off to normal
TCP. This will on very rare occasions produce a case where a connection is
initiated using normal TCP, when both sides are actually capable of doing
TCP-SN; this is judged acceptable (and in any case will happen on roughly the
same frequency as a report from TCP that the destination host is down, when
it is in fact up - and for exactly the same reason - a statistical run of
lost packets).

	The last case, when you get back a SYN-ACK without a TCPXH and a
TCPSN, is ugly, because it means that the listener TCP doesn't implement
TCPXH but tried to process the packet anyway, so the listening TCP will
presumably already have held the extended options as what it thought was user
data, intending to pass them onto the client when the connection moved to
ESTABLISHED. (This assumes that the listening TCP accepts DATA in the
SYN packet - not all do.)
	The recovery at this point is to send an RST, which should cause the
listener (which is in SYN-SENT state) to close the connection, discard the
"initial" data, and erase the TCB; thereby allowing an immediate
retransmission of the SYN packet without the TCPXH and TCPSN options (which
the listener doesn't implement anyway).

	If the TCPXHP option is used, you should always get the fourth case
behaviour (since no flag bit will be set), and the effects will be the same
(the listener is buffering the "data"), and it can be handled in the same way
(send an RST).


References

[Braden]	Robert T. Braden, "Requirements for Internet Hosts -
Communication Layers", RFC 1122, University of Southern California,
Information Sciences Institute, Marina Del Rey, Calif., October 1989.

[Chiappa]	J. Noel Chiappa, "Stacks and Stack Names: A Proposed
Enhancement to the Internet Architecture", Internet Draft, In Preparation.

[Deering]	S. Deering, R. Hinden, "Internet Protocol, Version 6 (IPv6)
Specification", RFC 2460, University of Southern California, Information
Sciences Institute, Marina Del Rey, Calif., December 1998.

[Partridge]	Craig Partridge, J. Zweig, "TCP Alternate Checksum Options",
RFC 1146, University of Southern California, Information Sciences Institute,
Marina Del Rey, Calif., March 1990.

[Perkins]	Charles Perkins, David B. Johnson, "Mobility Support in
IPv6", Internet Draft <draft-ietf-mobileip-ipv6-12.txt>, April 2000

[Postel81A]	Jon Postel, "Internet Protocol", RFC 791,
University of Southern California, Information Sciences Institute,
Marina Del Rey, Calif., September 1981.

[Postel81B]	Jon Postel, "Transmission Control Protocol", RFC 793,
University of Southern California, Information Sciences Institute,
Marina Del Rey, Calif., September 1981.