Network Working Group M. Allman, Editor
Request for Comments: 2760 NASA Glenn Research Center/BBN Technologies
Category: Informational S. Dawkins
Nortel
D. Glover
J. Griner
D. Tran
NASA Glenn Research Center
T. Henderson
University of California at Berkeley
J. Heidemann
J. ToUCh
University of Southern California/ISI
H. Kruse
S. Ostermann
Ohio University
K. Scott
The MITRE Corporation
J. Semke
Pittsburgh Supercomputing Center
February 2000
Ongoing TCP Research Related to Satellites
Status of this Memo
This memo provides information for the Internet community. It does
not specify an Internet standard of any kind. Distribution of this
memo is unlimited.
Copyright Notice
Copyright (C) The Internet Society (2000). All Rights Reserved.
Abstract
This document outlines possible TCP enhancements that may allow TCP
to better utilize the available bandwidth provided by networks
containing satellite links. The algorithms and mechanisms outlined
have not been judged to be mature enough to be recommended by the
IETF. The goal of this document is to educate researchers as to the
current work and progress being done in TCP research related to
satellite networks.
Table of Contents
1 Introduction. . . . . . . . . . . . . . . . . . . . 2
2 Satellite Architectures . . . . . . . . . . . . . . 3
2.1 Asymmetric Satellite Networks . . . . . . . . . . . 3
2.2 Satellite Link as Last Hop. . . . . . . . . . . . . 3
2.3 Hybrid Satellite Networks . . . . . . . . . . . 4
2.4 Point-to-Point Satellite Networks . . . . . . . . . 4
2.5 Multiple Satellite Hops . . . . . . . . . . . . . . 4
3 Mitigations . . . . . . . . . . . . . . . . . . . . 4
3.1 TCP For Transactions. . . . . . . . . . . . . . . . 4
3.2 Slow Start. . . . . . . . . . . . . . . . . . . . . 5
3.2.1 Larger Initial Window . . . . . . . . . . . . . . . 6
3.2.2 Byte Counting . . . . . . . . . . . . . . . . . . . 7
3.2.3 Delayed ACKs After Slow Start . . . . . . . . . . . 9
3.2.4 Terminating Slow Start. . . . . . . . . . . . . . . 11
3.3 Loss Recovery . . . . . . . . . . . . . . . . . . . 12
3.3.1 Non-SACK Based Mechanisms . . . . . . . . . . . . . 12
3.3.2 SACK Based Mechanisms . . . . . . . . . . . . . . . 13
3.3.3 EXPlicit Congestion Notification. . . . . . . . . . 16
3.3.4 Detecting Corruption Loss . . . . . . . . . . . . . 18
3.4 Congestion Avoidance. . . . . . . . . . . . . . . . 21
3.5 Multiple Data Connections . . . . . . . . . . . . . 22
3.6 Pacing TCP Segments . . . . . . . . . . . . . . . . 24
3.7 TCP Header Compression. . . . . . . . . . . . . . . 26
3.8 Sharing TCP State Among Similar Connections . . . . 29
3.9 ACK Congestion Control. . . . . . . . . . . . . . . 32
3.10 ACK Filtering . . . . . . . . . . . . . . . . . . . 34
4 Conclusions . . . . . . . . . . . . . . . . . . . . 36
5 Security Considerations . . . . . . . . . . . . . . 36
6 Acknowledgments . . . . . . . . . . . . . . . . . . 37
7 References. . . . . . . . . . . . . . . . . . . . . 37
8 Authors" Addresses. . . . . . . . . . . . . . . . . 43
9 Full Copyright Statement. . . . . . . . . . . . . . 46
1 Introduction
This document outlines mechanisms that may help the Transmission
Control Protocol (TCP) [Pos81] better utilize the bandwidth provided
by long-delay satellite environments. These mechanisms may also help
in other environments or for other protocols. The proposals outlined
in this document are currently being studied throughout the research
community. Therefore, these mechanisms are not mature enough to be
recommended for wide-spread use by the IETF. However, some of these
mechanisms may be safely used today. It is hoped that this document
will stimulate further study into the described mechanisms. If, at
some point, the mechanisms discussed in this memo prove to be safe
and appropriate to be recommended for general use, the appropriate
IETF documents will be written.
It should be noted that non-TCP mechanisms that help performance over
satellite links do exist (e.g., application-level changes, queueing
disciplines, etc.). However, outlining these non-TCP mitigations is
beyond the scope of this document and therefore is left as future
work. Additionally, there are a number of mitigations to TCP"s
performance problems that involve very active intervention by
gateways along the end-to-end path from the sender to the receiver.
Documenting the pros and cons of such solutions is also left as
future work.
2 Satellite Architectures
Specific characteristics of satellite links and the impact these
characteristics have on TCP are presented in RFC2488 [AGS99]. This
section discusses several possible topologies where satellite links
may be integrated into the global Internet. The mitigation outlined
in section 3 will include a discussion of which environment the
mechanism is expected to benefit.
2.1 Asymmetric Satellite Networks
Some satellite networks exhibit a bandwidth asymmetry, a larger data
rate in one direction than the reverse direction, because of limits
on the transmission power and the antenna size at one end of the
link. Meanwhile, some other satellite systems are unidirectional and
use a non-satellite return path (such as a dialup modem link). The
nature of most TCP traffic is asymmetric with data flowing in one
direction and acknowledgments in opposite direction. However, the
term asymmetric in this document refers to different physical
capacities in the forward and return links. Asymmetry has been shown
to be a problem for TCP [BPK97,BPK98].
2.2 Satellite Link as Last Hop
Satellite links that provide service directly to end users, as
opposed to satellite links located in the middle of a network, may
allow for specialized design of protocols used over the last hop.
Some satellite providers use the satellite link as a shared high
speed downlink to users with a lower speed, non-shared terrestrial
link that is used as a return link for requests and acknowledgments.
Many times this creates an asymmetric network, as discussed above.
2.3 Hybrid Satellite Networks
In the more general case, satellite links may be located at any point
in the network topology. In this case, the satellite link acts as
just another link between two gateways. In this environment, a given
connection may be sent over terrestrial links (including terrestrial
wireless), as well as satellite links. On the other hand, a
connection could also travel over only the terrestrial network or
only over the satellite portion of the network.
2.4 Point-to-Point Satellite Networks
In point-to-point satellite networks, the only hop in the network is
over the satellite link. This pure satellite environment exhibits
only the problems associated with the satellite links, as outlined in
[AGS99]. Since this is a private network, some mitigations that are
not appropriate for shared networks can be considered.
2.5 Multiple Satellite Hops
In some situations, network traffic may traverse multiple satellite
hops between the source and the destination. Such an environment
aggravates the satellite characteristics described in [AGS99].
3 Mitigations
The following sections will discuss various techniques for mitigating
the problems TCP faces in the satellite environment. Each of the
following sections will be organized as follows: First, each
mitigation will be briefly outlined. Next, research work involving
the mechanism in question will be briefly discussed. Next the
implementation issues of the mechanism will be presented (including
whether or not the particular mechanism presents any dangers to
shared networks). Then a discussion of the mechanism"s potential
with regard to the topologies outlined above is given. Finally, the
relationships and possible interactions with other TCP mechanisms are
outlined. The reader is expected to be familiar with the TCP
terminology used in [AGS99].
3.1 TCP For Transactions
3.1.1 Mitigation Description
TCP uses a three-way handshake to setup a connection between two
hosts [Pos81]. This connection setup requires 1-1.5 round-trip times
(RTTs), depending upon whether the data sender started the connection
actively or passively. This startup time can be eliminated by using
TCP extensions for transactions (T/TCP) [Bra94]. After the first
connection between a pair of hosts is established, T/TCP is able to
bypass the three-way handshake, allowing the data sender to begin
transmitting data in the first segment sent (along with the SYN).
This is especially helpful for short request/response traffic, as it
saves a potentially long setup phase when no useful data is being
transmitted.
3.1.2 Research
T/TCP is outlined and analyzed in [Bra92,Bra94].
3.1.3 Implementation Issues
T/TCP requires changes in the TCP stacks of both the data sender and
the data receiver. While T/TCP is safe to implement in shared
networks from a congestion control perspective, several security
implications of sending data in the first data segment have been
identified [ddKI99].
3.1.4 Topology Considerations
It is expected that T/TCP will be equally beneficial in all
environments outlined in section 2.
3.1.5 Possible Interaction and Relationships with Other Research
T/TCP allows data transfer to start more rapidly, much like using a
larger initial congestion window (see section 3.2.1), delayed ACKs
after slow start (section 3.2.3) or byte counting (section 3.2.2).
3.2 Slow Start
The slow start algorithm is used to gradually increase the size of
TCP"s congestion window (cwnd) [Jac88,Ste97,APS99]. The algorithm is
an important safe-guard against transmitting an inappropriate amount
of data into the network when the connection starts up. However,
slow start can also waste available network capacity, especially in
long-delay networks [All97a,Hay97]. Slow start is particularly
inefficient for transfers that are short compared to the
delay*bandwidth product of the network (e.g., WWW transfers).
Delayed ACKs are another source of wasted capacity during the slow
start phase. RFC1122 [Bra89] suggests data receivers refrain from
ACKing every incoming data segment. However, every second full-sized
segment should be ACKed. If a second full-sized segment does not
arrive within a given timeout, an ACK must be generated (this timeout
cannot exceed 500 ms). Since the data sender increases the size of
cwnd based on the number of arriving ACKs, reducing the number of
ACKs slows the cwnd growth rate. In addition, when TCP starts
sending, it sends 1 segment. When using delayed ACKs a second
segment must arrive before an ACK is sent. Therefore, the receiver
is always forced to wait for the delayed ACK timer to expire before
ACKing the first segment, which also increases the transfer time.
Several proposals have suggested ways to make slow start less time
consuming. These proposals are briefly outlined below and references
to the research work given.
3.2.1 Larger Initial Window
3.2.1.1 Mitigation Description
One method that will reduce the amount of time required by slow start
(and therefore, the amount of wasted capacity) is to increase the
initial value of cwnd. An experimental TCP extension outlined in
[AFP98] allows the initial size of cwnd to be increased from 1
segment to that given in equation (1).
min (4*MSS, max (2*MSS, 4380 bytes)) (1)
By increasing the initial value of cwnd, more packets are sent during
the first RTT of data transmission, which will trigger more ACKs,
allowing the congestion window to open more rapidly. In addition, by
sending at least 2 segments initially, the first segment does not
need to wait for the delayed ACK timer to expire as is the case when
the initial size of cwnd is 1 segment (as discussed above).
Therefore, the value of cwnd given in equation 1 saves up to 3 RTTs
and a delayed ACK timeout when compared to an initial cwnd of 1
segment.
Also, we note that RFC2581 [APS99], a standards-track document,
allows a TCP to use an initial cwnd of up to 2 segments. This change
is highly recommended for satellite networks.
3.2.1.2 Research
Several researchers have studied the use of a larger initial window
in various environments. [Nic97] and [KAGT98] show a reduction in
WWW page transfer time over hybrid fiber coax (HFC) and satellite
links respectively. Furthermore, it has been shown that using an
initial cwnd of 4 segments does not negatively impact overall
performance over dialup modem links with a small number of buffers
[SP98]. [AHO98] shows an improvement in transfer time for 16 KB
files across the Internet and dialup modem links when using a larger
initial value for cwnd. However, a slight increase in dropped
segments was also shown. Finally, [PN98] shows improved transfer
time for WWW traffic in simulations with competing traffic, in
addition to a small increase in the drop rate.
3.2.1.3 Implementation Issues
The use of a larger initial cwnd value requires changes to the
sender"s TCP stack. Using an initial congestion window of 2 segments
is allowed by RFC2581 [APS99]. Using an initial congestion window
of 3 or 4 segments is not expected to present any danger of
congestion collapse [AFP98], however may degrade performance in some
networks.
3.2.1.4 Topology Considerations
It is expected that the use of a large initial window would be
equally beneficial to all network architectures outlined in section
2.
3.2.1.5 Possible Interaction and Relationships with Other Research
Using a fixed larger initial congestion window decreases the impact
of a long RTT on transfer time (especially for short transfers) at
the cost of bursting data into a network with unknown conditions. A
mechanism that mitigates bursts may make the use of a larger initial
congestion window more appropriate (e.g., limiting the size of line-
rate bursts [FF96] or pacing the segments in a burst [VH97a]).
Also, using delayed ACKs only after slow start (as outlined in
section 3.2.3) offers an alternative way to immediately ACK the first
segment of a transfer and open the congestion window more rapidly.
Finally, using some form of TCP state sharing among a number of
connections (as discussed in 3.8) may provide an alternative to using
a fixed larger initial window.
3.2.2 Byte Counting
3.2.2.1 Mitigation Description
As discussed above, the wide-spread use of delayed ACKs increases the
time needed by a TCP sender to increase the size of the congestion
window during slow start. This is especially harmful to flows
traversing long-delay GEO satellite links. One mechanism that has
been suggested to mitigate the problems caused by delayed ACKs is the
use of "byte counting", rather than standard ACK counting
[All97a,All98]. Using standard ACK counting, the congestion window
is increased by 1 segment for each ACK received during slow start.
However, using byte counting the congestion window increase is based
on the number of previously unacknowledged bytes covered by each
incoming ACK, rather than on the number of ACKs received. This makes
the increase relative to the amount of data transmitted, rather than
being dependent on the ACK interval used by the receiver.
Two forms of byte counting are studied in [All98]. The first is
unlimited byte counting (UBC). This mechanism simply uses the number
of previously unacknowledged bytes to increase the congestion window
each time an ACK arrives. The second form is limited byte counting
(LBC). LBC limits the amount of cwnd increase to 2 segments. This
limit throttles the size of the burst of data sent in response to a
"stretch ACK" [Pax97]. Stretch ACKs are acknowledgments that cover
more than 2 segments of previously unacknowledged data. Stretch ACKs
can occur by design [Joh95] (although this is not standard), due to
implementation bugs [All97b,PADHV99] or due to ACK loss. [All98]
shows that LBC prevents large line-rate bursts when compared to UBC,
and therefore offers fewer dropped segments and better performance.
In addition, UBC causes large bursts during slow start based loss
recovery due to the large cumulative ACKs that can arrive during loss
recovery. The behavior of UBC during loss recovery can cause large
decreases in performance and [All98] strongly recommends UBC not be
deployed without further study into mitigating the large bursts.
Note: The standards track RFC2581 [APS99] allows a TCP to use byte
counting to increase cwnd during congestion avoidance, however not
during slow start.
3.2.2.2 Research
Using byte counting, as opposed to standard ACK counting, has been
shown to reduce the amount of time needed to increase the value of
cwnd to an appropriate size in satellite networks [All97a]. In
addition, [All98] presents a simulation comparison of byte counting
and the standard cwnd increase algorithm in uncongested networks and
networks with competing traffic. This study found that the limited
form of byte counting outlined above can improve performance, while
also increasing the drop rate slightly.
[BPK97,BPK98] also investigated unlimited byte counting in
conjunction with various ACK filtering algorithms (discussed in
section 3.10) in asymmetric networks.
3.2.2.3 Implementation Issues
Changing from ACK counting to byte counting requires changes to the
data sender"s TCP stack. Byte counting violates the algorithm for
increasing the congestion window outlined in RFC2581 [APS99] (by
making congestion window growth more aggressive during slow start)
and therefore should not be used in shared networks.
3.2.2.4 Topology Considerations
It has been suggested by some (and roundly criticized by others) that
byte counting will allow TCP to provide uniform cwnd increase,
regardless of the ACKing behavior of the receiver. In addition, byte
counting also mitigates the retarded window growth provided by
receivers that generate stretch ACKs because of the capacity of the
return link, as discussed in [BPK97,BPK98]. Therefore, this change
is expected to be especially beneficial to asymmetric networks.
3.2.2.5 Possible Interaction and Relationships with Other Research
Unlimited byte counting should not be used without a method to
mitigate the potentially large line-rate bursts the algorithm can
cause. Also, LBC may send bursts that are too large for the given
network conditions. In this case, LBC may also benefit from some
algorithm that would lessen the impact of line-rate bursts of
segments. Also note that using delayed ACKs only after slow start
(as outlined in section 3.2.3) negates the limited byte counting
algorithm because each ACK covers only one segment during slow start.
Therefore, both ACK counting and byte counting yield the same
increase in the congestion window at this point (in the first RTT).
3.2.3 Delayed ACKs After Slow Start
3.2.3.1 Mitigation Description
As discussed above, TCP senders use the number of incoming ACKs to
increase the congestion window during slow start. And, since delayed
ACKs reduce the number of ACKs returned by the receiver by roughly
half, the rate of growth of the congestion window is reduced. One
proposed solution to this problem is to use delayed ACKs only after
the slow start (DAASS) phase. This provides more ACKs while TCP is
aggressively increasing the congestion window and less ACKs while TCP
is in steady state, which conserves network resources.
3.2.3.2 Research
[All98] shows that in simulation, using delayed ACKs after slow start
(DAASS) improves transfer time when compared to a receiver that
always generates delayed ACKs. However, DAASS also slightly
increases the loss rate due to the increased rate of cwnd growth.
3.2.3.3 Implementation Issues
The major problem with DAASS is in the implementation. The receiver
has to somehow know when the sender is using the slow start
algorithm. The receiver could implement a heuristic that attempts to
watch the change in the amount of data being received and change the
ACKing behavior accordingly. Or, the sender could send a message (a
flipped bit in the TCP header, perhaps) indicating that it was using
slow start. The implementation of DAASS is, therefore, an open
issue.
Using DAASS does not violate the TCP congestion control specification
[APS99]. However, the standards (RFC2581 [APS99]) currently
recommend using delayed acknowledgments and DAASS goes (partially)
against this recommendation.
3.2.3.4 Topology Considerations
DAASS should work equally well in all scenarios presented in section
2. However, in asymmetric networks it may aggravate ACK congestion
in the return link, due to the increased number of ACKs (see sections
3.9 and 3.10 for a more detailed discussion of ACK congestion).
3.2.3.5 Possible Interaction and Relationships with Other Research
DAASS has several possible interactions with other proposals made in
the research community. DAASS can aggravate congestion on the path
between the data receiver and the data sender due to the increased
number of returning acknowledgments. This can have an especially
adverse effect on asymmetric networks that are prone to experiencing
ACK congestion. As outlined in sections 3.9 and 3.10, several
mitigations have been proposed to reduce the number of ACKs that are
passed over a low-bandwidth return link. Using DAASS will increase
the number of ACKs sent by the receiver. The interaction between
DAASS and the methods for reducing the number of ACKs is an open
research question. Also, as noted in section 3.2.1.5 above, DAASS
provides some of the same benefits as using a larger initial
congestion window and therefore it may not be desirable to use both
mechanisms together. However, this remains an open question.
Finally, DAASS and limited byte counting are both used to increase
the rate at which the congestion window is opened. The DAASS
algorithm substantially reduces the impact limited byte counting has
on the rate of congestion window increase.
3.2.4 Terminating Slow Start
3.2.4.1 Mitigation Description
The initial slow start phase is used by TCP to determine an
appropriate congestion window size for the given network conditions
[Jac88]. Slow start is terminated when TCP detects congestion, or
when the size of cwnd reaches the size of the receiver"s advertised
window. Slow start is also terminated if cwnd grows beyond a certain
size. The threshold at which TCP ends slow start and begins using
the congestion avoidance algorithm is called "ssthresh" [Jac88]. In
most implementations, the initial value for ssthresh is the
receiver"s advertised window. During slow start, TCP roughly doubles
the size of cwnd every RTT and therefore can overwhelm the network
with at most twice as many segments as the network can handle. By
setting ssthresh to a value less than the receiver"s advertised
window initially, the sender may avoid overwhelming the network with
twice the appropriate number of segments. Hoe [Hoe96] proposes using
the packet-pair algorithm [Kes91] and the measured RTT to determine a
more appropriate value for ssthresh. The algorithm observes the
spacing between the first few returning ACKs to determine the
bandwidth of the bottleneck link. Together with the measured RTT,
the delay*bandwidth product is determined and ssthresh is set to this
value. When TCP"s cwnd reaches this reduced ssthresh, slow start is
terminated and transmission continues using congestion avoidance,
which is a more conservative algorithm for increasing the size of the
congestion window.
3.2.4.2 Research
It has been shown that estimating ssthresh can improve performance
and decrease packet loss in simulations [Hoe96]. However, oBTaining
an accurate estimate of the available bandwidth in a dynamic network
is very challenging, especially attempting to do so on the sending
side of the TCP connection [AP99]. Therefore, before this mechanism
is widely deployed, bandwidth estimation must be studied in a more
detail.
3.2.4.3 Implementation Issues
As outlined in [Hoe96], estimating ssthresh requires changes to the
data sender"s TCP stack. As suggested in [AP99], bandwidth estimates
may be more accurate when taken by the TCP receiver, and therefore
both sender and receiver changes would be required. Estimating
ssthresh is safe to implement in production networks from a
congestion control perspective, as it can only make TCP more
conservative than outlined in RFC2581 [APS99] (assuming the TCP
implementation is using an initial ssthresh of infinity as allowed by
[APS99]).
3.2.4.4 Topology Considerations
It is expected that this mechanism will work equally well in all
symmetric topologies outlined in section 2. However, asymmetric
links pose a special problem, as the rate of the returning ACKs may
not be the bottleneck bandwidth in the forward direction. This can
lead to the sender setting ssthresh too low. Premature termination
of slow start can hurt performance, as congestion avoidance opens
cwnd more conservatively. Receiver-based bandwidth estimators do not
suffer from this problem.
3.2.4.5 Possible Interaction and Relationships with Other Research
Terminating slow start at the right time is useful to avoid multiple
dropped segments. However, using a selective acknowledgment-based
loss recovery scheme (as outlined in section 3.3.2) can drastically
improve TCP"s ability to quickly recover from multiple lost segments
Therefore, it may not be as important to terminate slow start before
a large loss event occurs. [AP99] shows that using delayed
acknowledgments [Bra89] reduces the effectiveness of sender-side
bandwidth estimation. Therefore, using delayed ACKs only during slow
start (as outlined in section 3.2.3) may make bandwidth estimation
more feasible.
3.3 Loss Recovery
3.3.1 Non-SACK Based Mechanisms
3.3.1.1 Mitigation Description
Several similar algorithms have been developed and studied that
improve TCP"s ability to recover from multiple lost segments in a
window of data without relying on the (often long) retransmission
timeout. These sender-side algorithms, known as NewReno TCP, do not
depend on the availability of selective acknowledgments (SACKs)
[MMFR96].
These algorithms generally work by updating the fast recovery
algorithm to use information provided by "partial ACKs" to trigger
retransmissions. A partial ACK covers some new data, but not all
data outstanding when a particular loss event starts. For instance,
consider the case when segment N is retransmitted using the fast
retransmit algorithm and segment M is the last segment sent when
segment N is resent. If segment N is the only segment lost, the ACK
elicited by the retransmission of segment N would be for segment M.
If, however, segment N+1 was also lost, the ACK elicited by the
retransmission of segment N will be N+1. This can be taken as an
indication that segment N+1 was lost and used to trigger a
retransmission.
3.3.1.2 Research
Hoe [Hoe95,Hoe96] introduced the idea of using partial ACKs to
trigger retransmissions and showed that doing so could improve
performance. [FF96] shows that in some cases using partial ACKs to
trigger retransmissions reduces the time required to recover from
multiple lost segments. However, [FF96] also shows that in some
cases (many lost segments) relying on the RTO timer can improve
performance over simply using partial ACKs to trigger all
retransmissions. [HK99] shows that using partial ACKs to trigger
retransmissions, in conjunction with SACK, improves performance when
compared to TCP using fast retransmit/fast recovery in a satellite
environment. Finally, [FH99] describes several slightly different
variants of NewReno.
3.3.1.3 Implementation Issues
Implementing these fast recovery enhancements requires changes to the
sender-side TCP stack. These changes can safely be implemented in
production networks and are allowed by RFC2581 [APS99].
3.3.1.4 Topology Considerations
It is expected that these changes will work well in all environments
outlined in section 2.
3.3.1.5 Possible Interaction and Relationships with Other Research
See section 3.3.2.2.5.
3.3.2 SACK Based Mechanisms
3.3.2.1 Fast Recovery with SACK
3.3.2.1.1 Mitigation Description
Fall and Floyd [FF96] describe a conservative extension to the fast
recovery algorithm that takes into account information provided by
selective acknowledgments (SACKs) [MMFR96] sent by the receiver. The
algorithm starts after fast retransmit triggers the resending of a
segment. As with fast retransmit, the algorithm cuts cwnd in half
when a loss is detected. The algorithm keeps a variable called
"pipe", which is an estimate of the number of outstanding segments in
the network. The pipe variable is decremented by 1 segment for each
duplicate ACK that arrives with new SACK information. The pipe
variable is incremented by 1 for each new or retransmitted segment
sent. A segment may be sent when the value of pipe is less than cwnd
(this segment is either a retransmission per the SACK information or
a new segment if the SACK information indicates that no more
retransmits are needed).
This algorithm generally allows TCP to recover from multiple segment
losses in a window of data within one RTT of loss detection. Like
the forward acknowledgment (FACK) algorithm described below, the SACK
information allows the pipe algorithm to decouple the choice of when
to send a segment from the choice of what segment to send.
[APS99] allows the use of this algorithm, as it is consistent with
the spirit of the fast recovery algorithm.
3.3.2.1.2 Research
[FF96] shows that the above described SACK algorithm performs better
than several non-SACK based recovery algorithms when 1--4 segments
are lost from a window of data. [AHKO97] shows that the algorithm
improves performance over satellite links. Hayes [Hay97] shows the
in certain circumstances, the SACK algorithm can hurt performance by
generating a large line-rate burst of data at the end of loss
recovery, which causes further loss.
3.3.2.1.3 Implementation Issues
This algorithm is implemented in the sender"s TCP stack. However, it
relies on SACK information generated by the receiver. This algorithm
is safe for shared networks and is allowed by RFC2581 [APS99].
3.3.2.1.4 Topology Considerations
It is expected that the pipe algorithm will work equally well in all
scenarios presented in section 2.
3.3.2.1.5 Possible Interaction and Relationships with Other Research
See section 3.3.2.2.5.
3.3.2.2 Forward Acknowledgments
3.3.2.2.1 Mitigation Description
The Forward Acknowledgment (FACK) algorithm [MM96a,MM96b] was
developed to improve TCP congestion control during loss recovery.
FACK uses TCP SACK options to glean additional information about the
congestion state, adding more precise control to the injection of
data into the network during recovery. FACK decouples the congestion
control algorithms from the data recovery algorithms to provide a
simple and direct way to use SACK to improve congestion control. Due
to the separation of these two algorithms, new data may be sent
during recovery to sustain TCP"s self-clock when there is no further
data to retransmit.
The most recent version of FACK is Rate-Halving [MM96b], in which one
packet is sent for every two ACKs received during recovery.
Transmitting a segment for every-other ACK has the result of reducing
the congestion window in one round trip to half of the number of
packets that were successfully handled by the network (so when cwnd
is too large by more than a factor of two it still gets reduced to
half of what the network can sustain). Another important ASPect of
FACK with Rate-Halving is that it sustains the ACK self-clock during
recovery because transmitting a packet for every-other ACK does not
require half a cwnd of data to drain from the network before
transmitting, as required by the fast recovery algorithm
[Ste97,APS99].
In addition, the FACK with Rate-Halving implementation provides
Thresholded Retransmission to each lost segment. "Tcprexmtthresh" is
the number of duplicate ACKs required by TCP to trigger a fast
retransmit and enter recovery. FACK applies thresholded
retransmission to all segments by waiting until tcprexmtthresh SACK
blocks indicate that a given segment is missing before resending the
segment. This allows reasonable behavior on links that reorder
segments. As described above, FACK sends a segment for every second
ACK received during recovery. New segments are transmitted except
when tcprexmtthresh SACK blocks have been observed for a dropped
segment, at which point the dropped segment is retransmitted.
[APS99] allows the use of this algorithm, as it is consistent with
the spirit of the fast recovery algorithm.
3.3.2.2.2 Research
The original FACK algorithm is outlined in [MM96a]. The algorithm
was later enhanced to include Rate-Halving [MM96b]. The real-world
performance of FACK with Rate-Halving was shown to be much closer to
the theoretical maximum for TCP than either TCP Reno or the SACK-
based extensions to fast recovery outlined in section 3.3.2.1
[MSMO97].
3.3.2.2.3 Implementation Issues
In order to use FACK, the sender"s TCP stack must be modified. In
addition, the receiver must be able to generate SACK options to
obtain the full benefit of using FACK. The FACK algorithm is safe
for shared networks and is allowed by RFC2581 [APS99].
3.3.2.2.4 Topology Considerations
FACK is expected to improve performance in all environments outlined
in section 2. Since it is better able to sustain its self-clock than
TCP Reno, it may be considerably more attractive over long delay
paths.
3.3.2.2.5 Possible Interaction and Relationships with Other Research
Both SACK based loss recovery algorithms described above (the fast
recovery enhancement and the FACK algorithm) are similar in that they
attempt to effectively repair multiple lost segments from a window of
data. Which of the SACK-based loss recovery algorithms to use is
still an open research question. In addition, these algorithms are
similar to the non-SACK NewReno algorithm described in section 3.3.1,
in that they attempt to recover from multiple lost segments without
reverting to using the retransmission timer. As has been shown, the
above SACK based algorithms are more robust than the NewReno
algorithm. However, the SACK algorithm requires a cooperating TCP
receiver, which the NewReno algorithm does not. A reasonable TCP
implementation might include both a SACK-based and a NewReno-based
loss recovery algorithm such that the sender can use the most
appropriate loss recovery algorithm based on whether or not the
receiver supports SACKs. Finally, both SACK-based and non-SACK-based
versions of fast recovery have been shown to transmit a large burst
of data upon leaving loss recovery, in some cases [Hay97].
Therefore, the algorithms may benefit from some burst suppression
algorithm.
3.3.3 Explicit Congestion Notification
3.3.3.1 Mitigation Description
Explicit congestion notification (ECN) allows routers to inform TCP
senders about imminent congestion without dropping segments. Two
major forms of ECN have been studied. A router employing backward
ECN (BECN), transmits messages directly to the data originator
informing it of congestion. IP routers can accomplish this with an
ICMP Source Quench message. The arrival of a BECN signal may or may
not mean that a TCP data segment has been dropped, but it is a clear
indication that the TCP sender should reduce its sending rate (i.e.,
the value of cwnd). The second major form of congestion notification
is forward ECN (FECN). FECN routers mark data segments with a
special tag when congestion is imminent, but forward the data
segment. The data receiver then echos the congestion information
back to the sender in the ACK packet. A description of a FECN
mechanism for TCP/IP is given in [RF99].
As described in [RF99], senders transmit segments with an "ECN-
Capable Transport" bit set in the IP header of each packet. If a
router employing an active queueing strategy, such as Random Early
Detection (RED) [FJ93,BCC+98], would otherwise drop this segment, an
"Congestion Experienced" bit in the IP header is set instead. Upon
reception, the information is echoed back to TCP senders using a bit
in the TCP header. The TCP sender adjusts the congestion window just
as it would if a segment was dropped.
The implementation of ECN as specified in [RF99] requires the
deployment of active queue management mechanisms in the affected
routers. This allows the routers to signal congestion by sending TCP
a small number of "congestion signals" (segment drops or ECN
messages), rather than discarding a large number of segments, as can
happen when TCP overwhelms a drop-tail router queue.
Since satellite networks generally have higher bit-error rates than
terrestrial networks, determining whether a segment was lost due to
congestion or corruption may allow TCP to achieve better performance
in high BER environments than currently possible (due to TCP"s
assumption that all loss is due to congestion). While not a solution
to this problem, adding an ECN mechanism to TCP may be a part of a
mechanism that will help achieve this goal. See section 3.3.4 for a
more detailed discussion of differentiating between corruption and
congestion based losses.
3.3.3.2 Research
[Flo94] shows that ECN is effective in reducing the segment loss rate
which yields better performance especially for short and interactive
TCP connections. Furthermore, [Flo94] also shows that ECN avoids
some unnecessary, and costly TCP retransmission timeouts. Finally,
[Flo94] also considers some of the advantages and disadvantages of
various forms of explicit congestion notification.
3.3.3.3 Implementation Issues
Deployment of ECN requires changes to the TCP implementation on both
sender and receiver. Additionally, deployment of ECN requires
deployment of some active queue management infrastructure in routers.
RED is assumed in most ECN discussions, because RED is already
identifying segments to drop, even before its buffer space is
exhausted. ECN simply allows the delivery of "marked" segments while
still notifying the end nodes that congestion is occurring along the
path. ECN is safe (from a congestion control perspective) for shared
networks, as it maintains the same TCP congestion control principles
as are used when congestion is detected via segment drops.
3.3.3.4 Topology Considerations
It is expected that none of the environments outlined in section 2
will present a bias towards or against ECN traffic.
3.3.3.5 Possible Interaction and Relationships with Other Research
Note that some form of active queueing is necessary to use ECN (e.g.,
RED queueing).
3.3.4 Detecting Corruption Loss
Differentiating between congestion (loss of segments due to router
buffer overflow or imminent buffer overflow) and corruption (loss of
segments due to damaged bits) is a difficult problem for TCP. This
differentiation is particularly important because the action that TCP
should take in the two cases is entirely different. In the case of
corruption, TCP should merely retransmit the damaged segment as soon
as its loss is detected; there is no need for TCP to adjust its
congestion window. On the other hand, as has been widely discussed
above, when the TCP sender detects congestion, it should immediately
reduce its congestion window to avoid making the congestion worse.
TCP"s defined behavior, as motivated by [Jac88,Jac90] and defined in
[Bra89,Ste97,APS99], is to assume that all loss is due to congestion
and to trigger the congestion control algorithms, as defined in
[Ste97,APS99]. The loss may be detected using the fast retransmit
algorithm, or in the worst case is detected by the expiration of
TCP"s retransmission timer.
TCP"s assumption that loss is due to congestion rather than
corruption is a conservative mechanism that prevents congestion
collapse [Jac88,FF98]. Over satellite networks, however, as in many
wireless environments, loss due to corruption is more common than on
terrestrial networks. One common partial solution to this problem is
to add Forward Error Correction (FEC) to the data that"s sent over
the satellite/wireless link. A more complete discussion of the
benefits of FEC can be found in [AGS99]. However, given that FEC
does not always work or cannot be universally applied, other
mechanisms have been studied to attempt to make TCP able to
differentiate between congestion-based and corruption-based loss.
TCP segments that have been corrupted are most often dropped by
intervening routers when link-level checksum mechanisms detect that
an incoming frame has errors. Occasionally, a TCP segment containing
an error may survive without detection until it arrives at the TCP
receiving host, at which point it will almost always either fail the
IP header checksum or the TCP checksum and be discarded as in the
link-level error case. Unfortunately, in either of these cases, it"s
not generally safe for the node detecting the corruption to return
information about the corrupt packet to the TCP sender because the
sending address itself might have been corrupted.
3.3.4.1 Mitigation Description
Because the probability of link errors on a satellite link is
relatively greater than on a hardwired link, it is particularly
important that the TCP sender retransmit these lost segments without
reducing its congestion window. Because corrupt segments do not
indicate congestion, there is no need for the TCP sender to enter a
congestion avoidance phase, which may waste available bandwidth.
Simulations performed in [SF98] show a performance improvement when
TCP can properly differentiate between between corruption and
congestion of wireless links.
Perhaps the greatest research challenge in detecting corruption is
getting TCP (a transport-layer protocol) to receive appropriate
information from either the network layer (IP) or the link layer.
Much of the work done to date has involved link-layer mechanisms that
retransmit damaged segments. The challenge seems to be to get these
mechanisms to make repairs in such a way that TCP understands what
happened and can respond appropriately.
3.3.4.2 Research
Research into corruption detection to date has focused primarily on
making the link level detect errors and then perform link-level
retransmissions. This work is summarized in [BKVP97,BPSK96]. One of
the problems with this promising technique is that it causes an
effective reordering of the segments from the TCP receiver"s point of
view. As a simple example, if segments A B C D are sent across a
noisy link and segment B is corrupted, segments C and D may have
already crossed the link before B can be retransmitted at the link
level, causing them to arrive at the TCP receiver in the order A C D
B. This segment reordering would cause the TCP receiver to generate
duplicate ACKs upon the arrival of segments C and D. If the
reordering was bad enough, the sender would trigger the fast
retransmit algorithm in the TCP sender, in response to the duplicate
ACKs. Research presented in [MV98] proposes the idea of suppressing
or delaying the duplicate ACKs in the reverse direction to counteract
this behavior. Alternatively, proposals that make TCP more robust in
the face of re-ordered segment arrivals [Flo99] may reduce the side
effects of the re-ordering caused by link-layer retransmissions.
A more high-level approach, outlined in the [DMT96], uses a new
"corruption experienced" ICMP error message generated by routers that
detect corruption. These messages are sent in the forward direction,
toward the packet"s destination, rather than in the reverse direction
as is done with ICMP Source Quench messages. Sending the error
messages in the forward direction allows this feedback to work over
asymmetric paths. As noted above, generating an error message in
response to a damaged packet is problematic because the source and
destination addresses may not be valid. The mechanism outlined in
[DMT96] gets around this problem by having the routers maintain a
small cache of recent packet destinations; when the router
experiences an error rate above some threshold, it sends an ICMP
corruption-experienced message to all of the destinations in its
cache. Each TCP receiver then must return this information to its
respective TCP sender (through a TCP option). Upon receiving an ACK
with this "corruption-experienced" option, the TCP sender assumes
that packet loss is due to corruption rather than congestion for two
round trip times (RTT) or until it receives additional link state
information (such as "link down", source quench, or additional
"corruption experienced" messages). Note that in shared networks,
ignoring segment loss for 2 RTTs may aggravate congestion by making
TCP unresponsive.
3.3.4.3 Implementation Issues
All of the techniques discussed above require changes to at least the
TCP sending and receiving stacks, as well as intermediate routers.
Due to the concerns over possibly ignoring congestion signals (i.e.,
segment drops), the above algorithm is not recommended for use in
shared networks.
3.3.4.4 Topology Considerations
It is expected that corruption detection, in general would be
beneficial in all environments outlined in section 2. It would be
particularly beneficial in the satellite/wireless environment over
which these errors may be more prevalent.
3.3.4.5 Possible Interaction and Relationships with Other Research
SACK-based loss recovery algorithms (as described in 3.3.2) may
reduce the impact of corrupted segments on mostly clean links because
recovery will be able to happen more rapidly (and without relying on
the retransmission timer). Note that while SACK-based loss recovery
helps, throughput will still suffer in the face of non-congestion
related packet loss.
3.4 Congestion Avoidance
3.4.1 Mitigation Description
During congestion avoidance, in the absence of loss, the TCP sender
adds approximately one segment to its congestion window during each
RTT [Jac88,Ste97,APS99]. Several researchers have observed that this
policy leads to unfair sharing of bandwidth when multiple connections
with different RTTs traverse the same bottleneck link, with the long
RTT connections obtaining only a small fraction of their fair share
of the bandwidth.
One effective solution to this problem is to deploy fair queueing and
TCP-friendly buffer management in network routers [Sut98]. However,
in the absence of help from the network, other researchers have
investigated changes to the congestion avoidance policy at the TCP
sender, as described in [Flo91,HK98].
3.4.2 Research
The "Constant-Rate" increase policy has been studied in [Flo91,HK98].
It attempts to equalize the rate at which TCP senders increase their
sending rate during congestion avoidance. Both [Flo91] and [HK98]
illustrate cases in which the "Constant-Rate" policy largely corrects
the bias against long RTT connections, although [HK98] presents some
evidence that such a policy may be difficult to incrementally deploy
in an operational network. The proper selection of a constant (for
the constant rate of increase) is an open issue.
The "Increase-by-K" policy can be selectively used by long RTT
connections in a heterogeneous environment. This policy simply
changes the slope of the linear increase, with connections over a
given RTT threshold adding "K" segments to the congestion window
every RTT, instead of one. [HK98] presents evidence that this
policy, when used with small values of "K", may be successful in
reducing the unfairness while keeping the link utilization high, when
a small number of connections share a bottleneck link. The selection
of the constant "K," the RTT threshold to invoke this policy, and
performance under a large number of flows are all open issues.
3.4.3 Implementation Issues
Implementation of either the "Constant-Rate" or "Increase-by-K"
policies requires a change to the congestion avoidance mechanism at
the TCP sender. In the case of "Constant-Rate," such a change must
be implemented globally. Additionally, the TCP sender must have a
reasonably accurate estimate of the RTT of the connection. The
algorithms outlined above violate the congestion avoidance algorithm
as outlined in RFC2581 [APS99] and therefore should not be
implemented in shared networks at this time.
3.4.4 Topology Considerations
These solutions are applicable to all satellite networks that are
integrated with a terrestrial network, in which satellite connections
may be competing with terrestrial connections for the same bottleneck
link.
3.4.5 Possible Interaction and Relationships with Other Research
As shown in [PADHV99], increasing the congestion window by multiple
segments per RTT can cause TCP to drop multiple segments and force a
retransmission timeout in some versions of TCP. Therefore, the above
changes to the congestion avoidance algorithm may need to be
accompanied by a SACK-based loss recovery algorithm that can quickly
repair multiple dropped segments.
3.5 Multiple Data Connections
3.5.1 Mitigation Description
One method that has been used to overcome TCP"s inefficiencies in the
satellite environment is to use multiple TCP flows to transfer a
given file. The use of N TCP connections makes the sender N times
more aggressive and therefore can improve throughput in some
situations. Using N multiple TCP connections can impact the transfer
and the network in a number of ways, which are listed below.
1. The transfer is able to start transmission using an effective
congestion window of N segments, rather than a single segment as
one TCP flow uses. This allows the transfer to more quickly
increase the effective cwnd size to an appropriate size for the
given network. However, in some circumstances an initial window
of N segments is inappropriate for the network conditions. In
this case, a transfer utilizing more than one connection may
aggravate congestion.
2. During the congestion avoidance phase, the transfer increases the
effective cwnd by N segments per RTT, rather than the one segment
per RTT increase that a single TCP connection provides. Again,
this can aid the transfer by more rapidly increasing the effective
cwnd to an appropriate point. However, this rate of increase can
also be too aggressive for the network conditions. In this case,
the use of multiple data connections can aggravate congestion in
the network.
3. Using multiple connections can provide a very large overall
congestion window. This can be an advantage for TCP
implementations that do not support the TCP window scaling
extension [JBB92]. However, the aggregate cwnd size across all N
connections is equivalent to using a TCP implementation that
supports large windows.
4. The overall cwnd decrease in the face of dropped segments is
reduced when using N parallel connections. A single TCP
connection reduces the effective size of cwnd to half when a
single segment loss is detected. When utilizing N connections
each using a window of W bytes, a single drop reduces the window
to:
(N * W) - (W / 2)
Clearly this is a less dramatic reduction in the effective cwnd size
than when using a single TCP connection. And, the amount by which
the cwnd is decreased is further reduced by increasing N.
The use of multiple data connections can increase the ability of
non-SACK TCP implementations to quickly recover from multiple dropped
segments without resorting to a timeout, assuming the dropped
segments cross connections.
The use of multiple parallel connections makes TCP overly aggressive
for many environments and can contribute to congestive collapse in
shared networks [FF99]. The advantages provided by using multiple
TCP connections are now largely provided by TCP extensions (larger
windows, SACKs, etc.). Therefore, the use of a single TCP connection
is more "network friendly" than using multiple parallel connections.
However, using multiple parallel TCP connections may provide
performance improvement in private networks.
3.5.2 Research
Research on the use of multiple parallel TCP connections shows
improved performance [IL92,Hah94,AOK95,AKO96]. In addition, research
has shown that multiple TCP connections can outperform a single
modern TCP connection (with large windows and SACK) [AHKO97].
However, these studies did not consider the impact of using multiple
TCP connections on competing traffic. [FF99] argues that using
multiple simultaneous connections to transfer a given file may lead
to congestive collapse in shared networks.
3.5.3 Implementation Issues
To utilize multiple parallel TCP connections a client application and
the corresponding server must be customized. As outlined in [FF99]
using multiple parallel TCP connections is not safe (from a
congestion control perspective) in shared networks and should not be
used.
3.5.4 Topological Considerations
As stated above, [FF99] outlines that the use of multiple parallel
connections in a shared network, such as the Internet, may lead to
congestive collapse. However, the use of multiple connections may be
safe and beneficial in private networks. The specific topology being
used will dictate the number of parallel connections required. Some
work has been done to determine the appropriate number of connections
on the fly [AKO96], but such a mechanism is far from complete.
3.5.5 Possible Interaction and Relationships with Other Research
Using multiple concurrent TCP connections enables use of a large
congestion window, much like the TCP window scaling option [JBB92].
In addition, a larger initial congestion window is achieved, similar
to using [AFP98] or TCB sharing (see section 3.8).
3.6 Pacing TCP Segments
3.6.1 Mitigation Description
Slow-start takes several round trips to fully open the TCP congestion
window over routes with high bandwidth-delay products. For short TCP
connections (such as WWW traffic with HTTP/1.0), the slow-start
overhead can preclude effective use of the high-bandwidth satellite
links. When senders implement slow-start restart after a TCP
connection goes idle (suggested by Jacobson and Karels [JK92]),
performance is reduced in long-lived (but bursty) connections (such
as HTTP/1.1, which uses persistent TCP connections to transfer
multiple WWW page elements) [Hei97a].
Rate-based pacing (RBP) is a technique, used in the absence of
incoming ACKs, where the data sender temporarily paces TCP segments
at a given rate to restart the ACK clock. Upon receipt of the first
ACK, pacing is discontinued and normal TCP ACK clocking resumes. The
pacing rate may either be known from recent traffic estimates (when
restarting an idle connection or from recent prior connections), or
may be known through external means (perhaps in a point-to-point or
point-to-multipoint satellite network where available bandwidth can
be assumed to be large).
In addition, pacing data during the first RTT of a transfer may allow
TCP to make effective use of high bandwidth-delay links even for
short transfers. However, in order to pace segments during the first
RTT a TCP will have to be using a non-standard initial congestion
window and a new mechanism to pace outgoing segments rather than send
them back-to-back. Determining an appropriate size for the initial
cwnd is an open research question. Pacing can also be used to reduce
bursts in general (due to buggy TCPs or byte counting, see section
3.2.2 for a discussion on byte counting).
3.6.2 Research
Simulation studies of rate-paced pacing for WWW-like traffic have
shown reductions in router congestion and drop rates [VH97a]. In
this environment, RBP substantially improves performance compared to
slow-start-after-idle for intermittent senders, and it slightly
improves performance over burst-full-cwnd-after-idle (because of
drops) [VH98]. More recently, pacing has been suggested to eliminate
burstiness in networks with ACK filtering [BPK97].
3.6.3 Implementation Issues
RBP requires only sender-side changes to TCP. Prototype
implementations of RBP are available [VH97b]. RBP requires an
additional sender timer for pacing. The overhead of timer-driven
data transfer is often considered too high for practical use.
Preliminary experiments suggest that in RBP this overhead is minimal
because RBP only requires this timer for one RTT of transmission
[VH98]. RBP is expected to make TCP more conservative in sending
bursts of data after an idle period in hosts that do not revert to
slow start after an idle period. On the other hand, RBP makes TCP
more aggressive if the sender uses the slow start algorithm to start
the ACK clock after a long idle period.
3.6.4 Topology Considerations
RBP could be used to restart idle TCP connections for all topologies
in Section 2. Use at the beginning of new connections would be
restricted to topologies where available bandwidth can be estimated
out-of-band.
3.6.5 Possible Interaction and Relationships with Other Research
Pacing segments may benefit from sharing state amongst various flows
between two hosts, due to the time required to determine the needed
information. Additionally, pacing segments, rather than sending
back-to-back segments, may make estimating the available bandwidth
(as outlined in section 3.2.4) more difficult.
3.7 TCP Header Compression
The TCP and IP header information needed to reliably deliver packets
to a remote site across the Internet can add significant overhead,
especially for interactive applications. Telnet packets, for
example, typically carry only a few bytes of data per packet, and
standard IPv4/TCP headers add at least 40 bytes to this; IPv6/TCP
headers add at least 60 bytes. Much of this information remains
relatively constant over the course of a session and so can be
replaced by a short session identifier.
3.7.1 Mitigation Description
Many fields in the TCP and IP headers either remain constant during
the course of a session, change very infrequently, or can be inferred
from other sources. For example, the source and destination
addresses, as well as the IP version, protocol, and port fields
generally do not change during a session. Packet length can be
deduced from the length field of the underlying link layer protocol
provided that the link layer packet is not padded. Packet sequence
numbers in a forward data stream generally change with every packet,
but increase in a predictable manner.
The TCP/IP header compression methods described in
[DNP99,DENP97,Jac90] reduce the overhead of TCP sessions by replacing
the data in the TCP and IP headers that remains constant, changes
slowly, or changes in a predictable manner with a short "connection
number". Using this method, the sender first sends a full TCP/IP
header, including in it a connection number that the sender will use
to reference the connection. The receiver stores the full header and
uses it as a template, filling in some fields from the limited
information contained in later, compressed headers. This compression
can reduce the size of an IPv4/TCP headers from 40 to as few as 3 to
5 bytes (3 bytes for some common cases, 5 bytes in general).
Compression and decompression generally happen below the IP layer, at
the end-points of a given physical link (such as at two routers
connected by a serial line). The hosts on either side of the
physical link must maintain some state about the TCP connections that
are using the link.
The decompresser must pass complete, uncompressed packets to the IP
layer. Thus header compression is transparent to routing, for
example, since an incoming packet with compressed headers is expanded
before being passed to the IP layer.
A variety of methods can be used by the compressor/decompressor to
negotiate the use of header compression. For example, the PPP serial
line protocol allows for an option exchange, during which time the
compressor/decompressor agree on whether or not to use header
compression. For older SLIP implementations, [Jac90] describes a
mechanism that uses the first bit in the IP packet as a flag.
The