Internet Draft
Network Working Group
INTERNET DRAFT S. Jagannath and N. Yin
Expire in six months Bay Networks Inc.
August 1997
End-to-End Traffic Management Issues in IP/ATM Internetworks
<draft-jagan-e2e-traf-mgmt-00.txt>
Status of this Memo
This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute
working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as ``work in progress.''
To learn the current status of any Internet-Draft, please check
the ``1id-abstracts.txt'' listing contained in the Internet-Drafts
Shadow Directories on ds.internic.net (US East Coast), nic.nordu.net
(Europe), ftp.isi.edu (US West Coast), or munnari.oz.au (Pacific
Rim).
Abstract
This document addresses the end-to-end traffic management issues in
IP/ATM internetworks. In the internetwork environment, the ATM
control mechanisms (e.g., Available Bit Rate (ABR) and UBR with Early
Packet Discard (EPD)) are applicable to the ATM subnetwork, while the
TCP flow control extends from end to end. We investigated the end to
end performance in terms of TCP throughput and file transfer delay in
cases using ABR and UBR in the ATM subnetwork. In this document, we
also discuss the issue of trade-off between the buffer requirements
at the ATM edge device (e.g., Ethernet-ATM switch, ATM router
interface) versus ATM switches inside the ATM network.
Our simulation results show that in certain scenarios (e.g., with
limited edge device buffer memory) UBR with EPD may perform
comparably to ABR or even outperform ABR. We show that it is not
sufficient to have a lossless ATM subnetwork from the end-to-end
Jagannath, Yin [Page 1]
INTERNET DRAFT August 1997 Expires February 5 1998
performance point of view. The results illustrate the necessity for
an edge device congestion handling mechanism that can couple the ABR
and TCP feedback control loops. We present an algorithm that makes
use of the ABR feedback information and edge device congestion state
to make packet dropping decisions at the edge of the ATM network.
Using the algorithm at the edge device, the end-to-end performance in
throughput and delay are improved while using ABR as the ATM
subnetwork technology and with small buffers in the edge device.
Jagannath, Yin [Page 2]
INTERNET DRAFT August 1997 Expires February 5 1998
1. Introduction
In an IP/ATM internetwork environment, the legacy networks (e.g.,
Ethernet) are typically interconnected by an ATM subnetwork (e.g.,
ATM backbone). Supporting end-to-end Quality of Services (QOS) and
traffic management in the internetworks become important requirements
for multimedia applications. In this document, we focus on the end-
to-end traffic management issues in IP/ATM internetwork. We
investigated the end-to-end performance in terms of TCP throughput
and file transfer delay in cases using ABR and UBR in the ATM
subnetwork. In this document, we also discuss the issue of trade-off
between the buffer requirement at the ATM edge device (e.g.,
Ethernet-ATM switch, ATM router interface) versus ATM switches inside
the ATM network.
ABR rate-based control mechanism has been specified by the ATM Forum
[1]. In the past, there have been significant investigations on ABR
flow control and TCP over ATM [8-11]. Most of the works were focused
on the ATM switch buffer requirement and throughput fairness using
different algorithms (e.g., simple binary - Explicit Forward
Congestion Indication (EFCI) vs. Explicit Rate Control; FIFO vs. per
VC queuing, etc.). It is generally believed that ABR can effectively
control the congestion within ATM networks.
There has been little work on end-to-end traffic management in the
internetworking environment, where networks are not end-to-end ATM.
In such cases ABR flow control may simply push the congestion to the
edge of ATM network. Even if the ATM network is kept free of
congestion by using ABR flow control, the end-to-end performance
(e.g., the time to transfer a file) perceived by the application
(typically hosted in the legacy network) may not necessarily be
better. Furthermore, one may argue that the reduction in buffer
requirement in the ATM switch by using ABR flow control may be at the
expense of an increase in buffer requirement at the edge device
(e.g., ATM router interface, legacy LAN to ATM switches).
Since most of today's data applications use TCP flow control
protocol, one may question the benefits of ABR flow control as a
subnetwork technology, arguing that UBR is equally effective and much
less complex than ABR [12]. Figures 1 and 2 illustrate the two cases
under consideration.
Jagannath, Yin [Page 3]
INTERNET DRAFT August 1997 Expires February 5 1998
Source --> IP -> Router --> ATM Net --> Router -> IP --> Destination
Cloud UBR Cloud
----------------------->----------------------
| |
--------------- TCP Control Loop --<----------
Figure 1: UBR based ATM Subnetwork.
In these cases, source and destination are interconnected through
IP/ATM/IP internetworks. In Figure 1, the connection uses UBR
service and in Figure 2, the connection uses ABR. In the UBR case,
when congestion occurs in ATM networks, cells are dropped (e.g.,
Early Packet Discard [15] may be utilized) resulting in the reduction
of the TCP congestion window. In the ABR case, when congestion is
detected in ATM Networks, ABR rate control becomes effective and
enforces the ATM edge device to reduce its transmission rate into the
ATM network. If congestion persists, the buffer in the edge device
will reach its capacity and start to drop packets, resulting in the
reduction of the TCP congestion window.
From the performance point of view, the latter involves two control
loops: ABR and TCP. The ABR inner control loop may result in a
longer feedback delay for the TCP control. Furthermore, there are
two feedback control protocols, and one may argue that the
interactions or interference between the two may actually degrade the
TCP performance [12].
Source --> IP -> Router --> ATM Net --> Router -> IP --> Destination
Cloud ABR Cloud
----->---------
| |
-- ABR Loop -<-
-------------------------->-------------------
| |
--------------- TCP Control Loop --<----------
Figure 2. ABR Based ATM Subnetwork
From the implementation perspective, there are trade-off between
memory requirements at ATM switch and edge device for UBR and ABR. In
Jagannath, Yin [Page 4]
INTERNET DRAFT August 1997 Expires February 5 1998
addition, we need to consider the costs of implementation of ABR and
UBR at ATM switches and edge devices.
From this discussion, we believe that the end-to-end traffic
management of an entire internetwork environment should be
considered. In this document, we study the interaction and
performance of ABR and UBR from the point of view of end-to-end
traffic management. Our simulation results show that in certain
scenarios (e.g., with limited edge device buffer memory) UBR with EPD
may perform comparably to ABR or even outperform ABR. We show that it
is not sufficient to have a lossless ATM subnetwork when we are
looking at the end-to-end performance point of view. The results
illustrate the necessity for an edge device congestion handling
mechanism that can couple the two - ABR and TCP feedback control
loops. We present an algorithm that makes use of the ABR feedback
information and edge device congestion state to make packet dropping
decisions at the edge of the ATM network,. Using the algorithm at the
edge device, the end-to-end throughput and delay are improved while
using ABR as the ATM subnetwork technology and with small buffers in
the edge device.
2. Buffer Requirements
In the context of feedback congestion control, buffers are used to
absorb the transient traffic conditions and steady state rate
oscillations due to the feedback delay. The buffers help to avoid
losses in the network and to improve link utilization as aggregated
traffic rate is below the link bandwidth. When there is congestion
the switch conveys the information back to the source. The source
reduces its rate according to the feedback information. The switch
sees a drop in its input rates after a round trip delay. If the
buffer size is large enough to store the extra data during this
delay, it is possible to avoid losses. When the congestion abates,
the switch sends information back to the source allowing it to
increase its rate. There is again a delay before the switch sees an
increase in its input rates. Meanwhile the data is drained from the
buffer at a higher rate compared with its input rate. The switch
continues to see a full link utilization as long as there is data in
the buffer. Since the maximum drain rate is the bandwidth of the link
and the minimum feedback delay is the round trip propagation delay,
it is generally believed that a buffer size of one bandwidth
propagation delay product is required to achieve good throughput.
Jagannath, Yin [Page 5]
INTERNET DRAFT August 1997 Expires February 5 1998
The above argument assumes that the source can reduce its rate when
it receives feedback information from the network and that the source
will be able to send data at a higher rate when congestion abates in
the network. In the internetworking environment, the edge device is
the ATM end-system. It controls its transmission rate according to
the ABR feedback information, whereas TCP sources send data packet
according to the TCP window control protocol.
TCP uses implicit negative feedback information (i.e., packet loss
and time out). When there is congestion in the network, TCP may
continue to transmit a full window amount of data, causing packet
loss if the buffer size is not greater than the window size. In case
of multiple TCP streams sharing the same buffer it is impossible to
size the buffer to be greater than the sum of the window size of all
the streams. When there is packet loss, the TCP window size is
reduced multiplicatively and it goes through a recovery phase. In
this phase, TCP is constrained to transmit at a lower rate even if
the network is free of congestion, causing a decrease in the link
utilization.
When ABR is used, it is possible to avoid losses within the ATM
subnetwork. However, since the host uses TCP, the congestion is
merely shifted from the ATM subnetwork to the IP/ATM interface. There
may be packet loss in the edge device and loss of throughput because
of that. Moreover, there is a possibility that there will be negative
interaction between the two feedback loops. For example, when the TCP
window is large, the available rate may be reduced and when the TCP
window is small due to packet losses, the available rate may be high.
When edge buffers are limited this kind of negative interaction can
cause a severe degradation in the throughput. When UBR is used,
packet loss due to EPD or cell discard at the ATM switches triggers
the reduction of TCP window. Hence, in the UBR case, there is only a
single TCP feedback loop and no additional buffer requirement at the
edge device.
In general, using UBR may need more memory at the ATM switch, whereas
using ABR needs more memory at the edge device. The amount of
required buffers for zero cell loss at ATM switch for UBR and at edge
device for ABR are in order of maximum TCP window times the number of
TCP sessions for zero cell loss. The edge device may be a low cost
LAN-ATM switch with limited buffer memory. Furthermore the cost of
the edge device is shared by lesser number of TCP sessions as
compared to the ATM switch - making it more cost sensitive than the
backbone switch. In reality, the buffers at both edge device and ATM
switch are limited and smaller than required for zero cell loss.
Jagannath, Yin [Page 6]
INTERNET DRAFT August 1997 Expires February 5 1998
Hence, it is our goal to maximize the TCP throughput for the limited
buffers at both edge device and ATM switches. In the our simulation
experiments, we assume limited buffers at both edge device and ATM
switches.
3. Internetwork Model
In the end-to-end model the whole network can be considered to be a
black box with TCP sources and destinations in the periphery. The
only properties of the black box that the TCP modules are sensitive
to are the packet loss and the round trip delay through the black
box. The round trip delay determines how fast the TCP window can
open up to utilize available bandwidth in the network, and also how
fast it can react to impending congestion in the network. The packet
loss triggers the TCP congestion avoidance and recovery scheme. The
feedback control loop of the ABR service may cause an increase in the
over-all round trip delay as it constricts the rate available to each
TCP stream. It is generally believed that ABR is able to reduce the
congestion level within the ATM network, however it does so at the
expense of increasing the congestion level at the edge of the ATM
network. From the an end-to-end performance point of view it is not
clear if this is indeed beneficial.
In our simulations we use a modified version of TCP Reno. This
implementation of TCP uses the fast retransmission and recovery along
with a modification to deal with multiple packet losses within the
same window[4]. The simulation experiments were performed using a
TCP/IP and IP over ATM [5] protocol stack, as shown in Figure 3. The
TCP model conforms to RFCs 793 & 1122. In addition, the window
scaling option is used and the TCP timer granularity is set to be
0.1s. The options considered for ATM are UBR, UBR with Early Packet
Discard (EPD) ABR with EFCI and ABR with Explicit Rate (ER) [1].
The bigger picture of the model consists of two IP clouds connected
by an ATM cloud as shown in Figure 1 and 2. In more specific details
the model consists of two ATM switches connected back to back with a
single OC-3 (155 Mb/s) link. The IP/ATM edge devices are connected to
the ATM switches via OC-3 access links. 10 such devices are attached
to each switch and therefore there are 10 TCP bi-directional
connections that go through the bottleneck link between the ATM
switches. One such connection is shown in the layered model in the
Figure 3.
Jagannath, Yin [Page 7]
INTERNET DRAFT August 1997 Expires February 5 1998
_________ _________
| App. | | App. |
|---------| |---------|
| App. I/F| | App. I/F|
|---------| |---------|
| TCP | | TCP |
|---------| _________ _________ |---------|
| | | IP | | IP | | |
| | | over | | over | | |
| | | ATM | | ATM | | |
| | |---------| |---------| | |
| IP | | IP |AAL5| |AAL5 | IP| | IP |
| | | |----| _________ |-----| | | |
| | | | ATM| | ATM | | ATM | | | |
|---------| |----|----| |---------| |-----|---| |---------|
| PL | | PL | PL | | PL | PL | | PL | PL| | PL |
--------- --------- --------- --------- --------
| | | | | | | |
--------- ------ ------- ---------
-------->----------
| |
----- ABR Loop -<--
------------>--------------->--------------------
| |
-----<---------TCP Control Loop -<---------------
Figure 3. Simulation model
The propagation delay (D1 + D2 + D3) and the transmission delay
corresponding to the finite transmission rate in the IP cloud is
simulated in the TCP/IP stack. In the model described below the delay
associated with each of the clouds is 3ms (D1 = D2 = D3 = 3ms). The
performance metrics considered are basically the goodput at the TCP
layer R, the time delay for a successful transmission of a 100KB file
(D = duration between the transmission of the first byte and the
receipt of an ack for the last byte). The goodput at the TCP layer is
defined as the rate at which the data is successfully transmitted up
to the application layer from the TCP layer. Data that is
retransmitted is counted only once when it gets transmitted to the
application layer. Packets that arrive out of sequence have to wait
before all the preceding data is received before they can be
transmitted to the application layer.
Jagannath, Yin [Page 8]
INTERNET DRAFT August 1997 Expires February 5 1998
Explicit Rate Switch Scheme: The switch keeps a flag for every VC
which indicates if the VC is active or not. Switch maintains a timer
for each VC. Every time a cell comes in on a particular VC the flag
is turned on (if it is off) and the corresponding timer is started
(if the flag is already on the timer is restarted). This flag is only
turned off when the timer expires. Thus, the VC is considered active
when the flag is on and inactive when the flag is off. The explicit
rate timer value (ERT) is a parameter that can be adjusted. For our
simulation we set the timer value to 1ms - this corresponds to little
more than the transmission time for a single TCP packet (at line
rate). The explicit rate for each VC is simply the total link
bandwidth divided equally between all active VCs. Thus, if a VC is
inactive for a duration more than the timer value, its bandwidth is
reallocated to the other active VCs.
3.1 Simulation Results
In this section we provide some initial results that help us
understand the issues discussed in the previous sections. The
simulation parameters are the following:
For TCP with Fast Retransmission and Recovery:
Timer Granularity = 100 ms.
MSS = 9180 bytes.
Round Trip prop. = 18 ms = 2*(D1 + D2 + D3).
Max. RTO = 500 ms.
Initial RTO = 100 ms.
Max. Window Size = 655360 bytes ( > Bw * RTT)
For ABR:
Round Trip Prop. Delay = 6ms = 2*(D2)
Buffer sizes = variable
EFCI Thresholds = configurable parameter
Nrm = 32
RDF=RIF = 1/32
All link rates = 155Mb/s
Explicit Rate Timer (ERT) = 1ms The application is
modeled to have infinite amount of data.
The performance metrics shown are the goodput at the TCP layer, R and
the time delay for a successful transmission of a 100KB file (D =
Jagannath, Yin [Page 9]
INTERNET DRAFT August 1997 Expires February 5 1998
duration between the transmission of the first byte and the receipt
of an ack for the last byte) within a contiguous data stream.
Note: All results are with a TCP timer granularity of 100 ms.
Table 1: EB = 500, SB = 1000, TCP with FRR
Edge-buffer size = 500 cells,
Switch-buffer size = 1000 cells,
ABR-EFCI threshold = 250,
UBR-EPD threshold = 700, Exp. Rate Timer (ERT) = 1ms.
____________________________________________________________
| UBR UBR + EPD ABR-EFCI ABR-ER |
|------------------------------------------------------------|
| Delay (D) 520 ms 521 ms 460 ms 625 ms |
| Goodput(R) 81.06 Mbps 87.12 Mbps 77.77 Mbps 58.95 Mbps |
------------------------------------------------------------
In the case of UBR and UBR+EPD the switch buffer is fully utilized
but maximum occupancy of the edge buffers is about the size of 1 TCP
packet. Thus, cells are lost only in the switch buffer in case of
UBR. ABR with EFCI reduces the cell loss in the switch but results in
increase in the cell loss in the edge buffer. ABR with ER eliminates
the cell loss completely from the switch buffer but results in more
cell loss in the edge device. Thus in the case shown, ER results in
the lowest throughput and longest file transfer delay.
For the sake of comparison, in Table 2 (below) we show the results
for the TCP without fast retransmission and recovery which were
obtained earlier.
Table 2: EB = 500, SB = 1000, No Fast Retransmit and Recovery
Edge-buffer size = 500 cells,
Switch-buffer size = 1000 cells,
ABR-EFCI threshold = 250, UBR-EPD threshold = 700.
________________________________________________
| UBR UBR+EPD ABR-EFCI |
|------------------------------------------------|
| Delay (D) 201 ms 139 ms 226 ms |
| Goodput(R) 63.94 Mbps 86.39 Mbps 54.16 Mbps |
------------------------------------------------
The set of results (Table 1) with the enhanced TCP show an overall
improvement in throughput, but an increase in the file transfer
delay. In the presence of fast recovery and retransmit the lost
Jagannath, Yin [Page 10]
INTERNET DRAFT August 1997 Expires February 5 1998
packet is detected with the receipt of duplicate acknowledgments, but
the TCP instead of reducing its window size to 1 packet, merely
reduces it to half its current size. This increases the total amount
of data that is transmitted by TCP. In the absence of FRR the TCP
sources are silent for long periods of time when TCP recovers a
packet loss through time-out. The increase in throughput in the case
of TCP with FRR can be mainly attributed to more amount of data being
transmitted by the TCP source. However, that also increases the
number of packets that are dropped causing an increase in the average
file transfer delay. (The file transfer delay is the duration between
the transmit of the first byte of the file to the receipt for the
acknowledgment for the last byte of the file.)
In both the sets of results we see that UBR with Early Packet Discard
results in better performance. While the ER scheme (Table 1) results
in zero cell loss within the ATM network, it does not translate to
better end-to-end performance for the TCP connections. The limited
buffer sizes on the edge of the network result in poor effective
throughput, both due to increased retransmissions and poor
interaction between the TCP window and the ABR allowed rate (ACR).
All the earlier results assume limited buffers in the edge as well as
the switch. We also performed simulations with larger buffers at the
edge device and in the switch. When the edge device buffer size is
increased the end-to-end throughput is improved, though the delay
might still be high, as shown in the following tables.
Table 3: EB = 5000, SB = 1000 with FRR
Edge-buffer size = 5000 cells,
Switch-buffer size = 1000 cells,
ABR-EFCI threshold = 250, ER Timer (ERT) = 1ms
___________________________________________
| ABR with EFCI ABR with ER |
|-------------------------------------------|
| Delay (D) 355 ms 233 ms |
| Goodput (R) 89.8 Mbps 112.6 Mbps |
-------------------------------------------
Table 4: EB = 10000, SB = 1000 with FRR
Edge-buffer size = 10000 cells,
Switch-buffer size = 1000 cells,
ABR-EFCI threshold = 250, ER Timer (ERT) = 1ms
___________________________________________
| ABR with EFCI ABR with ER |
Jagannath, Yin [Page 11]
INTERNET DRAFT August 1997 Expires February 5 1998
|-------------------------------------------|
| Delay (D) 355 ms 256 ms |
| Goodput(R) 89.8 Mbps 119.6 Mbps |
-------------------------------------------
The maximum TCP window size is approximately 13600 cells (655360
bytes). With larger buffers, the edge buffer is large enough to hold
most of the TCP window. However, we still get low throughput for ABR
with EFCI because of cell loss within the switch. ABR with ER on the
other hand pushes the congestion into the edge device which can
handle it with larger buffers. Cell loss occurs only when the edge
buffer overflows which is a very infrequent event because of the size
of the buffer. Since the TCP window dynamics is actually triggered by
loss of packets, there is less interaction here between the TCP flow
control and the ATM flow control mechanism, contrary to the earlier
results (Table 1). The only factor that influences the TCP flow
control, in the absence of lost packets, is the total round trip time
- which in this case slowly increases as the buffers fill up, thereby
increasing end-to-end delays.
From the results it is evident that it is not sufficient to keep the
ATM subnetwork free of congestion to provide good end-to-end
throughput and delay performance. Our approach to solve this problem
is to couple the two flow control loops. The information received
from the network, along with the local queue sizes is used to detect
congestion early and speed up the TCP slow-down and recovery
processes. In the next section we present an edge device congestion
handling results in significant improvement in the end-to-end
performance with ABR-ER and small buffers at the edge device.
4. Edge Device Congestion Handling Mechanism
The Packet Discard Algorithm, that couples ABR flow control with TCP
window flow control, has the objectives of relieving congestion as
well as conveying congestion information to the TCP source as soon as
possible. In addition the algorithm needs to take into consideration
how the TCP source can recover from the lost packet without a
significant loss of throughput. Figure 5 gives a state-diagram for
the algorithm followed by the pseudocode and an explanation for the
algorithm. Currently we are working on an enhancement of this
algorithm that is more sensitive to the explicit rate feedback and
can lead to further improvement in the performance.
Jagannath, Yin [Page 12]
INTERNET DRAFT August 1997 Expires February 5 1998
4.1 Adaptive Discard Algorithm
_________________
--->--| Phase 0 |--->----
| ----------------- |
(ACR < f*LD_ACR | | Queue < LT
and Queue < LT) | | and
or | | P_CTR = 0
Queue = MQS | |
| _________________ |
---<--| Phase 1 |---<----
-----------------
Figure 4. Algorithm State Diagram
Variables:
ACR = current Available Cell Rate(current cell transmission rate).
ER = maximum network allowed cell transmission rate from
receiving RM cell, it's the ceiling of ACR.
CI = Congestion indication in network from receiving RM cell.
LD_ACR = ACR when a packet was last dropped, or when ACR is
increased. LD_ACR is always greater than or equal to ACR.
QL = Queue Length, number of packets in VC queue
LT = Low Queue Threshold
MQS = Maximum Queue Size
P_CTR = Stores the value of the number of packets in the VC queue
when a packet is dropped and is then decrement for every
packet that is removed from the VC queue.
Congestion Phase = 1 bit information on the phase of the
congestion (0 or 1)
f = Factor that determines the significance in rate reduction,
with 0= LT) or (QL >= MQS){
Drop packet from front of the Queue;
Congestion Phase = 1;
LD_ACR = ACR;
P_CTR = QL;
}
if (ACR > LD_ACR)
LD_ACR = ACR ;
}
if (Congestion Phase = 1){
if (QL = MQS){
Drop packet in front of queue;
LD_ACR = ACR;
P_CTR = QL;
}
if (QL < LT and P_CTR=0)
Congestion Phase = 0;
}
Decrement P_CTR when a packet is serviced.
P_CTR = Max (P_CTR, 0);
Description: On receipt of the RM cells with the feedback information
the ACR value is updated for the particular VC. When ACR is reduced
drastically we would like to convey this information to the TCP
source as soon as possible. The above algorithm uses this feedback
information and the current queue length information to decide if a
packet should be dropped from the VC. This coupling is a critical
factor in resolving the issues discussed in earlier sections. Every
time a packet is dropped the algorithm updates the LD_ACR (Last Drop
ACR) value and uses that as the reference rate against which it
compares the new ACR values. This value is increased and made equal
to the ACR whenever ACR is larger than its current value. Thus,
LD_ACR is always greater than or equal to the ACR. The algorithm
consists of two phases. The criteria used to affect a packet drop is
different in the two phases.
Phase 0: The network is assumed not to be congested in this phase.
This is reflected both by an ACR that is slowly changing and queue
length less than the maximum queue size (MQS). Two possible scenarios
can cause a packet drop. If the ACR is constant or slowly changing
eventually the TCP window and hence the input rate to the queue will
Jagannath, Yin [Page 14]
INTERNET DRAFT August 1997 Expires February 5 1998
become large enough to cause the queue length to touch the MQS level.
Then it is required to drop a packet to trigger a reduction in the
TCP window. In the second scenario a packet is dropped when there is
a drastic reduction in the ER available to the VC and if the queue
length is greater than a queue threshold. The threshold should be set
to at least a few packets to ensure the transmission of duplicated
acks, and should not be set too close to the size of queue to ensure
the function of congestion avoidance. A drastic reduction in the ACR
value signifies congestion in the network and the algorithm sends an
early warning to the TCP source by dropping a packet.
Front Drop : In both the above cases the packet is dropped from the
front of the queue. This results in early triggering of the
congestion control mechanism in TCP. This has also been observed by
others [16]. Additionally TCP window flow control has the property
that the start of the sliding window aligns itself to the dropped
packet and stops there till that packet is successfully
retransmitted. This reduces the amount of data that the TCP source
can pump into a congested network. In implementations of TCP with the
fast recovery and retransmit options the recovery from the lost
packet is sooner since at least one buffer worth of data is
transmitted after the lost packet (which in turn generate the
required duplicate acknowledgments for the fast retransmit or
recovery).
Transition to Phase 1: When TCP detects a lost packet, depending on
the implementation, it will either reduce its window size to 1
packet, or it reduces the window size to half its current window
size. When multiple packets are lost within the same TCP window
different TCP implementations will recover differently.
The first packet that is dropped causes the reduction in TCP window
and hence its average rate. The multiple packet losses within the
same TCP window causes a degradation of throughput and is not
desirable, irrespective of the TCP implementation. After the first
packet is dropped the algorithm makes a transition to Phase 1 and
does not drop any packets to cause rate reduction in this phase.
Phase 1 : In this phase the algorithm does not drop packets to convey
a reduction of the ACR rate. The packets are dropped only when the
queue reaches the MQS value. The packets are dropped from the front
because of the same reasons as described in Phase 0. When a packet
is dropped the algorithm records the number of packets in the queue
in the variable P_CTR. The TCP window size is at least as large as
P_CTR when a packet is dropped. Thus, the algorithm tries not to drop
Jagannath, Yin [Page 15]
INTERNET DRAFT August 1997 Expires February 5 1998
any more packets due to rate reduction till P_CTR packets are
serviced.
Transition to Phase 0: If the ACR stays at the value that caused the
transition to Phase 1, the queue length will decrease after one round
trip time and the algorithm can transit to Phase 0. If the ACR
decreases further the algorithm eventually drops another packet when
the queue length reaches the MQS, but does not transit back to Phase
0. The transition to Phase 0 takes place when at least P_CTR packets
have been serviced and queue length recedes below the LT threshold.
4.2 Simulation Results
The initial set of results served to identify the problem and to
motivate the edge device congestion handling mechanism. In this
section we discuss some of the simulation results with the use of the
algorithm described above. In order to get better understanding of
the behavior of the algorithm these simulation experiments were
performed with a smaller TCP packet size - 1500 bytes. A 9180 byte
TCP packet is approximately equivalent to 192 ATM cells. This would
be very close to the threshold value of the algorithm for the small
buffer range that we are considering and would unduly influence the
results. The rest of the simulation parameters are the same as used
above with all results using TCP with fast retransmit and recovery.
The results are shown for UBR and UBR with Early Packet Discard
(EPD). Along with ABR we show results for ABR with a simple front
drop strategy in the edge device and for ABR with the Adaptive
Discard Algorithm. These are termed as ABR-ER, ABR-ER with FD and
ABR-ER with ADA in the tables below. From the results it is evident
that the end- to-end performance improves as we use intelligent
packet discard algorithms in the edge device. There is significant
improvement due to the early detection of congestion based on the
network feedback information received due to ABR.
Table 5: EB = 500, SB = 1000,
TCP with Fast Retransmit and Recovery
____________________________________________
| UBR UBR with EPD |
|--------------------------------------------|
| Delay (D) 370 ms 230 ms |
Jagannath, Yin [Page 16]
INTERNET DRAFT August 1997 Expires February 5 1998
| Goodput(R) 63.4 Mbps 79.1 Mbps |
--------------------------------------------
Table 6: EB = 500, SB = 1000,
TCP with Fast Retransmit and Recovery
__________________________________________________________
| ABR-ER ABR-ER with FD ABR-ER with ADA |
|----------------------------------------------------------|
| Delay (D) 715 ms 237 ms 244 ms |
| Goodput(R) 36.15 Mbps 67.9 Mbps 87.68 Mbps |
----------------------------------------------------------
Table 7: EB = 1000, SB = 1000,
TCP with Fast Retransmit and Recovery
__________________________________________________________
| ABR-ER ABR-ER with FD ABR-ER with ADA |
|----------------------------------------------------------|
| Delay (D) 310 ms 145 ms 108 ms |
| Goodput(R) 66.7 Mbps 107.7 Mbps 112.86 Mbps |
----------------------------------------------------------
The use of the algorithm leads to increase in the TCP throughput and
a decrease in the end- to-end delays seen by the end host. The
algorithm has two effects on the TCP streams - it detects congestion
early and drops a packet as soon as it detects congestion, plus it
tries to avoid dropping additional packets. The first packet that is
dropped achieves a reduction in the TCP window size and hence the
input rates, the algorithm tries not to drop any more. This keeps
the TCP streams active with a higher sustained average rate. In order
to improve the ABR throughput and delay performance it seems to be
important to control the TCP window dynamics and reduce possibilities
for negative interaction between the TCP flow control loop and the
explicit rate feedback loop.
5. Summary
In this document we discuss the necessity for considering end-to-end
traffic management in IP/ATM internetworks. In particular we consider
the impact of the interactions between the TCP and the ABR flow
control loops in providing end-to-end quality of service to the user.
We also discuss the trade-off in the memory requirements between the
switch and edge device based on the type of flow control employed
Jagannath, Yin [Page 17]
INTERNET DRAFT August 1997 Expires February 5 1998
within the ATM subnetwork. We show that it is important to consider
the behavior of the end-to-end protocol like TCP when selecting the
traffic management mechanisms in the IP/ATM internetworking
environment.
The ABR rate control pushes the congestion out of the ATM subnetwork
and into the edge devices. We show that this can have a detrimental
effect on the end-to-end performance where the buffers at the edge
device are limited. In such cases we show that UBR with Early Packet
Discard can achieve better performance than ABR.
We present an algorithm for handling the congestion in the edge
device which improves the end-to-end throughput and delay
significantly while using ABR. The algorithm couples the two flow
control loops by using the information received by the explicit rate
scheme to intelligently discard packets when there is congestion in
the edge device. In addition the algorithm is sensitive to the TCP
flow control and recovery mechanism to make sure that the throughput
does not suffer due closing of the TCP window. As presented in this
paper the algorithm is limited to cases where different TCP streams
are queued separately. As future work the algorithm will be extended
to include shared memory cases. More work needs to be done to include
studies with heterogeneous sources with different round- trip delays.
References
[1] ATM Forum TM SWG "Traffic Management Specification V. 4.0"
ATM Forum, April 1996.
[2] V. Jacobson, "Modified TCP Congestion Avoidance Algorithm,"
end2end-interest mailing list, April 30, 1990.
ftp://ftp.isi.edu/end2end/end2end-interest-1990.mail.
[3] K. Fall and S. Floyd, "Comparisons of Tahoe, Reno and Sack TCP",
available from http://www-nrg.ee.lbl.gov/nrg-abstracts.html#KF95.
[4] J. C. Hoe, "Improving the Start-up behavior of Congestion Control Scheme for TCP",
SIGCOMM 96.
[5] RFC 1577, "Classical IP and ARP over ATM", December 1993.
[6] RFC 793, "Transmission Control Protocol", September 1981.
Jagannath, Yin [Page 18]
INTERNET DRAFT August 1997 Expires February 5 1998
[7] RFC 1122, "Requirements for Internet Hosts - Communication Layers",
October 1989.
[8] C. Fang and A. Lin, "A Simulation Study of ABR Robustness with
Binary-Mode Switches", ATMF-95-132, Oct. 1995
[9] H. Chen, and J. Brandt, "Performance Evaluation of EFCI Switch
with Per VC Queuing", ATMF 96-0048, Feb. 1996
[10] H. Li, K. Sui, and H. Tzeng, "TCP over ATM with ABR vs. UBR+EPD"
ATMF 95-0718, June 1995
[11] R. Jain, S. Kalyanaraman, R. Goyal and S. Fahmy, "Buffer Requirements for
TCP over ABR", ATMF 96-0517, April 1996.
[12] P. Newman, "Data over ATM: Standardization of Flow Control"
SuperCon'96, Santa Clara, Jan 1996.
[13] N. Yin and S. Jagannath, "End-to-End Traffic Management in IP/ATM
Internetworks", ATM Forum 96-1406, October 1996.
[14] S. Jagannath and N. Yin, "End-to-End TCP Performance in IP/ATM Internetworks",
ATM Forum 96-1711, December 1996.
[15] A. Romanow and S. Floyd, "Dynamics of TCP Traffic over ATM Networks", IEEE
Journal of Selected Areas in Communications, pp 633-41, vol. 13, N o. 4, May 1995.
[16] T. V. Lakshman et al., "The Drop from Front Strategy in TCP and TCP over ATM",
IEEE Infocom, 1996.
Authors' Address
S. Jagannath and N. Yin
(508) 670-8888, 670-8153 (fax)
jagan_pop@baynetworks.com,
nyin@baynetworks.com
Bay Networks Inc.
The ION working group can be contacted through the chairs:
Andy Malis George Swallow
Jagannath, Yin [Page 19]