Internet Draft Internet Engineering Task Force INTERNET-DRAFT TE Working Group Daniel O. Awduche January 2000 UUNET (MCI Worldcom) Angela Chiu AT&T Anwar Elwalid Lucent Technologies Indra Widjaja Fujitsu Network Communications Xipeng Xiao Global Crossing A Framework for Internet Traffic Engineering draft-ietf-te-framework-00.txt Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." To view the list Internet-Draft Shadow Directories, see http://www.ietf.org/shadow.html. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 1] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 Abstract This memo describes a framework for Internet traffic engineering. The framework is intended to promote better understanding of the issues surrounding traffic engineering in IP networks, and to provide a common basis for the development of traffic engineering capabilities in the Internet. The framework explores the principles, architectures, and methodologies for performance evaluation and performance optimization of operational IP networks. The optimization goals of traffic engineering seek to enhance the performance of IP traffic while utilizing network resources economically, efficiently, and reliably. The framework includes a set of generic requirements, recommendations, and options for Internet traffic engineering. The framework can serve as a guide to implementors of online and offline Internet traffic engineering mechanisms, tools, and support systems. The framework can also help service providers in devising traffic engineering solutions for their networks. Table of Contents 1.0 Introduction 1.1 What is Internet Traffic Engineering? 1.2 Scope 1.3 Terminology 2.0 Background 2.1 Context of Internet Traffic Engineering 2.2 Network Context 2.3 Problem Context 2.3.1 Congestion and its Ramifications 2.4 Solution Context 2.4.1 Combating the Congestion Problem 2.5 Implementation and Operational Context 3.0 Traffic Engineering Process Model 3.1 Components of the Traffic Engineering Process Model 3.2 Measurement 3.3 Modeling and Analysis 3.4 Optimization 4.0 Historical Review and Recent Developments 4.1 Traffic Engineering in Classical Telephone Networks 4.2 Evolution of Traffic Engineering in Packet Networks 4.2.1 Adaptive Routing in ARPANET 4.2.2 SNA Subarea 4.2.2 Dynamic Routing in the Internet 4.2.2 TOS Routing 4.2.3 Equal Cost Multipath 4.3 Overlay Model 4.4 Constraint-Based Routing 4.5 Overview of IETF Projects Related to Traffic Engineering 4.5.1 Integrated Services 4.5.2 RSVP 4.5.3 Differentiated Services 4.5.4 MPLS 4.5.5 IP Performance Metrics Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 2] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 4.5.6 Flow Measurement 4.5.7 Endpoint Congestion Management 4.6 Overview of ITU Activities Related to Traffic Engineering 5.0 Taxonomy of Traffic Engineering Systems 5.1 Time-Dependent Versus State-Dependent 5.2 Offline Versus Online 5.3 Centralized Versus Distributed 5.4 Local Versus Global 5.5 Prescriptive Versus Descriptive 5.6 Open-Loop Versus Closed-Loop 6.0 Requirements for Internet Traffic Engineering 6.1 Generic Requirements 6.2 Routing Requirements 6.3 Measurement Requirements 6.4 Traffic Mapping Requirements 6.5 Network Survivability 6.5.1 Survivability in MPLS Based Networks 6.6 Content Distribution (Webserver) Requirements 6.7 Offline Traffic Engineering Support Systems 7.0 Traffic Engineering in Multiclass Environments 7.1 Traffic Engineering in Diffserv Environments 8.0 Traffic Engineering in Multicast Environments 9.0 Inter-Domain Considerations 10.0 Conclusion 11.0 Security Considerations 12.0 Acknowledgements 13.0 References 14.0 Authors' Addresses 1.0 Introduction This memo describes a framework for Internet traffic engineering. The intent is to articulate the general issues, principles and requirements for Internet traffic engineering; and where appropriate to provide recommendations, guidelines, and options for the development of online and offline Internet traffic engineering capabilities and support systems. The framework can assist vendors of networking hardware and software in developing mechanisms and support systems for the Internet environment that support the traffic engineering function. The framework can also help service providers in devising and implementing traffic engineering solutions for their networks. The framework provides a terminology for describing and understanding common Internet traffic engineering concepts. The framework also provides a taxonomy of known traffic engineering styles. In this context, a traffic engineering style abstracts important aspects from a traffic engineering methodology and expresses them in terms of a number of fundamental patterns. Traffic engineering styles can be viewed in different ways depending upon the specific context in which they are used and the specific purpose which they serve. The Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 3] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 combination of patterns, styles, and views results in a natural taxonomy of traffic engineering systems. Although Internet traffic engineering is most effective when applied end-to-end, the initial focus of this framework document is intra- domain traffic engineering (that is, traffic engineering within a given autonomous system). However, in consideration of the fact that a preponderance of Internet traffic tends to be inter-domain (that is, they originate in one autonomous system and terminate in another), this document provides an overview of some of the aspects that pertain to inter-domain traffic engineering. This draft is preliminary and will be reviewed and revised over time. 1.1. What is Internet Traffic Engineering? Internet traffic engineering is defined as that aspect of Internet network engineering that deals with the issue of performance evaluation and performance optimization of operational IP networks. Traffic Engineering encompasses the application of technology and scientific principles to the measurement, characterization, modeling, and control of Internet traffic [AWD1, AWD2]. A major objective of Internet traffic engineering is to enhance the performance of an operational network; at both the traffic and resource levels. This is accomplished by addressing traffic oriented performance requirements, while utilizing network resources efficiently, reliably, and economically. Traffic oriented performance measures include delay, delay variation, packet loss, and goodput. A significant, but subtle, practical advantage of applying traffic engineering concepts to operational networks is that it helps to identify and structure goals and priorities in terms of enhancing the quality of service delivered to end-users of network services, and in terms of measuring and analyzing the achievement of these goals. The optimization aspects of traffic engineering can be achieved through routing control, traffic management, and resource management functions. Network resources of interest include link bandwidth, buffer space, and computational resources. The optimization objectives of Internet traffic engineering should be viewed as a continual and iterative process of network performance improvement, rather than as a one time goal. The optimization objectives of Internet traffic engineering may change over time as new requirements are imposed, or as new technologies emerge, or as new insights are brought to bear on the underlying problems. Moreover, different networks may have different optimization objectives, depending upon their business models, capabilities, and operating constraints. Regardless of the specific optimization goals that prevail in any particular environment, for practical purposes, the optimization aspects of traffic engineering are ultimately Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 4] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 concerned with network control. Consequently, the optimization aspects of traffic engineering can be viewed from a control perspective. The control dimension of Internet traffic engineering can be proactive or reactive. In the later, the control system responds to events that have already occurred in the network. In the former, the system takes preventive action to obviate predicted unfavorable network states. The control dimension of Internet traffic engineering responds at multiple levels of temporal resolution to network events. The capacity planning functions respond at a very coarse temporal level. The routing control functions operate at intermediate levels of temporal resolution. Finally, the packet level processing functions (e.g. rate shaping, queue management, and scheduling) operate at very fine levels of temporal resolution, responding to the real-time statistical characteristics of traffic. The subsystems of Internet traffic engineering control include: routing control, traffic control, and resource control (including control of service policies at network elements). Inputs into the control system include network state variables, policy variables, and decision variables. Traffic engineering constructs must be sufficiently specific and well defined to address known requirements, but at the same time must be flexible and extensible to accommodate unforseen future demands. A major challenge in Internet traffic engineering is the realization of automated control capabilities that adapt quickly to significant changes in network state while maintaining stability. 1.2. Scope The scope of this document is intra-domain traffic engineering; that is, traffic engineering within a given autonomous system in the Internet. The framework will discuss concepts pertaining to intra- domain traffic control, including such issues as routing control, micro and macro resource allocation, and the control coordination problems that arise consequently. This document will describe and characterize techniques already in use or in advanced development for Internet TE, indicate how they fit together, and identify scenarios in which they are useful. Although the emphasis is on intra-domain traffic engineering, in Section 8.0, however, an overview of the high level considerations pertaining to inter-domain traffic engineering will be provided. Inter-domain internet traffic engineering is crucial to the performance enhancement of the global Internet infrastructure. Whenever possible, relevant requirements from existing IETF documents and other sources will be incorporated by reference. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 5] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 1.3 Terminology - Baseline analysis: A study conducted to serve as a baseline for comparison to the actual behavior of the network. - Busy hour: A one hour period within a specified interval of time (typically 24 hours) in which the traffic load in a network or subnetwork is greatest. - Congestion: A state of a network resource in which the traffic incident on the resource exceeds its output capacity over an interval of time. - Congestion avoidance: An approach to congestion management that attempts to obviate the occurrence of congestion. - Congestion control: An approach to congestion management that attempts to remedy congestion problems that have already occurred. - Constraint-based routing: A class of routing protocols that take specified traffic attributes, network constraints, and policy constraints into account in making routing decisions. Constraint-based routing is applicable to traffic aggregates as well as flows. It is a generalization of QoS routing. - Demand side congestion management: A congestion management scheme that addresses congestion problems by regulating or conditioning offered load. - Effective bandwidth: The minimum amount of bandwidth that can be assigned to a flow or traffic aggregate in order to deliver 'acceptable service quality' to the flow or traffic aggregate. - Egress traffic: Traffic exiting a network or network element. - Ingress traffic: Traffic entering a network or network element. - Interdomain traffic: Traffic that originates in one Autonomous system and terminates in another. - Loss network: A network that does not provide adequate buffering for traffic, so that traffic entering a busy resource within Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 6] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 the network will be dropped rather than queued. - Network Survivability: A capability of promptly recovering from a network failure and maintaining the required QoS for existing services. - Offline traffic engineering: A traffic engineering system that exists outside of the network. - Online traffic engineering: A traffic engineering system that exists within the network, typically implemented on or as adjuncts to operational network elements. - Performance measures: Metrics that provide quantitative or qualitative measures of the performance of systems or subsystems of interest. - Performance management: [WORK IN PROGRESS] A systematic approach to improving effectiveness in the accomplishment of specific networking goals related to performance improvement. - Provisioning: The process of assigning or configuring network resources to meet certain requests. - QoS routing: Class of routing systems that selects paths to be used by a flow based on the QoS requirements of the flow. - Service Level Agreement: A contract between a provider and a costumer that guarantees specific levels of performance and reliability at a certain cost. - Stability: An operational state in which a network does not oscillate in a disruptive and continuous manner from one mode to another mode. - Supply side congestion management: A congestion management scheme that provisions additional network resources to address existing and/or anticipated congestion problems. - Transit traffic: Traffic whose origin and destination are both outside of the network under consideration. - Traffic characteristic: A description of the temporal behavior of a given traffic flow or traffic aggregate. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 7] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 - Traffic flow: [WORK IN PROGRESS] A stream of packets between two end-points that can be characterized in a certain way. A micro-flow has a more specific definition: A micro-flow is a stream of packets with with a bounded inter-arrival time and with the same source address, destination address, and port number. - Traffic intensity: A measure of traffic loading with respect to a resource capacity over a specified period of time. In classical telephony systems, traffic intensity is measured in units of Erlang. - Traffic matrix: A representation of the traffic demand between a set of origin and destination abstract nodes. An abstract node can consist of one or more network elements. - Traffic monitoring: The process of observing traffic flows at a certain point in a network and collecting the flow information for further analysis and action. - Traffic trunk: An aggregation of traffic flows belonging to the same class which are forwarded through a common path. A traffic trunk may be characterized by an ingress and egress node, and a set of attributes which determine its behavioral characteristics and requirements from the network. 2.0 Background The Internet is evolving rapidly into a very critical communications infrastructure, supporting significant economic, educational, and social activities. Consequently, optimizing the performance of large scale IP networks, especially public Internet backbones, has become an important but challenging problem. Network performance requirements are multidimensional, complex, and sometimes contradictory; making the traffic engineering problem all the more challenging. The network must convey IP packets from ingress nodes to egress nodes efficiently, expeditiously, reliably, and economically. Furthermore, in a multiclass service environment (e.g. Diffserv capable networks), the resource sharing parameters of the network must be appropriately determined and configured according to prevailing policies and service models to resolve resource contention issues arising from mutual interference between different packets traversing through the network. Especially in multi-class environments, consideration must be given to resolving competition for network resources between traffic streams belonging to the same service class (intra-class contention resolution) and between traffic streams belonging to Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 8] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 different classes (inter-class contention resolution). 2.1 Context of Internet Traffic Engineering The context of Internet traffic engineering includes: -1- a network context -- network characteristics, network structure, policies, constraints, quality attributes, optimization criteria -2- a problem context -- identification, representation, -3- a solution context -- analysis, evaluation of alternatives -4- an implementation context 2.2 The Network Context At the most basic level of abstraction, an IP network can be represented as: (1) a constrained system consisting of set of interconnected resources which provide transport services for IP traffic, (2) a demand system representing the offered load to be transported through the network, and (3) a response system consisting of network processes, protocols, and related mechanisms which facilitate the movement of traffic through the network. The network elements and resources may have specific characteristics which restrict the way in which they handle the demand. Additionally, network resources may be equipped with traffic control mechanisms which support regulation of the way in which they handle the demand. Traffic control mechanisms may also be used to control various packet processing activities within the resource, to arbitrate contention for access to the resource by different packets, and to regulate traffic behavior through the resource. A configuration management system may allow the settings of the traffic control mechanisms to be manipulated to control or constrain the way in which the network element responds to internal and external stimuli. The details of how the network provides transport services for packets are specified in the policies of the network administrators and are installed through network configuration management systems. Generally the types of services provided by the network depends upon the characteristics of the network resources, the prevailing policies, the ability of the network administrators to translate policies into network configurations. There are two significant characteristics of contemporary Internet networks: (1) they provide real-time services and (2) their operating environment is very dynamic. The dynamic characteristics of IP networks can be attributed to fluctuations in demand, to the interaction of various network protocols and processes, and to transient and persistent impairments that occur within the system. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 9] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 As packets are conveyed through the network, they contend for the use of network resources. If the arrival rate of packets exceed the output capacity of a network resource over an interval of time, the resource is said to be congested, and some of the arrival packets may be dropped as result. Congestion also increases transit delays, delay variation, and reduces the predictability of network service delivery. A basic economic premise for packet switched networks in general and the Internet in particular is the efficient sharing of network resources by multiple traffic streams. One of the fundamental challenges in operating a network, especially large scale public IP networks, is the need to increase the efficiency of resource utilization while minimizing the possibility of congestion. In practice, a particular set of packets may have specific delivery requirements which may be specified explicitly or implicitly. Two of the most important traffic delivery requirements are (1) capacity constraints which can be expressed as peak rates, mean rates, burst sizes, or as some notion of effective bandwidth, and (2) QoS constraints which can be expressed in terms of packet loss and timing restrictions for delivery of each packet and delivery of consecutive packets belonging to the same traffic stream. Packets may also be grouped into classes, in such a way that each class may have a common set of behavioral characteristics and a common set of delivery requirements. 2.3 Problem Context There are a number of fundamental problems associated with the operation of a network described by the simple model of the previous subsection. The present subsection reviews the problem context with regard to the traffic engineering function. One problem concerns how to identify, abstract, represent, and measure relevant features of the network which are relevant for traffic engineering. Another problem concerns how to measure and estimate relevant network state parameters. Effective traffic engineering relies on a good estimate of the offered traffic load as well as a +network-wide view of the underlying topology, which is a must for offline planning. Still another problem concerns how to characterize the state of the network and how to evaluate its performance under a variety of scenarios. There are two aspects to the performance analysis problem. One aspect relates to the evaluation of the system level performance of the network. The other aspect relates to the evaluation of the resource level performance, which restricts attention to the performance evaluation of individual network resources. In this memo, we shall refer to the system level characteristics of the network as the "macro-states" and the resource level characteristics as the Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 10] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 "micro-states." Likewise, we shall refer to the traffic engineering schemes that deal with network performance optimization at the systems level as macro-TE and the schemes that optimize at the individual resource level as micro-TE. In general, depending upon the particular performance measures of interest, the system level performance can be derived from the resource level performance results using appropriate rules of composition. Yet another fundamental problem concerns how to optimize the performance of the network. Performance optimization may entail some degree of resource management control, routing control, and/or capacity augmentation. 2.3.1 Congestion and its Ramifications Congestion is one of the most significant problems in an operational context. A network element is said to be congested if it experiences sustained overload over an interval of time. Almost invariably, congestion results in degradation of service quality to end users. Congestion control policies can include 1) restricting access to the congested resource, 2) regulating demand dynamically so that the overload situation disappears, 3) re-allocating network resources by redistributing traffic over the infrastructure, 4) expanding or augmenting network capacity, etc. In this memo, the emphasis is mainly on congestion issues that can be addressed within the scope of the network, rather than congestion management systems that depend on sensitivity from end-systems. 2.4 Solution Context The solution context for Internet traffic engineering involves analysis, evaluation of alternatives, and choice between alternative courses of action. Generally the solution context is predicated on making reasonable inferences about the current or future state of the network making an appropriate choice between alternatives. More specifically, the solution context demands good estimates of traffic workload, characterization of network state, and a set of control actions. Control actions may involve manipulation of parameters associated with the routing function, control over tactical capacity acquisition, and control over the traffic management functions. The following is a subset of the instruments (1) A collection of online and offline tools and mechanisms for measurement, characterization, modeling, and control of Internet traffic. (2) A set of policies, objectives, and requirements (which may be context dependent) for network performance evaluation and performance and control optimization. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 11] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 (3) A set of constraints on the operating environment, the network protocols, and the traffic engineering system itself. (4) Configuration Management system which may include a Configuration Control subsystem and a Configuration auditing Traffic estimates can be derived from customer subscriptions, traffic projections, or actual measurements. In order to obtain measured traffic matrix at various levels of detail, the basic measurement may be done at the flow level or on small traffic aggregates at the edge, where traffic enters and leaves the network [FGLR]. A flow consists of a set of packets that match in all the main IP and TCP/UDP header fields, such as source and destination IP addresses, protocol, port numbers, and TOS bits, and arrive close in time. Each measurement record includes information about the traffic end points, IP and TCP/IDP header fields, the number of packets and bytes in the flow, and the start and finish time of the flow. In order to conduct performance studies and planning on current or future networks, a routing module is needed to determine the path(s) chosen by the routing protocols for each traffic demand, and the load imparted on each link as the traffic flows through the network. The routing module needs to capture the selection of shortest paths to/from multi-homed customers and peers, the splitting of traffic across multiple shortest-path routes, and the multiplexing of layer-3 links over layer-2 trunks as well as traffic trunks that are carried on LSPs if there is any. Topology model can be obtained by extracting information from router configuration files and forwarding tables, or by a topology server that is monitoring the network state. Routing can be controlled at various level of abstraction, e.g., manipulating BGP attributes, manipulating IGP metrics, manipulating the traffic engineering parameters for path oriented technologies such as MPLS. Within the context of MPLS, the path of an explicit LSP can be computed and established in various way, e.g. (1) manually, (2) automatically online using a constraint-based routing processes implemented on label switching routers, or (3) offline using a constraint-based traffic engineering support systems. 2.4.1 Combating the Congestion Problem Minimizing congestion is a significant aspect of traffic engineering. This subsection gives an overview of the general approaches that have been used to combat congestion problems. Congestion management policies can be categorized based on the following criteria (see [YaRe95] for a more detailed taxonomy of congestion control schemes): (1) Response time scale, which can be characterized as long, medium, or short; (2) reactive versus preventive which relates to congestion control and congestion avoidance; and (3) supply side versus demand side congestion mitigation schemes. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 12] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 1. Response time scale - Long (weeks to months): capacity planning works over a relative long time scale to build up network capacities based on some forecast of traffic demand and distribution. Since router and link provisioning takes time and are in general expensive, these upgrades are carried out in the weeks-to-months time scale. - Medium (hours to days): several control policies can fall into this category: 1) Adjusting IGP and/or BGP parameters to route traffic away or towards certain segment of the network; 2) Setting up and/or adjusting some Explicitly-Routed Label Switched Paths (ER-LSPs) to route some traffic trunks away from their shortest paths which cause certain congestion; 3) Adjusting the logical topology to match traffic distribution using some lower layer technologies, e.g., MPLS LSPs and ATM PVCs. All these schemes rely on network management system to monitor changes in traffic distribution, and feed them into an offline or online traffic engineering tool to trigger certain actions on the network, often in the hours-to-days time scale. The tool can be either centralized or distributed. A centralized tool can produce potentially more optimal solutions, but in general more costly and may not be as robust as the distributed one if the information used by the centralized tool does not reflect the one in the real network. - Short (packet level to several round trip times): routers can perform active buffer management to control congestion and/or signal congestion to end systems to slow down. One of the most popular schemes is Random Early Detection (RED) [FlJa93]. The main goal of RED is to provide congestion avoidance by controlling the average queue size. During congestion (before the queue is filled), arriving packets are chosen to be "marked" according to a probabilistic algorithm and the level of the average queue size. For a router that is not Explicit Congestion Notification (ECN) [Floy94] capable, it can simple drop the "marked" packets as an indication of congestion to the end systems; otherwise, the router can set the ECN field in the packet header. Variations of RED, e.g., RED with In and Out (RIO) and Weighted RED, were developed to be used for packets in difference classes and dropping precedence levels [RFC-2597]. Besides the benefit of providing congestion avoidance comparing with traditional Tail-Drop (TD) (i.e., dropping arriving packets only when the queue is full), it can also avoid global synchronization and improve fairness among different TCP connections. However, RED by itself can not prevent congestion and unfairness caused by unresponsive sources, e.g., UDP connections, or some misbehaved greedy connections. Other schemes were proposed to improve the performance and fairness with the presence of unresponsive connections. They often require to be used in conjunction with per-connection (i.e., per-flow) fair queueing or accounting, which is not available in many vendors' products, especially for core routers. Two of such schemes are Longest Queue Drop (LQD) and Dynamic Soft Partitioning with Random Drop (RND) [SLDC98] Given per-connection queue is used with fair queueing, both schemes push out the front packet from some connection whose current occupancy exceeds its allocation. In terms of methods Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 13] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 for choosing the connection for which the push-out is done: LQD chooses the connection with the largest difference between its queue length and its allocation, and RND selects a connection randomly amongst the connections with its queue length exceeding its allocation. RND reduces the amount of bursty loss, which may cause severe performance degradation in some TCP implementations. Dropping packets from the front can trigger TCP's fast/recovery feature faster and hence increase throughput [LNO96]. 2. Reactive versus preventive - Reactive (recovery): reactive policies are those that react to existing congestion for improvement. All the policies described in the long and medium time scales above can be categorized being reactive if the policies are based on monitoring existing congestion and decide on the actions that should be taken to ease or remove the congestion points. - Preventive (predictive/avoidance): preventive policies are those that take actions based on estimation of potential congestion in the future before congestion actually occurs. The policies described in the long and medium time scales need not necessarily be based on monitoring existing congestion. Instead they can take into account forecasts of future traffic demand and distribution, and decide on actions that should be taken in order to prevent potential congestion in the future. The schemes described in short time scale, e.g., RED and its variations, ECN, LQD, and RND, are also used for congestion avoidance since dropping or marking packets as an early congestion notification before queues actually overflow would trigger corresponding TCP sources to slow down. 3. Supply side versus demand side - Supply side: supply side policies are those that seek to increase the effective capacity available to traffic by having a relatively balanced network. For example, capacity planning aims to provide a physical topology that matches estimated traffic volume and distribution based on forecasting, given a fixed capacity buildup budget. However, if actual traffic distribution does not match the one used by capacity panning due to forecasting error, adjusting the logical topology with a fixed physical topology using lower layer technologies (e.g., MPLS and ATM) or adjusting IGP and/or BGP parameters to redistribute traffic can further improve load balancing in the network. - Demand side: demand side policies are those that seek to control the offered traffic. For example, those short time scale policies described above, e.g., RED and its variations, ECN, LQD, and RND, send early congestion notifications to sources would trigger the corresponding TCP sources to slow down. In summary, based on conditions of the network, network operators can use congestion policies from various categorizations in combination to achieve the maximum control. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 14] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 2.5 Implementation and Operational Context [WORK IN PROGRESS] 3.0 Traffic Engineering Process Model(s) This section describes a process model that captures the high level aspects of Internet traffic engineering in an operational context. The process model will be described in terms of a sequence of actions that a traffic engineer or a traffic engineering control system goes through in order to optimize the performance of an operational network (see also [AWD1, AWD2]). The first phase of the process model is to define relevant control policies that govern the operation of the network. These policies may depend on the prevaling business model, the network cost structure, the operating constraints, and a number of optimization criteria. The second phase of the process model is a feedback process which involves acquiring measurement data from the operational network. If empirical data is not readily available from the network, then synthetic workloads may be used instead, which reflect the workload or expected workload of the network. The third phase of the process model is concerned with analysis of network state and characterization of traffic workload. Proactive performance analysis and reactive performance analysis. Proactive performance analysis identifies potential problems that do not exist, but that may manifest at some point in the future. Reactive performance analysis identifies existing problems, determines their cause, and if necessary evaluates alternative approaches to remedy the problem. Various quantitative and qualitative techniques may be used in the analysis process, including modeling and simulation. The analysis phase of the process model may involve the following actions: (1) investigate the concentration and distribution of traffic across the network or relevant subsets of the network, (2) identify existing or potential bottlenecks, and (3) identify network pathologies such as single points of failures. Network pathologies may result from a number of factors such as inferior network architecture, configuration problems, and others. A traffic matrix may be constructed as part of the analysis process. Network analysis may be descriptive or prescriptive. The fourth phase of the process model is concerned with the performance optimization of the network. The performance optimization phase generally involves a decision process which selects and implements a particular set of actions from a choice between alternatives. Optimization actions may include use of appropriate techniques to control the distribution of traffic across the network. Optimization actions may also involve increasing link capacity, deploying additional hardware such as routers and switches, Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 15] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 adjustment parameters associated with routing such as IGP metrics in a systematic way. Network performance optimization may also involve starting a network planning process to improve the network architecture in order to accommodate current and future growth. 3.1 Components of the Traffic Engineering Process Model [WORK IN PROGRESS] 3.2 Measurement [WORK IN PROGRESS] Measurement is fundamental to the traffic engineering function. 3.3 Modeling and Analysis Modeling and analysis are an important aspect of Internet traffic engineering. Modeling involves characterization of traffic and construction of an appropriate network model that succintly captures some performance measures of interest. Accurate source traffic models are needed for modeling and analysis. A major research topic in Internet traffic engineering is the development of traffic source models that are consistent with empirical data obtained from operational networks, and that are tractable and amenable to analysis. The topic of source models for IP traffic is a research topic and therefore is outside the scope of this document; nonetheless its importance cannot be over-emphasized. A network model is an abstraction of the operational network which captures the important features of the network such as its operational characteristics, link and nodal attributes. A network model should also facilitate analysis and simulation for the purposes for predicting performance in various operational environments, and for guiding network expansion. A network simulation tool is also useful for a TE system. A network simulator can be used to visualize network conditions. A network simulation tool can also show congestion spots, and provide hints to possible solutions. The simulator can be further used to verify that a planned solution is indeed effective and without undesired side effects. For network planning, the simulator may reveal single points of failure, the facilities that need capacity most, and the unused links. Proper actions can then be taken based on the simulation results. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 16] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 3.4 Optimization Network optimization is an iterative process of improving the architecture and performance of a network. Each iteration consists of a real-time optimization sub-process and a network planning sub- process. The difference between real-time optimization and network planning is largely in the relative timescale and the granularity of actions. The Real-time optimization sub-process is to control traffic distribution in an existing network infrastructure so as to avoid/relieve congestion, to assure the delivery of QoS, and to optimize resource utilization. Real-time optimization is needed because, no matter how well a network is designed, uncontrollable incidences such as fiber cut or shift in traffic demand can cause congestion or other problems in a network. Real-time optimization must solve such problems in a small time-scale such as minutes or hours. Examples of real-time optimization include IGP/BGP metric tuning, and using MPLS Explicitly-Routed Label Switched Paths (ER- LSPs) to change the paths of some traffic trunks [XIAO]. The network planning sub-process is to decide and change the topology and capacity of a network in a systematic way. When there is a problem in the network, real-time optimization must provide an immediate fix. Because of the time constraint, the solution may not be optimal. Network planning may then be needed to correct such sub- optimality. Network planning is also needed because network traffic grows and traffic distribution changes over time. Topology and capacity of the network therefore need to be changed accordingly. Examples of network planning include systematic change of IGP metrics, deploying new routers/switches, and adding/removing links in the network. Network planning and real-time optimization is mutually complementary -- a well-planned network makes real-time optimization easier while the process of optimization provides statistics and other feedback to facilitate future planning. 4.0 Historical Review and Recent Developments In this section, we present various traffic engineering approaches that have been proposed and implemented in telecommunication and computer networks. The discussion is not meant to be exhaustive, but is primarily intended to illuminate various different perspectives. 4.1 Traffic Engineering in Classical Telephone Networks It is useful to begin with a review of traffic engineering in telephone networks which often relates to the means by which user traffic is steered from the source to the destination. The early telephone network relied on static hierarchical routing whereby Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 17] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 routing patterns remained fixed independent of the state of the network or time of day. The hierarchy was intended to accommodate overflow traffic, improve network reliability via alternate routes, and prevent call looping by using strict hierarchy rules. The network was typically over-provisioned since a given fixed route had to be dimensioned so that it could carry user traffic during a busy hour of any busy day. Hierarchical routing in the telephony network was found to be too rigid with the advent of digital switches and stored program control which were able to manage more complicated traffic engineering rules. Dynamic routing was introduced to alleviate the routing inflexibility in the static hierarchical routing so that the network would operate more efficiently, thereby resulting in significant economic gains [HuSS87]. Dynamic routing typically reduces the overall loss probability by 10 to 20 percent as compared to static hierarchical routing. Dynamic routing can also improve network resilience by recalculating routes on a per-call basis and periodically updating routes. There are two types of dynamic routing in the telephone network: time-dependent routing and state-dependent routing. In the time-dependent routing, the regular variations in traffic loads due to time of day and season are exploited in pre-planned routing tables. In the state-dependent routing, routing tables are updated online in accordance with the current state of the network (e.g, traffic demand, utilization, etc.). Dynamic non-hierarchical routing (DNHR) is an example of dynamic routing that was introduced in the AT&T toll network in the 1980's to respond to time-dependent information such as regular load variations as a function of time. Time-dependent information in terms of load may be divided into three time scales: hourly, weekly, and yearly. Correspondingly, three algorithms are defined to pre-plan the routing tables. Network design algorithm operates over a year-long interval while demand servicing algorithm operates on a weekly basis to fine tune link sizes and routing tables to correct forecast errors on the yearly basis. At the smallest time scale, routing algorithm is used to make limited adjustments based on daily traffic variations. Network design and demand servicing are computed using offline calculations. Typically, the calculations require extensive search on possible routes. On the other hand, routing may need online calculations to handle crankback. DNHR adopts a "two-link" approach whereby a path can consist of two links at most. The routing algorithm presents an ordered list of route choices between an originating and terminating switches. If a call overflows, a via switch (a tandem exchange between the originating switch and the terminating switch) would send a crankback signal to the originating switch which would then select the next route, and so on, until no alternative routes are available in which the call is blocked. 4.2 Evolution of Traffic Engineering in Packet Networks [WORK IN PROGRESS] Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 18] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 This section review related work that was aimed at improving the performance of data networks. Indeed, optimization of the performance of data networks started since the early days of ARPANET. Other commercial networks such as the SNA also recognized the importance of providing SLAs. Besides performance optimization, another objective of Internet Traffic Engineering is to improve the reliability of data forwarding. Some believe in providing reliability in packet networks which is comparable to that in traditional telephone networks. 4.2.1 Adaptive Routing in ARPANET The early ARPANET recognized the importance of adaptive routing where routing decisions were based on current state of the network [McQl]. Each packet is forwarded to its destination along the path for which the total estimated transit time is the smallest. Each node maintained a table of network delays where the delay is the estimated delay a packet can expect to experience along the path toward its destination. The minimum delay table is periodically transmitted by a node to its neighbors. The shortest path in terms of hop count is also propagated to give the connectivity information. A drawback of this approach is that dynamic link metrics tend to create "traffic magnets" whereby congestion will be shifted from one location of a network to another location which essentially creates oscillation. 4.2.2 SNA Subarea [WORK IN PROGRESS] 4.2.2 Dynamic Routing in the Internet Today, the Internet adopts dynamic routing algorithms with distributed control to determine the paths that packets should take to their respective destinations. The path taken by a packet follows shortest path where the path cost is defined to be sum of link metrics. In principle, the link metric can be based on static or dynamic quantities. In the static one, the link metric may be assigned in proportion of the inverse of the link capacity. In the dynamic one, the link metric may be a function of some congestion measure such as delay or packet loss. It was recognized early that static link metric assignment was inadequate as it would easily lead to an unfavorable scenario whereby Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 19] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 some links were congested while some others lightly loaded. One reason for the inadequacy is that link metric assignment is often done without considering the traffic matrix in the network. Even if link metrics are assigned in accordance with the traffic matrix, unbalanced loads in the network can still occur due to one or more reasons such as: - Some resources do not turn up at where they were planned originally. - Forecasting errors in traffic volume and/or traffic distribution. - Dynamics in traffic matrix due to the temporal nature of traffic patterns, BGP policy change from peer, etc. 4.2.3 ToS Routing In ToS-based routing, different routes to the same destination may be selected depending on the Type-of-Service (TOS) field of an IP packet [RFC-1349]. The ToS classes may be classified as low delay and high throughput. Each link is associated with multiple link costs with each link cost corresponding to a particular TOS. A separate shortest path algorithm is run to give a shortest path tree for each TOS. Since the shortest path algorithm has to be run for each TOS, the computation cost may be prohibitively expensive with this approach. ToS-based routing also becomes outdated as the field has been replaced by a DS field. Another more technical issue is that it is difficult to engineer traffic based on TOS-based routing as each class still relies on shortest path routing. 4.2.4 Equal Cost Multipath Equal Cost MultiPath (ECMP) is another technique that tries to address the deficiency in Shortest Path First (SPF) [RFC-2178]. In SPF, when two or more paths to a given destination have the same cost, the algorithm chooses one of them. In ECMP, the algorithm distributes the traffic equally among the multiple paths having the same cost. Traffic distribution across the equal-cost paths is usually done in two ways: 1) packet-based in a round-robin fashion, or 2) flow-based using hashing on source and destination IP addresses. Approach 1) can easily cause out-of-order packets while approach 2) is dependent on the number and distribution of flows. Flow-based load sharing may be unpredictable in an enterprise network where the number of flows is relatively small and heterogeneous (i.e., hashing may not be uniform), but is generally effective in a core network where the number of flows is very large. Because link costs are static, ECMP distributes the traffic equally among the equal-cost paths independent of their congestion status. As a result, given two equal-cost paths, it is possible that one of the paths ends up being more congested than the other. Another drawback of ECMP is that load sharing is not done on multiple paths having non-identical albeit similar costs. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 20] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 4.3 Overlay Model In the overlay model, a virtual-circuit network such as ATM or frame relay provides virtual-circuit connectivity among routers that are located at the edges. In this mode, two routers only see a direct link between them independent of the physical route taken by the virtual circuit connecting the routers. Thus, the overlay model essentially decouples the logical topology that routers see from the physical topology that the ATM or frame relay networks manage. The overlay model enables the network operator to perform traffic engineering by re-configuring the virtual circuits so that a virtual circuit on a congested physical path can be re-routed to a less congested one. The overlay model requires the management of two separate networks (e.g., IP and ATM) which results in increased operational cost. In the full-meshed overlay model, each router would peer to each other router in the network. This imposes a scalability limit on IGP peering relationship. 4.4 Constrained-Based Routing Constrained-based routing pertains to a class of routing systems that compute routes through a network that satisfy a set of requirements subject to a set of constraints imposed by both the network and administrative policies. Constraints may include bandwidth, delay, and policy instruments such as resource class attributes [RFC-]. The concept of constraint-based in IP networks was first defined in [.] within the context of MPLS traffic engineering requirements. 4.5 Overview of IETF Projects Related to Traffic Engineering This subsection reviews a number of IETF activities that are pertinent to Internet traffic engineering. 4.5.1 Integrated Services The IETF has developed the integrated services model that requires resources such as bandwidth and buffers to be reserved a priori for a given traffic flow to ensure that the quality of service requested is satisfied. The integrated services model requires additional components beyond those used in the best-effort model such as packet classifiers, packet schedulers, and admission control. A packet classifier is used to identify flows that are to receive a certain level of service. A packet scheduler handles the service of different packet flows to ensure that QoS commitments are met. Admission control is used to determine whether a router has the necessary Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 21] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 resources to accept a new flow. Two services have been defined: guaranteed service [RFC-2212] and controlled-load service [RFC-2211]. The guaranteed service can be used for applications that require real-time delivery. For this type of application, data that is delivered to the application after a certain time is generally considered worthless. Thus guaranteed service has been designed to provide a firm bound on the end-to-end packet delay for a flow. The controlled-load service can be used for adaptive applications that can tolerate some delay but that are sensitive to traffic overload conditions. This type of applications typically perform satisfactorily when the network is lightly loaded but degrade significantly when the network is heavily loaded. Thus controlled- load service has been designed to provide approximately the same service as best-effort service in a lightly loaded network regardless of actual network conditions. Controlled-load service is described qualitatively in that no target values on delay or loss are specified. 4.5.2 RSVP RSVP was originally invented as a signaling protocol for applications to reserve resources [RFC-2205]. The sender sends a PATH Message to the receiver, specifying the characteristics of the traffic. Every intermediate router along the path forwards the PATH Message to the next hop determined by the routing protocol. Upon receiving a PATH Message, the receiver responds with a RESV Message to request resources for the flow. Every intermediate router along the path can reject or accept the request of the RESV Message. If the request is rejected, the router will send an error message to the receiver, and the signaling process will terminate. If the request is accepted, link bandwidth and buffer space are allocated for the flow and the related flow state information will be installed in the router. Recently, RSVP has been modified and extended in several ways to reserve resources for aggregation of flows, to set up MPLS explicit routes, etc. 4.5.3 Differentiated Services The essence of Differentiated Services (Diffserv) is to divide traffic into different classes and treat them differently, especially during time when there is a shortage of resources such as link bandwidth and buffer space [RFC-2475]. Diffserv defines the Differentiated Services field (DS field, formerly known as TOS octet) and uses it to indicate the forwarding treatment a packet should receive [RFC-2474]. Diffserv also Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 22] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 standardizes a number of Per-Hop Behavior (PHB) groups. Using different classification, policing, shaping and scheduling rules, several classes of services can be provided. In order for a customer to receive Differentiated Services from its Internet Service Provider (ISP), it must have a Service Level Agreement (SLA) with its ISP. Customers can mark the DS fields of their packets to indicate the desired service or have them marked by the ISP edge routers based on packet classification. An SLA may explicitly or implicitly specify a Traffic Conditioning Agreement (TCA) which defines classifier rules as well as metering, marking, discarding, and shaping rules. At the ingress of the ISP networks, packets are classified, policed, and possibly shaped or discarded. When a packet goes across domains, its DS field may be re-marked, as determined by the SLA between the two domains. In Differentiated Services, there are only a limited number of service classes indicated by the DS field. Since resources are allocated on a per-class basis, the amount of state information is proportional to the number of classes rather than the number of application flows. 4.5.4 MPLS MPLS is an advanced forwarding scheme. It extends routing with respect to packet forwarding and path controlling [RoVC]. Each MPLS packet has a header. In a non-ATM/FR environment, the header contains a 20-bit label, a 3-bit Experimental field (formerly known as Class-of-Service or CoS field), a 1-bit label stack indicator and an 8-bit TTL field. In an ATM (FR) environment, the header contains only a label encoded in the VCI/VPI (DLCI) field. An MPLS capable router, termed Label Switching Router (LSR), examines the label and possibly the experimental field in forwarding a packet. At the ingress LSRs of an MPLS-capable domain, IP packets are classified and routed based on a combination of the information carried in the IP header of the packets and the local routing information maintained by the LSRs. An MPLS header is then inserted for each packet. Within an MPLS-capable domain, an LSR will use the label as the index to look up the forwarding table of the LSR. The packet is processed as specified by the forwarding table entry. The incoming label is replaced by the outgoing label and the packet is switched to the next LSR. This label-switching process is very similar to ATM's VCI/VPI processing. Before a packet leaves an MPLS domain, its MPLS header is removed. The paths between the ingress LSRs and the egress LSRs are called Label Switched Paths (LSPs). MPLS can use a signaling protocol such as RSVP or LDP to set up LSPs. MPLS enables TE using ER-LSPs. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 23] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 4.5.5 IP Performance Metrics The IPPM WG has been developing a set of standard metrics that can be applied to the quality, performance, and reliability of Internet services by network operators, end users, or independent testing groups [RFC2330], so that users and service providers have accurate common understanding of the performance and reliability of the Internet component 'clouds' that they use/provide. Example of performance metrics are one-way packet loss [RFC2680], one-way delay [RFC2679] and measure of connectivity between two hosts [RFC2678]. Other second-order measure of packet loss and delay are also considered. Performance metrics are useful for specifying Service Level Agreements (SLAs), which are sets of service level objectives negotiated between users and service providers, where each objective is a combinations of one or more performance metrics to which constraints are applied. 4.5.6 Flow Measurement A flow measurement systems enables network's traffic flows to be measured and analyzed. A traffic flow is defined as a stream of packets between two end points with a given level of granularity. RTMF has produced an architecture that defines a method to specify traffic flows, and a number of components (meters, meter readers, manager) to measure the traffic flows [RFC-2722]. A meter observes packets passing through a measurement point, classifies them into certain groups, and accumulates certain data such as number of packets and bytes for each group. A meter reader gathers usage data from various meters so that it can be made available for analysis. A manager is responsible for configuring and controlling meters and meter readers. 4.5.7 Endpoint Congestion Management The work in endpoint congestion management is intended to catalog a set of congestion control mechanisms that transport protocols can use, and to develop a unified congestion control mechanism across a subset of an endpoint's active unicast connections called a congestion group. A congestion manager continuously monitors the state of the path for each congestion group under its control, and uses that information to instruct a scheduler how to partition bandwidth among the connections of that congestion group. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 24] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 4.6 Overview of ITU Activities Related to Traffic Engineering This section provides an overview of prior work within the ITU-T pertaining to traffic engineering in traditional telecommunications networks. ITU-T Recommendations E.600 [itu-e600], E.701 [itu-e701], and E.801 [itu-e801] address traffic engineering issues in traditional telecommunications networks. Recommendation E.600 provides a vocabulary for describing traffic engineering concepts, while E.701 defines reference connections, Grade of Service (GOS), and traffic parameters for ISDN. Recommendation E.701 uses the concept of a reference connection to identify representative cases of different types of connections without describing the specifics of their actual realizations by different physical means. As defined in Recommendation E.600, "a connection is an association of resources providing means for communication between two or more devices in, or attached to, a telecommunication network." Also, E.600 defines "a resource as any set of physically or conceptually identifiable entities within a telecommunication network, the use of which can be unambiguously determined" [itu-e600]. There can be different types of connections as the number and types of resources in a connection may vary. Typically, different network segments are involved in the path of a connection. For example, a connection may be local, national, or international. The purposes of reference connections are to clarify and specify traffic performance issues at various interfaces between different network domains. Each domain may consist of one or more service provider networks. Reference connections provide a basis to define grade of service (GoS) parameters related to traffic engineering within the ITU-T framework. As defined in E.600, "GoS refers to a number of traffic engineering variables which are used to provide a measure of the adequacy of a group of resources under specified conditions." These GoS variables may be probability of loss, dial tone delay, etc. They are essential for network internal design and operation, as well as component performance specification. GoS is different from quality of service (QoS). QoS is the performance perceivable by a user of a telecommunication service and expresses the user's degree of satisfaction of the service. Thus, GoS is a set of network oriented measures which characterize the adequacy of a group of resources under specified conditions, while QoS parameters focus on performance aspects which are observable at the service access points and network interfaces, rather than their causes within the network. For a network to be effective in serving its users, the values of both GoS and QoS parameters must be related, with GoS parameters typically making a major contribution to the QoS. To assist the network provider in the goal of improving efficiency and effectiveness of the network, E.600 stipulates that a set of GoS Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 25] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 parameters must be selected and defined on an end-to-end basis for each major service category provided by a network. Based on a selected set of reference connections, suitable target values are then assigned to the selected GoS parameters, under normal and high load conditions. These end-to-end GoS target values are then apportioned to individual resource components of the reference connections for dimensioning purposes. 5.0 Taxonomy of Traffic Engineering Systems A taxonomy of traffic engineering systems can be constructed based on the classification system shown below: - Time dependent vs State Dependent - Offline vs Online - Centralized vs Distributed - Local vs Global Information - Prescriptive vs Descriptive - Open Loop vs Closed Loop In the following subsections, these classification systems are described in greater detail. 5.1 Time-Dependent Versus State-Dependent TE algorithms are classified into two basic types: time-dependent or state-dependent. In this framework, all TE schemes are considered to be dynamic. Static TE implies that no traffic engineering algorithm is being applied. In the time-dependent TE, historical information based on seasonal variations in traffic is used to pre-program routing plans. Additionally, customer subscription or traffic projection may be used. Pre-programmed routing plans typically change on a relatively long time scale (e.g., diurnal). Time-dependent algorithms make no attempt to adapt to random variations in traffic or changing network conditions. An example of time-dependent algorithm is a global centralized optimizer where the input to the system is traffic matrix and multiclass QoS requirements described [MR99]. State-dependent or adaptive TE adapts the routing plans for packets based on the current state of the network. The current state of the network gives additional information on random variations in actual traffic (i.e., perturbations from regular variations ) that could not be predicted by using historical information. An example of state- dependent TE that operates in a relatively long time scale is constraint-based routing, and an example that operates in a relatively short time scale is a load-balancing algorithm described in [OMP] and [MATE]. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 26] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 The state of the network can be based on various parameters such as utilization, packet delay, packet loss, etc. These parameters in turn can be obtained in several ways. For example, each router may flood these parameters periodically or by means of some kind of trigger to other routers. An alternative approach is to have a particular router that wants to perform adaptive TE to send probe packets along a path to gather the state of that path. Because of the dynamic nature of the network conditions, expeditious and accurate gathering of state information is typically critical to adaptive TE. State- dependent algorithms may be applied to increase network efficiency and resilience. While time-dependent algorithms are more suitable for predictable traffic variations, state-dependent algorithms are more suitable for adapting to random fluctuations in the traffic. 5.2 Offline Versus Online Traffic engineering requires the computation of routing plans. The computation itself may be done offline or online. For the case where the routing plans do not have to become known in real-time, then the computation can be done offline. As an example, routing plans computed from forecast information may be computed offline. Typically, offline computation is also used to perform extensive search on multi-dimensional space. Online computation is required when the routing plans need to adapt to changing network conditions as in state-dependent algorithms. Unlike offline computation which can be computationally demanding, online computation is geared toward simple calculations to fine-tune the allocations of resources such as load balancing. 5.3 Centralized Versus Distributed With centralized control, there is a central authority which determines routing plans on behalf of each router. The central authority collects the network-state information from all routers, and returns the routing information to the routers periodically. The routing update cycle is a critical parameter which directly impacts the performance of the network being controlled. Centralized control requires high processing power and high bandwidth control channels. With distributed control, route selection is determined by each router autonomously based on the state of the network. The network state may be obtained by the router using some probing method, or distributed by other by routers on a periodic basis. 5.4 Local Versus Global TE algorithms may require local or global network-state information. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 27] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 It is to be noted that the scope network-state information does refer to the scope of the optimization. In other words, it is possible for a TE algorithm to perform global optimization based on local state information. Similarly, a TE algorithm may arrive at a local optimum solution even if it relies on global state information. Global information pertains to the state of the entire domain that is being traffic engineered. Examples include traffic matrix, or loading information on each link. Global state information is typically required with centralized control. In some cases, distributed- controlled TEs may also need global information. Local information pertains to the state of a portion of the domain. Examples include the bandwidth and packet loss rate of a particular path. Local state information may be sufficient for distributed- controlled TEs. 5.5 Prescriptive Versus Descriptive Prescriptive traffic engineering evaluates alternatives and recommend a course of action. Descriptive traffic engineering characterizes the state of the network and assess the impact of various policies without recommending any particular course of action. 5.6 Open-Loop Versus Closed-Loop Open-loop control is where control action does not use any feedback information from the current network state. The control action may, however, use its own on local information for accounting purposes. Closed-loop control is where control action utilize feedback information from the network state. The feedback information may be in the form historical information or current measurement. 6.0 Requirements for Internet Traffic Engineering This section describes the requirements and recommendations for traffic engineering in the Internet. [Note: 1) Minimize the maximum utilization, probably most key in more-meshed topologies 2) 45 ms or less reroute times around facility and equipment failure] Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 28] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 6.1 Generic Requirements Usability: As a general principle, ISPs prefer a TE system that can readily be deployed in their existing networks. The system SHOULD also be easy to operate and maintain. Scalability: ISP networks are growing fast with respect to both network size and traffic. A TE system SHOULD remain functional as the number of routers and links increase in the network. Moreover, traffic growth SHOULD not cause problem to the TE system. These imply that a TE system SHOULD have a scalable architecture, SHOULD not have high CPU and memory utilization, and SHOULD not consume too much link bandwidth to collect, distribute statistics and exert control. Stability: Network stability is critical for ISPs. Stability is generally guaranteed by networks with static provisioning. A TE system is intended to improve network efficiency and reduce congestion by introducing dynamic control on the network. However, a poorly designed TE system may cause instability in the network which in turn causes performance to be unpredictable. If a TE system adapts to the prevailing network condition because of changes in traffic patterns or topology, the system MUST guarantee convergence to a steady state. Flexibility: A TE system SHOULD provide sufficient configurable functions so that an ISP can tailor a particular configuration to a particular environment. Both online and offline TE subsystems SHOULD be supported to provide flexibility so that ISP can enable/disable each or both subsystems. Per-COS TE system SHOULD also be supported although this will be disabled in the best-effort-only environment. Accountability: A TE system SHOULD be able to collect statistics and analyze statistics to show how well the network is functioning. End- to-end traffic matrix, link utilization, latency, packet loss etc. can be used as indication of condition of the network. Simplicity: [WORK IN PROGRESS] With progress in high-speed routing/switching and DWDM technologies, the cost of per unit of bandwidth is dropping quickly. Over-provisioning of network has become more affordable, and thus more common. This fact argues for simple TE systems that are suboptimal rather than complex TE systems that are optimal. Simplicity and effectiveness are essential for the success of any TE systems. Besides, simplicity is essential for the usability, scalability of an TE system and stability of the network. Tractability: A TE system SHOULD have the ability to monitor the status of its state or operation. Some examples for the status include the routes on the shortest paths, existing/planned LSPs, etc. Congestion management: It is generally desirable to minimize the maximum resource utilization per service in an operational network. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 29] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 Survivability: in certain network contexts, it may be desirable to have restoration and/or protection schemes that requires 45 ms or less reroute times around facility and equipment failures. Extendibility: [WORK IN PROGRESS] Monitoribility: [WORK IN PROGRESS] 6.1.1 Stability Considerations Stability is a very important consideration in traffic engineering systems that respond to the state of the network. State dependent dynamic traffic engineering methodologies typically mandates a tradeoff between responsiveness and stability. It is strongly recommended that when tradeoffs are warranted between performance and stability that the tradeoff should be made in favor of stability (especially in public IP backbone networks) . 6.2 Routing Requirements In general, in order to avoid congestion and increase link efficiency, constraint-based path selection SHOULD be supported. Moreover, if the network support multiple classes of services, the TE system SHOULD be able to select different paths for different classes of traffic from the same source to the same destination. IGP is an important component of a TE system. In order for a TE system to be effective, a link state IGP is highly desirable. To do constraint-based path selection, routers MUST have topology information of the network (or area/level). A link state protocol provides such information to all routers. Besides, the IGP SHOULD be able to carry link attributes such as reservable bandwidth and affinity. However, the IGP should not flood information too frequently or consume too much link bandwidth in propagating such information. The IGP in the TE system MUST be stable. It SHOULD be able to converge quickly in the case of network topology change, and SHOULD not introduce traffic oscillation. The IGP SHOULD also be simple with respect to CPU and memory utilization. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 30] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 6.3 Measurement Requirements Statistics collection obtained from traffic measurement, customer subscriptions, or traffic projections, and analysis are essential for traffic engineering purposes. Therefore, a TE system MUST provide mechanisms for collecting statistics and, optionally, analyzing them. The presence of these mechanisms MUST not affect the accuracy of the statistics collected, and MUST be scalable to large ISP networks. Traffic statistics may be classified based on time scales. At the long-term time scale, they SHOULD reflect seasonality (e.g., hourly, daily, weekly, etc.). For a network supporting multiple classes of service, the traffic statistics SHOULD reflect class of service. Analysis MAY provide busy hour traffic statistics, traffic growth patterns, hot spots, spatial imbalances, etc. Traffic statistics SHOULD be represented as traffic matrices which may be indexed by season (time interval) and service class. Each element of a traffic matrix is indexed by a pair of abstract nodes. As an example, an abstract node may represent a VPN site. At the short-term time scale, traffic statistics SHOULD provide reliable estimates of the current network state. In particular, traffic statistics SHOULD reflect link or path congestion state. Examples of congestion measures include packet delay, packet loss, and utilization. Examples of mechanisms of obtaining such measures are probing and IGP flooding. 6.4 Traffic Mapping Requirements Traffic mapping pertains to the assignment of the traffic to be engineered to the network topology to meet certain requirements and optimize resource usage. Traffic mapping can be performed by time- dependent or state-dependent mechanisms, as described in Section 5.1. A TE system SHOULD support both time-dependent and state-dependent mechanisms. For the time-dependent mechanism: - a TE system SHOULD maintain traffic matrices. - a TE system SHOULD have an algorithm that generates a mapping plan for each traffic trunk. - a TE system SHOULD be able to control the path from any source to any destination; e.g., with explicit routes. - a TE system SHOULD be able to setup multiple paths to forward traffic from any source to any destination, and distribute the traffic among them. - a TE system SHOULD provide a graceful migration from one mapping plan to another as the traffic matrix changes to minimize service disruption. For the state-dependent mechanism: - a TE system SHOULD be able to gather and maintain link state Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 31] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 information, for example, by using enhanced OSPF or IS-IS. - for a given demand request, QoS requirements, and other constraints, a TE system SHOULD be able to compute and setup a path, for example, by using constraint-based routing. - a TE system SHOULD be able to perform load balancing among multiple paths. Load balancing SHOULD NOT compromise the stability of the network. In general, a TE system SHOULD be able to change IGP link metrics to induce traffic mapping modification. Experience with traffic engineering solution may aid in future network plan and growth such as adding routers and increasing link capacities. This process is also called network planning. The time scale of network planning is weeks or months. Based on the records maintained by a TE system, recommendations may be provided in the form of future requirements. 6.5 Network Survivability Network survivability is referred as the capability of promptly recovering from a network failure and maintaining the required QoS for existing services. It has become an issue of great concern to the Internet community with increasing demands of carrying mission critical data, real-time voice and video, and other high priority traffic over the Internet. As network technologies advance till today, failure protection and restoration capabilities are available across multiple layers. From the bottom of the layered stacks, optical layer is now capable of providing dynamic ring and mesh restoration functionality. SONET/SDH layer is providing restoration capability in most of the networks today with its Automatic Protection Switching (APS), self-healing ring and mesh architectures. Similar functionality can be provided by the ATM layer (with somewhat slower restoration time). At the conectionless IP layer, rerouting is used to maintain connectivity when routing protocol computation converges after a link/node failure, which often takes seconds to minutes. In order to support real-time applications, path-oriented MPLS is introduced to enhance the survivability of IP networks, potentially more cost effective with per-class protection/restoration capability. Recently, a common suite of protocols in MPLS is proposed under Multiprotocol Lamda Switching as signaling protocols for performing dynamic mesh restoration at the optical layer for the emerging IP over WDM architecture. As one knows, various technologies in different layers provide failure protection and restoration capabilities at different granularities in terms of both time scale (from 50ms to minutes) and capacity (from packet-level to wavelength bandwidth level). They can be class of service aware or unaware, with their own pros and cons including different spare capacity requirements. On the other hand, service outage impact varies tremendously based on the time scale for different services and applications. It ranges from the smallest time Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 32] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 scale (in ms) with minor performance hits, to medium time scale (in seconds) with possible call drops and session time-outs, to large time scale (in minutes to hours) with potential network congestion and social/business impacts. Therefore, how to combine different restoration capabilities across layers in a coordinating manner to ensure certain network survivability is maintained for the services it supports becomes a challenging task. Here we outline a set of general requirements. - Protection/restoration capabilities from different layers should be combined to provide cost effective network survivability at the level services require. - Spare capacity of the upper layer is often regarded as working traffic at the lower layer. Placing protection/restoration functions in many layers may increase redundancy and robustness, but it should not result in significant wastes in network resource. - Alarm triggering/escalation from the lowest physical layer to higher layers should be performed in a coordinated way to avoid functionality collision and duplication. A temporal order of restoration triggering timing at different layers can be formed for effective coordination purposes. - Failure notification across the network should be timely and reliable. - Monitoring capability for bit error rate and restoration status at different layers should be provided. 6.5.1 Survivability in MPLS Based Networks MPLS is an important emerging technology for enhancing IP in both features and services. Due to its path-oriented feature, MPLS can provide potentially faster and more predictable failure protection and restoration. Here, we provide an outline of its basic features and requirements in terms of failure protection and restoration (which may not include some special proprietary mechanisms). Other draft documents that address similar issues include [ACJ99], [MSOH99], and [Shew99]. Protection type: - Link Protection: The path of the protection LSP is only disjoint from its working LSP at a particular link on the working LSP. Traffic on the working LSP is switched over to a protection path at the upstream LSR that connects to the failed link. This method is potentially the fastest, and can be effective in situations where certain path components are much more unreliable than others. - Node Protection: The path of the protection LSP is disjoint from its working LSP at a particular node and links associated with the node on the working LSP. Traffic on the working LSP is switched over to a protection LSP at the upstream LSR that directly connects to the Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 33] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 failed node. - Path Protection: The path of the protection LSP is completely disjoint from its working LSP. The advantage of this method is that the protection LSP protects the working LSP from all possible link and router failures along the path of the working LSP except the ingress and egress LSR failures. In addition, since the path selection is end-to-end, it can be potentially more optimal in resource usage than link or node protection. However, it is in general slower than link/node protection since it takes longer for the failure notification message to get to the ingress LSR to trigger the reroute. - Partial Protection: The path of the protection LSP is partially disjoint from its working LSP. Protection option: It can be described in general using the notation m:n protection where m protection LSPs are used to protect n working LSPs. Here are some common ones plus the 1+1 protection. - 1:1: one working LSP is protected/restored by one protection LSP; - n:1: one working LSP is protected/restored by n protection LSPs with configurable load splitting ratio; - 1:n: one protection LSP is used to protect/restore n working LSP; - 1+1: traffic is sent on both the working LSP as well as the protection LSP, and the egress LSR selects one of the two copies. This option may not be common due to its inefficiency in resource usage. Resilience Attributes: - Basic attribute: reroute using IGP or protection LSP(s) when a segment of the path fails, or no rerouting at all. - Extended attributes: 1. Protection LSP establishment attribute: the protection LSP is i) pre-established, or ii) established-on-demand after receiving failure notification. Pre-established protection LSP can be faster while established-on-demand one can potentially find a more optimal path and with more efficient resource usage. 2. Constraint attribute under failure condition: the protection LSP requires certain constraint(s) to be satisfied, which can be the same or less than the ones under normal condition, e.g., bandwidth requirement, or choose to use 0-bandwidth requirement under any failure condition. 3. Protection LSP resource reservation attribute: resource allocation Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 34] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 of a pre-established protection LSP is i) pre-reserved, or ii) reserved-at-setup after receiving failure notification; Failure Notification: Failure notification SHOULD be reliable and fast enough, i.e., at least in the same order as IGP notification, which is through LSA flooding, if not faster. 6.6 Content Distribution (Webserver) Requirements The Internet is dominated by client-server interactions, especially Web traffic. The location of major information servers have a significant impact on the traffic patterns within the Internet, and on the perception of service quality by end users. Scheduling systems that allocate servers to clients in replicated, geographically dispersed information distribution systems may require performance parameters from the network to make effective decisions. A number of dynamic load balancing techniques have been devised to improve the performance of replicated Web servers. The impact of these techniques to ISPs is that the traffic becomes more dynamic, because Web servers can be dynamically picked based on the locations of the clients, and the relative performance of different networks or different parts of a network. We call this process Traffic Directing (TD). A TE system should not be too reactive to traffic shift caused by TD, otherwise traffic oscillation may be introduced by the interaction of TD and TE. A simple approach to deal with the traffic shift introduced by TD is to over-provision network capacity. 6.7 Offline Traffic Engineering Support Systems If optimal link efficiency is desired, an offline Traffic Engineering support system may be used to compute the paths for the traffic trunks. By taking all the trunk requirements, link attributes and network topology information into consideration, an offline TE Support System may be able to find better trunk placement than online TE system, where every router in the network finds paths trunks originated from it separately based on its own information. An offline TE support system will compute paths for trunks periodically, e.g. daily. Then the path information of the trunks is downloaded into the routers. An online TE System is still needed, so that routers can adapt to changes promptly. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 35] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 7.0 Traffic Engineering in Multiclass Environments Increasing demands of supporting voice, video, mission critic data in the Internet is calling for IP networks to differentiate traffic according to its application needs. Large number of flows are aggregated into a few classes based on their common performance requirements in terms of packer loss ratio, delay, and jitter. In this section, we describe features and requirements that are unique to a multiclass environment. Here, we concentrate on the most acceptable multiclass architecture, Differentiated Services (Diffserv). 7.1 Traffic Engineering in Diffserv Environments As Diffserv emerges to be the IP backbone architecture for providing classes of service to different applications, traffic engineering in Diffserv environments is becoming increasingly important. Diffserv provides classes of services (CoS) by concatenating per-hop behaviors (PHBs) along the routing path, together with service provisioning and edge functionality including traffic classification, marking, policing, and shaping. PHB is the forwarding behavior a packet receive at a DS node (i.e., a Diffserv-compliant node) by means of some buffer management and packet scheduling mechanisms. In additional to implementing proper buffer management and packet scheduling mechanisms to provide the corresponding PHBs, it may be desirable to limit the performance impact of higher priority traffic on the lower priority traffic by controlling the relative percentage of higher priority traffic and/or increasing resource capacities. Traffic trunk with a given class can be mapped onto a LSP to perform per-class traffic engineering for performance and scalability enhancements. Here, we describe requirements that are specific to CoS traffic trunks in additional to those that are described for general traffic in the previous section. - Besides the preemption attributes in the general case, a LSR SHOULD provide configurable maximum reservable bandwidth and/or buffer for each class (i.e., Ordering Aggregate in Diffserv) or class-type. An class-type is a set of classes that satisfy the following two conditions: 1) Classes in the same class-type have a common aggregate maximum and/or minimum bandwidth requirements to guarantee the required performance level; 2) There is no maximum or minimum bandwidth requirement to be enforced at the level of individual class in the class-type. One can still implement some "priority" policies for classes in the same class-type in terms of accessing the class-type bandwidth. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 36] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 An example of the class-type can be a real-time class-type that includes both EF-based and AF1-based Ordering Aggregates. One can assign higher preemption priority to EF-based traffic trunks over AF1-based ones, vice versa, or the same priority. - An LSR SHOULD provide configurable minimum available bandwidth and/or buffer for each class or class-type. - In order to perform constraint-based routing for per-class LSPs, IGPs (e.g., IS-IS and OSPF) SHOULD provide extensions to propagate per-class or per-class-type information besides the per-link information. - For real-time traffic trunks with end-to-end network delay requirement, path selection algorithm in constraint-based routing SHOULD be able to take multiple constraints into account. Some candidate constraints include 1) bandwidth requirement which provides low loss and low queueing delay/jitter bound (in high percentile); 2) end-to-end delay requirement when it is necessary (i.e., not all possible paths considered by the algorithm may satisfy the end-to-end delay requirement). In practice it can be translated it into end-to- end propagation delay requirement which equals to (maximum end-to-end network delay - end-to-end queueing delay). It is in general a difficult task to calculate the end-to-end queueing delay analytically first due to the lack of adequate traffic model, secondly because analytical results are tractable currently only for simple traffic models such as Poisson processes and Markov-modulated processes (both remain to be active research areas). In practice, one can approximate it with (per-hop queueing delay bound per * number of hops) where per-hop queueing delay bound is obtained based on the corresponding PHB characteristics. - When an LSR dynamically adjusts resource allocation based on per- class LSP resource requests, weight adjustment granularity of queueing scheduling algorithms SHOULD not compromise delay/jitter property of certain class(es). - An LSR SHOULD provide configurable maximum allocation multiplier at per-class basis. - Measurement-based admission control CAN be used for better resource usage, especially for those classes without stringent loss or delay/jitter requirements. For example, an LSR CAN dynamically adjust maximum allocation multiplier (i.e., oversubscribing/undersubscribing ratio) for certain classes based on their resource utilization. 8.0 Traffic Engineering in Multicast Environments For further study. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 37] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 9.0 Inter-Domain Considerations Inter-domain traffic engineering is concerned with the performance optimization for traffic that originates in one administrative domain and terminates in a different one. Traffic exchange between autonomous occurs through exterior gateway protocols. Currently, BGP-4 [bgp4] is the defacto EGP standard. Traditionally, in the public Internet, BGP based policies are used to control import and export policies for inter-domain traffic. BGP policies are also used to determine exit and entrance points to peers to reduce intra-domain traffic and load balance cross multiple peering points, in conjunction with IGP routing, e.g., hot potato routing, local preference, MED. MPLS TE-tunnels add another degree of flexibility by assigning different TE metric than the one defined by IGP metric (e.g., through either absolute metric or relative metric). For example, routes that connect to certain edge router can prefer certain exit point by connecting the router to the peering point via a TE-tunnel with a smaller metric than the IGP cost to all other peering points. Similar scheme can be applied to prefer certain entrance point by setting MED to be the IGP cost, which has been modified by the tunnel metric. Same as in intra-domain TE, one key element here is to obtain the traffic matrix for inter-domain traffic. As one can see, any of these load balance/splitting policy changes in one domain can affect the traffic distribution inside the peering partner's domain. This, in turn, will affect the intra-domain TE due to the traffic matrix change. Therefore, it is critical for peering partners to negotiate and coordinate prior to any significant policy change to ensure successful and stable TE cross multiple domains. Although this can be a challenge in practice since different ISPs may have different business focuses and traffic characteristics (e.g., web-hosting versus intra-domain traffic), there are still common grounds where it is beneficial for both peering partners to coordinate in a coherent inter-domain TE. Inter-domain TE is inherently a much more difficult problem than intra-domain TE since currently BGP is not capable of propagating topology and link state information outside one domain. How to extend BGP and use MPLS for selecting an optimal path across multiple domains that satisfy certain requirements for both normal and failure conditions becomes a challenging task. 10.0 Conclusion This document described a framework for traffic engineering in the Internet. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 38] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 11.0 Security Considerations This document does not introduce new security issues. 12.0 Acknowledgements The authors would like to thank Jim Boyle for inputs on requirements section, and Francois Le Faucheur for his inputs on class-type. The subsection describing an "Overview of ITU Activities Related to Traffic Engineering" was adapted from a contribution by Waisum Lai. 13.0 References [ACJ99] L. Anderson, B. Cain, and B. Jamoussi, "Requirement Framework for Fast Re-route with MPLS", Work in progress, October 1999. [ASH1] J. Ash, M. Girish, E. Gray, B. Jamoussi, G. Wright, "Applicability Statement for CR-LDP," Work in Progress, 1999. [AWD1-99] D. Awduche, J. Malcolm, J. Agogbua, M. O'Dell, J. McManus, "Requirements for Traffic Engineering over MPLS," RFC 1702, September 1999. [AWD2-99] D. Awduche, "MPLS and Traffic Engineering in IP Networks," IEEE Communications Magazine, December 1999. [AWD3] D. Awduche, L. Berger, D. Gan, T. Li, G. Swallow, and V. Srinivasan "Extensions to RSVP for LSP Tunnels," Work in Progress, 1999. [AWD4] D. Awduche, A. Hannan, X. Xiao, " Applicability Statement for Extensions to RSVP for LSP-Tunnels" Work in Progress, 1999. [AWD5] D. Awduche et al, "An Approach to Optimal Peering Between Autonomous Systems in the Internet," International Conference on Computer Communications and Networks (ICCCN'98), October 1998. [CAL] R. Callon, P. Doolan, N. Feldman, A. Fredette, G. Swallow, A. Viswanathan, A Framework for Multiprotocol Label Switching," Work in Progress, 1999. [FGLR] A. Feldmann, A. Greenberg, C. Lund, N. Reingold, and J. Rexford, "NetScope: Traffic Engineering for IP Networks," to appear in IEEE Network Magazine, 2000. [FlJa93] S. Floyd and V. Jacobson, "Random Early Detection Gateways for Congestion Avoidance", IEEE/ACM Transactions on Networking, Vol. 1 Nov. 4., August 1993, p. 387-413. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 39] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 [Floy94] S. Floyd, "TCP and Explicit Congestion Notification", ACM Computer Communication Review, V. 24, No. 5, October 1994, p. 10-23. [HuSS87] B.R. Hurley, C.J.R. Seidl and W.F. Sewel, "A Survey of Dynamic Routing Methods for Circuit-Switched Traffic", IEEE Communication Magazine, Sep 1997. [itu-e600] ITU-T Recommendation E.600, "Terms and Definitions of Traffic Engineering", March 1993. [itu-e701] ITU-T Recommendation E.701 "Reference Connections for Traffic Engineering", October 1993. [JAM] B. Jamoussi, "Constraint-Based LSP Setup using LDP," Work in Progress, 1999. [Li-IGP] T. Li, G. Swallow, and D. Awduche, "IGP Requirements for Traffic Engineering with MPLS," Work in Progress, 1999 [LNO96] T. Lakshman, A. Neidhardt, and T. Ott, "The Drop from Front Strategy in TCP over ATM and its Interworking with other Control Features", Proc. INFOCOM'96, p. 1242-1250. [MATE] I. Widjaja and A. Elwalid, "MATE: MPLS Adaptive Traffic Engineering," Work in Progress, 1999. [McQl] J.M. McQuillan, I. Richer, and E.C. Rosen, "The New Routing Algorithm for the ARPANET", IEEE. Trans. on Communications, vol. 28, no. 5, pp. 711-719, May 1980. [MR99] D. Mitra and K.G. Ramakrishnan, "A Case Study of Multiservice, Multipriority Traffic Engineering Design for Data Networks, Proc. Globecom'99, Dec 1999. [MSOH99] S.Makam, V. Sharma, K. Owens, C. Huang, "Protection/Restoration of MPLS Networks", draft-makam-mpls- protection-00.txt, October, 1999. [OMP] C. Villamizar, "MPLS Optimized OMP", Work in Progress, 1999. [RFC-1349] P. Almquist, "Type of Service in the Internet Protocol Suite", RFC 1349, Jul 1992. [RFC-1458] R. Braudes, S. Zabele, "Requirements for Multicast Protocols," RFC 1458, May 1993. [RFC-1771] Y. Rekhter and T. Li, "A Border Gateway Protocol 4 (BGP- 4), RFC 1771, March 195. [RFC-1812] F. Baker (Editor), "Requirements for IP Version 4 Routers," RFC 1812, June 1995. [RFC-1997] R. Chandra, P. Traina, and T. Li, "BGP Community Attributes" RFC 1997, August 1996. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 40] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 [RFC-1998] E. Chen and T. Bates, "An Application of the BGP Community Attribute in Multi-home Routing," RFC 1998, August 1996. [RFC-2178] J. Moy, "OSPF Version 2", RFC 2178, July 1997. [RFC-2205] Braden, R., et. al., "Resource Reservation Protocol (RSVP) - Version 1 Functional Specification", RFC 2205, September 1997. [RFC-2211] J. Wroclawski, "Specification of the Controlled-Load Network Element Service", RFC 2211, Sep 1997. [RFC-2212] S. Shenker, C. Partridge, R. Guerin, "Specification of Guaranteed Quality of Service," RFC 2212, September 1997 [RFC-2215] Shenker, S., and J. Wroclawski, "General Characterization Parameters for Integrated Service Network Elements", RFC 2215, September 1997. [RFC-2216] Shenker, S., and J. Wroclawski, "Network Element Service Specification Template", RFC 2216, September 1997. [RFC-2330] V. Paxson et al., "Framework for IP Performance Metrics", RFC 2330, May 1998. [RFC-2475] S. Blake et al., "An Architecture for Differentiated Services", RFC 2475, Dec 1998. [RFC-2597] J. Heinanen, F. Baker, W. Weiss, and J. Wroclawski, "Assured Forwarding PHB Group", RFC 2597, June 1999. [RFC-2678] J. Mahdavi and V. Paxson, "IPPM Metrics for Measuring Connectivity", RFC 2678, Sep 1999. [RFC-2679] G. Almes, S. Kalidindi, and M. Zekauskas, "A One-way Delay Metric for IPPM", RFC 2679, Sep 1999. [RFC-2680] G. Almes, S. Kalidindi, and M. Zekauskas, "A One-way Packet Loss Metric for IPPM", RFC 2680, Sep 1999. [RFC-2722] N. Brownlee, C. Mills, and G. Ruth, "Traffic Flow Measurement: Architecture", RFC 2722, Oct 1999. [RoVC] E. Rosen, A. Viswanathan, R. Callon, "Multiprotocol Label Switching Architecture," Work in Progress, 1999. [Shew99] S. Shew, "Fast Restoration of MPLS Label Switched Paths", draft-shew-lsp-restoration-00.txt, October 1999. [SLDC98] B. Suter, T. Lakshman, D. Stiliadis, and A. Choudhury, "Design Considerations for Supporting TCP with Per-flow Queueing", Proc. INFOCOM'99, 1998, p. 299-306. [XIAO] X. Xiao, A. Hannan, B. Bailey, L. Ni, "Traffic Engineering with MPLS in the Internet", IEEE Network magazine, March 2000. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 41] Internet Draft draft-ietf-te-framework-00.txt Expires July 2000 [YaRe95] C. Yang and A. Reddy, "A Taxonomy for Congestion Control Algorithms in Packet Switching Networks", IEEE Network Magzine, 1995 p. 34-45. 14.0 Authors' Addresses: Daniel O. Awduche UUNET (MCI Worldcom) 22001 Loudoun County Parkway Ashburn, VA 20147 Phone: 703-886-5277 Email: awduche@uu.net Angela Chiu AT&T Labs Room C4-3A22 200 Laurel Ave. Middletown, NJ 07748 Phone: (732) 420-2290 Email: alchiu@att.com Anwar Elwalid Lucent Technologies Murray Hill, NJ 07974, USA Phone: 908 582-7589 Email: anwar@lucent.com Indra Widjaja Fujitsu Network Communications Two Blue Hill Plaza Pearl River, NY 10965, USA Phone: 914-731-2244 Email: indra.widjaja@fnc.fujitsu.com Xipeng Xiao Global Crossing 141 Caspian Court, Sunnyvale, CA 94089 Email: xipeng@globalcenter.net Voice: +1 408-543-4801 Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 42]