Internet Draft Internet Engineering Task Force INTERNET-DRAFT TE Working Group Daniel O. Awduche January 2000 UUNET (MCI Worldcom) Angela Chiu AT&T Anwar Elwalid Lucent Technologies Indra Widjaja Fujitsu Network Communications Xipeng Xiao Global Crossing A Framework for Internet Traffic Engineering draft-ietf-tewg-framework-00.txt Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." To view the list Internet-Draft Shadow Directories, see http://www.ietf.org/shadow.html. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 1] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 Abstract This memo describes a framework for Traffic Engineering (TE) in the Internet. The framework is intended to promote better understanding of the issues surrounding traffic engineering in IP networks, and to provide a common basis for the development of traffic engineering capabilities for the Internet. The framework explores the principles, architectures, and methodologies for performance evaluation and performance optimization of operational IP networks. The optimization goals of traffic engineering seek to enhance the performance of IP traffic while utilizing network resources economically, efficiently, and reliably. The framework includes a set of generic requirements, recommendations, and options for Internet traffic engineering. The framework can serve as a guide to implementors of online and off-line Internet traffic engineering mechanisms, tools, and support systems. The framework can also help service providers in devising traffic engineering solutions for their networks. Table of Contents 1.0 Introduction 1.1 What is Internet Traffic Engineering? 1.2 Scope 1.3 Terminology 2.0 Background 2.1 Context of Internet Traffic Engineering 2.2 Network Context 2.3 Problem Context 2.3.1 Congestion and its Ramifications 2.4 Solution Context 2.4.1 Combating the Congestion Problem 2.5 Implementation and Operational Context 3.0 Traffic Engineering Process Model 3.1 Components of the Traffic Engineering Process Model 3.2 Measurement 3.3 Modeling and Analysis 3.4 Optimization 4.0 Historical Review and Recent Developments 4.1 Traffic Engineering in Classical Telephone Networks 4.2 Evolution of Traffic Engineering in Packet Networks 4.2.1 Adaptive Routing in ARPANET 4.2.2 Dynamic Routing in the Internet 4.2.3 ToS Routing 4.2.4 Equal Cost MultiPath 4.3 Overlay Model 4.4 Constraint-Based Routing 4.5 Overview of Recent IETF Projects Related to Traffic Engineering 4.5.1 Integrated Services 4.5.2 RSVP 4.5.3 Differentiated Services 4.5.4 MPLS Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 2] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 4.5.5 IP Performance Metrics 4.5.6 Flow Measurement 4.5.7 Endpoint Congestion Management 4.6 Overview of ITU Activities Related to Traffic Engineering 5.0 Taxonomy of Traffic Engineering Systems 5.1 Time-Dependent Versus State-Dependent 5.2 Offline Versus Online 5.3 Centralized Versus Distributed 5.4 Local Versus Global 5.5 Prescriptive Versus Descriptive 5.6 Open-Loop Versus Closed-Loop 6.0 Requirements for Internet Traffic Engineering 6.1 Generic Requirements 6.2 Routing Requirements 6.3 Traffic Mapping Requirements 6.4 Measurement Requirements 6.5 Network Survivability 6.5.1 Survivability in MPLS Based Networks 6.6 Content Distribution (Webserver) Requirements 6.7 Off-line Traffic Engineering Support Systems 6.8 Traffic Engineering in Diffserv Environments 7.0 Multicast Considerations 8.0 Inter-Domain Considerations 9.0 Conclusion 10.0 Security Considerations 11.0 Acknowledgments 12.0 References 13.0 Authors' Addresses 1.0 Introduction This memo describes a framework for Internet traffic engineering. The intent is to articulate the general issues, principles and requirements for Internet traffic engineering; and where appropriate to provide recommendations, guidelines, and options for the development of online and off-line Internet traffic engineering capabilities and support systems. The framework can assist vendors of networking hardware and software in developing mechanisms and support systems for the Internet environment that support the traffic engineering function. The framework can also help service providers in devising and implementing traffic engineering solutions for their networks. The framework provides a terminology for describing and understanding common Internet traffic engineering concepts. The framework also provides a taxonomy of known traffic engineering styles. In this context, a traffic engineering style abstracts important aspects from a traffic engineering methodology. Traffic engineering styles can be viewed in different ways depending upon the specific context in which they are used and the specific purpose which they serve. The combination of styles and views results in a natural taxonomy of Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 3] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 traffic engineering systems. Although Internet traffic engineering is most effective when applied end-to-end, the initial focus of this framework document is intra- domain traffic engineering (that is, traffic engineering within a given autonomous system). However, in consideration of the fact that a preponderance of Internet traffic tends to be inter-domain (that is, they originate in one autonomous system and terminate in another), this document provides an overview of some of the aspects that pertain to inter-domain traffic engineering. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119. This draft is preliminary and will be reviewed and revised over time. 1.1. What is Internet Traffic Engineering? Internet traffic engineering is defined as that aspect of Internet network engineering that deals with the issue of performance evaluation and performance optimization of operational IP networks. Traffic Engineering encompasses the application of technology and scientific principles to the measurement, characterization, modeling, and control of Internet traffic [AWD1, AWD2]. A major objective of Internet traffic engineering is to enhance the performance of an operational network; at both the traffic and resource levels. This is accomplished by addressing traffic oriented performance requirements, while utilizing network resources efficiently, reliably, and economically. Traffic oriented performance measures include delay, delay variation, packet loss, and goodput. It is worthwhile to emphasize that an important objective of Internet traffic engineering is to facilitate reliable network operations [AWD1]. Reliable network operations can be facilitated by providing mechanisms that enhance network integrity and by embracing policies that emphasis network survivability, so that the vulnerability of the network to service outages arising from errors, faults, and failures that occur within the infrastructure can be minimized. It is also important to be cognizant of the fact that ultimately, what really matters is the performance of the network as seen by end users of network services. This crucial aspect should be kept in view when developing traffic engineering mechanisms and policies. The charateristics that are visible to end users are the emergent properties of the network. Emergent properties are the characteristics of the network viewed as a whole. A significant, but subtle, practical advantage of applying traffic engineering concepts to operational networks is that it helps to identify and structure goals and priorities in terms of enhancing the Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 4] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 quality of service delivered to end-users of network services, and in terms of measuring and analyzing the achievement of these goals. The optimization aspects of traffic engineering can be achieved through capacity management and traffic management. As used in this document, capacity management includes capacity planning, routing control, and resource management. Network resources of particular interest include link bandwidth, buffer space, and computational resources. Likewise, as used in this document, traffic management includes traffic conditioning, scheduling, and other functions that regulate traffic flow through the network or that arbitrate access to network resources between different packets. The optimization objectives of Internet traffic engineering should be viewed as a continual and iterative process of network performance improvement, rather than as a one time goal. Traffic engineering also demands continual development of new technologies and new methodologies for network performance enhancement. The optimization objectives of Internet traffic engineering may change over time as new requirements are imposed, or as new technologies emerge, or as new insights are brought to bear on the underlying problems. Moreover, different networks may have different optimization objectives, depending upon their business models, capabilities, and operating constraints. Regardless of the specific optimization goals that prevail in any particular environment, for practical purposes, the optimization aspects of traffic engineering are ultimately concerned with network control. Thus, the optimization aspects of traffic engineering can be viewed from a control perspective. The control dimension of Internet traffic engineering can be pro-active and/or reactive. In the reactive case, the control system responds to events that have already transpired in the network. In the pro-active case, the control system takes preventive action to obviate predicted unfavorable future network states, or takes perfective action to induce a more desirable state in the future. The control dimension of Internet traffic engineering responds at multiple levels of temporal resolution to network events. Some aspects of capacity management such as capacity planning functions respond at a very coarse temporal level, ranging from days to possibly years. The routing control functions operate at intermediate levels of temporal resolution, ranging from milliseconds to days. Finally, the packet level processing functions (e.g. rate shaping, queue management, and scheduling) operate at very fine levels of temporal resolution, responding to the real-time statistical characteristics of traffic, ranging from picoseconds to milliseconds. The subsystems of Internet traffic engineering control include: capacity augmentation, routing control, traffic control, and resource control (including control of service policies at network elements). Inputs into the control system include network state variables, policy variables, and decision variables. For practical purposes, traffic engineering concepts and mechanisms must be sufficiently specific and well defined to address known requirements, but at the same time must be flexible and extensible to Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 5] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 accommodate unforeseen future demands. A major challenge in Internet traffic engineering is the realization of automated control capabilities that adapt quickly and at reasonable cost to significant changes in network state, while maintaining stability. 1.2. Scope The scope of this document is intra-domain traffic engineering; that is, traffic engineering within a given autonomous system in the Internet. The framework will discuss concepts pertaining to intra- domain traffic control, including such issues as routing control, micro and macro resource allocation, and the control coordination problems that arise consequently. This document will describe and characterize techniques already in use or in advanced development for Internet traffic engineering, indicate how they fit together, and identify scenarios in which they are useful. Although the emphasis is on intra-domain traffic engineering, in Section 8.0, however, an overview of the high level considerations pertaining to inter-domain traffic engineering will be provided. Inter-domain Internet traffic engineering is crucial to the performance enhancement of the global Internet infrastructure. Whenever possible, relevant requirements from existing IETF documents and other sources will be incorporated by reference. 1.3 Terminology This subsection provides terminology which is useful for Internet traffic engineering. The definitions presented apply to this framework document. These terms may have other meanings elsewhere. - Baseline analysis: A study conducted to serve as a baseline for comparison to the actual behavior of the network. - Busy hour: A one hour period within a specified interval of time (typically 24 hours) in which the traffic load in a network or subnetwork is greatest. - Congestion: A state of a network resource in which the traffic incident on the resource exceeds its output capacity over an interval of time. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 6] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 - Congestion avoidance: An approach to congestion management that attempts to obviate the occurrence of congestion. - Congestion control: An approach to congestion management that attempts to remedy congestion problems that have already occurred. - Constraint-based routing: A class of routing protocols that take specified traffic attributes, network constraints, and policy constraints into account in making routing decisions. Constraint-based routing is applicable to traffic aggregates as well as flows. It is a generalization of QoS routing. - Demand side congestion management: A congestion management scheme that addresses congestion problems by regulating or conditioning offered load. - Effective bandwidth: The minimum amount of bandwidth that can be assigned to a flow or traffic aggregate in order to deliver 'acceptable service quality' to the flow or traffic aggregate. - Egress traffic: Traffic exiting a network or network element. - Ingress traffic: Traffic entering a network or network element. - Inter-domain traffic: Traffic that originates in one Autonomous system and terminates in another. - Loss network: A network that does not provide adequate buffering for traffic, so that traffic entering a busy resource within the network will be dropped rather than queued. - Network Survivability: The capability to provide a prescribed level of QoS for existing services after a given number of failures occur within the network. - Off-line traffic engineering: A traffic engineering system that exists outside of the network. - Online traffic engineering: A traffic engineering system that exists within the network, typically implemented on or as adjuncts to operational network elements. - Performance measures: Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 7] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 Metrics that provide quantitative or qualitative measures of the performance of systems or subsystems of interest. - Performance management: A systematic approach to improving effectiveness in the accomplishment of specific networking goals related to performance improvement. - Provisioning: The process of assigning or configuring network resources to meet certain requests. - QoS routing: Class of routing systems that selects paths to be used by a flow based on the QoS requirements of the flow. - Service Level Agreement: A contract between a provider and a customer that guarantees specific levels of performance and reliability at a certain cost. - Stability: An operational state in which a network does not oscillate in a disruptive manner from one mode to another mode. - Supply side congestion management: A congestion management scheme that provisions additional network resources to address existing and/or anticipated congestion problems. - Transit traffic: Traffic whose origin and destination are both outside of the network under consideration. - Traffic characteristic: A description of the temporal behavior or a description of the attributes of a given traffic flow or traffic aggregate. - Traffic engineering system A collection of objects, mechanisms, and protocols that are used conjunctively to accomplish traffic engineering objectives. - Traffic flow: A stream of packets between two end-points that can be characterized in a certain way. A micro-flow has a more specific definition: A micro-flow is a stream of packets with a bounded inter-arrival time and with the same source and destination addresses, source and destination ports, and protocol ID. - Traffic intensity: A measure of traffic loading with respect to a resource capacity over a specified period of time. In classical Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 8] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 telephony systems, traffic intensity is measured in units of Erlang. - Traffic matrix: A representation of the traffic demand between a set of origin and destination abstract nodes. An abstract node can consist of one or more network elements. - Traffic monitoring: The process of observing traffic characteristics at a given point in a network and collecting the traffic information for analysis and further action. - Traffic trunk: An aggregation of traffic flows belonging to the same class which are forwarded through a common path. A traffic trunk may be characterized by an ingress and egress node, and a set of attributes which determine its behavioral characteristics and requirements from the network. 2.0 Background The Internet has quickly evolved into a very critical communications infrastructure, supporting significant economic, educational, and social activities. At the same time, the delivery of Internet communications services has become a very competitive endeavor. Consequently, optimizing the performance of large scale IP networks, especially public Internet backbones, has become an important problem. Network performance requirements are multidimensional, complex, and sometimes contradictory; thereby making the traffic engineering problem very challenging. The network must convey IP packets from ingress nodes to egress nodes efficiently, expeditiously, reliably, and economically. Furthermore, in a multiclass service environment (e.g. Diffserv capable networks), the resource sharing parameters of the network must be appropriately determined and configured according to prevailing policies and service models to resolve resource contention issues arising from mutual interference between packets traversing through the network. Moreover, in multi-class environments, consideration must be given to resolving competition for network resources between traffic streams belonging to the same service class (intra-class contention resolution) and between traffic streams belonging to different classes (inter-class contention resolution). 2.1 Context of Internet Traffic Engineering The context of Internet traffic engineering pertains to the scenarios in which the problems that traffic engineering attempts to solve Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 9] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 manifest. A traffic engineering methodology establishes appropriate rules to solve traffic performance problems that occur in a specific context. The context of Internet traffic engineering includes: (1) A network context which defines the situations in which the TE problems occur. The Network context encompasses network structure, network policies, network characteristics, network constraints, network quality attributes, network optimization criteria, etc. (2) A problem context which defines the general and concrete issues that TE addresses. The problem context encompasses identification, abstraction of relevant features, representation, formulation, requirements and desirable features of solutions, etc. (3) A solution context which suggests how to solve the TE problems. The solution context encompasses analysis, evaluation of alternatives, prescription, and resolution. (4) An implementation and operational where the solutions are methodologically instantiated. The implementation and operational context which encompasses planning, organization, and execution. In the following subsections, we elaborate on the context of Internet traffic engineering. 2.2 Network Context IP networks range in size from small clusters of routers situated within a given location, to thousands of interconnected routers and switches distributed all over the world. At the most basic level of abstraction, an IP network can be represented as: (1) a constrained system consisting of set of interconnected resources which provide transport services for IP traffic, (2) a demand system representing the offered load to be transported through the network, and (3) a response system consisting of network processes, protocols, and related mechanisms which facilitate the movement of traffic through the network [see also AWD2]. The network elements and resources may have specific characteristics which restrict the way in which they handle the demand. Additionally, network resources may be equipped with traffic control mechanisms which allow the way in which they handle the demand to be regulated. Traffic control mechanisms may also be used to control various packet processing activities within the resource, or to arbitrate contention for access to the resource by different packets, or to regulate traffic behavior through the resource. A configuration management and provisioning system may allow the settings of the traffic control Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 10] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 mechanisms to be manipulated by external or internal entities in order to constrain or to exercise control over the way in which the network element responds to internal and external stimuli. The details of how the network provides transport services for packets are specified in the policies of the network administrators and are installed through network configuration management and provisioning systems. Generally, the types of services provided by the network also depends upon the technology and characteristics of the network elements, the prevailing policies, as well as the ability of the network administrators to translate policies into network configurations. There are three significant characteristics of contemporary Internet networks: (1) they provide real-time services, (2) they have become mission critical, and (3) their operating environments are very dynamic. The dynamic characteristics of IP networks can be attributed in part to fluctuations in demand, to the interaction between various network protocols and processes, to the rapid evolution of the infrastructure which demands constant insertion of new technologies and new network elements, and to transient and persistent impairments which occur within the system. The most significant function permformed by an IP network is the routing of packets from source nodes to destination nodes. Not surprisingly, one of the most significant functions performed by Internet traffic engineering is the control and optimization of the routing function, so as to steer packets in the most effective way through the network. As packets are conveyed through the network, they contend for the use of network resources. If the arrival rate of packets exceed the output capacity of a network resource over an interval of time, the resource is said to be congested, and some of the arrival packets may be dropped as a result. Congestion also increases transit delays, delay variation, and reduces the predictability of network service delivery. Thus, congestion is a highly undesirable phenomenon. Combating congestion at reasonable cost is a major objective of Internet traffic engineering. A basic economic premise for packet switched networks in general and the Internet in particular is the efficient sharing of network resources by multiple traffic streams. One of the fundamental challenges in operating a network, especially large scale public IP networks, is the need to increase the efficiency of resource utilization while minimizing the possibility of congestion. Increasingly, the Internet will have to function in the presence of different classes of traffic, especially with the advent of differentiated services. In practice, a particular set of packets may have specific delivery requirements which may be specified explicitly or implicitly. Two of the most important traffic delivery requirements are (1) capacity constraints which can be expressed statistically as peak rates, mean rates, burst sizes, or as some deterministic notion of effective bandwidth, and (2) QoS constraints which can be expressed in terms of integrity constraints (e.g. packet Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 11] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 loss) and temporal constraints, e.g. timing restrictions for the delivery of each packet and timing restrictions for the delivery of consecutive packets belonging to the same traffic stream. Packets may also be grouped into classes, in such a way that each class may have a common set of behavioral characteristics and/or a common set of delivery requirements. 2.3 Problem Context There are a number of fundamental problems associated with the operation of a network described by the simple model of the previous subsection. The present subsection reviews the problem context with regard to the traffic engineering function. One problem concerns how to identify, abstract, represent, and measure relevant features of the network which are relevant for traffic engineering. One particularly important class of problems concerns how to explicitely formulate the problems that TE attempts to solve, how to identify the requirements on the solution space, how to specify the desireable features of good solutions, and how to measure and characterize the effectiveness of the solutions. Another problem concerns how to measure and estimate relevant network state parameters. Effective traffic engineering relies on a good estimate of the offered traffic load as well as a view of the underlying topology and associated resource constraints. A network- wide view of the topology is also a must for off-line planning. Still another problem concerns how to characterize the state of the network and how to evaluate its performance under a variety of scenarios. There are two aspects to the performance analysis problem. One aspect relates to the evaluation of the system level performance of the network. The other aspect relates to the evaluation of the resource level performance, which restricts attention to the performance evaluation of individual network resources. In this memo, we shall refer to the system level characteristics of the network as the "macro-states" and the resource level characteristics as the "micro-states." Likewise, we shall refer to the traffic engineering schemes that deal with network performance optimization at the systems level as macro-TE and the schemes that optimize at the individual resource level as micro-TE. In general, depending upon the particular performance measures of interest, the system level performance can be derived from the resource level performance results using appropriate rules of composition. Yet another fundamental problem concerns how to optimize the performance of the network. Performance optimization may entail some degree of resource management control, routing control, and/or capacity augmentation. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 12] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 As noted previously, congestion is an undesirable phenomena in operational networks. Therefore, we devote the next subsection to the issue of congestion and its ramifications within the problem context of Internet traffic engineering. 2.3.1 Congestion and its Ramifications Congestion is one of the most significant problems in an operational IP context. A network element is said to be congested if it experiences sustained overload over an interval of time. Almost invariably, congestion results in degradation of service quality to end users. Congestion control schemes can include demand side policies and supply side policies. Demand side policies may restrict access to congested resources and/or dynamically regulate the demand to alleviate the overload situation. Supply side policies may re- allocate network resources by redistributing traffic over the infrastructure and/or expand or augment network capacity. In this memo, the emphasis is mainly on congestion management schemes that fall within the scope of the network, rather than congestion management systems that depend on sensitivity and adaptivity from end-systems. That is, the focus of this memo with respect to congestion management is on those solutions that can be provided by control entities operating on the network or by the actions of network administrators. 2.4 Solution Context The solution context for Internet traffic engineering involves analysis, evaluation of alternatives, and choice between alternative courses of action. Generally the solution context is predicated on making reasonable inferences about the current or future state of the network and possibly making appropriate decisions that may involve a preference between alternative sets of action. More specifically, the solution context demands good estimates of traffic workload, characterization of network state, and possibly a set of control actions. Control actions may involve manipulation of parameters associated with the routing function, control over tactical capacity acquisition, and control over the traffic management functions. The following is a subset of the instruments that may be applicable to the solution context of Internet TE. (1) A set of policies, objectives, and requirements (which may be context dependent) for network performance evaluation and performance optimization. (2) A collection of online and possibly off-line tools and mechanisms for measurement, characterization, modeling, and control of Internet traffic and control over the placement and allocation of network resources, as well as control over the mapping or Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 13] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 distribution of traffic onto the infrastructure. (3) A set of constraints on the operating environment, the network protocols, and the traffic engineering system itself. (4) A set of administrative control parameters which may be manipulated through a Configuration Management (CM) system. The CM system itself may include a configuration control subsystem, a configuration repository, a configuration accounting subsystem, and a configuration auditing subsystem. Derivation of traffic characteristics through measurement and/or estimation is very useful within the realm of the solution space for traffic engineering. Traffic estimates can be derived from customer subscription information, from traffic projections, from traffic models, or from actual empirical measurements. In order to measure and derive traffic matrices at various levels of detail, the measurement may be performed at the flow level or at the traffic aggregate level. Measurements at the flow level or on small traffic aggregates may be performed at edge nodes, where traffic enters and leaves the network [FGLR]. In order to conduct performance studies and planning on existing or future networks, a routing analysis may be performed to determine the path(s) which the routing protocols will choose for each traffic demand, and the utilization of network resources as traffic is routed through the network. The routing analysis needs to capture the selection of paths through the network, the assignment of traffic across multiple feasible routes , and the multiplexing of IP traffic over traffic trunks (if such constructs exists) and over the underlying network infrastructure. A topology model for the network may be extracted from network architecture documents, or from network designs, or from information contained in router configuration files, or from routing databases, or from routing tables. Topology information may also be derived from servers that monitor network state and from servers that perform provisioning functions. Routing in operational IP networks can be administratively controlled at various level of abstraction, e.g., manipulating BGP attributes; manipulating IGP metrics; manipulating traffic engineering parameters, resource parameters, and policy constraints for path oriented technologies such as MPLS, etc. Within the context of MPLS, the path of an explicit LSP can be computed and established in various ways, e.g. (1) manually, (2) automatically online using constraint-based routing processes implemented on label switching routers, or (3) automatically off-line using a constraint-based traffic engineering support systems. 2.4.1 Combating the Congestion Problem Minimizing congestion is a significant aspect of traffic engineering. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 14] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 This subsection gives an overview of the general approaches that have been used or proposed to combat congestion problems. Congestion management policies can be categorized based upon the following criteria (see [YaRe95] for a more detailed taxonomy of congestion control schemes): (1) Response time scale, which can be characterized as long, medium, or short; (2) reactive versus preventive which relates to congestion control and congestion avoidance; and (3) supply side versus demand side congestion management schemes. These aspects are elaborated upon in the following paragraphs. (1) Response time scale - Long (weeks to months): Capacity planning works over a relatively long time scale to expand network capacity based on estimates or forecasts of future traffic demand and traffic distribution. Since router and link provisioning takes time and are in general expensive, these upgrades are typically carried out in the weeks-to-months or even years time scale. - Medium (minutes to days): Several control policies fall within the medium time scale category. Examples include: 1) Adjusting IGP and/or BGP parameters to route traffic away or towards certain segment of the network; 2) Setting up and/or adjusting some Explicitly-Routed Label Switched Paths (ER-LSPs) to route some traffic trunks away from possibly congested resources or towards possibly more favorable routes; 3) reconfiguring the logical topology of the network to make it correlate more closely with the traffic distribution using for example some underlying path-oriented technology such as MPLS LSPs, ATM PVCs, or optical channel trails (see e.g. [AWD6]). Many of these adaptive medium time scale response schemes rely on a measurement system that monitors changes in traffic distribution, traffic shifts, and network resource utilization and subsequently provides feedback to the online and/or off-line traffic engineering mechanisms and tools which employ this feedback information to trigger certain control actions to occur within the network. The traffic engineering mechanisms and tools can be implemented in a distributed or centralized fashion, and may have a hierarchical or flat structure. The comparative merits and demerits of distributed and centralized control structures for networks are well known. A centralized scheme may have global visibility into the network state and may produce potentially more optimal solutions. However, centralized schemes are prone to single points of failure and may not scale as well as distributed schemes. Moreover, the information utilized by a centralized scheme may be stale and may not reflect the actual state of the network. It is not an objective of this memo to make a recommendation between distributed and centralized schemes. This is a choice that network administrators must make based on their specific needs. - Short (picoseconds to minutes): This category includes packet level processing functions and events in the order of several round trip times. It includes router mechanisms such as passive or active buffer Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 15] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 management which is used to control congestion and/or signal congestion to end systems so that they can slow down. One of the most popular active management schemes, especially for TCP traffic, is Random Early Detection (RED) [FlJa93], which supports congestion avoidance by controlling the average queue size. During congestion (but before the queue is filled), the RED scheme chooses arriving packets to be "marked" according to a probabilistic algorithm which takes into account the average queue size. For a router that does not utilize explicit congestion notification (ECN) see e.g., [Floy94]), the marked packets can simply be dropped to signal the inception of congestion to end systems; otherwise, if the router supports ECN, then it can set the ECN field in the packet header. Several variations of RED have been proposed for use in multiclass environments with different drop precedence levels [RFC-2597], e.g., RED with In and Out (RIO) and Weighted RED. It is generally agreed that RED provides congestion avoidance performance which is not worse than traditional Tail-Drop (TD) (i.e., dropping arriving packets only when the queue is full). Importantly, however, RED reduces the possibility of global synchronization and improves fairness among different TCP sessions. However, RED by itself can not prevent congestion and unfairness caused by unresponsive sources, e.g., UDP connections, or some misbehaved greedy connections. Other schemes have been proposed to improve the performance and fairness in the presence of unresponsive connections. Some of these schemes have been proposed as theoretical frameworks and are not typically available in existing products. Two such schemes are Longest Queue Drop (LQD) and Dynamic Soft Partitioning with Random Drop (RND) [SLDC98]. (2) Reactive versus preventive - Reactive (recovery): reactive policies are those that react to existing congestion in order to improve it. All the policies described in the long and medium time scales above can be categorized as being reactive especially if the policies are based on monitoring and identifying existing congestion problems and initiating relevant actions to ease the situation. - Preventive (predictive/avoidance): preventive policies are those that take proactive action to prevent congestion based on estimates or predictions of potential congestion problems in the future. Some of the policies described in the long and medium time scales fall under this category. They do not necessarily respond immediately to existing congestion problems. Instead they may take into account forecasts of future traffic demand and distribution, and may take or prescribe actions in order to prevent potential congestion problems in the future. The schemes described in short time scale, e.g., RED and its variations, ECN, LQD, and RND, are also used for congestion avoidance since dropping or marking packets as an early congestion notification before queues actually overflow would trigger corresponding TCP sources to slow down. (3) Supply side versus demand side Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 16] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 - Supply side: supply side policies are those that seek to increase the effective capacity available to traffic in order to control or obviate congestion. One way to accomplish this is to minimize congestion by having a relatively balanced network. For example, capacity planning should aim to provide a physical topology and associated link bandwidths that match estimated traffic workload and traffic distribution based on forecasting, subject to budgetary and other constraints. However, if actual traffic distribution does not match the topology derived from capacity panning (due, for example, to forecasting errors or facility constraints), then the traffic can be mapped onto the existing topology using routing control mechanisms or by modifying the logical topology using path oriented technologies (e.g., MPLS, ATM, optical channel trails), or by using some other load redistribution mechanisms. - Demand side: demand side policies are those that seek to control or regulate the offered traffic. For example, some of the short time scale mechanisms described earlier (such as RED and its variations, ECN, LQD, and RND) as well as policing and rate shaping mechanisms attempt to regulate the offered load in various ways. Tariffs may also be applied as a demand side instrument. However, to date, tariffs have not been used as a means of demand side congestion management within the Internet. In summary, a variety of mechanisms can be brought to bear to address congestion problems in IP networks. These mechanisms may operate at multiple time-scales. 2.5 Implementation and Operational Context The operational context of Internet traffic engineering is characterized by constant change which occur at multiple levels of abstraction. The implementation context demands effective planning, organization, and execution. The planning aspects may involve determining prior sets of actions in order to achieve desired objectives. Organizing involves arranging and assigning responsibility to the various components of the traffic engineering system and coordinating their activities in order to accomplish the desired TE objectives. Execution involves measuring and applying corrective or perfective actions to attain and maintain desired TE goals. 3.0 Traffic Engineering Process Model(s) This section describes a process model that captures the high level practical aspects of Internet traffic engineering in an operational context. The process model is described in terms of a sequence of actions that a traffic engineer, or more generally that a traffic engineering system, goes through in order to optimize the performance of an operational network (see also [AWD1, AWD2]). Although the Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 17] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 details regarding how traffic engineering is carried out may differ from network to network, the process model described here represents broad activities which are common to most traffic engineering methodologies. This process model may be enacted explicitely or implicitely; by an automaton and/or by a human. The first phase of the TE process model is to define relevant control policies that govern the operation of the network. These policies may depend on the prevailing business model, the network cost structure, the operating constraints, and one or more optimization criteria, as well as other factors. The second phase of the process model is a feedback process which involves acquiring measurement data from the operational network. If empirical data is not readily available from the network, then synthetic workloads may be used instead, which reflect either the prevailing or the expected workload of the network. Synthetic workloads may be derived by estimation or by extrapolation using prior empirical data, or by using mathematical models of traffic characteristics, or by using some other means. The third phase of the process model is to analysis the network state and to characterize traffic workload. In general, performance analysis may be proactive and/or reactive. Proactive performance analysis identifies potential problems that do not exist, but that may manifest at some point in the future. Reactive performance analysis, on the other hand, identifies existing problems, determines their cause through a process of diagnosis, and if necessary evaluates alternative approaches to remedy the problem. A number of quantitative and qualitative techniques may be used in the analysis process, including modeling based analysis and simulation. The analysis phase of the process model may involve the following actions: (1) investigate the concentration and distribution of traffic across the network or relevant subsets of the network, (2) identify the characteristics of the offered traffic workload, (3) identify existing or potential bottlenecks, and (4) identify network pathologies such as ineffective placement of links, single points of failures, etc. Network pathologies may result from a number of factors such as inferior network architecture, inferior network design, configuration problems, and others. A traffic matrix may be constructed as part of the analysis process. Network analysis may also be descriptive or prescriptive. The fourth phase of the TE process model is concerned with the performance optimization of the network. The performance optimization phase generally involves a decision process which selects and implements a particular set of actions from a choice between alternatives. Optimization actions may include use of appropriate techniques to control the offered traffic, or to control the distribution of traffic across the network. Optimization actions may also involve increasing link capacity, deploying additional hardware such as routers and switches, adjusting parameters associated with routing such as IGP metrics and BGP attributes in a systematic way, and adjusting traffic management parameters. Network performance Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 18] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 optimization may also involve starting a network planning process to improve the network architecture, network design, network capacity, network technology, and the configuration of network elements in order to accommodate current and future growth. 3.1 Components of the Traffic Engineering Process Model As evidenced by the discussion in the previous subsection, the key components of the TE process model include a measurement subsystem, a modeling and analysis subsystem, and an optimization subsystem. The following subsections elaborate on these components as they apply to the TE process model. 3.2 Measurement Measurement is crucial to the traffic engineering function. The operational state of a network can only be conclusively determined through measurement. Measurement is also critical to the optimization function because it provides feedback data which is used by TE control subsystems to adaptively optimize network performance in response to events and stimuli that originate within and outside the network. Measurement is also needed to ascertain the quality of network services and to evaluate the effectiveness of TE policies. Experience suggests that measurement is most effective when it is applied systematically. In developing a measurement system to support the TE function in IP networks, the following questions should be considered very carefully: Why is measurement needed in this particular context? What parameters are to be measured? How should the measurement be accomplished? Where should the measurement be performed? When should the measurement be performed? How frequently should the monitored variables be measured? What level of measurement accuracy and reliability is desirable. What level of measurement accuracy and reliability is realistically attainable? To what extent can the measurement system permissibly interfere with the monitored network components and variables? What is the acceptable cost of measurement? To a large degree, the answers to the above questions will determine the measurement tools and measurement methodologies that are appropriate in any given TE context. It is also worthwhile to point out that there is a distinction between measurement and evaluation. Measurement provides raw data concerning state parameters and variables of monitored elements. Evaluation utilizes the raw data to make inferences regarding the monitored system. Measurement in support of the TE function can occur at different levels of abstraction. For example, measurement can be used to derive packet level characteristics, flow level characteristics, user or Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 19] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 customer level characteristics, traffic aggregate characteristics, component level characteristics, network wide characteristics, etc. 3.3 Modeling and Analysis Modeling and analysis are important aspects of Internet traffic engineering. Modeling involves constructing an abstract or physical representation which depicts relevant traffic and network attributes and characteristics. Accurate source models for traffic are particularly very useful for analysis. A major research topic in Internet traffic engineering is the development of traffic source models that are consistent with empirical data obtained from operational networks. Such models should also be tractable and amenable to analysis. The topic of source models for IP traffic is a research topic and is therefore outside the scope of this document; nonetheless its importance cannot be over-emphasized. A network model is an abstract representation of the network which captures relevant network features, attributes, and characteristics, such as link and nodal attributes and constraints. A network model may facilitate analysis and/or simulation which can be used to predict network performance under various conditions, and also to guide network expansion plans. Network simulation tools are extremely useful for traffic engineering. A good network simulator can be used to mimic and visualize network characteristics in various ways under various conditions. For example, a network simulator might be used to depict congested resources and hot spots, and to provide hints regarding possible solutions to network performance problems. A good simulator may also be used to validate the effectiveness of planned solutions to network issues without the need to tamper with the operational network, or to commence an expensive network upgrade which may not achieve the desired objectives. Furthermore, during the process of network planning, a network simulator may reveal pathologies such as single points of failure which may require additional redundancy, and potential bottlenecks and hot spots which may require additional capacity. Routing simulators are especially useful. A routing simulator may identify planned links which may not actually be used to route traffic by the existing routing protocols. Simulators can also be used to conduct scenario based and perturbation based analysis, as well as sensitivity studies. Simulation results can be used to initiate appropriate actions in various ways. For example, an important application of network simulation tools is to investigate and identify how best to evolve and grow the network in order to accommodate projected future demands. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 20] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 3.4 Optimization Network performance optimization involves resolving network issues into concepts that enable a solution, identifying a solution, and implementing the solution. Network performance optimization can be corrective or perfective. In corrective optimization, the goal is to remedy a problem that has occurred or that is incipient. In perfective optimization, the goal is to improve network performance even when explicit problems do not exist and are not anticipated. Network performance optimization is a continual process, as noted previously. Performance optimization iterations may consist of real-time optimization sub-processes and non-real-time network planning sub-processes. The difference between real-time optimization and network planning is largely in the relative time- scale at they operate and in the granularity of actions. One of the objectives of a real-time optimization sub-process is to control the mapping and distribution of traffic over the existing network infrastructure to avoid and/or relieve congestion, to assure satisfactory service delivery, and to optimize resource utilization. Real-time optimization is needed because, no matter how well a network is designed, random incidents such as fiber cuts or shifts in traffic demand will occur. When they occur, they can cause congestion and other problems to manifest in an operational network. Real-time optimization must solve such problems in small to medium time-scales ranging from micro-seconds to minutes or hours. Examples of real-time optimization include queue management, IGP/BGP metric tuning, and using technologies such as MPLS explicit LSPs to change the paths of some traffic trunks [XIAO]. One of the functions of the network planning sub-process is to initiate actions to evolve the architecture, technology, topology, and capacity of a network in a systematic way. When there is a problem in the network, real-time optimization should provide an immediate fix. Because of the need to respond promptly, the real-time solution may not be the best possible solution. Network planning may subsequently be needed to refine the solution and improve the situation. Network planning is also needed to expand the network to support traffic growth and changes in traffic distribution over time. As noted previously, the outcome of network planning might be a change in the topology and/or capacity of the network. It can be seen that network planning and real-time performance optimization are mutually complementary activities. A well-planned and designed network makes real-time optimization easier, while a systematic approach to real-time network performance optimization allows network planning to focus on long term issues rather than tactical considerations. Systematic real-time network performance optimization also provides valuable inputs and insights towards network planning. Stability is a major consideration in real-time network performance optimization. This aspect will be reiterated repeatedly throughout Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 21] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 this memo. 4.0 Historical Review and Recent Developments This section presents a brief review of various traffic engineering approaches that have been proposed and implemented in telecommunications and computer networks. The discussion is not meant to be exhaustive, but is primarily intended to illuminate pre- existing perspectives and prior art concerning traffic engineering in the Internet as well as in legacy telecommunications networks. 4.1 Traffic Engineering in Classical Telephone Networks It is useful to begin with a review of traffic engineering in telephone networks which often relates to the means by which user traffic is steered from the source to the destination. This subsection presents a brief overview of this topic. The book by G. Ash [ASH2] contains a detailed description of the various routing strategies that have been applied in telephone networks. The early telephone network relied on static hierarchical routing, whereby routing patterns remained fixed independent of the state of the network or time of day. The hierarchy was intended to accommodate overflow traffic, improve network reliability via alternate routes, and prevent call looping by using strict hierarchical rules. The network was typically over-provisioned since a given fixed route had to be dimensioned so that it could carry user traffic during a busy hour of any busy day. Hierarchical routing in the telephony network was found to be too rigid with the advent of digital switches and stored program control which were able to manage more complicated traffic engineering rules. Dynamic routing was introduced to alleviate the routing inflexibility in the static hierarchical routing so that the network would operate more efficiently, thereby resulting in significant economic gains [HuSS87]. Dynamic routing typically reduces the overall loss probability by 10 to 20 percent as compared to static hierarchical routing. Dynamic routing can also improve network resilience by recalculating routes on a per-call basis and periodically updating routes. There are three main types of dynamic routing in the telephone network: time-dependent routing, state-dependent routing (SDR), and event dependent routing (EDR). In time-dependent routing, regular variations in traffic loads due to time of day and seasonality are exploited in pre-planned routing tables. In state-dependent routing, routing tables are updated online according to the current state of the network (e.g, traffic demand, utilization, etc.). In event dependent routing, routing Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 22] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 changes are incepted by events, such as call setups encountering congested or blocked links, whereupon new paths are searched out using learning models. EDR methods are real-time adaptive, but do not require global state information such as is the case with SDR. Examples of EDR schemes include the dynamic alternate routing (DAR) from BT, the state-and-time dependent routing (STR) from NTT, and the success-to-the-top (STT) routing from AT&T. Dynamic non-hierarchical routing (DNHR) is an example of dynamic routing that was introduced in the AT&T toll network in the 1980's to respond to time-dependent information such as regular load variations as a function of time. Time-dependent information in terms of load may be divided into three time scales: hourly, weekly, and yearly. Correspondingly, three algorithms are defined to pre-plan the routing tables. Network design algorithm operates over a year-long interval while demand servicing algorithm operates on a weekly basis to fine tune link sizes and routing tables to correct forecast errors on the yearly basis. At the smallest time scale, routing algorithm is used to make limited adjustments based on daily traffic variations. Network design and demand servicing are computed using off-line calculations. Typically, the calculations require extensive search on possible routes. On the other hand, routing may need online calculations to handle crankback. DNHR adopts a "two-link" approach whereby a path can consist of two links at most. The routing algorithm presents an ordered list of route choices between an originating switch and a terminating switch. If a call overflows, a via switch (a tandem exchange between the originating switch and the terminating switch) would send a crankback signal to the originating switch which would then select the next route, and so on, until no alternative routes are available in which the call is blocked. 4.2 Evolution of Traffic Engineering in Packet Networks This subsection reviews related prior work that was intended to improve the performance of data networks. Indeed, optimization of the performance of data networks started in the early days of the ARPANET. Other early commercial networks such as SNA also recognized the importance of performance optimization and service differentiation. In terms of traffic management, the Internet has been a best effort service environment until recently. In particular, very limited traffic management capabilities existed in IP networks to provide differentiated queue management and scheduling services to packets belonging to different classes. In the following subsections, we review the evolution of practical implementations of traffic engineering mechanisms in IP networks and its predecessors. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 23] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 4.2.1 Adaptive Routing in ARPANET The early ARPANET recognized the importance of adaptive routing where routing decisions were based on the current state of the network [McQ80]. In the early minimum delay routing approaches, each packet was forwarded to its destination along a path for which the total estimated transit time is the smallest. Each node maintained a table of network delays, which represented the estimated delay that a packet can expect to experience along a given path toward its destination. The minimum delay table was periodically transmitted by a node to its neighbors. The shortest path in terms of hop count was also propagated to give the connectivity information. A drawback of this approach is that dynamic link metrics tend to create "traffic magnets" whereby congestion will be shifted from one location of a network to another location; essentially creating oscillation and instability. 4.2.2 Dynamic Routing in the Internet The Internet, which evolved from the APARNET, adopted dynamic routing algorithms with distributed control to determine the paths that packets should take en-route to their destinations. The routing algorithms themselves are adaptations of shortest path algorithms where costs are based on link metrics. In principle, the link metric can be based on static or dynamic quantities. In the static case, the link metric may be assigned administratively according to some local criteria. In the dynamic case, the link metric may be a function of some congestion measure such as delay or packet loss. It was recognized early that static link metric assignment was inadequate because it can easily lead to unfavorable scenarios whereby some links become congested while some others remain lightly loaded. One of the many reasons for the inadequacy of static link metrics is that link metric assignment was often done without considering the traffic matrix in the network. Moreover, the routing protocols did not take traffic attributes and capacity constraints into account in making routing decisions. The practical implication is that traffic concentration is localized in subsets of the network infrastructure, potentially causing congestion. Even if link metrics are assigned in accordance with the traffic matrix, unbalanced loads in the network can still occur due to a number of reasons, such as: - Some resources might not be deployed in the most optimal locations from a routing perspective. - Forecasting errors in traffic volume and/or traffic distribution. - Dynamics in traffic matrix due to the temporal nature of traffic patterns, BGP policy change from peers, etc. The inadequacy of the legacy Internet interior gateway routing system is one of the factors motivating the interest in path oriented technologies with explicit routing and constraint-based routing Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 24] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 capability, such as MPLS. 4.2.3 ToS Routing In ToS-based routing, different routes to the same destination may be selected depending on the Type-of-Service (ToS) field of an IP packet [RFC-1349]. The ToS classes may be classified as low delay and high throughput. Each link is associated with multiple link costs, where each link cost is used to compute routes for a particular ToS. A separate shortest path tree is computed for each ToS. Since the shortest path algorithm has to be run for each ToS, the computation may be quite expensive with this approach. Classical ToS-based routing has become outdated as the IP header field has been replaced by a Diffserv field. A more serious technical issue with the classical TOS based routing concerns the fact that it is difficult to perform effective traffic engineering because each class still relies exclusively on shortest path routing. 4.2.4 Equal Cost MultiPath Equal Cost MultiPath (ECMP) is another technique that attempts to address the deficiency in Shortest Path First (SPF) interior gateway routing systems [RFC-2178]. In a SPF algorithm, if two or more paths to a given destination have the same cost, the algorithm will choose one of them. In ECMP, the algorithm is modified slightly so that if two or more equal shortest cost paths exist between two nodes, the traffic between the nodes is distributed among the multiple equal- cost paths. Traffic distribution across the equal-cost paths is usually done in two ways: 1) packet-based in a round-robin fashion, or 2) flow-based using hashing on source and destination IP addresses. Approach 1) can easily cause out-of-order packets while approach 2) is dependent on the number and distribution of flows. Flow-based load sharing may be unpredictable in an enterprise network where the number of flows is relatively small and heterogeneous (i.e., hashing may not be uniform), but is generally effective in core public networks where the number of flows is very large. Because link costs are static and bandwidth constraints are not taken into account, ECMP attempts to distribute the traffic as equally as possible among the equal-cost paths independent of the congestion status of each path. As a result, given two equal-cost paths, it is possible that one of the paths will be more congested than the other. Another drawback of ECMP is that load sharing cannot be done on multiple paths which have non-identical costs. 4.3 Overlay Model In the overlay model, a virtual-circuit network, such as ATM or frame Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 25] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 relay, provides virtual-circuit connectivity between routers that are located at the edges of the virtual-cirtuit network. In this mode, two routers that are connected through a virtual circuit see a direct adjacency between themselves independent of the physical route taken by the virtual circuit through the ATM or frame relay network. Thus, the overlay model essentially decouples the logical topology that routers see from the physical topology that the ATM or frame relay network manages. The overlay model enables the network operator to perform traffic engineering by re-configuring the virtual circuits so that a virtual circuit on a congested physical path can be re-routed to a less congested one. The overlay model requires the management of two separate networks (e.g., IP and ATM) which results in increased operational complexity and cost. In the fully-meshed overlay model, each router would peer to every other router in the network. Some of the issues with the overlay model are discussed in [AWD2]. 4.4 Constrained-Based Routing Constrained-based routing pertains to a class of routing systems that compute routes through a network subject to satisfaction of a set of constraints and requirements. The constraints may be imposed by the network and/or by administrative policies. Constraints may include bandwidth, delay, and policy instruments such as resource class attributes. The concept of constraint-based routing in IP networks was first defined in [AWD1] within the context of MPLS traffic engineering requirements. Unlike QoS routing which generally deals with routing traffic flows in order to QoS prescribed QoS requirements, constraint-based routing is applicable to traffic aggregates as well as flow and may also take policy restrictions into account. 4.5 Overview of Recent IETF Projects Related to Traffic Engineering This subsection reviews a number of recent IETF activities that are pertinent to Internet traffic engineering. 4.5.1 Integrated Services The IETF developed the integrated services model which requires resources, such as bandwidth and buffers, to be reserved a priori for a given traffic flow to ensure that the quality of service requested by the traffic flow is satisfied. The integrated services model requires additional components beyond those used in the best-effort model such as packet classifiers, packet schedulers, and admission control. A packet classifier is used to identify flows that are to receive a certain level of service. A packet scheduler handles the Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 26] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 service of different packet flows to ensure that QoS commitments are met. Admission control is used to determine whether a router has the necessary resources to accept a new flow. Two services have been defined: guaranteed service [RFC-2212] and controlled-load service [RFC-2211]. The guaranteed service can be used for applications that require real-time delivery. For this type of application, data that is delivered to the application after a certain time is generally considered worthless. Thus guaranteed service has been designed to provide a firm bound on the end-to-end packet delay for a flow. The controlled-load service can be used for adaptive applications that can tolerate some delay but that are sensitive to traffic overload conditions. This type of applications typically function satisfactorily when the network is lightly loaded but degrade significantly when the network is heavily loaded. Thus, controlled- load service has been designed to provide approximately the same service as best-effort service in a lightly loaded network regardless of actual network conditions. Controlled-load service is described qualitatively in that no target values of delay or loss are specified. The main issue with the Integrated services model has been scalability, especially in large public IP networks which may potentially have millions of concurrent micro-flows. 4.5.2 RSVP RSVP, a soft state signaling protocol, was originally invented as a signaling protocol for applications to reserve network resources [RFC-2205]. Under RSVP, the sender sends a PATH Message to the receiver, specifying the characteristics of the traffic. Every intermediate router along the path forwards the PATH Message to the next hop determined by the routing protocol. Upon receiving a PATH Message, the receiver responds with a RESV Message to request resources for the flow. The RESV message travels to the source in the opposite direction along the path through which the PATH message traversed. Every intermediate router along the path can reject or accept the request of the RESV Message. If the request is rejected, the router will send an error message to the receiver, and the signaling process will terminate. If the request is accepted, link bandwidth and buffer space are allocated for the flow and the related flow state information will be installed in the router. One of the issues with the original RSVP specification was scalability, because reservations were required for micro-flows, so that the amount of state maintained on network increases linearly with the number of micro-flows. Recently, however, RSVP has been modified and extended in several ways to overcome the scaling problems and to enable it to become a versatile signaling protocol for IP networks. For example, RSVP has been extended to reserve Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 27] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 resources for aggregation of flows, to set up MPLS explicit label switched paths, and to perform other signaling functions within the Internet. 4.5.3 Differentiated Services The essence of the Differentiated Services (Diffserv) effort within the IETF is to allow traffic to be categorized and divided into classes, and subsequently to allow each class to be treated differently, especially during times when there is a shortage of resources such as link bandwidth and buffer space [RFC-2475]. Diffserv defines the Differentiated Services field (DS field, formerly known as TOS octet) and uses it to indicate the forwarding treatment a packet should receive [RFC-2474]. Diffserv also standardizes a number of Per-Hop Behavior (PHB) groups. Using different classification, policing, shaping and scheduling rules, several classes of services can be defined. In order for a customer to receive Differentiated Services from its Internet Service Provider (ISP), it may be necessary for the customer to have a Service Level Agreement (SLA) with the ISP. An SLA may explicitly or implicitly specify a Traffic Conditioning Agreement (TCA) which defines classifier rules as well as metering, marking, discarding, and shaping rules. At the ingress to a Diffserv network, packets are classified, policed, and possibly shaped. When a packet traverses the boundary between different Diffserv domains, the DS field of the packet may be re-marked according to existing agreements between the domains. In Differentiated Services, there are only a finite and limited number of service classes that can be indicated by the DS field. The main advantage of the Diffserv approach is scalability: Since resources are allocated on a per-class basis, the amount of state information is proportional to the number of classes rather than to the number of application flows. It should be evident from the above discussion that the Diffserv model essentially deals with traffic management issues on a per hop basis. Thus, the Diffserv control model consists of a collection of micro-TE control mechanisms. Other traffic engineering capabilities such as capacity management, including routing control, are also required in Diffserv networks in order to deliver acceptable service quality. 4.5.4 MPLS MPLS is an advanced forwarding scheme which also includes extensions to conventional IP control plane protocols. MPLS extends the Internet Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 28] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 routing model, and enhances packet forwarding and path control [RoVC]. Each MPLS packet has a fixed length label affixed to the header. In a non-ATM/FR environment, the header contains a 20-bit label, a 3-bit Experimental field (formerly known as Class-of-Service or CoS field), a 1-bit label stack indicator and an 8-bit TTL field. In an ATM (FR) environment, the header contains only a label encoded in the VCI/VPI (DLCI) field. An MPLS capable router, termed Label Switching Router (LSR), examines the label and possibly the experimental field in forwarding a packet. At the ingress LSRs of an MPLS-capable domain, IP packets are classified into forwarding equivalence classes (FECs) and routed based on a variety of factors, including e.g. a combination of the information carried in the IP header of the packets and the local routing information maintained by the LSRs. An MPLS header is then appended to each packet according to the notion of forwarding equivalence classes. Within an MPLS-capable domain, an LSR will use the label prependend to packets as the index into a local next hop label forwarding entry (NHLFE). The packet is then processed as specified in NHLFE.. The incoming label may be replaced by an outgoing label and the packet may be switched to the next LSR. This label-switching process is very similar to the label (VCI/VPI) swapping process in ATM networks. Before a packet leaves an MPLS domain, its MPLS header is removed. The path through which a FEC traverses between an ingress LSRs and an egress LSRs is called a Label Switched Path (LSP). The path of an explicit LSP is defined at the originating (ingress) node of the LSP. MPLS can use a signaling protocol such as RSVP or LDP to set up LSPs. MPLS is a very powerful technology for Internet traffic engineering because it supports explicit LSPs which allow constraint-based routing to be implemented efficiently in IP networks. 4.5.5 IP Performance Metrics The IPPM WG has been developing a set of standard metrics that can be applied to the quality, performance, and reliability of Internet services by network operators, end users, or independent testing groups [RFC2330], so that users and service providers have accurate common understanding of the performance and reliability of the Internet component 'clouds' that they use/provide. Examples of performance metrics include one-way packet loss [RFC2680], one-way delay [RFC2679], and connectivity measures between two nodes [RFC2678]. Other metrics include second-order measures of packet loss and delay. Performance metrics are useful for specifying Service Level Agreements (SLAs), which are sets of service level objectives negotiated between users and service providers, where each objective is a combination of one or more performance metrics subject to Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 29] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 constraints. 4.5.6 Flow Measurement A flow measurement system enables network traffic flows to be measured and analyzed at the flow level for a variety of purposes. RTMF has produced an architecture document that defines a method to specify traffic flows, and a number of components (meters, meter readers, manager) to measure the traffic flows [RFC-2722]. A meter observes packets passing through a measurement point, classifies them into certain groups, accumulates certain usage data such as the number of packets and bytes for each group, and stores the usage data in a flow table. For this purpose, a group may represent a user application, a host, a network, a group of networks, any combination of the above, etc. A meter reader gathers usage data from various meters so that it can be made available for analysis. A manager is responsible for configuring and controlling meters and meter readers. The instructions received by a meter from a manager include flow specification, meter control parameters, and sampling techniques. The instructions received by a meter reader from a manager include the meter's address whose data is to be collected, the frequency of data collection, and the types of flows to be collected. 4.5.7 Endpoint Congestion Management The work in endpoint congestion management is intended to catalog a set of congestion control mechanisms that transport protocols can use, and to develop a unified congestion control mechanism across a subset of an endpoint's active unicast connections called a congestion group. A congestion manager continuously monitors the state of the path for each congestion group under its control, and uses that information to instruct a scheduler on how to partition bandwidth among the connections of that congestion group. 4.6 Overview of ITU Activities Related to Traffic Engineering This section provides an overview of prior work within the ITU-T pertaining to traffic engineering in traditional telecommunications networks. ITU-T Recommendations E.600 [itu-e600], E.701 [itu-e701], and E.801 [itu-e801] address traffic engineering issues in traditional telecommunications networks. Recommendation E.600 provides a vocabulary for describing traffic engineering concepts, while E.701 defines reference connections, Grade of Service (GOS), and traffic parameters for ISDN. Recommendation E.701 uses the concept of a reference connection to identify representative cases of different types of connections without describing the specifics of their actual Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 30] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 realizations by different physical means. As defined in Recommendation E.600, "a connection is an association of resources providing means for communication between two or more devices in, or attached to, a telecommunication network." Also, E.600 defines "a resource as any set of physically or conceptually identifiable entities within a telecommunication network, the use of which can be unambiguously determined" [itu-e600]. There can be different types of connections as the number and types of resources in a connection may vary. Typically, different network segments are involved in the path of a connection. For example, a connection may be local, national, or international. The purposes of reference connections are to clarify and specify traffic performance issues at various interfaces between different network domains. Each domain may consist of one or more service provider networks. Reference connections provide a basis to define grade of service (GoS) parameters related to traffic engineering within the ITU-T framework. As defined in E.600, "GoS refers to a number of traffic engineering variables which are used to provide a measure of the adequacy of a group of resources under specified conditions." These GoS variables may be probability of loss, dial tone delay, etc. They are essential for network internal design and operation, as well as component performance specification. In the ITU framework, GoS is different from quality of service (QoS). QoS is the performance perceivable by a user of a telecommunication service and expresses the user's degree of satisfaction of the service. GoS is a set of network oriented measures which characterize the adequacy of a group of resources under specified conditions. On the other hand, QoS parameters focus on performance aspects which are observable at the service access points and network interfaces, rather than their causes within the network. For a network to be effective in serving its users, the values of both GoS and QoS parameters must be related, with GoS parameters typically making a major contribution to the QoS. To assist the network provider in the goal of improving efficiency and effectiveness of the network, E.600 stipulates that a set of GoS parameters must be selected and defined on an end-to-end basis for each major service category provided by a network. Based on a selected set of reference connections, suitable target values are then assigned to the selected GoS parameters, under normal and high load conditions. These end-to-end GoS target values are then apportioned to individual resource components of the reference connections for dimensioning purposes. 5.0 Taxonomy of Traffic Engineering Systems This section presents a short taxonomy of traffic engineering systems. A taxonomy of traffic engineering systems can be constructed Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 31] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 based on traffic engineering styles and traffic engineering views. Such a classification system is shown below: - Time-dependent vs State-dependent - Offline vs Online - Centralized vs Distributed - Local vs Global Information - Prescriptive vs Descriptive - Open Loop vs Closed Loop In the following subsections, these classification systems are described in greater detail. 5.1 Time-Dependent Versus State-Dependent TE methodologies can be classified into two basic types: time- dependent or state-dependent. In this framework, all TE schemes are considered to be dynamic. Static TE implies that no traffic engineering methodology or algorithm is being applied. In the time-dependent TE, historical information based on seasonal variations in traffic is used to pre-program routing plans. Additionally, customer subscription or traffic projection may be used. Pre-programmed routing plans typically change on a relatively long time scale (e.g., diurnal). Time-dependent algorithms make no attempt to adapt to random variations in traffic or changing network conditions. An example of time-dependent algorithm is a global centralized optimizer where the input to the system is traffic matrix and multiclass QoS requirements described [MR99]. State-dependent or adaptive TE adapts the routing plans for packets based on the current state of the network. The current state of the network gives additional information on variations in actual traffic (i.e., perturbations from regular variations ) that could not be predicted by using historical information. An example of state- dependent TE that operates in a relatively long time scale is constraint-based routing, and an example that operates in a relatively short time scale is a load-balancing algorithm described in [OMP] and [MATE]. The state of the network can be based on various parameters such as utilization, packet delay, packet loss, etc. These parameters in turn can be obtained in several ways. For example, each router may flood these parameters periodically or by means of some kind of trigger to other routers. An alternative approach is to have a particular router that wants to perform adaptive TE to send probe packets along a path to gather the state of that path. Yet, another approach is to have some management system to gather MIB information from the interfaces. Because of the dynamic nature of the network conditions, expeditious and accurate gathering of state information is typically critical to adaptive TE. State- dependent algorithms may be applied to increase network efficiency and resilience. While time-dependent Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 32] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 algorithms are more suitable for predictable traffic variations, state-dependent algorithms are more suitable for adapting to the prevailing state of the network. 5.2 Offline Versus Online Traffic engineering requires the computation of routing plans. The computation itself may be done offline or online. For the scenarios where the routing plans do not need to be executed in real-time, then the computation can be done offline. As an example, routing plans computed from forecast information may be computed offline. Typically, offline computation is also used to perform extensive search on multi-dimensional space. Online computation is required when the routing plans need to adapt to changing network conditions as in state-dependent algorithms. Unlike offline computation which can be computationally demanding, online computation is geared toward simple calculations to fine-tune the allocations of resources such as load balancing. 5.3 Centralized Versus Distributed With centralized control, there is a central authority which determines routing plans on behalf of each router. The central authority collects the network-state information from all routers, and returns the routing information to the routers periodically. The routing update cycle is a critical parameter which directly impacts the performance of the network being controlled. Centralized control may need high processing power and high bandwidth control channels. With distributed control, route selection is determined by each router autonomously based on the state of the network. The network state may be obtained by the router using some probing method, or distributed by other by routers on a periodic basis. 5.4 Local Versus Global TE algorithms may require local or global network-state information. It is to be noted that the scope network-state information does refer to the scope of the optimization. In other words, it is possible for a TE algorithm to perform global optimization based on local state information. Similarly, a TE algorithm may arrive at a local optimum solution even if it relies on global state information. Global information pertains to the state of the entire domain that is being traffic engineered. Examples include traffic matrix, or loading information on each link. Global state information is typically required with centralized control. In some cases, distributed- Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 33] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 controlled TEs may also need global information. Local information pertains to the state of a portion of the domain. Examples include the bandwidth and packet loss rate of a particular path. Local state information may be sufficient for distributed- controlled TEs. 5.5 Prescriptive Versus Descriptive Prescriptive traffic engineering evaluates alternatives and recommends a course of action. Prescriptive traffic engineering can be further categorized as either corrective or perfective. Corrective TE prescribes a course of action to address an existing or predicted anomaly. Perfective TE prescribes a course of action to evolve and improve network performance even when no anomalies are evident. Descriptive traffic engineering characterizes the state of the network and assesses the impact of various policies without recommending any particular course of action. 5.6 Open-Loop Versus Closed-Loop Open-loop control is where control action does not use any feedback information from the current network state. The control action may, however, use its own on local information for accounting purposes. Closed-loop control is where control action utilizes feedback information from the network state. The feedback information may be in the form historical information or current measurement. 6.0 Requirements for Internet Traffic Engineering This section describes the some high level requirements and recommendations for traffic engineering in the Internet. Because this is a framework document, these requirements are presented in very general terms. Additional documents to follow may elaborate on specific aspects of these requirements in greater detail. [NOTE: THIS SECTION IS AN INITIAL VERSION OF THE HIGH LEVEL TE REQUIREMENTS. IT WILL BE REVISED OVER TIME TO EXTEND AND REFINE IT.] 6.1 Generic Requirements Usability: In general, it is desirable to have a TE system that can be readily deployed in an existing network. It is also desirable to have a TE system that is easy to operate and maintain. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 34] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 Automation: Whenever feasible, a TE system should automate the traffic engineering functions to minimize operator intervention in the control of operational networks. Scalability: Contemporary public networks are growing very fast with respect to network size and traffic volume. Therefore, a TE system SHOULD be scalable to remain applicable as the network evolves. In particular, a TE system SHOULD remain functional as the network expands with regard to the number of routers and links, and with respect to the traffic volume. A TE system SHOULD have a scalable architecture, SHOULD not adversely impair other functions and processes in a network element, and SHOULD not consume too much network resources when collecting and distributing state information or when exerting control. Stability: Stability is a very important consideration in traffic engineering systems that respond to changes in the state of the network. State-dependent traffic engineering methodologies typically mandate a tradeoff between responsiveness and stability. It is strongly RECOMMENDED that when tradeoffs are warranted between responsiveness and stability, that the tradeoff should be made in favor of stability (especially in public IP backbone networks). Flexibility: A TE system SHOULD be flexible to allow for changes in optimization policy. In particular, a TE system SHOULD provide sufficient configuration options so that a network administrator can tailor the TE system to a particular environment. It may also be desirable to have both online and offline TE subsystems which can be independently enabled and disabled. In multiclass networks, TE systems SHOULD also have options that support class based performance optimization. Observability: As part of the TE system, mechanisms SHOULD exist to collect statistics from the network and to analyze them to determine how well the network is functioning. Derived statistics such as traffic matrices, link utilization, latency, packet loss, and other performance measures or interest which are derived from network measurements can be used as indicators of prevailing network conditions. Other examples of status information which should be observed include existing functional routes, and e.g. in the context of MPLS existing LSP routes, etc. Simplicity: Generally, a TE system should be as simple as possible consistent with the intended applications. More importantly, the TE system should be relatively easy to use (i.e., clean, convenient, and intuitive user interfaces). Simplicity in user interface does not necessarily imply that the TE system will use naive algorithms. Even when complex algorithms and internal structures are used, such complexities should be hidden as much as possible from the network administrator through the user interface. Congestion management: A TE system SHOULD map the traffic onto the network to minimize congestion. If the total traffic load cannot be accommodated, then a TE system may rely on short time scale Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 35] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 congestion control mechanisms to mitigate congestion. A TE system SHOULD be compatible with and complement existing congestion control mechanisms. It is generally desirable to minimize the maximum resource utilization per service in an operational network. The use of trunk reservation technique may also be useful in some situations. Survivability: It is critical for an operational network to recover promptly from network failures and to maintain the required QoS for existing services. Survivability generally mandates introducing redundancy into the architecture, design, and operation of networks. There is a tradeoff between the level of survivability that can be attained and the cost required to attain it. The time required to restore a network service from a failure depends on several factors, including the particular context in which the failure occurred, the architecture and design of network, the characteristics of the network elements and network protocols, the applications and services that were impacted by the failure, etc. The extent and impact of service disruptions due to a network failure or outage can vary depending on the length of the outage, the part of the network where the failure occurred, the type and criticallity of the network resources that were impaired by the failure, the types of services that were impacted by the failure (e.g., voice quality degradation may be tolerable for an inexpensive VoIP service, but not be tolerable for a toll-quality VoIP service). Survivability can be addressed at the device level by developing network elements that are more reliable; and at the network level by incorporating redundancy into the architecture, design, and operation of networks. It is recommended that a philosophy of robustness and survivability should be adopted in the architecture, design, and operation of IP networks (expecially public IP networks) and network elements. At the same time, because different contexts may demand different levels of survivability, the mechanisms developed to support network survivability should be flexible so that they can be tailored to different needs. 6.2 Routing Requirements [NOTE: THIS SECTION IS STILL WORK IN PROGRESS] Routing control is one of the most significant aspects of Internet traffic engineering. Traditional IGPs which are based on shortest path algorithms have limited control capabilities for traffic engineering. These limitations include: 1. The well know issues with shortest path protocols. Since IGPs always use the shortest paths to forward traffic, load sharing cannot be done among paths of different costs. Using shortest paths to forward traffic conserves network resources, but it may cause the following problems: 1) If traffic from a source to a destination exceeds the capacity of the shortest path, the shortest path will become congested while a longer path between these two nodes is under-utilized; 2) the shortest paths from different sources can Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 36] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 overlap at some links. If the total traffic from different sources exceeds the capacity of any of these links, congestion will occur. Such problems occur because traffic demand changes over time but network topology cannot be changed as rapidly, causing the network architecture to become suboptimal over time. 2. Equal-Cost Multi-Path (ECMP) supports sharing of traffic among equal cost paths between two nodes. However, ECMP attempst to divide the traffic as equally as possible among the equal cost shortest paths. Generally, ECMP does not support configurable load splitting ratios among equal cost paths. The result is that in the aggregate, one of the paths may carry significantly more traffic than other paths because it also may also carry traffic from other sources. 3. Modifying IGP metric to control traffic distribution tends to have network-wide effect. Consequently, undesirable and unanticipated traffic shifts can be triggered as a result. Because of these limitations, new capabilities are needed to control the routing function in IP networks. Some of these capabilities are described below. Constraint-based routing is highly desirable in IP networks, especially public IP backbones with complex topologies [AWD1]. Constraint-based routing computes routes that fulfil some requirements subject to constraints. Constraints may include bandwidth, hop count, delay, and administrative policy instruments such as resource class attributes [AWD1, RFC-2386]. This makes it possible to select that satisfy a given set of requirements subject to network and administrative policy constraints. Routes computed through constraint-based routing are not necessarily the shortest paths. Constraint-based routing works best with path oriented technologies that support explicit routing such as MPLS. Constraint-based routing can also be used as a means to redistribute traffic onto the infrastructure, even for best effort traffic. For example, is the bandwidth constraints are set the bandwidth constraint of the paths and reservable bandwidth of the link properly, the congestion caused by uneven traffic distribution as described above can be avoided. The performance and resource efficiency of the network is thus improved. In order compute routes subject to constraints, a number of enhancements are needed to conventional link state IGPs such as OSPF and IS-IS. The basic extensions required are outlined in [Li-IGP]. Specializations of these requirements to OSPF were described in [KATZ] and to IS-IS in [SMIT]. Essentially, these enhancements require the propagation of additional information in link state advertisements. Specifically, in addition to normal link-state information, an enhanced IGP is required to propagate a number of topology state information that are needed for constraint-based routing. Some of the additional topology state information include link attributes such as: 1) reservable bandwidth, and 2) link resource class attribute which is an administratively specified Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 37] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 property of the link. The resource class attribute concept was defined in [AWD1]. The additional topology state information is carried in new TLVs or sub-TLVs in IS-IS, or in the Opaque LSA in OSPF [SMIT, KATZ]. An enhanced link-state IGP may flood information more frequently than a normal IGP. This is because even without changes in topology, changes in reservable bandwidth or link affinity can trigger the enhanced IGP to initiate flooding. In order to avoid consuming excessive link bandwidth and computational resources, a tradeoff is typically required between the timeliness of the information flooded and the flooding frequency. In a TE system, it is also desirable for the routing subsystem to make load splitting ratio among multiple paths (with equal cost or different cost) configurable. This capability gives network administrators more flexibility in controlling traffic distribution, and can be very useful for avoiding/relieving congestion in some situations. Examples can be found in [XIAO]. Another desirable feature of the routing system is the capability to control the route of subsets of traffic without affecting the routes of other traffic; provided that sufficient resources exist for this purpose. This capability allows more refined control over the distribution of traffic accross the network. For example, the capability to move traffic from a source to a destination away from its original path to another path without affecting the paths of other traffic allows traffic to moved from resource-poor network segments to resource-rich segments. Path oriented technologies such as MPLS support this capability naturally. If the network supports multiple classes of service, the routing subsystem SHOULD have the capability to select different paths for different classes of traffic. 6.3 Traffic Mapping Requirements Traffic mapping pertains to the assignment of the traffic to the network topology to meet certain requirements and optimize resource usage. Traffic mapping can be performed by time-dependent or state- dependent mechanisms, as described in Section 5.1. A TE system SHOULD support both time-dependent and state-dependent mechanisms. For the time-dependent mechanism: - a TE system SHOULD maintain traffic matrices. - a TE system SHOULD have an algorithm that generates a mapping plan for each traffic trunk. - In certain environments (e.g., MPLS) a TE system SHOULD be able to control the path from any source to any destination; e.g., with explicit routing. - a TE system SHOULD be able to setup multiple paths to forward traffic from any source to any destination, and distribute the Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 38] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 traffic among them based on a configurable traffic split. - a TE system SHOULD provide a graceful migration from one mapping plan to another as the traffic matrix changes to minimize service disruption. For the state-dependent mechanism: - a TE system SHOULD be able to gather and maintain link state information, for example, by using enhanced OSPF or IS-IS. - for a given demand request, QoS requirements, and other constraints, a TE system SHOULD be able to compute and setup a path, for example, by using constraint-based routing. - a TE system SHOULD be able to perform load balancing among multiple paths. Load balancing SHOULD NOT compromise the stability of the network. In general, a TE system SHOULD support modification of IGP link metrics to induce changes in the traffic mapping patterns. 6.4 Measurement Requirements The importance of measurement in traffic engineering has been stated previously. In order to support the traffic engineering function, mechanisms SHOULD be provided to measure and collect statistics from the network. Additional capabilities may be provided to help in the analysis of the statistics. The actions of these mechanisms SHOULD not adversely affect the accuracy and integrity of the statistics collected. The mechanisms for statistical data acquisition SHOULD also be able to scale as the network evolves. Traffic statistics may be classified according to time scales, which may be long-term or short-term. Long-term traffic statistics are very useful for traffic engineering. Long-term time scale traffic statistics MAY capture or reflect seasonality network workload (e.g., hourly, daily, and weekly variations in traffic profiles; etc.). For a network that supports multiple classes of service, aspects of the monitored traffic statistics MAY also reflect class of service characteristics. Analysis of the long-term traffic statistics MAY yield secondary statistics such as busy hour characteristics, traffic growth patterns, persistent congestion and hot-spot problems within the network, imbalances in link utilization caused by routing anomalies, etc. There SHOULD also be a mechanism for constructing traffic matrices for both long-term and short-term traffic statistics. In multiservice IP networks, the traffic matrices MAY also be constructed for different service classses. Each element of a traffic matrix represents a statistic of traffic flow between a pair of abstract nodes. An abstract node may represent a router, a collection of routers, or a site in a VPN. At the short-term time scale, traffic statistics SHOULD provide reasonable and reliable indicators of the current state of the Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 39] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 network. In particular, some traffic statistics SHOULD reflect link utilization, and link and path congestion status. Examples of congestion indicators include excessive packet delay, packet loss, and high resource utilization. Examples of mechanisms for distributing such information including SNMP, probing techniques, FTP, and IGP link state advertisements, etc. 6.5 Network Survivability Network survivability refers to the capability of the network to maintain service continuity in the presence of failures within the network. This can be accomplished by promptly recovering from network failures and maintaining the required QoS for existing services after recovery. Survivability has become an issue of great concern to the Internet community with the increasing demands to carry mission critical traffic, real-time traffic, and other high priority traffic over the Internet. As network technologies continue improve, failure protection and restoration capabilities have become available from multiple layers. At the bottom of the layered stack, optical networks are now capable of providing dynamic ring and mesh restoration functionality as well as traditional protection functionality. For instance, the SONET/SDH layer provides survivability capability with Automatic Protection Switching (APS), as well as self-healing ring and mesh architectures. Similar functionality are provided by layer 2 technologies such as ATM (generally with slower mean restoration times). At the IP layer, rerouting is used to restore service continuity following link and node outages. Rerouting at the IP layer occurs after a period of routing convergence, which may require seconds to minutes to complete. In order to support advanced survivability requirements, path-oriented technologies such a MPLS can be used to enhance the survivability of IP networks; in a potentially cost effective manner. The advantages of path oriented technologies such as MPLS for IP restoration becomes even more evident when class based protection and restoration capabilities are required. Recently, a common suite of control plane protocols has been proposed for both MPLS and optical transport networks under the acronym Multiprotocol Lambda Switching [AWD5]. This new paradigm of Multiprotocol Lambda Switching will support even more sophisticated mesh restoration capabilities at the optical layer for the emerging IP over WDM network architectures. Another important aspect regarding multi-layer survivability is that various technologies at different layers provide protection and restoration capabilities at different temporal granularities (i.e., in terms of time scales) and at different bandwidth granularity (from packet-level to wavelength level). Protection and restoration capabilities can also be aware or unaware of different service classes. As noted previously, the impact of service outages varies Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 40] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 significantly for different service classes depending on the effective duration of the outage. The duration of an outage can vary from milliseconds (with minor service impact), to seconds (with possible call drops for IP telephony and session time-outs), to minutes and hours (with potentially considerable social and business impact). Generally, it is a challenging task to cordinate different protection and restoration capabilities across multiple layers in a cohesive manner so as to ensure that network survivability is maintained at reasonable cost. Protection and restoration coordination across layers may not always be feasible, because, for example, networks at different layers might belong to different administrative domains. In the following paragraphs, some of the general requirements for protection and restoration coordination are highlighted. - Protection/restoration capabilities from different layers SHOULD be coordinated whenever feasible and appropriate in order to provide network survivability in a flexible and cost effective manner. One way to achieve the coordination is to minimize function duplication across layers. Escalation of alarms and other fault indicators from lower layers to higher layers may also be performed in a coordinated. A temporal order of restoration triger timing at different layers is another way to coordinate multi-layer protection/restoration. - Spare capacity at higher layers is often regarded as working traffic at lower layers. Placing protection/restoration functions in many layers may increase redundancy and robustness, but it SHOULD not result in significant and avoidable inneficiencies in network resource utilization. - It is generally desirable to have a protection/restoration scheme that is bandwidth efficient. - Failure notification throughout the network SHOULD be timely and reliable. - Alarms and other fault monitoring and reporting capabilities SHOULD be provided at appropriate layers. 6.5.1 Survivability in MPLS Based Networks MPLS is an important emerging technology that enhances IP networks in terms of features and services. Because MPLS is path-oriented it can potentially provide faster and more predictable protection and restoration capabilities than conventional IP systems. This subsection provides an outline of some of the basic features and requirements of MPLS networks regarding protection and restoration. A number of Internet drafts also discuss protection and restoration issues in MPLS networks (see e.g., [ACJ99], [MSOH99], and [Shew99]). Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 41] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 Protection types for MPLS networks can be categorized into link protection, node protection, path protection, and segment protection, as discussed below. - Link Protection: The goal of link protection is to protect an LSP from a given link failure. Under link protection, the path of the protect or backup LSP (also called secondary LSP) is disjoint from the path of the working or operational LSP at the particular link over which protection is required. When the protected link fails, traffic on the working LSP is switched over to the protect LSP at the head-end of the failed link. This is a local repair method which can be potentially fast. It might be more appropriate in situations where some network elements along a given path are less reliable than others. - Node Protection: The goal of LSP node protection is to protect an LSP from a given node failiure. Under node protection, the path of the protect LSP is disjoint from the path of the working LSP at particular node that is to be protected. The secondary path is also disjoint from the primary path at all links associated with the node to be protected. When the node fails, traffic on the working LSP is switched over to the protect LSP at the upstream LSR that directly connects to the failed node. - Path Protection: The goal of LSP path protection is to protect an LSP from failure at any point along its routed path. Under path protection, the path of the protect LSP is completely disjoint from the path of the working LSP. The advantage of path protection is that the protect LSP protects the working LSP from all possible link and node failures along the path, except for failures that might occur at the ingress and egress LSRs. Additionally, since the path selection is end-to-end, path protection mign yield more efficient in terms of resource usage than link or node protection. However, in general, path protection may be slower than link and node protection. - Segment Protection: In some cases, an MPLS domain may be partitioned into multiple protection domains whereby a failure in a protection domain is rectified with that domain. In cases where an LSP traverses multiple protection domains, a protection mechanism within a domain only needs to protect the segment of the LSP that lies within the domain. Segment protection will generally be faster than path protection because recovery generally occurs closer to the fault. Protection option: Anoter issue to consider is the concept of protection options. It can be described in general using the notation m:n protection where m is the number of protect LSPs used to protect n working LSPs. In the following, some feasible protection options are described. - 1:1: one working LSP is protected/restored by one protect LSP; - n:1: one working LSP is protected/restored by n protect LSPs, Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 42] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 perhaps with configurable load splitting ratio. In situations where more than one protect LSP is used, it may be desirable to share the traffic accross the protect LSPs when the working LSP fails in so as to satisfy the bandwidth requirement of the traffic trunk associated with the working LSP, especially when it may not be feasible to find one path that can satisfy the the bandwidth requirement of the primary LSP; - 1:n: one protection LSP is used to protect/restore n working LSPs; - 1+1: traffic is sent cocurrently on both the working LSP and the protect LSP. In this case, the egress LSR selects one of the two LSPs based on some location traffic integrity decision process. This option would probably not be used pervasively in IP networks due to its inefficiency in terms of resource utilization. Resilience Attributes: - Basic attribute: reroute using IGP or protection LSP(s) when a segment of the working path fails, or no rerouting at all. - Extended attributes: 1. Protection LSP establishment attribute: the protection LSP is i) pre-established, or ii) established-on-demand after receiving failure notification. Pre-established protection LSP can be faster while established-on-demand one can potentially find a more optimal path and with more efficient resource usage. 2. Constraint attribute under failure condition: the protection LSP requires certain constraint(s) to be satisfied, which can be the same or less than the ones under normal condition, e.g., bandwidth requirement, or choose to use 0-bandwidth requirement under any failure condition. 3. Protection LSP resource reservation attribute: resource allocation of a pre-established protection LSP is, i) pre-reserved, or ii) reserved-on-demand after receiving failure notification; A pre-established and pre-reserved protection LSP can guarantee that the QoS of existing services is maintained upon failure while a pre- established and reserve-on-demand one or an established-on-demand one may not be able to. In addition, it is the fastest among the three. It can switch packets on the protection LSP once the ingress LSR receives the failure notification message without experiencing any delay for resource availability checking and protection LSP establishment. However, a pre-established protection LSP may not be able to adapt to any new change in the network since its establishment if there could be a better path due to the change. In addition, the bandwidth being reserved on the protection LSP is subtracted from the available bandwidth pool on all associated links, hence, not available for admitting new LSPs in the future. On the other hand, it differs from SONET protection in terms that the Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 43] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 reserved bandwidth does not sit idle, instead it can be used by any traffic presents on those links. Now, comparing a pre-established protection LSP and an established-on-demand one, the former is potentially faster since it only needs to wait to check if the requested bandwidth is available on the pre-established path without waiting for the path to be set up. However, if the requested bandwidth is not available on the pre-established path, it may choose to use an established-on-demand one as a second option. Failure Notification: Failure notification SHOULD be reliable and fast enough, i.e., at least in the same order as IGP notification, which is through LSA flooding, if not faster. 6.6 Content Distribution (Webserver) Requirements The Internet is dominated by client-server interactions, especially Web traffic. The location of major information servers has a significant impact on the traffic patterns within the Internet, and on the perception of service quality by end users. A number of dynamic load balancing techniques have been devised to improve the performance of replicated Web servers. The impact of these techniques is that the traffic becomes more dynamic in the Internet, because Web servers can be dynamically picked based on the locations of the clients, and the relative performance of different networks or different parts of a network. This process can be called Traffic Directing (TD). It is similar to Traffic Engineering but is at the application layer. Scheduling systems in TD that allocate servers to in replicated, geographically dispersed information distribution systems may require performance parameters of the network to make effective decisions. It is desirable that the TE system provide such information. The exact parameters needed are to be defined. When there is congestion in the network, the TD and TE systems SHOULD act in a coordinated manner. This topic is for further study. Because TD can introduce more traffic dynamics into a network, network planning SHOULD take this into consideration. It can be desirable to reserve a certain amount of extra capacity for the links to accommodate this additional traffic fluctuation. 6.7 Offline Traffic Engineering Support Systems If optimal link efficiency is desired, an offline and centralized traffic engineering support system MAY be provided as an integral part of an overall TE system. An offline and centralized traffic engineering support system can be used to compute the paths for the Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 44] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 traffic trunks. By taking all the trunk requirements, link attributes and network topology information into consideration, an offline TE support system can typically find a better trunk placement than an online TE system, where every router in the network finds paths originated from it in a distributed manner based on its own information. An offline TE support system may compute paths for trunks periodically, e.g., daily, for the purpose of re-optimization. The computed paths can then be downloaded into the routers. An online TE support system is still needed, so that routers can adapt to changes promptly. 6.8 Traffic Engineering in Diffserv Environments [NOTE: THIS SECTION IS WORK IN PROGRESS AND WILL BE UPDATED IN THE NEXT VERSION OF DRAFT] Traffic engineering will be very important in Diffserv environments. This section describes the traffic engineering features and requirements that are specifically pertinent to Differentiated Services (Diffserv) capable IP networks. 7.0 Multicast Considerations For further study. 8.0 Inter-Domain Considerations Inter-domain traffic engineering is concerned with the performance optimization for traffic that originates in one administrative domain and terminates in a different one. Traffic exchange between autonomous occurs through exterior gateway protocols. Currently, BGP-4 [bgp4] is the defacto EGP standard. Traditionally, in the public Internet, BGP based policies are used to control import and export policies for inter-domain traffic. BGP policies are also used to determine exit and entrance points to and from peer networks. Inter-domain TE is inherently more difficult than intra-domain TE. The reasons for this are both technical and administrative. Technically, the current version of BGP does not propagate topology and link state information outside accross domain boundaries. Administratively, there are differences in operating costs and network capacities between domains, and what may be considered a good solution in one domain may not necessarily be a good in another domain. Moreover, it would generally be considered imprudent for one Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 45] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 domain to permit another domain to influence the routing and control of traffic in its network. When Diffserv becomes widely deployed, inter-domain TE will become even more important, but more challenging to address. MPLS TE-tunnels (explicit LSPs) add a degree of flexibility in terms of selection of exit points for inter-domain routing. The concept of relative and absolute metrics were defined in [SHEN]. If the BGP attributes are defined such that the BGP decision process depends on IGP metrics to select exit points for Inter-domain traffic, then some inter-domain traffic destined to a given peer network can be made to prefer a given exit point by establishing a TE-tunnel between the router making the selection to the peering point via a TE-tunnel and assigning the TE-tunnel a metric which is smaller than the IGP cost to all other peering points. If a peer accepts and processes MEDs, then a similar MPLS TE-tunnel based scheme can be applied to cause certain entrance point to be preferred by setting MED to be the IGP cost, which has been modified by the tunnel metric. Similar to intra-domain TE, Inter-domain TE is best accomplished when a traffic matrix can be derived. traffic matrix for inter-domain traffic. Generally, redistribution of inter-domain traffic requires coordination between peering partners. Any export policy in one domain that results load redistribution across peer points can significantly affect the traffic distribution inside the domain of the peering partner. This, in turn, will affect the intra-domain TE due to changes in the intra-domain traffic matrix. Therefore, it is critical for peering partners to negotiate and coordinate with each other before attemping any policy changes that may result in significant shifts in inter-domain traffic. In practice, this coordination can be quite challenging for technical and non-technical reasons. It is a matter of speculation as to whether MPLS, or similar technologies, can be extended to allow selection of constrained-paths across domain boundaries. 9.0 Conclusion This document described a framework for traffic engineering in the Internet. It presented an overview of some of the basic issues surrounding traffic engineering in IP networks. The context of TE was described, a TE process models and a taxonomy of TE styles were presented. A brief historical review of pertinent developments related to traffic engineering was provided. Finally, the document specified a set of generic requirements, recommendations, and options for Internet traffic engineering. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 46] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 10.0 Security Considerations This document does not introduce new security issues. 11.0 Acknowledgments The authors would like to thank Jim Boyle for inputs on the requirements section, Francois Le Faucheur for inputs on class-type, and Gerald Ash for inputs on routing in telephone networks. The subsection describing an "Overview of ITU Activities Related to Traffic Engineering" was adapted from a contribution by Waisum Lai. 12.0 References [ACJ99] L. Anderson, B. Cain, and B. Jamoussi, "Requirement Framework for Fast Re-route with MPLS", Work in progress, October 1999. [ASH1] J. Ash, M. Girish, E. Gray, B. Jamoussi, G. Wright, "Applicability Statement for CR-LDP," Work in Progress, 1999. [ASH2] J. Ash, Dynamic Routing in Telecommunications Networks, McGraw Hill, 1998 [AWD1] D. Awduche, J. Malcolm, J. Agogbua, M. O'Dell, J. McManus, "Requirements for Traffic Engineering over MPLS," RFC 2702, September 1999. [AWD2] D. Awduche, "MPLS and Traffic Engineering in IP Networks," IEEE Communications Magazine, December 1999. [AWD3] D. Awduche, L. Berger, D. Gan, T. Li, G. Swallow, and V. Srinivasan "Extensions to RSVP for LSP Tunnels," Work in Progress, 1999. [AWD4] D. Awduche, A. Hannan, X. Xiao, " Applicability Statement for Extensions to RSVP for LSP-Tunnels" Work in Progress, 1999. [AWD5] D. Awduche et al, "An Approach to Optimal Peering Between Autonomous Systems in the Internet," International Conference on Computer Communications and Networks (ICCCN'98), October 1998. [AWD6] D. Awduche, Y. Rekhter, J. Drake, R. Coltun, "Multiprotocol Lambda Switching: Combining MPLS Traffic Engineering Control with Optical Crossconnects," Work in Progress, 1999. [CAL] R. Callon, P. Doolan, N. Feldman, A. Fredette, G. Swallow, A. Viswanathan, A Framework for Multiprotocol Label Switching," Work in Progress, 1999. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 47] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 [FGLR] A. Feldmann, A. Greenberg, C. Lund, N. Reingold, and J. Rexford, "NetScope: Traffic Engineering for IP Networks," to appear in IEEE Network Magazine, 2000. [FlJa93] S. Floyd and V. Jacobson, "Random Early Detection Gateways for Congestion Avoidance", IEEE/ACM Transactions on Networking, Vol. 1 Nov. 4., August 1993, p. 387-413. [Floy94] S. Floyd, "TCP and Explicit Congestion Notification", ACM Computer Communication Review, V. 24, No. 5, October 1994, p. 10-23. [HuSS87] B.R. Hurley, C.J.R. Seidl and W.F. Sewel, "A Survey of Dynamic Routing Methods for Circuit-Switched Traffic", IEEE Communication Magazine, Sep 1987. [itu-e600] ITU-T Recommendation E.600, "Terms and Definitions of Traffic Engineering", March 1993. [itu-e701] ITU-T Recommendation E.701 "Reference Connections for Traffic Engineering", October 1993. [JAM] B. Jamoussi, "Constraint-Based LSP Setup using LDP," Work in Progress, 1999. [Li-IGP] T. Li, G. Swallow, and D. Awduche, "IGP Requirements for Traffic Engineering with MPLS," Work in Progress, 1999 [LNO96] T. Lakshman, A. Neidhardt, and T. Ott, "The Drop from Front Strategy in TCP over ATM and its Interworking with other Control Features", Proc. INFOCOM'96, p. 1242-1250. [MATE] I. Widjaja and A. Elwalid, "MATE: MPLS Adaptive Traffic Engineering," Work in Progress, 1999. [McQ80] J.M. McQuillan, I. Richer, and E.C. Rosen, "The New Routing Algorithm for the ARPANET", IEEE. Trans. on Communications, vol. 28, no. 5, pp. 711-719, May 1980. [MR99] D. Mitra and K.G. Ramakrishnan, "A Case Study of Multiservice, Multipriority Traffic Engineering Design for Data Networks, Proc. Globecom'99, Dec 1999. [MSOH99] S. Makam, V. Sharma, K. Owens, C. Huang, "Protection/Restoration of MPLS Networks", Work in Progress, October, 1999. [OMP] C. Villamizar, "MPLS Optimized OMP", Work in Progress, 1999. [RFC-1349] P. Almquist, "Type of Service in the Internet Protocol Suite", RFC 1349, Jul 1992. [RFC-1458] R. Braudes, S. Zabele, "Requirements for Multicast Protocols," RFC 1458, May 1993. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 48] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 [RFC-1771] Y. Rekhter and T. Li, "A Border Gateway Protocol 4 (BGP- 4), RFC 1771, March 195. [RFC-1812] F. Baker (Editor), "Requirements for IP Version 4 Routers," RFC 1812, June 1995. [RFC-1997] R. Chandra, P. Traina, and T. Li, "BGP Community Attributes" RFC 1997, August 1996. [RFC-1998] E. Chen and T. Bates, "An Application of the BGP Community Attribute in Multi-home Routing," RFC 1998, August 1996. [RFC-2178] J. Moy, "OSPF Version 2", RFC 2178, July 1997. [RFC-2205] R. Braden, et. al., "Resource Reservation Protocol (RSVP) - Version 1 Functional Specification", RFC 2205, September 1997. [RFC-2211] J. Wroclawski, "Specification of the Controlled-Load Network Element Service", RFC 2211, Sep 1997. [RFC-2212] S. Shenker, C. Partridge, R. Guerin, "Specification of Guaranteed Quality of Service," RFC 2212, September 1997 [RFC-2215] S. Shenker, and J. Wroclawski, "General Characterization Parameters for Integrated Service Network Elements", RFC 2215, September 1997. [RFC-2216] S. Shenker, and J. Wroclawski, "Network Element Service Specification Template", RFC 2216, September 1997. [RFC-2330] V. Paxson et al., "Framework for IP Performance Metrics", RFC 2330, May 1998. [RFC-2386] E. Crawley, R. Nair, B. Rajagopalan, and H. Sandick, "A Framework for QoS-based Routing in the Internet", RFC 2386, Aug. 1998. [RFC-2475] S. Blake et al., "An Architecture for Differentiated Services", RFC 2475, Dec 1998. [RFC-2597] J. Heinanen, F. Baker, W. Weiss, and J. Wroclawski, "Assured Forwarding PHB Group", RFC 2597, June 1999. [RFC-2678] J. Mahdavi and V. Paxson, "IPPM Metrics for Measuring Connectivity", RFC 2678, Sep 1999. [RFC-2679] G. Almes, S. Kalidindi, and M. Zekauskas, "A One-way Delay Metric for IPPM", RFC 2679, Sep 1999. [RFC-2680] G. Almes, S. Kalidindi, and M. Zekauskas, "A One-way Packet Loss Metric for IPPM", RFC 2680, Sep 1999. [RFC-2722] N. Brownlee, C. Mills, and G. Ruth, "Traffic Flow Measurement: Architecture", RFC 2722, Oct 1999. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 49] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 [RoVC] E. Rosen, A. Viswanathan, R. Callon, "Multiprotocol Label Switching Architecture," Work in Progress, 1999. [Shew99] S. Shew, "Fast Restoration of MPLS Label Switched Paths", draft-shew-lsp-restoration-00.txt, October 1999. [SLDC98] B. Suter, T. Lakshman, D. Stiliadis, and A. Choudhury, "Design Considerations for Supporting TCP with Per-flow Queueing", Proc. INFOCOM'99, 1998, p. 299-306. [XIAO] X. Xiao, A. Hannan, B. Bailey, L. Ni, "Traffic Engineering with MPLS in the Internet", IEEE Network magazine, March 2000. [YaRe95] C. Yang and A. Reddy, "A Taxonomy for Congestion Control Algorithms in Packet Switching Networks", IEEE Network Magazine, 1995 p. 34-45. [SMIT] H. Smit and T. Li, "IS-IS extensions for Traffic Engineering,"Internet Draft, Work in Progress, 1999 [KATZ] D. Katz, D. Yeung, "Traffic Engineering Extensions to OSPF,"Internet Draft, Work in Progress, 1999 [SHEN] N. Shen and H. Smit, "Calculating IGP routes over Traffic Engineering tunnels" Internet Draft, Work in Progress, 1999. Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 50] Internet Draft draft-ietf-tewg-framework-00.txt Expires July 2000 13.0 Authors' Addresses: Daniel O. Awduche UUNET (MCI Worldcom) 22001 Loudoun County Parkway Ashburn, VA 20147 Phone: 703-886-5277 Email: awduche@uu.net Angela Chiu AT&T Labs Room C4-3A22 200 Laurel Ave. Middletown, NJ 07748 Phone: (732) 420-2290 Email: alchiu@att.com Anwar Elwalid Lucent Technologies Murray Hill, NJ 07974, USA Phone: 908 582-7589 Email: anwar@lucent.com Indra Widjaja Fujitsu Network Communications Two Blue Hill Plaza Pearl River, NY 10965, USA Phone: 914-731-2244 Email: indra.widjaja@fnc.fujitsu.com Xipeng Xiao Global Crossing 141 Caspian Court, Sunnyvale, CA 94089 Email: xipeng@globalcenter.net Voice: +1 408-543-4801 Awduche/Chiu/Elwalid/Widjaja/Xiao [Page 51]