Internet Draft Internet-Draft P. Vaananen Expiration Date: September, 1998 Nokia Telecommunications R. Ravikanth Nokia Research Center March, 1998 Framework for Traffic Management in MPLS Networks <draft-vaananen-mpls-tm-framework-00.txt> STATUS OF THIS MEMO This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress". To learn the current status of any Internet-Draft, please check the "lid-abstracts.txt" listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). ABSTRACT It has been recognised that the success of the MPLS depends on the ability to better support the multiservice traffic integration with some levels of service guarantees, which are not feasible to implement with the current destination prefix only based packet forwarding paradigms. The efficient support for these services throughout the network is expected to be possible using label based forwarding paradigm in the network. However, the service categories and the enabling mechanisms to support those service categories are not well addressed in the current proposals for the MPLS working group; the effort has mostly concentrated on the handling of the best effort traffic and associated scalability and routing related issues. The goal of this document is to define a framework for traffic management in MPLS networks. We discuss the set of mechanisms that have been proposed for enabling the implementation of the more advanced P.Vaananen, R. Ravikanth [Page 1] Internet-Draft Framework for TM in MPLS Networks March 1998 services than pure best-effort packet forwarding, and the impact of those mechanisms with respect to MPLS network environments and MPLS protocol implementation. The document describes the mechanisms and their application with the intent to approach the level of the traffic management capabilities that are currently available in hybrid router/ATM or frame relay networks using the MPLS. The approach taken is that no modifications are required in the end station protocol or application software in the first phase of deployment, while this might be allowed later, if deemed necessary. This document concentrates on the issues from the public network operators point of view, although most of the discussion applies as well in the local network environments. Concepts and mechanisms described in this document are based on the previous work done in the subject on various working groups of IETF and other standardisation bodies. It has been attempted to use applicable concepts and terminology from previous work as much as possible. This document concentrates on the MPLS specific issues, number of related mechanisms and concepts are only briefly presented for sake of completeness, and the other related work is referred, where applicable. Reader is suggested to consult the referred material, in case he / she wants to have more information on these areas. ACKNOWLEDGEMENTS The ideas presented in this in document have been based on the information collected from a number of sources. --- Individuals to be added --- 1. TABLE OF CONTENTS STATUS OF THIS MEMO ABSTRACT...............................................................1 ACKNOWLEDGEMENTS.......................................................2 1. TABLE OF CONTENTS..............................................2 2. INTRODUCTION...................................................5 3. SERVICE CATEGORIES.............................................5 3.1 BEST EFFORT SERVICES..........................................,6 3.1.1 Enhanced best effort service...................................6 3.1.2 Enhanced best effort service with bandwidth allocation.........6 3.1.3 Enhanced best effort services in MPLS environments.............6 3.2 DIFFERENTIATED SERVICES........................................7 3.2.1 Differentiated service.........................................7 3.2.2 Differentiated services with bandwidth allocations.............8 3.2.3 Differentiated services in MPLS environments...................8 3.3 GUARANTEED SERVICES............................................8 P.Vaananen, R. Ravikanth [Page 2] Internet-Draft Framework for TM in MPLS Networks March 1998 3.3.1 Services.......................................................9 3.3.2 Guaranteed services in MPLS environments.......................9 4. TRAFFIC MANAGEMENT REQUIREMENTS...............................10 4.1 SERVICE CATEGORY SUPPORT......................................10 4.2 ADMISSION CONTROL, MONITORING AND SECURITY....................11 4.3 CONGESTION MANAGEMENT.........................................11 4.4 SCALABILITY REQUIREMENTS......................................11 4.5 ROBUSTNESS AND RELIABILITY....................................12 4.6 TOPOLOGY SUPPORT..............................................13 4.7 TOPOLOGICAL SCOPE.............................................13 4.8 COMPATIBILITY.................................................14 4.9 EXTENSIBILITY.................................................14 5. CONTROL PLANE MECHANISMS FOR TRAFFIC MANAGEMENT FUNCTIONS.....15 5.1 TRIGGERS......................................................15 5.1.1 Configuration events..........................................16 5.1.2 Signaling events..............................................16 5.1.3 Topology changes..............................................16 5.1.4 Traffic pattern changes.......................................16 5.2 POLICY AND ADMISSION CONTROL..................................17 5.2.1 Routing policy................................................17 5.2.2 Classification policy.........................................17 5.2.3 Admission policy..............................................18 5.2.4 Admission control.............................................18 5.3 PATH SELECTION................................................19 5.4 ACCOUNTING....................................................19 5.5 USER AUTHENTICATION...........................................19 6. DATA PLANE MECHANISMS FOR TRAFFIC MANAGEMENT FUNCTIONS........20 6.1 LABEL FORWARDING PARADIGM.....................................20 6.2 CLASSIFICATION................................................20 6.2.1 What is classification and where it should be done............20 6.2.2 Flow Classification...........................................21 6.2.3 Packet Classification.........................................22 6.2.4 Classification results for differentiated services............23 6.2.5 Classification results for guaranteed services................23 6.2.6 Problems with non end-system classifications..................23 6.2.6.1 Classification in presence of IPSEC...........................23 6.2.6.2 Classification in presence of dynamic address assignment......24 6.2.6.3 Classification in presence of dynamic port numbers............24 6.2.7 Classification state maintenance..............................24 6.3 POLICING......................................................25 6.4 MAPPING.......................................................25 6.4.1 Direct mapping................................................26 6.4.2 Indirect mapping..............................................26 6.5 AGGREGATION, MERGING AND DEAGGREGATION........................26 6.5.1 Aggregation...................................................26 6.5.2 Merging.......................................................27 6.5.3 Aggregation and merging of traffic with service guarantees...27 6.5.4 Deaggregation.................................................28 6.6 QUEUING AND CONGESTION MANAGEMENT.............................28 6.6.1 Queue management..............................................28 P.Vaananen, R. Ravikanth [Page 3] Internet-Draft Framework for TM in MPLS Networks March 1998 6.6.2 Queuing principles............................................29 6.6.3 Congestion control............................................29 6.6.3.1 Passive congestion control schemes............................29 6.6.3.2 Active congestion control schemes.............................30 6.6.4 Packet scheduling.............................................31 6.7 TRAFFIC SHAPING...............................................31 6.8 LOAD SHARING..................................................32 7. LABEL SWITCHED PATH GRANULARITIES AND AGGREGATION.............32 8. LABEL SWITCHED PATH TOPOLOGIES AND ASSOCIATED TM PROCEDURES...33 8.1 POINT-TO-POINT................................................34 8.2 POINT-TO-MULTIPOINT...........................................34 8.3 MULTIPOINT-TO-POINT...........................................34 8.4 MULTIPOINT-TO-MULTIPOINT......................................35 8.5 MULTILEVEL PATHS..............................................35 9. NETWORK FUNCTIONAL PARTITIONING...............................37 9.1 NETWORK MODELS................................................37 9.2 NETWORK ELEMENT CATEGORIES....................................38 9.2.1 Hosts.........................................................38 9.2.1.1 Enhanced best effort services.................................38 9.2.1.2 Differentiated services.......................................38 9.2.1.3 Guaranteed services...........................................39 9.2.1.4 Participation in MPLS.........................................39 9.2.2 MPLS edge nodes...............................................40 9.2.2.1 Best effort services to customer..............................42 9.2.2.2 Differentiated services to customer...........................42 9.2.2.3 Guaranteed services to customer...............................43 9.2.2.4 MPLS to customer..............................................43 9.2.3 MPLS core node................................................44 9.3 INTERFACE CATEGORIES..........................................45 9.3.1 Interface to non-MPLS networks................................45 9.3.2 Interface inside MPLS network domains.........................45 9.3.3 Interface between MPLS network domains........................45 10. LSP MAPPINGS TO EXISTING LINK LAYER TECHNOLOGIES..............46 11. GENERAL REQUIREMENTS FOR LABEL ENCAPSULATIONS.................46 11.1 DIFFERENTIATED SERVICES SUPPORT...............................46 11.2 CONGESTION MANAGEMENT SUPPORT.................................47 11.2.1 Congestion indicator bit......................................47 11.2.2 Examine me bit................................................48 11.3 SUPPORT FOR MULTILEVEL LABEL SWITCHED PATHS...................48 12. GENERAL REQUIREMENTS FOR DISTRIBUTION OF LABELS AND TM ATTRs..48 12.1 SETUP REQUEST.................................................49 12.2 SETUP MODIFICATION............................................49 12.3 SETUP ACKNOWLEDGE.............................................49 12.4 SETUP REJECT..................................................49 12.5 DISCUSSION OF SIGNALING PROTOCOLS.............................50 12.5.1 General.......................................................50 12.5.2 LDP...........................................................50 12.5.3 RSVP..........................................................51 13. REFERENCES....................................................51 14. SECURITY CONSIDERATIONS.......................................54 P.Vaananen, R. Ravikanth [Page 4] Internet-Draft Framework for TM in MPLS Networks March 1998 15. AUTHOR'S ADDRESSES............................................58 2. INTRODUCTION The ability of the network to support service level guarantees and traffic engineering is becoming very important. This area has been, and will remain as subject area addressed in various working groups of IETF (e.g. INTSERV, RSVP, ISSLL, RAP, DIFFSERV, IPPM, QOSR), IRTF (E2E), ATM Forum (TM), Frame Relay Forum, ITU-T, and various other organisations and user consortiums. We build on the ideas and previous work done in these working groups, and try to build a coherent set of capabilities around the label based packet forwarding technology discussed in MPLS working group of IETF, as described in MPLS framework document [Callon97] and MPLS architecture document [Rosen97a]. The approach taken in this document is to look at the available pieces, and try to fit them on to the MPLS framework in a scaleable fashion.This document presents a requirements and implementation framework in the context of MPLS for the services and capabilities that needs to be built. Possible mechanisms and deployment scenarios to actually achieve these advanced services are also described. The document tries to take evolutionary rather than revolutionary approach, we don't propose to change everything at once (and do not believe it's possible), as previous attempts have quite consistently failed to do it. Focus is to try to answer two questions: what should be done that the quality of the network service perceived by the end user improves, and how to maximise the usage of the network resources, and at the same time do it in scaleable and controlled manner. We feel especially important that the deployment of the technologies presented can be started on the small scale, and without changes to the host communication and application protocols, while this framework attempts to be flexible enough to be able to accommodate such changes when the technology matures and the incremental deployment is determined to be feasible and necessary. We hope to evolve the technologies and protocols of the MPLS towards supporting the capabilities outlined in this document, but do realise that much more detailed discussion, research and specification work needs to be done before the complete set of "wishes" can be accomplished. 3. SERVICE CATEGORIES The advanced services requiring the use of the traffic management P.Vaananen, R. Ravikanth [Page 5] Internet-Draft Framework for TM in MPLS Networks March 1998 mechanisms can be broadly divided into three categories on a basis of (i) the level of assurance on service guarantees that can be achieved and (ii) the granularity of guarantees (simple to complex) that is provided. This division is made here to support the discussion of the related traffic management issues. The characteristics of the different service categories are briefly described in the chapters 3.1. to 3.3. 3.1 Best effort services 3.1.1 Enhanced best effort service The service remains similar to the current best effort service, but with the higher service quality perceived by the end-user, regardless of the applications used. Enhanced best effort service can be realised without specific signalling protocols inside the network. This service differs from "plain old best effort" because of the use of the advanced congestion control mechanisms. The purpose is to provide a more controlled and more fair behaviour during congestion period. Passive congestion control mechanisms based on packet drop policies, such as random early detection [Floyd93], [Braden97] can be used. In addition to passive congestion control mechanisms, active congestion control mechanisms based on congestion feedback and transport protocol interactions have also been suggested [Ramakr97], [Packeteer97], [Jagan97]. This service can be implemented in any router with the support of appropriate traffic management mechanisms. The use of label based forwarding paradigm does add capabilities for the network operator traffic engineering, such as better ways to control the path selection for the traffic. 3.1.2 Enhanced best effort service with bandwidth allocation The enhanced best effort service augmented with bandwidth allocation capability allows an operator to optimise network capacity usage, and manage bandwidth usage by allocating it to individual users, networks, or any aggregated community as desired. These services generally require a specific signalling protocol for communication of the related traffic management attributes through the network. 3.1.3 Enhanced best effort services in MPLS environments Basic enhanced best effort service does not generally require per-flow state to be maintained in the network elements, the goal is to support fair usage of resources inside network. P.Vaananen, R. Ravikanth [Page 6] Internet-Draft Framework for TM in MPLS Networks March 1998 MPLS enables the carrying of congestion indication over the LSP to allow the LSP endpoints to react to congestion. In addition, the congestion indication can be monitored in the LSP endpoints, and information of congestion exceeding some predetermined threshold can be used e.g. to initiate the re-evaluation LSP path selection. In environments where bandwidth allocations are used, any required traffic management related attributes that are used are generally applied on aggregated streams. The use of label based forwarding paradigm adds easy to implement capabilities to allocate bandwidth to aggregated best effort traffic streams and provides ways to communicate these allocations through the network. Generally enhanced best effort approaches rely on the interactions of the network with end-to-end protocols (e.g. intelligent drop policies) to reduce the load at times of congestion. Common practise at a moment is to use FIFO type queuing. Together with the bandwidth allocation capabilities, the path selection mechanisms, such as explicit label switched paths provide efficient capabilities to network traffic engineering. 3.2 Differentiated services 3.2.1 Differentiated service Differentiated services are currently being specified in the IETF DIFFSERV working group. Work is in an early phase, and there are several different proposed approaches. Differentiated services, as proposed, allow the traffic to be classified into finite number of priority and/or delay classes. Traffic classified as having the higher priority and/or delay class receives some form of preferential treatment over the traffic that is classified onto lower class. Differentiated service does not attempt to give explicit end-to-end guarantees over the network, instead, in congested network elements, the traffic with higher priority class has a higher probability to get through, or in case of delay priority, scheduled for transmission before the traffic that is not delay sensitive. Differentiated service packet classification can be performed either in the hosts, CPE routers or in the operator network border routers. The information required to perform actual differentiation in the network elements will be carried in the TOS field of the IPv4 packets, referred as DS-byte in differentiated service operational model document [Nichols98]. Thus, as the information required by the buffer management and scheduling algorithms is carried inside the packet, differentiated services do not necessarily require signalling protocols to control the mechanisms that are used to select different treatment for the individual packets. P.Vaananen, R. Ravikanth [Page 7] Internet-Draft Framework for TM in MPLS Networks March 1998 Differentiated services can be implemented in any router that supports the appropriate traffic management mechanisms. 3.2.2 Differentiated services with bandwidth allocations In addition to the basic functionality provided by the differentiated services, the addition of the bandwidth allocation capability allows the network operator to allocate the desired bandwidth to the switched paths carrying the differentiated services over the network domain. Depending on how the differentiated service allocations are implemented, the operator can either control the bandwidth share given to each priority class separately, or allocate bandwidth to differentiate service class paths as a whole, and implement differentiation on the basis of capability of the resulting virtual path. 3.2.3 Differentiated services in MPLS environments Generally no per-flow state is maintained in the network elements, goal is to support a small, fixed number of service categories. Per stream attributes distributed using the label distribution mechanisms can include the differentiated service category associated with the LSP. One or more queues with simple service policy are used. In case that multiple queues are used to support delay prioritisation, scheduling mechanism ensures that the low delay classes are served first. Weighted scheduling mechanisms may be used instead of strict priority scheduling to ensure that the lower classes cannot suffer of starvation. The support of differentiated services in MPLS environments requires signalling support for the association of the desired category with the label, or alternatively each packet needs to carry the information of the desired service category. MPLS allows the allocation of bandwidth for the differential services in conjunction of the another services in controlled manner. This allows the operator to allocate the available bandwidth between differentiated service category and other categories, on LSP basis depending on implementation. 3.3 Guaranteed services These services provide hard guarantees that are explicitly specified for different granularities, and topological scopes from network boundary to network boundary to end-to-end. Guarantees can be given for different kinds of the parameters, such as bandwidth and/or delay, depending on the service class and capabilities of the network elements on the path. Guaranteed services may be based on the contractual P.Vaananen, R. Ravikanth [Page 8] Internet-Draft Framework for TM in MPLS Networks March 1998 guarantees or user-network signalling, such as RSVP. Signalling protocol to communicate the service parameter information is required inside network. In the IETF, guaranteed services have been specified by INTSERV working group. Integrated service framework is described in [RFC1633]. There are currently two services that have been defined by INTSERV; controlled load [RFC2211] and guaranteed service [RFC2212]. These services should be supported in MPLS environments. Service parameter mappings to different link layers specified in the ISSLL working groups should be applicable to MPLS, augmented with the label encapsulation procedures specified in the MPLS WG. 3.3.1 Services Two different guaranteed services have been specified in INTSERV effort of the IETF so far: - Controlled load service [RFC2211] - Guaranteed Quality of Service [RFC2212] Other guaranteed service categories that may be applicable to certain MPLS environments have been specified by other standardisation bodies, such as in Frame Relay Forum and ATM Forum [ATMF96]. The service categories specified in other bodies than IETF are not presently discussed in this document, as we attempt to build onto present state of the work of the IETF. The service categories from the other standardisation bodies may become important in the future, and their use in the MPLS context and mappings between IETF services and external categories may be specified as part of MPLS effort or other IETF efforts, such as ISSLL. 3.3.2 Guaranteed services in MPLS environments Per-LSP or per-flow state needs to be maintained in the edge MPLS nodes, depending on the topological scope of the guarantees, for end- to-end, flow state is required, and internally, per-LSP state for aggregated guarantees needs to be maintained. Aggregated state information is needed in the core network elements. The implementation of guaranteed services requires the use of the advanced queuing mechanisms in the network elements. Signalling support for communication of changes of the individual or aggregated state information associated with the LSP will be required. For scalability, the aggregation of the guarantees to form guaranteed aggregated label switched paths is desirable. For the implementation of the end-to-end reservations, the information of the parameters of the P.Vaananen, R. Ravikanth [Page 9] Internet-Draft Framework for TM in MPLS Networks March 1998 aggregated entities are required in the de-aggregation points of the network. This can be realised in MPLS by using the multilevel LSPs. This requires signalling of the individual constituents of aggregated flows from the aggregation to de-aggregation point. The current methods for QoS on IP seem to have scalability issues, when the number of connections requesting such services grows. Thus, an issue that is not MPLS specific, is that of making it scalable through a combination of aggregation and provisioning. Such aggregation techniques may place some requirements on MPLS, to the extent that the labels may have to be associated specific kinds of parameters, which pertain to the aggregation. Thus the label assignment and distribution mechanisms should provide ways for distributing such attributes. MPLS benefits the implementation of the guaranteed services, as the association can be made in the border nodes of the network onto LSPs, and the intermediate nodes need only use the label information to retrieve the attributes them require to provide the desired guarantee for the associated LSP. The use of labels to retrieve the state information provides great benefit compared to the model where each node in the path would require to keep state of each guaranteed flow, and find the flow by matching a filter to each packet to retrieve the traffic parameters of the flow. 4. TRAFFIC MANAGEMENT REQUIREMENTS Requirements presented in this chapter are a superset of requirements of those expressed in numerous sources. Some of the requirement sources these requirements are based on are [Smith97], [Bradner97], and [RFC1633]. 4.1 Service category support - Support for services described in the previous chapter MPLS shall support the implementation of the services described in previous chapter, in such way that the desired set of services can be implemented in same node and same link. The implementation of all services should not be mandatory, but considered as a differentiator between the products. However, the MPLS standardisation effort should describe the set of mechanisms to support all of the above services to ensure the interoperable implementation of these services. - Support for controlled link sharing Network operators shall be able to allocate maximum shares of link bandwidth to different service categories, in a such way that the minimum amount of bandwidth is guaranteed for each service class. This allows the operator to guarantee that the lower priority services cannot suffer from the starvation because of the higher priority P.Vaananen, R. Ravikanth [Page 10] Internet-Draft Framework for TM in MPLS Networks March 1998 services use all available bandwidth. These setting shall be enforced by the policy and admission control together with the policing, queue management and scheduling mechanisms. In the absence of the traffic in higher priority service classes, the bandwidth should be available for use by the lower priority traffic. 4.2 Admission control, monitoring and security - Support for authentication Authentication of the users and/or equipment needs to be performed at domain borders to determine that the service user is who he claims to be. Authentication is required to support admission and accounting. - Support for admission policies and control Operator shall be able to apply admission policies in the operational network boundary, to enforce the service agreements between the users and/or other operator network domains. - Support for accounting When the enhanced service levels are used, the incentive for the network operator to provide such services is to get more revenue of the consumers of such services. Accounting is required to keep track of the services used, and to be able to provide usage sensitive pricing policies for enhanced level services. - Service management When the enhanced services are provided for the end-user, inside the operator's network, or between the domains, it is important for both the operator and end-user to be able to monitor that the performance of the provided services fulfil their specifications. The required measurement and management features shall be implemented on network elements and management systems to support these requirements. 4.3 Congestion management - Congestion control Congestion control is important even for the best effort services, but becomes more complicated when the different levels of services are supported over same interfaces. Characteristics of mechanisms and guidelines for use of these congestion control mechanisms in multi- service environments shall be specified. 4.4 Scalability requirements P.Vaananen, R. Ravikanth [Page 11] Internet-Draft Framework for TM in MPLS Networks March 1998 - Minimisation of the label space requirements The label space may become a limitation of the applicability of the label switching scheme, unless the attention for the constraining the label space in architecture design phase is given. Increased label space makes the management of the label space more difficult, it involves more state keeping in network elements, and implies higher dynamics of change in the label assignments or attributes. Adding advanced services to pure best-effort delivery will inevitably increase the label space requirements, and an attempt should be made in the specification phase to minimise the overhead. Aggregation and merging are examples of the mechanisms to help in the label space containment. - Minimisation of the state in the network elements Flow specific state shall be maintained only on the network elements that are required to handle the individual flows, such as edge network elements. The design goal is that the core network elements do not require to maintain flow specific state information. This enhances the applicability of the MPLS in large networks and high-speed backbone links. - Support of the different granularities of control, single flow to highly aggregated streams It is important that the multiple control levels are supported, depending on the level in the network where the services are provided. General guideline is that the amount of the state information that is required to be maintained decreases from network edge towards the core of the network. - Minimisation of the signalling requirements The state maintenance associated with the control of the path traffic management attributes implies the use of the signalling mechanism to convey this information. It is important that the signalling traffic required by the traffic management support be minimised. 4.5 Robustness and reliability - Soft state protocol The protocol(s) resulting from the MPLS work should use soft state approach as much as possible, i.e. to have the state associated with the LSPs required to "expire", if not periodically refreshed. Hard state should only be associated with the administratively configured LSPs (explicit routes, policies, etc.). Care should be taken that the overhead of the state refreshments required to maintain the soft state components does not grow excessive, e.g. due to requirement to refresh the state of associated with each LSP individually. P.Vaananen, R. Ravikanth [Page 12] Internet-Draft Framework for TM in MPLS Networks March 1998 - Security considerations The basic idea of supporting the any kind of service level differentiation opens up the possibilities for the user's to try to gain access to more valuable services without paying the appropriate compensation. In addition, the new kinds of denial of service attacks may become possible. Security considerations have to be taken in account when designing the architecture and protocols for the traffic management aspects. - Reliability The service level agreed upon with the customer have to be monitored, and the means for alerting the network operator of failures, and mechanisms (to possibly automatic) reconfiguring of the switched path arrangement inside the network to quickly remedy the failure have to be considered. 4.6 Topology support - Support for point-to-point topology Point-to-point topology is conceivably the simplest of the topologies that needs to be supported. The basic topology, between the network elements is point-to-point path, which can have it's associated parameters. More complex topologies can be supported by merging the ingress paths to single egress paths with different characteristics (aggregation). It shall be possible to support point-to-point LSP's with the associated resource allocations and priorities. - Support for point-to-multipoint topology (multicast) Point to multipoint topology is useful for the support of the multicast data delivery. Point to multipoint topology support shall include means for managing the joins and withdrawals of leafs, affecting only the associated part of the multicast distribution tree. Also, it shall be possible to support heterogeneous receivers in the multicast groups. - Support for multipoint-to-point topology Multipoint-to-point topologies are attractive for scalability reasons. Single destination based tree can be constructed for traffic that can be treated similarly. It shall be possible to support different traffic reservations in different parts of the tree, with higher resource allocations towards the egress points of the multipoint-to-point delivery tree (each merge point adds it's traffic volume to the tree). 4.7 Topological scope - Support for different topological scopes (inside domain, between P.Vaananen, R. Ravikanth [Page 13] Internet-Draft Framework for TM in MPLS Networks March 1998 domains, end-to-end) MPLS shall consider the different requirements and scalability aspects imposed by the different topological scope, and the functional partitioning inside MPLS domain and between the MPLS domain and other MPLS or non-MPLS domain. 4.8 Compatibility - Support of current applications without modifications There have been numerous proposals in the past to provide the enhanced services, that involve the modification of the end-user application software. Examples of such proposals are end-to-end ATM deployment, use of RSVP by the end-user applications to request service guarantees, and use of the applications to classify their traffic onto differentiated service categories. While such end-to-end guarantees may become important later, it shall be possible to initially implement service contracts without modifications to applications and end-to-end protocols. This can be accomplished by classifying the traffic on the network edge's instead of the end-to-end basis, and providing required transmission capacity (e.g. dedicated switched Ethernet port) to end- user's computer system. Additional advantage is the centralised nature of the management of these services. - Interoperability MPLS should consider the interworking and interoperability of the MPLS based network with the currently available networking technologies, and also describe the advanced service mappings between the other networking technologies and MPLS where applicable. - Support for different link layer technologies Mapping of the label switching paths to different link layer technologies shall be specified taking into account the traffic management capabilities provided by the underlying link layer technology, and the desired properties of the supported service set. Candidates for the link layers suitable for carrying labelled traffic in public network environments include ATM, Frame Relay and MPLS over SONET. 4.9 Extensibility - Extensibility Traffic management framework and associated architecture and protocols shall be extensible to support new attributes for supporting new services without the changes to initial concepts and mechanisms. P.Vaananen, R. Ravikanth [Page 14] Internet-Draft Framework for TM in MPLS Networks March 1998 - Mechanism independence The traffic management mechanisms shall be loosely specified, rather in the way of specifying the characteristics of the mechanisms required to support different parts of traffic management functionality. Mechanisms like queue management and scheduling are local in the network element, and thus do not need to be strictly standardised. Suggestions of the applicable mechanism should be given, but vendors should have the freedom to implement whatever mechanisms they feel appropriate to achieving the desired functionality. Additionally, this allows for improvements in the individual mechanisms via active research in the area. Thus it is important to standardise on the semantics of information carried in the signalling protocol (LDP) or that associated with individual packets, as applicable. 5. CONTROL PLANE MECHANISMS FOR TRAFFIC MANAGEMENT FUNCTIONS This chapter describes the mechanisms required in the various parts of the network to control the data plane traffic management functions described in the next chapter. These mechanisms include policy and signalling aspects required to set up, and to maintain the LSPs. Note that the location of these mechanisms in the networks is not discussed in this chapter, a discussion of the location of mechanisms in different network environments is given in chapter 9. 5.1 Triggers Triggers are events that cause the changes in the LSP configuration. These changes may be LSP establishment, reconfiguration, deletion or attribute modification. The triggers either require going through full or partial LSP establishment process depending on the type of the trigger. Triggers typically result from events related to changes of some information relevant to LSP set-up, such as: - configuration event - signalling event - topology change - traffic pattern change The scope of the change initiated by trigger can be either local (i.e. inside of the network element), regional (i.e. affects the configuration of the peer MPLS nodes) or global (i.e. affects all network elements that compose the LSP). The frequency of the regional and global changes should be minimised. As the finer granularity of control of the LSP attributes is required P.Vaananen, R. Ravikanth [Page 15] Internet-Draft Framework for TM in MPLS Networks March 1998 (e.g. explicit reservations), this becomes increasingly hard to achieve. Properties associated with different kinds of triggers are discussed in sections 5.1.1 to 5.1.3. 5.1.1 Configuration events A configuration event can affect either policy or LSP configuration parameters. Policy changes affect the admission or classification policy being used in the node. LSP parameter change affects the attributes associated with the statically configured label switched path. The policy related changes can either force the re-evaluation of the current classifications or be taken into use gradually, as new paths are used. Although the immediate re-evaluation would be desirable, it may have negative effects on the performance and the handling of the current traffic. Parameter changes may require the communication of the change to peer LSRs that compose the LSP (signalling initiated by the 'root' node of the LSP), or be configured onto each LSR along the path individually (management initiated change). In either case, these changes should be taken into effect immediately. 5.1.2 Signaling events Signalling event is an externally received trigger that explicitly affects the way LSPs are set, and depending on the signalling event type, may results either in setting-up, tearing down or modifying attributes of the LSPs. It can be foreseen that different kinds of signalling protocols will need to be supported, depending on the interface the event is received from. There will likely be different signalling mechanisms used for users, inside a network domain and between domains (e.g. RSVP and LDP). 5.1.3 Topology changes Topology changes are events that are associated with the changes in network topology, and may potentially result in the requirement for large number of reconfiguration of a large number of LSPs. Topology changes are brought to the attention of the label distribution subsystem by the routing protocols and the monitoring of the status of the established LSPs. 5.1.4 Traffic pattern changes Traffic pattern change is an event triggered by the user activity that P.Vaananen, R. Ravikanth [Page 16] Internet-Draft Framework for TM in MPLS Networks March 1998 is observed by the network element resulting in the change of the traffic characteristics received over interface. Examples include the appearance of the new traffic flow, or timeout of the existing flow. These changes may affect how the LSPs are set up or attributes of the LSPs. Traffic pattern related changes should be attempted to be kept as local. 5.2 Policy and admission control Policies and admission control form a set of processes that directly or indirectly control the set-up of the label switched paths through the network element. 5.2.1 Routing policy For the new LDP requests, routing policies applied on the Internetwork are the first controlling policy that controls the potential routes the LDP paths can take through the network. Routing policies are thus not directly involved in the topological control of the LDP establishments, but them control the establishment basis of the information (routing information base) that the LDP uses to determine available routes. It shall be noted that the current routing protocols use the topology and metric information to select the "best" route to use of the multiple options, and do not generally know anything about the path characteristics or services supported on the path available in the routing database. 5.2.2 Classification policy Classification is based on two categories of information, specifically information in the headers of the received packets and the control and policy information provided by the configuration (management plane), routing and signalling protocols. IPv4 Header information useable in the classification process: - Destination address - Source address - IP protocol field - TOS TCP/UDP header information useable in the classification process: - Source port number - Destination port number Additional header fields may be parts of the classification, if desired. P.Vaananen, R. Ravikanth [Page 17] Internet-Draft Framework for TM in MPLS Networks March 1998 The classifier makes a decision on the basis of the preconfigured classification policy information, which specifies the kind of treatment the packets belonging to flow would like to receive. Note that the classification policy alone does not guarantee that the desired behaviour will be achieved, this is further refined by the admission policy, admission control and policing functions. For IP packets, the classification process can be generally accomplished by applying the filter template of the form {DA prefix, SA prefix, PRO, TOS, SPN, DPN} to each individual packet. Any of the fields can be a wildcard, so for example all traffic destined to web server would be specified using filter {*,*,6,*,*,80}. In some cases, there may be several different filters that may match the same packet, and the results of the match for the most specific filter should be used in such cases. In addition to packet header information, local information may be added to the classification process. One example of such local information type is the interface the packet was received from. The classification policy determines how the individual flow should be treated, including attributes such as the reservation type and granularity, differentiated service class, etc. On the basis of the filtering result, the packet may be associated with the LSP, or flow identifier. 5.2.3 Admission policy Admission policy is the process to determine if the new request for the LDP set-up or attribute modification with some set of reservations is administratively acceptable. This is administratively configured, and is associated with the given granularity entity, such as individual user, user community, or peer AS. The type and the granularity of the information that will be taken into account by the admission policy depends on the interface type, local policy and trigger type (e.g. signalling versus configuration event). When reservation requests of coarse granularity are considered (e.g. individual LDP set-up on public network interface supporting a large corporation), the admission policies are typically applied against the parameters associated with the aggregate set of all reservations currently associated with the community, reservation parameters and the administratively configured maximum resources allocated to that community. 5.2.4 Admission control Admission control is the process that is used to determine the resource availability to support a new request or the modification of attributes P.Vaananen, R. Ravikanth [Page 18] Internet-Draft Framework for TM in MPLS Networks March 1998 associated with existing label switched paths. Admission control is invoked as a final step, after it has been determined that the route to the destination is available, and the permission to process the request is granted by admission policy. Admission control gets more complex when the granularity of the reservations increases, being not invoked at all for best-effort traffic, and being most complex for the guaranteed traffic. 5.3 Path selection The primary mechanism to control path establishments and deletions in MPLS networks is the routing protocol. In addition, paths through the network can be established using the explicit routing. Static LSPs can be configured through management interface. For the MPLS network elements to be able to automatically locate alternate paths with the sufficient resources available, routing protocols that are able to take in the account additional path attributes instead of just topological connectivity and preconfigured metrics of the available paths is needed. The draft framework for QoS routing work effort have been developed in QOSR working group of the IETF [Crawley98]. However, the routing protocols with suitable metrics to be used in the environments with fine-granularity service guarantees inside or between the domains need to be developed. 5.4 Accounting Accounting mechanism is required by the service operators to be able to bill users in accordance with the services used. If the accounting mechanisms are not in place, there is no incentive for the users to use anything but best offered service classes. MPLS accounting mechanisms shall be able to collect usage data with desired granularity (single user to peer operator), together with traffic management attributes associated with the LSP, and transfer this data to operator's billing system. Protocols used for transferring accounting data to billing systems and billing procedures are outside of the scope of the MPLS work. Suitable protocols may include e.g. RADIUS and TACACS+. 5.5 User authentication At the moment, these services need to be implemented on the basis of the interface, protocol and network address information, but as the users are mobile (even within a corporate network), and also because of the increasing use of the dynamic address allocation mechanisms, such as DHCP and NAT's the ultimate goal should be to base the service P.Vaananen, R. Ravikanth [Page 19] Internet-Draft Framework for TM in MPLS Networks March 1998 policies on the user information. One possible implementation may be based on the use of directory services, such as LDAP to store the user profile information, but the approach needs to be standardised to be usable in the large scale. 6. DATA PLANE MECHANISMS FOR TRAFFIC MANAGEMENT FUNCTIONS This chapter describes the mechanisms required in the various parts of the network to provide support for the transport of the traffic with the service parameters through the network. These mechanisms include all the mechanisms that are involved in per packet decisions that are performed in the intermediate network nodes. The parameters for controlling these mechanisms are determined by the control plane mechanisms described in the previous chapter. Note that the location of these mechanisms in the networks is not discussed in this chapter, discussion of the location of mechanisms in different network environments is given in chapter 9. 6.1 Label forwarding paradigm In the best-effort label based forwarding, MPLS nodes use the simple exact match lookup to determine the egress link where the packet should be sent. When the services that require the support for service level differentiation are implemented, MPLS node uses the same exact match label lookup to determine not only where the packet should be destined, but also the additional state information associated with label, related to queuing and scheduling of the packet. 6.2 Classification 6.2.1 What is classification and where it should be done The purpose of the classification process is to determine the queuing / scheduling treatment that the packets should get as they traverse through the network. The result of the classification determine the following attributes: - service class the packet should be carried on, - for differentiated services the drop priority and / or delay priority for the packet - for guaranteed services the parameters determining the desired service guarantees Packets may be classified as belonging to different service categories in the various places of the end-to-end path traversed. Likely places where the packet classification may occur are: P.Vaananen, R. Ravikanth [Page 20] Internet-Draft Framework for TM in MPLS Networks March 1998 - Operator's domain ingress router - CPE router - Host When the hosts performs the classification, it may base the classification decisions either on the protocol used (part of the host protocol stack), or the attributes communicated from the application. Guaranteed service parameters will likely be based on the parameters communicated by applications. When the classification is performed by the routers (either CPE router or operator's border router), the classification decisions have to be based on the protocol information carried on the packet. Initial deployment is likely to be based on the classification on the routers, as there is no support for performing the classifications in the host protocol stacks. When the classification is performed in router's, modifications to host protocols and applications are not required. Additionally, it is easier to set up administrative classification policies when the classification is performed in routers. The stand-alone and integrated equipment for performing the classification for controlling the traffic are available at a moment, but there are not standard ways to manage these, neither standard ways on how the classification results are used to control the data stream. One common characteristic of current solutions is that they are usually decoupled from the other network equipment. Depending on the place where the classification is performed, the procedures performed on subsequent nodes do vary. 6.2.2 Flow Classification Flow classification is the process of associating a label to individual traffic flows. This process needs the consideration of the classification policy to be able the associate the label with the flow. Depending on the aggregation environment, the label may be associated with single flow, or if the flow aggregation is supported and suitable label already exists, flows may be aggregated to stream on the existing label. The purpose of the flow classification process is to reduce the processing load associated on making the decision of which label to associate with arriving packets. If the full classification can be performed for each packet without performance penalty and the suitable label exists, the flow classification is not required. Flow classification needs to be performed at least once for each new P.Vaananen, R. Ravikanth [Page 21] Internet-Draft Framework for TM in MPLS Networks March 1998 flow. Flow classification is performed on the edge MPLS nodes, where the packets from non-MPLS network domain enter onto MPLS network domain. This process can also produce the simple key, such as the entry in the hash table to be subsequently used by the packet classifier for the faster determination of the label that needs to be associated with individual packets. As more fine-grained control becomes necessary, flow classification becomes mandatory, because the accomplishment of fine-grained guarantees involve the setting up the new LSP or modifying the parameters of the existing LSP. In some cases, if it is determined that the suitable label for carrying the flow does not exist, a new LSP needs to be set up or the attributes of the existing LSP needs to be changed. The applications that are allowed to do this should be subject to careful consideration, as it is preferable to have the LSPs set up beforehand, otherwise the LDP modifications done on a per flow basis consume too much resources and become the performance / scalability bottleneck. However, this is useful for some applications whose characteristics are known beforehand to require relatively long lasting flow with service level requirements, such as e.g. videoconferencing. Classification mechanisms that require the edge routers to maintain per-flow state information are susceptible to denial of service attracts by malicious users. One can foresee the attack based on sending the packets with various destination address / port combinations in rapid sequence, causing the per flow state to be established for each packet. This can lead to exceeding of the per-flow state maintenance and flow establishment handling capacities of the routers performing the classification. There is no easy cure against such an attack, except administratively limiting the amount of the per- flow state that is associated with the interface. Together with the source address validation, this at least can provide information of where the attack originated from. Note that the flow switching as discussed here is nothing new, this has been used in routers and firewalls for long time. For more information of flow measurements and classification, see [Claffy95], [RFC2063], [RFC1954], [Cisco97]. 6.2.3 Packet Classification Packet classification performs the mapping of the individual packets onto desired LSPs. Packet classification process essentially assigns each arriving non-labelled packet onto suitable label switched path, which has to be available before the packet classifier can perform it's function. Prior to the packet classification, the LSP has to have been set up using either the flow classification process or other mechanisms, such P.Vaananen, R. Ravikanth [Page 22] Internet-Draft Framework for TM in MPLS Networks March 1998 as setting up the LSP on basis of information provided by management, topology (e.g. routing protocol) or signalling protocol. As discussed in the above chapter, flow classification process may help packet classification by producing the keys to increase the packet classifier performance. 6.2.4 Classification results for differentiated services For the differentiated services, classification determines the differential service attributes, such as drop precedence bit values and delay precedence bit values. In cases where these attributes differ from those carried in the received IP packet header, the received header bits may be overwritten or depending on the implementation of the diffserv support in MPLS, left alone. If the differentiated services attributes are allocated on per LSP basis, then the attributes are associated with the label switched path, and the result of the classification process should be the label to that path. 6.2.5 Classification results for guaranteed services For the guaranteed services, the label for the LSP that has the associated reservation attributes may be the result of the classification process. Alternatively, in fine-grained flow based systems, the flow identifier which can be used to determine it's individual traffic characteristics may be the result of the classification process. In this case, these are mapped to aggregated LSP by mapping function following the classification function. 6.2.6 Problems with non end-system classifications There are some known problems in performing the classifications in intermediate network elements, which are discussed below. Whether these present a problem, and if so, the extent of the problem depends on the environment the classification function is performed, and needs to be addressed in case-by-case basis. 6.2.6.1 Classification in presence of IPSEC When the transport protocol headers are encrypted, as described in IPSEC document "IP Encapsulating Security Payload (ESP)" [RFC1827], the transport layer (UDP/TCP) header information, such as port numbers cannot be used as parameters for determining onto which flow the packet belongs to. This implies that the classification has to be performed before the encryption is applied, in the customer customer device (typically host, P.Vaananen, R. Ravikanth [Page 23] Internet-Draft Framework for TM in MPLS Networks March 1998 router or firewall) that performs the encryption process. Also, as the per flow information is not available in the public network, it is possible to run MPLS all way to subscriber and use the label to identify IPSEC encrypted flow encapsulated onto one label. This way, it would be possible for the operator to enforce the requested parameters on per encrypted flow basis. It is also possible to achieve this using the RSVP signalling to the user, using the IPSEC extensions specified in [RFC2207], which basically uses SPI instead of the destination port number to identify the flow. 6.2.6.2 Classification in presence of dynamic address assignment The increasing use of the dynamic assignment of the IP addresses make it hard to determine the end-system the packets originated from. Dynamic address assignments are common in the environments that employ DHCP, or NATs. If the end-system address is important part of the classification policy, then the means to communicate the address - physical system mappings to classifier needs to be arranged. One possible way to achieve this in DHCP environments might be to have DHCP/DNS mapping in use, and resolve IP addresses on basis of DNS bindings. In environments, where the classification is based more on the protocol information carried in the packets, dynamic address assignment is not problem. This is due to the fact that the dynamically assigned addresses are expected to be same for the duration of the session, and the flow classifier can still use these addresses for identifying individual sessions. 6.2.6.3 Classification in presence of dynamic port numbers Some applications assign the port numbers they use dynamically, and it is very difficult or even impossible to make the correct classification on basis of such assignments. For such environments, it appears that the easiest way to achieve the correct classification is to let host determine the desired classification. 6.2.7 Classification state maintenance Classification state maintenance process is related to the deletion of the per flow state and associated LSP bindings that are not required anymore. Examples that lead to the removal of classification state are flow time-out, ending of the individual flow recognised by other means (e.g. TCP FIN) or signalling event to signify the end of reservation request. P.Vaananen, R. Ravikanth [Page 24] Internet-Draft Framework for TM in MPLS Networks March 1998 Classification state maintenance activities ensure that the non-used flow state information is deleted with appropriate intervals to free up the resources in network elements. Classification state maintenance activity shall be mostly local to the MPLS node. Only when the reservations are made on individual flow basis, this affects the LSP bindings between peer MPLS nodes. If the reservation type for the flow was guaranteed reservation, and the flow was aggregated on the LSP with other guaranteed flows, state maintenance activity triggers the modification of the reservation attributes of the LSP the flow was mapped onto, but does not result in teardown of the LSP. 6.3 Policing In the environments, where the packet classification is performed by the end-user's router or user's computer, it is important for the network operator to be able to enforce the traffic contract to disallow the users to exceed their contractual limits for the advanced services. This is performed using mechanism called traffic policing, which monitors the user's traffic. The policing function can, depending on the service used, either drop packets, or move the packets to lower priority or best effort delivery class. An alternative for the using policing is to allow users send whatever they want, and meter the usage of different services and bill the user based on what enters the public network. However, one likely alternative is to use a combination of these mechanisms, so that the user can send up to some maximum value specified by the traffic contract per class / service, and get billed on basis of combination of basic fee and usage. In cases where the classification is performed by the operator, the traffic contract can be enforced as part of the classification process. Policing actions can be taken at several granularity levels. Policing can be made for individual flows, when the per-flow reservations are in effect. Operator likely wants to police on basis of aggregated traffic contract on customers interface, and on MPLS network boundaries policing can be based on the individual LSP parameters. 6.4 Mapping On the basis of the flow identification performed by the classifier, the mapping process maps the packets to appropriate label switched path. This process is configured taking into account the traffic class, attributes associated with the flow and the topology information. P.Vaananen, R. Ravikanth [Page 25] Internet-Draft Framework for TM in MPLS Networks March 1998 The mapping function is responsible for achieving the aggregation. Depending on the traffic class, two styles of mappings can exist; direct and indirect mapping. 6.4.1 Direct mapping Direct mapping can be used when the reservation does not have explicit guarantees, like bandwidth associated with it. Traffic classes suitable for direct mapping are best effort and differentiated services without bandwidth allocations. In direct mapping, the association is done directly from the packet classifier outcome to the desired LSP. 6.4.2 Indirect mapping Indirect mapping needs to be used when the reservation does have explicit guarantees, like bandwidth associated with it, and the aggregation of these is desired. The need for the indirect mapping arises from the requirement to maintain per reservation state so that the individual reservation and its associated resources can be removed from the aggregate LSP. The reservation state deletion shall commence immediately after the end of reservation is detected, either through timeout, determined by observing transport header bits, or as result of signalling event. The associated parameter changes in the LSP configuration may be made more infrequently, especially when the frequency of the individual reservation establishments and deletions associated with given aggregated LSP is high and the reservations are relatively homogenous. This reduces the signalling load between the MPLS nodes the along the LSP. 6.5 Aggregation, merging and deaggregation 6.5.1 Aggregation Aggregation means that multiple flows that are treated similarly in the network are associated onto same label. Depending on the supported service type, the effort to support aggregation ranges from straightforward to very complicated. General guidelines for the aggregation to meet the scalability requirements suggest that the all flows that can be aggregated onto same label should be aggregated. Aggregation is the process that is performed at the first place the packet classification is performed, and involves the association of the different packets that belong to same forwarding equivalence class the same label. P.Vaananen, R. Ravikanth [Page 26] Internet-Draft Framework for TM in MPLS Networks March 1998 Aggregation conserves label space, as the labels do not have to be associated with the individual traffic flows. Figure 6.5.1. Aggregation Consider the node depicted in the figure 6.5.1. Traffic arrives from non MPLS network interfaces (not labeled) and is mapped onto LSPs. Because of the aggregation, the number of outgoing LSPs is reduced. 6.5.2 Merging Merging is also a form of traffic aggregation, but is performed to label switched paths, instead of the individual packets. In merge capable node, packets coming from multiple ingress LSPs belonging to same forwarding equivalence class are sent out on the single label switched path. The merging process helps to conserve the label space, and also reduces the amount of the connection state that needs to be maintained in the intermediate network elements. Figure 6.5.2. Merging 6.5.3 Aggregation and merging of traffic with service guarantees Aggregation of the traffic with service guarantees itself is not a problem, the problem is to come up with the associated service parameters for the aggregated path, in such way that the minimum amount of the resources are reserved, and the guarantees of individual reservations are maintained through the aggregated path. Aggregation of the traffic with just bandwidth guarantees is relatively straightforward; the attributes of the resulting aggregated label switched paths can be computed on basis of the guarantees given for the individual paths or flows that are aggregated. The computation of the aggregate path parameters can be based on simply a sum of the attributes of flows or paths that the aggregate is composed of, or can take in the account additional factors like oversubscription factor. When explicit guarantees for both delay and bandwidth are given, aggregation becomes much harder, especially if the delay requirements are tight. Several aggregation strategies for traffic both with and P.Vaananen, R. Ravikanth [Page 27] Internet-Draft Framework for TM in MPLS Networks March 1998 without delay guarantees are considered in references[Schwantag97], [Guerin97], [Berson97] [Rampal97], and [Li98]. 6.5.4 Deaggregation Deaggregation is the opposite to aggregation and merging, in the sense that it terminates the label switched path and performs layer three lookup for the individual packets to determine their next destination. Deaggregation can associate the packets either with new label switched path, or to the interface to non-MPLS network. Note that the service class related information associated with the labeled packets is not lost in the deaggregation, because the attributes of the LSP the packet arrived on are available at the deaggregation point. If the LSPs are constructed through the MPLS domain, from a set of domain ingress interfaces to a single domain egress interface, and packets not associated with this egress interface are not merged or aggregated to same LSP, deaggregation process is not needed. In such cases, if the interface is to a non-MPLS domain, the MPLS header is simply removed. Figure 6.5.4. Deaggregation 6.6 Queuing and congestion management 6.6.1 Queue management Queue management mechanisms manage the available queue space, and also determine the appropriate handling of the arriving packet, on the basis of the label switched path the packet is received on and the status of the desired queue. Queue management is closely related to congestion control, as congestion can be loosely defined as a condition where the queuing point on the network element has exceeded or is about to exceed its allocated queue space, forcing the packets be dropped instead of queued for resource. Packet handling decisions include which queue packet should be queued on, and also whether the packet should be approved onto that queue, moved to lower priority queue or dropped. Note that the moving of the individual packets between the different queues is not necessarily a good course of action, unless all packets P.Vaananen, R. Ravikanth [Page 28] Internet-Draft Framework for TM in MPLS Networks March 1998 of same flow are put to same queue. This is because the moving of the individual packets of the flow to lower priority queue is likely cause the packet re-ordering. Since the queuing mechanisms vary on the basis of the supported services and are local onto network element, they need not be subject to standardisation efforts. 6.6.2 Queuing principles Various queuing principles can be used for achieving the support of the required traffic classes. Properties of some possible principles in order of increasing complexity are discussed below. All of these queuing principles can be implemented for both cell and packet switching fabrics. - Single FIFO queue All traffic is queued onto single queue. Packets are queued together with their associated labels. Packets are admitted in the queue on the basis of a combination of parameters, such as packet class, queue occupancy level, LSP reservation parameters and measured throughput per LSP. Packets are scheduled for transmission in the order them arrived. Property of this queuing scheme is that the delay cannot be minimised for the packets that require that. - Multiple FIFO queues Traffic is queued in multiple queues (minimum of 2) on the basis of delay priority. Packets are scheduled in priority order, possible along with guarantee for the minimum service rate specified on per-queue basis. Packet admission onto queues is as before. - Shared queuing on per label basis Traffic is queued different logical queues on basis of the arriving label. Packet admission to queues is based on the occupancy level of each logical queue and possibly overall queue space. Requires complex queue space management algorithms as well as advanced scheduling mechanisms. This is functionally equivalent to per-VC queuing in ATM switches. It is unclear whether the per-label queuing has enough benefits over multiple FIFO queues with admission control to warrant the extra implementation complexity. 6.6.3 Congestion control 6.6.3.1 Passive congestion control schemes P.Vaananen, R. Ravikanth [Page 29] Internet-Draft Framework for TM in MPLS Networks March 1998 Passive congestion control schemes are based on dropping of the packets when they arrive at the congestion point. Passive schemes rely on the end-to-end protocols to find out that the packet loss has occurred and retransmit the dropped traffic with reduced traffic. Most of the Internet at a moment relies exclusively on the use of the passive congestion control schemes. TCP congestion control algorithms have been designed to act exclusively on the basis of packet loss information. Over time, numerous algorithms for the more intelligent drop policies have been developed, examples include RED [Floyd93], W-RED, and CBQ. These algorithms attempt to increase fairness of the usage of congested resource, to provide preferential treatment (typically more likely to get accepted onto queue) for some portion of the flows or to increase the end-to-end throughput in congestion conditions.. 6.6.3.2 Active congestion control schemes While passive congestion control algorithms do certainly work, one of their characteristics is that they waste network resources, as the traffic first is transmitted onto the congestion point, where it is dropped, and then retransmitted later. Dropped packets thus introduce extra overhead in the network portion before the congestion point. To avoid these disadvantages, there have been proposals to make the congestion control more active. The goal of the active congestion control approaches is to reduce or eliminate the packet loss due to the congestion, or to push the drop point towards the point originating the traffic. By active congestion control, we mean that the network more directly informs the traffic sources of the congestion situations, and more importantly even before the congestion actually occurs. These mechanisms are based on the explicit monitoring and notification of congestion state along the path the traffic is traversing. The notification can be either direct using explicit semantics to tell the end-station to slow down, or indirect, using the congestion information to influence congestion management mechanisms of the transport protocols to control the rate of the sender. The direct mechanisms have been attempted in the real networks, but with little success so far, because the lack of the support of the end- station transport protocols. It has been shown that these schemes work reasonably well, when implemented end-to-end. Examples of the direct congestion control mechanisms include frame relay congestion notification mechanism [I370], ATM binary and explicit feedback mechanisms [ATMF96], and proposal for inclusion of the P.Vaananen, R. Ravikanth [Page 30] Internet-Draft Framework for TM in MPLS Networks March 1998 explicit congestion notification for IPv4 and IPv6 [Ramakr97]. The natural place to carry the congestion notification information in MPLS networks would be as part of the label encapsulation header (when MPLS is mapped to Frame Relay and ATM environments the existing mechanisms to carry congestion information could be used). However, as the huge installed base of the existing applications is built on top of TCP and UDP, more attractive way is to provide direct feedback inside the network, and indirect feedback in the network interworking point, taking advantage of the characteristics of the current transport protocols. Examples of schemes that could be used to achieve indirect control are [Packeteer97] and [Jagan97]. The advantage of having the direct control inside network is that when the transport mechanisms evolve to be better able to take advantage of this functionality, the direct control can be extended to the end- stations. 6.6.4 Packet scheduling Scheduling algorithms determine the order in which traffic waiting in the queues are scheduled for transmission. Scheduling decisions are based on the queue specific information e.g. queue priority, weight, state, etc. The need of complex scheduling mechanisms depends on the capabilities provided in the network element, such as shaping, multiple service class queues, and complex queuing policy. In FIFO based queuing systems scheduling is trivial (transmit when you have the opportunity). 6.7 Traffic shaping Traffic shaping is the process of modifying the traffic characteristics to conform to desired traffic profile. Shaping can be used in various parts of the network to make sure that the resulting traffic conforms to the traffic contract, and thus has a better chance not to get discarded by the policing or congestion control mechanisms in the network. Traffic characteristics tends to get modified by the network, as the multiple traffic streams interact, and traffic goes through buffer and scheduling algorithms. The process of shaping inside the network to make traffic to better conform to its original profile is called reshaping. Examples of the possible shaping points are end-station, MPLS edge P.Vaananen, R. Ravikanth [Page 31] Internet-Draft Framework for TM in MPLS Networks March 1998 node, or MPLS core node. Shaping can be associated with any granularity, which has defined traffic characteristics, from application flow to aggregated label switched path. Shaping may be achieved as part of scheduling functionality. 6.8 Load sharing Load sharing can be implemented with MPLS routers using the path selection based on the load on the available links, and splitting the aggregated streams that are associated with different LSPs to different available links. The load sharing is especially important because of emergence of the Dense Wavelength Division Multiplexing (DWDM) systems, because these essentially divide the same fiber to up to tens of different channels going to the same destination node. Efficient load sharing allows the tight integration of the routed traffic and the transmission capabilities. Some of the issues related onto integration of optical networks and Internet are discussed in [Touch97]. MPLS based load sharing has advantage over the conventional router based load sharing, because it can take in the account also where the packets originated from, unlike the typical conventional routers. Without the knowledge where the traffic came from, it is not possible in the receiving node to easily guarantee that the packets are sent in the same order as they were sent in the previous node. Packet reordering causes performance degradation problems with TCP and some other transport protocols. The concept of the individual flows in the network ingress and/or egress points also allows to implement the load sharing for example to web server farms in such a way that the packets of the same session are always directed to same server. 7. LABEL SWITCHED PATH GRANULARITIES AND AGGREGATION The subset of the flow granularities defined in the section 2.2.2 of the MPLS Framework document [Callon97] appears below, with discussion of their applicability on context of traffic management mechanisms discussed in this document. - PQ (Port Quadruples) Same IP source address prefix, destination address prefix, TTL, IP protocol and TCP/UDP source/destination ports. This defines a single communication session between two hosts, and is generally referred as "flow". P.Vaananen, R. Ravikanth [Page 32] Internet-Draft Framework for TM in MPLS Networks March 1998 While the recognition of existence of individual flows can be important at the network boundaries and hosts, per flow state should not be required at the core network elements, as it quickly yields to unmanageable amount of state information to be maintained in high-speed backbone links. This is the reason for the need of aggregation. - PQT (Port Quadruples with TOS) same IP source address prefix, destination address prefix, TTL, IP protocol and TCP/UDP source/destination ports and same IP header TOS field (including Precedence and TOS bits). This augments the definition of the flow to take into account the TOS byte of the IPv4 packet. It is basically possible for the current applications to use different TOS values for different packets, although the practise is not likely to yield to any predictable results, as the TOS byte is not widely supported as part of forwarding process in current routers. The differentiated services working group will define the standard semantics for this byte, but if the single session uses different values it is likely to yield to packet re-ordering problems in the network. For the coarser granularity paths, the aggregation rules should take into account the topological scope and the traffic types. MPLS nodes should attempt to aggregate the same type of traffic onto same LSP. It should be noted that the support of the managed paths and different services is going to increase the label space consumption, but the aggregation should be used to minimise this increase. See chapter 8.5., "Multilevel paths" on discussion on how the use of multilevel paths can help on the aggregation of traffic with explicit guarantees. 8. LABEL SWITCHED PATH TOPOLOGIES AND ASSOCIATED TM PROCEDURES Services are implemented by assigning attributes to label switched paths. The path is composed of point-to-point segments between adjacent MPLS nodes. In complex topologies (excluding point-to-point) each individual segment may have different values for its attributes, depending on the location of the segment along the path and the topology of entire path. This is also true when the flows with resource allocations are aggregated to stream that is associated with the same LSP. Properties of the different LSP topologies and related traffic management issues are discussed in the following chapters. P.Vaananen, R. Ravikanth [Page 33] Internet-Draft Framework for TM in MPLS Networks March 1998 8.1 Point-to-point Point-to-point LSP is the simplest of the label switched path topologies, and this is the basic building block of all LSPs. In this document, point-to-point LSPs that have their own labels and attributes, and both the label and its associated attributes have local significance between the MPLS network elements. These local LSPs are called segments in this document. In the simplest case, where the end-to-end LSP with the attributes is built by concatenating a set of these segments, all segments have the same attributes, while the label has only the local significance between neighbour MPLS nodes. More complex topologies can be constructed by concatenating the segments and using traffic merge (mpt-pt) and copy operations (pt-mpt) in the network elements to achieve the desired topological LSP constructs. 8.2 Point-to-multipoint Point to multipoint topologies can be constructed using the packet copy function at the ingress point-to-point LSP segment on the MPLS network element. The incoming packets are duplicated for each outgoing label switched path. Point to multipoint topologies are important for supporting of the multicast packet delivery. 8.3 Multipoint-to-point Point to multipoint topologies are important for scalability reasons. Multipoint to point topologies can be constructed using the packet merge function at the MPLS network element. The incoming packets from multiple ingress label switched paths are merged onto same outgoing label switched path. In addition to aggregating the traffic destined onto single destination, in the presence of traffic with explicit guarantees, aggregation of the traffic parameters to get the attributes for each of the LSP segment composing the multipoint to point tree is required for supporting aggregation of the traffic with explicit guarantees. Note that this can yield for the different segment to get different attributes as the traffic is merged onto the shared multipoint-to-point tree. P.Vaananen, R. Ravikanth [Page 34] Internet-Draft Framework for TM in MPLS Networks March 1998 8.4 Multipoint-to-multipoint Multipoint to multipoint topologies cannot be directly constructed using the same labels, but these can be constructed using desired combination of point-to-point, multipoint-to-point and point-to- multipoint LSPs. Exact decomposition to simpler topologies depends on the desired connectivity in multipoint to multipoint topology. Traffic management requirements of such simpler topologies can be treated as for the simpler topologies used. For example, full mesh connectivity between set of endpoints can be achieved using multipoint-to-point LSPs, with each endpoint acting as a receiver of separate multipoint to-point tree. 8.5 Multilevel paths Multilevel paths can be constructed using multiple labels on stack, or alternatively partitioning the label space to represent different levels (like VPI/VCI in ATM networks). The operations associated with label stacks are described in the MPLS framework document [Callon97] and label stack encoding proposal is described in [Rosen97b]. The routing and scheduling decisions of the packets encapsulated on the on multilevel label switched path are performed on the basis of the top level label. Termination of the multilevel LSP is performed in deaggregation point, where the top level label is removed (referred as label pop in [Callon97]). Second level label is then available for use as the basis for routing and scheduling mechanisms. Multilevel paths are useful when the several paths with similar, but different service guarantees are aggregated onto same path. At the deaggregation point, the path characteristics of the individual aggregated paths that the higher level path is composed of can be determined on the basis of second level label. Figure 8.5. Multilevel path example Consider the simple MPLS network composed of four nodes A-D depicted on Figure 8.5. There are two traffic sources with reservations entering node A, from non-MPLS domains. These two sources are aggregated and leave node A on LSPx. P.Vaananen, R. Ravikanth [Page 35] Internet-Draft Framework for TM in MPLS Networks March 1998 At node B, the additional LSP (LSPy) that is destined towards same node is merged to same LSP, and the combination leaves node B as LSPz. Original labels are pushed to the label stack, and traffic leaves node B with top level label LSPz. At node C, no traffic is either merged or removed from the LSPz, LSP label just gets replaced and traffic leaves the node C with new label LSP z'. The traffic arrives at node D, which deaggregates the traffic to it's constituents LSP's, denoted as LSP y' and LSP X'. Now consider that all of the traffic entering and leaving the network has reservations. The capacity of LSPx is thus function of RES1_in and RES2_in. The capacity of aggregated LSPz is function of LSPx and LSPy, which at least LSPx is aggregate. As the node C does not modify the aggregate in any way, it does not need to know the parameters of the individual components the aggregate LSPz is composed of. Node D, which acts as deaggregation point for LSPy' and LSPx' needs to know the traffic attributes of both original LSPy and LSPx, but it does not need to know anything about the parameters of RES1_in and RES2_in. Compared to the model where each path requires the individual LSP through the network, the use of aggregation and multilevel paths can save significant amount of state information and signalling overhead in the network The use of the multilevel labels enables the de-aggregation point still distinguish between different sources received in the aggregated LSP and to treat the traffic according to their original reservations. For this to be possible, there needs to be signalling mechanism between the aggregation point and deaggregation point to communicate the traffic attributes of the second level labels that are deaggregated. Note that this does not mean that the deaggregation point does need to know attributes of all individual LSPs, that are aggregated, deaggregated LSP may still be aggregate on other level. Also, if there are large number of aggregated flows on single LSP, and there is deaggregation point that needs to split the traffic to number of the aggregated egress LSPs, the deaggregation point only needs to know which of the second level flows should be associated with which egress aggregate LSP, and the total aggregate value of each egress aggregated LSP. Large benefits can be achieved at the backbone level, by aggregating all the traffic with reservations with similar characteristics onto same LSP. P.Vaananen, R. Ravikanth [Page 36] Internet-Draft Framework for TM in MPLS Networks March 1998 The backbone nodes need only know the reservation parameters of the aggregated traffic, not the parameters of individual second level LSPs that compose the aggregate. Signalling protocol needs to be run between the sending and receiving domain to be able sort out the individuals in the receiving end, but the backbone does not need to be participating in this signalling other than carrying the signalling messages. The attributes of the aggregated LSP can either be modified on basis of changes of the constituents of aggregate, but up to single message per change is required to achieve this. Additionally, if this results in rapid changes to aggregate attributes, this can be dampened e.g. by having the threshold of the minimum change to aggregate attributes that needs to happen before the aggregate parameters are signalled to be changed 9. NETWORK FUNCTIONAL PARTITIONING For the purposes of this document, we divide the network elements into four categories, hosts, CPE routers, operator border MPLS nodes and core MPLS nodes. Note that this is just simple model to facilitate the discussion in this document, there is no any reason that the roles of these network elements cannot be combined. Edge MPLS nodes are the nodes that connect the MPLS aware network domain to non-MPLS aware domain. Example of such element would be border router connecting the users attached with Ethernet to the MPLS aware core network domain. Both CPE routers and domain border nodes are discussed as MPLS edge nodes, as their characteristics can be quite same, depending on the protocols and extent of the MPLS reaches to. Domain border MPLS nodes are the special cases of the edge MPLS node that connect the two MPLS aware domains together. Core MPLS nodes are the MPLS nodes in the core of the network, that are connected only to the other MPLS nodes; to the edge MPLS nodes and / or to other core MPLS nodes. 9.1 Network models Figure 9.1-1. Public MPLS network domain interface Figure 9.1.-1 depicts the interface between the MPLS network operator and operator's subscriber network. Subscriber is connected on the MPLS border node, and depending of the environment can support different service categories and run different protocols towards the subscriber's domain. The partitioning of functionality of CPE router and operator border router in different situations is discussed in section 9.2.2. P.Vaananen, R. Ravikanth [Page 37] Internet-Draft Framework for TM in MPLS Networks March 1998 9.2 Network element categories This chapter defines the roles of the different MPLS nodes in the network, and identifies some basic functionality that these nodes need to perform to be able to support the traffic management. For the purposes of this discussion, functionality is divided between hosts, edge MPLS nodes and core MPLS nodes. The basic assumption is that instead of using the label information just to make a forwarding decision, MPLS nodes capable of supporting differentiated services will use label information also as a part of the scheduling decision. 9.2.1 Hosts Hosts are initially likely to be just as they are at a moment, i.e. not supporting anything more than the best effort application. In the future, hosts may participate in diffserv packet classification or support signalling mechanism, such as RSVP to request explicit service guarantees. It is also possible that at the some point, hosts participate on the label distribution protocol. All of the above functions for the hosts, except the best effort communication capabilities shall remain optional. For the different service categories, the functions that the hosts can implement in the future are detailed in the chapters 9.2.1.1 to 9.2.1.4. 9.2.1.1 Enhanced best effort services To be able to take advantage of the enhanced best effort service provided by the network, the modifications to current host TCP/UDP protocols are not necessarily required. If the explicit congestion indication information is provided by the network, modifications to the host transport protocol stack allow the host to react to the congestion feedback information received from the network. 9.2.1.2 Differentiated services To be able to take advantage of the differentiated services provided by the network, the modifications to current host TCP/UDP protocols are not necessarily required. Host may optionally participate on the differentiated services process by performing the packet classification for the traffic originated from the host. This is not necessary however, as the flow / packet classification to P.Vaananen, R. Ravikanth [Page 38] Internet-Draft Framework for TM in MPLS Networks March 1998 differentiated service classes can be also performed on the router. Hosts that actively participate on the differentiated services processing have to support the following mechanisms: - Classification policy (Mandatory) - Packet Classification (Mandatory) - Classification state maintenance (Mandatory) Hosts that actively participate on the differentiated services processing may additionally support some of the following mechanisms: - Flow Classification (Optional) - Traffic shaping (Optional) - Scheduling (Optional) 9.2.1.3 Guaranteed services To be able to take advantage of the guaranteed services provided by the network, the modifications to current host TCP/UDP protocols are not necessarily required. Host may optionally participate on the guaranteed services environment by running the signalling protocol to request the explicit guarantees from the network. This is not required, as the flow / packet classification process run on the router can also make the appropriate requests to the network on the basis of the header information of the packets received by the host. Hosts that actively participate in the guaranteed services processing have to support the following mechanisms: - Signalling protocol to request the service (Mandatory) Hosts that actively participate on the guaranteed services processing may additionally support some of the following mechanisms: - Traffic shaping (Optional) - Scheduling (Optional) - Flow Classification (Optional) 9.2.1.4 Participation in MPLS Host may desire to participate on MPLS domain by running the LDP protocol to request and terminate the paths through the network, possibly with some attributes associated with the requested paths. The additional advantage of the host participation may be that, high- performance hosts may use the flow labeled LSPs to cache the state information inside the host protocol stack to increase performance by speeding up or bypassing some of the multilayer protocol stack processing. The unwanted effects of multilayer multiplexing are P.Vaananen, R. Ravikanth [Page 39] Internet-Draft Framework for TM in MPLS Networks March 1998 discussed in [Tennenh89]. Because the hosts have limited information of the overall network topology and the aggregation strategies used by the network, hosts should only participate by originating and terminating the LSPs with the fine granularity. Aggregation and deaggregation functions should thus be left to the network. Host that actively participates in the MPLS have to support the following mechanisms depending on the services used: - LDP processing (Mandatory) - Classification policy (Mandatory) - Packet Classification (Mandatory) - Classification state maintenance (Mandatory) Hosts that actively participate on the MPLS may additionally support some of the following mechanisms: - Traffic shaping (Optional) - Active congestion control (Optional) - Scheduling (Optional) - Flow Classification (Optional) In addition, hosts may choose to participate in the Intserv environment that is also MPLS capable, and use the RSVP to carry labels with the reservations. Note that there are important security considerations that generally make it infeasible for the untrusted hosts directly participate on the operator's LDP domain in any way, discussed in more detail in section 9.2.2.4. However, for the operator owned "trusted" servers, such as web hosting facilities, etc. host participation may have some performance advantages. 9.2.2 MPLS edge nodes In this context we include both CPE router and operator's MPLS domain in discussion as edge nodes, as the traffic management functionality is somehow divided between these two nodes, and the mechanisms described in sections 5 and 6 of this document apply to both. An MPLS domain edge node contains interfaces to non-MPLS networks, as well as to MPLS network domain. There are different scenarios that determine how the functionality between the public operator's MPLS border node and the CPE node needs to be divided. P.Vaananen, R. Ravikanth [Page 40] Internet-Draft Framework for TM in MPLS Networks March 1998 Figure 9.2.2. Implementation framework for MPLS edge node TM functionality, ingress The functionality and the implementation framework of the MPLS domain edge node is depicted in Figure 9.2.2. As a summary of the functionality that needs to be performed at the ingress point of the MPLS domain, the following list applies: Mandatory functions for operator border router: - Admission policy - Admission control - Direct mapping - Indirect mapping - Either of two: flow policing or LSP policing - Aggregation - Deaggregation - Queue management - Queuing - Scheduling - Label distribution Mandatory functions in either CPE equipment or operator's border router: - Classification policy - Packet classification - Classification state maintenance Remaining functions, that are optional, may be performed in hosts, CPE router, operator MPLS border router, or not implemented at all: - Flow classification - Flow policing - Merging - Congestion marking - Shaping An MPLS network ingress point, as viewed from the MPLS domains side has to classify the traffic according to the desired service categories and allocate the traffic to the LSPs. This association between the packets at the domain ingress point and the label switched path with path attributes determines how the packet will be treated in all subsequent network elements in the LSP associated with the label. In addition, ingress MPLS node has to enforce the traffic contract between the subscriber and the public MPLS domain operator and participate on the label distribution process. More P.Vaananen, R. Ravikanth [Page 41] Internet-Draft Framework for TM in MPLS Networks March 1998 detailed descriptions of the above listed functions are given in sections 5 and 6 of this document. Note that from the direction of the operator's MPLS domain towards the customer domain, the following functions are not mandatory: - Flow classifier - Packet classifier - Classification policy - Indirect mapping - Direct mapping - Flow policing The partitioning of the edge functionality is dependent on the services offered to the customer, and who is responsible for performing the traffic classification. The services that can be offered to customer by the public MPLS domain operator are: - Best effort services - Differentiated services - Guaranteed services - MPLS The network boundary between the user's and operators network can support any number of the above services. Depending on the implementation model, the support for some of these services may require signalling support between the MPLS domain and subscriber interface. The different cases are described in more detail in the following sections, from the operator border node's functionality standpoint. 9.2.2.1 Best effort services to customer If the best effort service is provided to the customer, edge node would just map the traffic onto suitable LSP, according to procedures defined for best effort traffic in the [Callon97] and [Rosen97a]. If there are service guarantees (e.g. bandwidth) for the some portion of the user's traffic (e.g. for all traffic destined to network x), these can be honoured by applying the suitable filter to the traffic and assigning it to the designated LSP. 9.2.2.2 Differentiated services to customer In the differentiated service model, the packets need to be marked on basis of some policy, and the packets receive the different treatment on basis of the values carried in the DS byte (encapsulated on TOS P.Vaananen, R. Ravikanth [Page 42] Internet-Draft Framework for TM in MPLS Networks March 1998 field of IPv4 packet). This labelling can be performed by the customer equipment, such as CPE router or customer's hosts. In the case that the marking is performed by the subscriber, the operator's border router needs to police the traffic according to the service contract between the operator and the customer. Operator may also need to measure the traffic for accounting purposes, depending on the contract. Another alternative is that the operator performs this marking on the basis of the policy agreed with the customer in the access nodes. 9.2.2.3 Guaranteed services to customer For security reasons stated in the next chapter, the use of the guaranteed services towards the customer on based on the MPLS labelling is not advisable. If the guaranteed services are supported, the signalling protocol, such as RSVP needs to be terminated on the operator's border node and the filter to achieve the classification needs to be applied to each packet. If signalling based guaranteed services is used towards the public network, the network operator may assign the resulting traffic onto it's own LSP or aggregate it to the LSP with suitable service guarantees towards the public network. Note that the operator's border router does not necessarily have to perform the aggregation, as it may be unlikely that there will be the suitable LSP towards the destination available. Alternatively, if signalling is not used, operator can just apply the set of pre-specified filters according to some policy agreed between the customer and the operator. 9.2.2.4 MPLS to customer Operator can run MPLS towards the customer premises, but there are some important considerations that need to be taken in the account on such environments. Since the customer is a non-trusted entity from the operator's standpoint, and the MPLS allows the establishment of the switched paths towards the destination, there is no possible way for the operator to control what enters onto LSP the subscriber's traffic enters onto. This opens the possibility of denial of service attacks, and other kinds of malicious uses that could possibly be prevented by the ingress filtering on the operator's ingress node. When the traffic enters on the LSP, it is impossible to determine where the traffic originated from after it is merged with the other traffic, assuming that the bogus P.Vaananen, R. Ravikanth [Page 43] Internet-Draft Framework for TM in MPLS Networks March 1998 source addresses are used. The only way to prevent this would be to terminate the LSPs originated from customer premises on the operator's border node, but in such case there is no reason to run MPLS to the customer for this type of traffic at all. Additionally, as the customer does not have the information of the operator's traffic aggregation policies and access to the routing information, customer will not be able to perform traffic aggregation. This would, in practice, mean that the MPLS sessions between operator and subscriber would have to be based on individual flows, and operator would be responsible for appropriate aggregation. An environment, where the use of the MPLS to customer premises makes sense is when the MPLS is used to create VPNs for the customer. The customer could then assign the traffic that is destined on the LSP that's part of the VPN to appropriate VPN. Even in these environments, it would make sense to use ordinary routing for other traffic. This assumes that the VPN LSP endpoint(s) trusts the sending entity to some extent, as the traffic would be carried quite transparently through the operator's network. In any case, all traffic that is entering onto operator's network that is destined to public the network should be validated for the source address before encapsulating to any label switched path. So, as a summary, MPLS to the customer's premises does not make much sense in typical environments. 9.2.3 MPLS core node Figure 9.2.3 Implementation framework for MPLS core node TM functionality MPLS core nodes are high capacity switching elements, that contain only MPLS interfaces. Core nodes need to forward packets at high speed and differentiate the queuing treatment on basis of the label they are received with. These nodes also participate in routing and label distribution protocols, and have to support admission control for the traffic that has reservation requests. The important thing to note is that the associated state information for the treatment of the arriving packets can be determined on basis of label, there is no need for the knowledge or reapplication of the admission policies or traffic filtering. The following is a list of the traffic management functions typically performed by core node: P.Vaananen, R. Ravikanth [Page 44] Internet-Draft Framework for TM in MPLS Networks March 1998 Mandatory functions: - Admission policy - Admission control - Aggregation - Queue management - Passive congestion control - Queuing - Scheduling - Label distribution Optional mechanisms: - Deaggregation - Congestion marking - LSP policing Above mechanisms are described in more detail in sections 5 and 6 of this document. 9.3 Interface categories 9.3.1 Interface to non-MPLS networks This interface is the point where the MPLS domain connects to existing network infrastructure, and the first point in the ingress direction, where packet labelling is performed. Also, in the egress side of the interface, labels are removed and packets are encapsulated according to the corresponding data link layer encapsulation. 9.3.2 Interface inside MPLS network domains This interface is the interconnection point between the different MPLS network elements inside the domain. This is characterised by the fact that the packets are received and transmitted labelled, and the forwarding and scheduling decisions are performed on basis of the label associated with the received packet. 9.3.3 Interface between MPLS network domains This interface is the interconnection point between two operationally different MPLS network domains. Such an interface applies the policies related to admission of the labelled path set-ups through the operator's network, and meters the usage, especially for advanced service categories to be able to monitor / create inter-operator settlement agreements. The policing functions in this interface are applied at the LSP level. P.Vaananen, R. Ravikanth [Page 45] Internet-Draft Framework for TM in MPLS Networks March 1998 The deaggregation of the arriving traffic aggregated to incoming LSPs to determine the appropriate LSPs inside domain traffic can be done either immediately on this interface point, or somewhere else in the network. This is generally required, as it is advantageous for the external domain's operator to aggregate the traffic as much as possible, and also since the internal topology (and corresponding LDP paths) is not known to external domain. 10. LSP MAPPINGS TO EXISTING LINK LAYER TECHNOLOGIES The discussion of the mappings of the traffic of different service guarantees to specific data link layers, and what of the requirements outlined in chapter 4 can be achieved goes here. Concentrate on ATM and Frame Relay environments, and what is missing from the current best effort mapping proposals. --- This section to be added later --- 11. GENERAL REQUIREMENTS FOR LABEL ENCAPSULATIONS 11.1 Differentiated services support Proposals for the "differentiated services", require some priority bits to be carried in the packets to be used for providing additional information to help to select appropriate queuing and scheduling actions in the intermediate routers. These mechanisms generally rely on the use of the IPv4 TOS field. At first look, it appears that the making the determination for the scheduling action should be based on both label and these differentiated service bits. There are however some reasons, which make the determination of all associated parameters strictly on basis of the information contained in the label: 1.) Straightforward mapping to hardware implementation Since it is expected that the MPLS nodes may be based on the high- capacity hardware implementation of the forwarding process, it is expected that the lookup result can directly be mapped onto hardware implementation of the particular product. Since the internal implementations of the supporting mechanisms are not subject to standardisation, it may be possible that even if the some header bits are used to indicate e.g. priority, some, possibly complex mapping needs to be performed to resolve the appropriate information to control HW based scheduling decisions. When the information is distributed with the LDP, the network element can perform necessary internal mappings, and then use the HW lookup table for determining the associated parameters that control scheduling hardware. P.Vaananen, R. Ravikanth [Page 46] Internet-Draft Framework for TM in MPLS Networks March 1998 2.) Support for fine grained service guarantees For the support of the fine grained service guarantees, such as INTSERV controlled load or guaranteed service, it is impractical to carry the required amount of the state information in every packet. Also, because implementations vary, the information cannot be subject to standardisation. In addition, reasons given in 1.) also apply here. 3.) Requires MPLS node to look only onto fixed portion of the header Even if the information for the providing the differential services could be carried in the packet, the system becomes specific for the header formats of the given protocol, such as IPv4. When the protocol is changed, the position where the information resides inside the header changes also. This implies that the hardware should be either able to identify the protocol on the fly to determine where to look for information, or this be statically configured for the entity doing the lookup function, which means that the simultaneous support of multiple protocols is not feasible. When the information is retrieved just on basis of label, these problems do not exist. It would be possible e.g. provide exactly same services for the IPv4 and IPv6 without any problems using the same label based forwarding entity. 4.) Legacy HW support Since much of the effort is currently concentrated on how the label switching should be supported in the legacy hardware with as little modifications as possible, it would make more sense to use the mapping on basis of the label. For example in ATM current ATM environments, there is no support for the differentiated services concept as being discussed, but there are some quite straightforward mappings that can be realised by using the currently defined ATM service categories. 5.) Single, standard forwarding paradigm If the lookup is kept strictly as label only based, it means that same kinds of services can be provided for completely different applications and protocols using same network elements. Also, this means that the new services can be introduced by developing extensions to the LDP, and implementing the appropriate improvements in the network elements by keeping the same basic concept intact. Note that the using different labels for different service class encodings increases the required label space, but on the environments that support only best effort or guaranteed traffic, these bits can be used by different LSPs. 11.2 Congestion management support 11.2.1 Congestion indicator bit P.Vaananen, R. Ravikanth [Page 47] Internet-Draft Framework for TM in MPLS Networks March 1998 For the purposes of the congestion management, it is desirable to have one bit of the label to indicate that the LSP is experiencing congestion. If the label encapsulation header is protected by checksum that goes over the label, it is desirable that this bit is excluded from the checksum calculation so that the hardware can modify the bit directly, or that the checksum modification mechanism is specified that allows easy recalculation of the checksum when the bit is modified. In the frame relay environments, FECN and BECN bits shall be used for the congestion notification bits. In ATM environments the CI bit of the header shall be used for congestion notification. When the multilevel labelling is used, the value of the CI bit shall be copied to CI bit of second level label in deaggregation point, where the top level label is removed. 11.2.2 Examine me bit For the more advanced traffic management mechanism support, it may be useful to have one bit of the label to indicate that the packet carries information that intermediate network element need either copy or modify. The advantage of having this bit encoded in the label instead than using dedicated LSP between nodes is that the associated operations can be made on per LSP basis, possibly in the hardware. The requirement for the support of this bit requires further study. It may be advisable to reserve one bit for this (or other) purposes from the beginning, even if the use has not been defined. This cannot be easily supported in standard ATM switching hardware, but ATM provides similar mechanisms in cell level with OAM and RM cells. 11.3 Support for multilevel label switched paths The multilevel label support is essential for the purposes of scaleable support of the label switched paths with explicit guarantees. The mechanisms for supporting this shall be included in the label encapsulation protocol. Two levels of the multi-level labels are generally sufficient for the traffic management purposes and in ATM environments this may be realised using VPI/VCI partitioning to support first and second level label encodings. 12. GENERAL REQUIREMENTS FOR DISTRIBUTION OF LABELS AND TM ATTRIBUTES To be able to realise the basic set of TM functionality, the following P.Vaananen, R. Ravikanth [Page 48] Internet-Draft Framework for TM in MPLS Networks March 1998 functions shall be available in the protocol used to distribute labels and the associated traffic management attributes. 12.1 Setup request This function is used to request the set up of the label switched path of desired topological scope, granularity and attributes. The traffic management related attributes need to be specified and available in the LSP set-up request. Some of the traffic management related attributes that shall be available for set-up request function: - Bandwidth (bits/s) - Discard priority class (1-Ndisc) (as specified by differentiated services WG) - Delay priority class (1-Ndel) (as specified by differentiated services WG) It shall be possible to add other possible attributes that may be required, depending on the desired services and/or signalling protocols that are to be used in MPLS context. 12.2 Setup modification LSP set-up modification function will be used to modify the attributes of the LSPs that have already been set up. Modification can be, e.g. addition or reduction of the bandwidth of the associated label switched path. Same attributes as for the set-up request shall be available for set-up modification function. 12.3 Setup Acknowledge LSP set-up acknowledge is received when the LSP with desired attributes has been set up. Set-up acknowledge can result of either set-up request or set-up modification functions. 12.4 Setup reject LSP set-up is rejected when the LSP with desired attributes can not be supported by the network. Set-up reject can result of either set-up request or set-up modification functions. Set-up reject shall communicate the reason why the request or modification was rejected. Traffic management related parameters that shall be returned in error conditions that shall be available: - Reason for rejection: no support for service, no LDP available, no P.Vaananen, R. Ravikanth [Page 49] Internet-Draft Framework for TM in MPLS Networks March 1998 resources - Information, such as: available bandwidth, highest available priority, etc. 12.5 Discussion of signaling protocols 12.5.1 General There have been proposals to use RSVP for implementing all services that require more than best effort traffic category support in MPLS environment. Also, there is other proposal for implementing limited set of services for supporting limited set of traffic management functionality, mainly suitable for the network operator's traffic engineering needs in the LDP protocol, which does not require the use of the LDP. The approaches have similar characteristics, and it is quite possible to achieve the desired functionality in either way. Either method is not obviously better than the other, and it is unclear whether these proposals are complementary or competitive. In addition, operator driven network traffic engineering purposes does not have as strict requirements for the dynamics and the granularity of control required. It might be feasible to implement the attributes required for the traffic engineered LSPs directly in the LDP, without mandating the use of the RSVP in networks that do not support differentiated services. To determine the suitability of the signalling mechanisms for TM support the proposals should be evaluated and decided against the requirements for supporting the traffic management related requirements and their applicability to different topologies, topological scopes and reservation models. 12.5.2 LDP Label Distribution Protocol (LDP) has been proposed for the communication of the bindings between the routes and the LSPs between the MPLS nodes. Current proposal of the LDP [Andersson97] does not include objects to carry any traffic management related attributes for the LSPs, except the placeholder for the Class-of-service objects. COS object semantics have not been specified in the current version of the document, and them will likely be based on the work done in the IETF DIFFSERV working group. It has been proposed to extend the current LDP proposal to include the P.Vaananen, R. Ravikanth [Page 50] Internet-Draft Framework for TM in MPLS Networks March 1998 basic set of the traffic management related attributes as part of the LDP. Reasoning behind this is that in the environments that are not otherwise using the RSVP, and do not need all the features provided by RSVP, the additional complexity inherited with the RSVP may be too expensive to implement, and the use of single protocol (LDP) should be sufficient. 12.5.3 RSVP RSVP [RFC2205] was originally developed for the establishment of the communication of the reservation parameters of the unicast traffic and reservation parameters for heterogeneous receivers for the multicast traffic. RSVP has scalability issues for the large scale deployment, that are discussed in the [RFC2208] and [Schwantag97]. MPLS can be used to address some of the scalability problems of the RSVP, by using the RSVP signalling at the edges of the network and either RSVP tunnels inside networks, or by mapping the reservations to some other signalling protocol, used to carry reservation information inside the operator's network and possible across domain boundaries. MPLS networks have the ability, in the core network elements, to make forwarding decisions using simple label based lookup instead of applying the flow specific filter to each packet, as required by the conventional RSVP implementation. Also, because of the aggregation of the reservations, the core routers can forward the traffic without keeping track of per-flow state. MPLS has also been proposed as signalling protocol to be used in MPLS context for communicating the reservation attributes of the label switched paths inside the network [Li97]. This operation model can be coupled to user-to-network RSVP signalling, or operate independently inside the network, between the network elements. There are also proposals for setting up explicit paths with reservations inside MPLS domain using extended RSVP and assigning labels for such paths [Gan97], [Guerin97] [Davie97a] and [Davie97b]. 13. REFERENCES [Andersson97] "LDP Specification", L. Andersson, P. Doolan, N. Feldman, Andre Fredette, work in progress, draft-mplsdt-ldp-spec- 00.txt, November 1997 [ATMF96], "Traffic Management Specification, Version 4.0", ATM Forum, April 1996 P.Vaananen, R. Ravikanth [Page 51] Internet-Draft Framework for TM in MPLS Networks March 1998 [Berson97] "Aggregation of Integrated Services State", S. Berson, S. Vincent, work in progress, draft-berson-classy-approach-01.ps, November 1997 [Braden97] "Recommendations on Queue Management and Congestion Avoidance in the Internet", B. Braden, D. Clarck, J. Crowcroft, B. Davie, S. Deering, D. Esterin, S. Floyd, V. Jacobson, G. Minshall, G. Partridge, L. Pettersson, K. Ramakrishnan, S. Schenker, J. Wroclawski and L. Zang, work in progress, draft-irtf-e2e-queue-mgt-00.ps, March 1997 [Bradner97] "Internet Protocol Quality of Service Problem Statement", S. Bradner, work in progress, draft-bradner-qos-problem-00.txt, September 1997 [Callon97] "A Framework for Multiprotocol Label Switching", R. Callon,P. Doolan, N. Feldman, A. Fredette, G. Swallow, and A. Wiswanathan, work in progress, draft-ietf-mpls-framework-02.txt, November 19, 1997 [Claffy95] "A parameterizable methodology for Internet traffic flow profiling", K.C. Claffy, H-W. Braun, G. C. Polyzos, IEEE Journal on Selected Areas in Communications, vol. 13, no. 8, pp. 1481-1494, October 1995. [Cisco97] "Netflow", White Paper, Cisco Systems, 1997 [Crawley98] "A Framework for QoS-based Routing in the Internet", E. Crawley, R. Nair,B. Rajagopalan , H. Sandick, work in progress, draft- ietf-qosr-framework-03.txt, March, 2, 1998 [Davie97a] "Use of Label Switching With RSVP ", B. Davie, Y. Rekhter, E. Rosen, A. Viswanathan, V. Srinivasan, work in progress, draft-davie-mpls-rsvp-01.txt, November 1997 [Davie97b] "Explicit Route Support in MPLS", B. Davie, T. Li, E. Rosen, Y. Rekhter,work in progress, draft-davie-mpls-explicit-routes- 00.txt, November 1997 [Ferguson98] "Simple Differential Services: IP TOS and Precedence, Delay Indication, and Drop Preference", P. Ferguson, work in progress, draft-ferguson-delay-drop-01.txt, March 10, 1998 [Floyd93] "Random Early Detection gateways for Congestion Avoidance", S. Floyd, V. Jacobsen, IEEE/ACM Transactions on Networking, volume 1 number 4, August 1993, Pages 397-413 [Fredette97] "Stream Aggregation", A. Fredette, C. White, L. Andersson, P. Doolan , work in progress,, November 1997 P.Vaananen, R. Ravikanth [Page 52] Internet-Draft Framework for TM in MPLS Networks March 1998 [Gan97] "Setting up Reservations on Explicit Paths using RSVP", D.-H. Gan, R. Guerin, S. Kamat, T. Li, E. Rosen, work in progress, draft- guerin-expl-path-rsvp-01.txt, 21 November 1997 [Guerin97] "Aggregating RSVP-based QoS Requests" R. Guerin, S. Blake, S. Herzog, work in progress, draft-guerin-aggreg-rsvp-00.txt, 21 Nov 1997 [I370] "Congestion Management for the ISDN Frame Relaying Bearer Service", Recommendation I.370, ITU-T, 1991 [Jagan97] "End-to-End Traffic Management in IP/ATM Internetworks", S. Jagannath, N. Yin, work in progress, draft-jagan-e2e-traf-mgmt-00.txt, August 1997 [Li98] "Provider Architecture for Differentiated Services and Traffic Engineering (PASTE)", T. Li, Y. Rekhter, work in progress, draft-li- paste-00.txt, January 1998 [Nichols98] "Differentiated Services Operational Model and Definitions", K. Nichols, S. Blake, work in progress, draft-nichols- dsopdef-00.txt, February, 1998 [Packeteer97] "Controlling TCP/IP bandwidth", TCP/IP bandwidth Management Series, Vol 1 Number 1, The Packeteer technical Journal, 1997 [Ramakr97] "A Proposal to add Explicit Congestion Notification (ECN) to IPv6 and to TCP", K. K. Ramakrishnan, S. Floyd, work in progress, draft-kksjf-ecn-00.txt, November 1997 [Rampal97] "Flow Grouping For Reducing Reservation Requirements for Guaranteed Delay Service",S. Rampal, R. Guerin, work in progress, draft-rampal-flow-delay-service-01.txt, July 15th, 1997. [RFC1633] "Integrated Services in the Internet Architecture: an Overview", R. Braden, D. Clarck, S. Shenker, RFC-1633, June 1994 [RFC1827] "IP Encapsulating Security Payload (ESP)", R. Atkinson, RFC-1827, August 1995 [RFC1954] "Transmission of Flow Labelled IPv4 on ATM Data Links", P. Newman, W. L. Edwards, R. Hinden, E. Hoffman, F. Ching Liaw, T. Lyon, G. Minshall, ,RFC-1954, May 1996 [RFC2063] "Traffic Flow Measurement: Architecture", N. Brownlee, C. Mills, G. Ruth, RFC-2063, January 1997 P.Vaananen, R. Ravikanth [Page 53] Internet-Draft Framework for TM in MPLS Networks March 1998 [RFC2205] "Resource Reservation Protocol (RSVP) - Version 1 Functional Specification", R. Braden, L. Zhang, S. Berson, S. Herzog, S. Jamin, RFC-2205, September 1997 [RFC2208] "Resource ReSerVation Protocol (RSVP) Version 1 Applicability Statement: Some Guidelines on Deployment", A. Mankin, F. Baker, B. Braden, S. Bradner, M. O`Dell, A. Romanow, A. Weinrib, L. Zhang, September 1997 [RFC2211] "Specification of the Controlled-Load Network Element Service", J. Wroclawski, RFC-2211, September 1997 [RFC2212] "Specification of the Guaranteed-Quality of Service", S. Shenker, C. Partridge, R. Guerin, RFC-2212, September 1997 [Rosen97a] " A proposed architecture for MPLS", E. Rosen, A. Wiswanathan and R. Callon, work in progress, draft-ietf-mpls-arch- 00.txt, July 1997 [Rosen97b] "Label Switching: Label Stack Encodings", E.C. Rosen,Y. Rekhter, D. Tappan, D. Farinacci, G. Fedorkow, T. Li, A. Conta, work in progress, draft-rosen-tag-stack-03.txt, July 1997 [Schwantag97] "An Analysis of the Applicability of RSVP", Ursula Schwantag, Diploma Thesis, Universitat Karlsruhe, July 15, 1997 [Smith97] "Research Challenges for the Next Generation Internet", J.E. Smith, F. W. Weingarten, Computing Research Association, May 12- 14, 1997 [Tennenh89] "Layered Multiplexing Considered Harmful", D. Tennenhouse, Protocols for High-Speed Networks, Rudin and Williamson (Editors), North Holland, Amsterdam, 1989. [Touch97] "Bridging the Gap Between Optical Networks and the Internet: Summary of a Mini-Workshop", DRAFT, Oct. 1-2, 1997, Arlington, VA, Joe Touch, Ken Young, Joe Berthold 14. SECURITY CONSIDERATIONS As the support for the different levels of services, together with the different pricing structures comes in the effect, the mechanisms to monitor the service usage, enforce the service contract between parties, authorisation and billing will become important. It is essential to develop the associated protocols in a such way, that the different forms of service abuse, such as different forms of theft of service are not easily possible. Since this document is not protocol specification, the specifics of the P.Vaananen, R. Ravikanth [Page 54] Internet-Draft Framework for TM in MPLS Networks March 1998 implementation alternatives are not discussed here. 15. AUTHOR'S ADDRESSES Pasi Vaananen Nokia Telecommunications, Inc. 3 Burlington Woods Drive, Suite 250 Burlington, MA 01803 USA Phone: (781) 238-4981 Fax: (781) 238-4949 Email: pasi.vaananen@ntc.nokia.com Rayadurgam Ravikanth Nokia Research Center 3 Burlington Woods Drive, Suite 260 Burlington, MA 01803 USA Phone: (781) 238-4905 Fax (781) 238-4949 Email: ravikanth.rayadurgam@research.nokia.com P.Vaananen, R. Ravikanth [Page 55]