Internet Draft MPLS Working Group Zheng Wang Internet Draft Grenville Armitage Expiration: Dec 1997 Bell Labs, Lucent Technologies July 1997 Scalability Issues in Label Switching over ATM Status of this Memo This document is an Internet Draft. Internet Drafts are working documents of the Internet Engineering Task Force (IETF), its Areas, and its Working Groups. Note that other groups may also distribute working documents as Internet Drafts. Internet Drafts are draft documents valid for a maximum of six months. Internet Drafts may be updated, replaced, or obsoleted by other documents at any time. It is not appropriate to use Internet Drafts as reference material or to cite them other than as a "working draft" or "work in progress." Please check the 1id-abstracts.txt listing contained in the internet-drafts Shadow Directories on nic.ddn.mil, nnsc.nsf.net, nic.nordu.net, ftp.nisc.sri.com, or munnari.oz.au to learn the current status of any Internet Draft. 1. Introduction The scalability of label switching over ATM is one of fundamental issues in MPLS that has not been fully understood. Whether or not one should assume stream merging in ATM is a major design decision that has many implications to MPLS protocols and ATM hardware design. The issues are also common to any proposals for setting up labels [1,2,3,4,5]. In this document, we present an analysis of scalability of label switching over ATM, and examine some possible solutions. The document is intended to do two things: - Facilitate discussions in the MPLS WG that lead to realistic assessments of the label space issues, - Result in additional text for the FrameWork document that captures the refined assessments. Wang & Armitage Expiration: Dec 1997 [Page 1] Internet Draft Scalability Issues July 1997 2. Consequences of Conventional VPI/VCI Use In the absence of non-standard ATM switch hardware, the need to avoid interleaving of cells from different AAL5 PDUs on a single VCC makes it necessary to use a different label for each source/destination pair. Therefore the number of labels required is O(N**2) for N end- points (sources and destinations) in a cloud. We now look at the worst-case label requirement, namely the maximum number labels required on a single link in one direction. To set up switched paths based on destination-based routing tables for a net- work with of N endpoints, the worst-case label requirement is as fol- lows: (N**2)/4 if N is an even number (N**2 - 1)/4 if N is an odd number It should be emphasized that this is the worst-case scenario which may never happen in real networks. However, the worst-case analysis gives us a very conservative estimation of the scalability of label switching over ATM. The worse-case scenario occurs only when the following two extreme conditions are met: 1) a network is divided into two parts with N/2 endpoints each (or with (N+1)/2, (N-1)/2 each if N is an odd number), and the two parts are connected with a single link 2) each endpoint in one part is simultaneously communicating with all endpoints in the other part. Note that it is the link between the two parts that will hit the worst-case label requirement. To set up switched paths between all N endpoints based on destination-based routing, it translates into an upper limit of 2**(0.5*M + 1) endpoints, where M is the length of the label in bits. For simplicity, we assume here that both N and M are even numbers. If we use the 28 bits VPI/VCI space in ATM for labels, the upper limit is 32K endpoints and maximum flows is 256M on a single link. The results has a number of implications on the way we deal with the scalability issues which we will discuss in the next a few sections. Wang & Armitage Expiration: Dec 1997 [Page 2] Internet Draft Scalability Issues July 1997 3. Cloud Size Given that each endpoint represents an Edge LSR of an MPLS domain (a edge router of the overlying routing domain), 32K endpoints would seem to be a fairly large figure for majority of current networks. Furthermore, the worst-case scenario occurs when a network can be divided into two parts and there is only a single link between the two. However, in most real network topology, there are usually multi- ple connections between any two parts of a network. Therefore the upper limit can be several times bigger than 32K. On the other hand, the results assume that we only have best-effort destination-based forwarding. Other types of traffic such as multi- cast, RSVP/explicit routes will also consume label space. However, it is difficult to quantify the level of such traffic in the future Internet, and it is likely that associated switched paths will be established on an 'as needed' basis. If we wanted to pre-establish switched paths for a few different classes of traffic such as low delay, high throughput, high reliabil- ity etc, the worst-case upper limit is then reduced by K*N, where K is the maximum number of classes and N is the number of endpoints. This will reduce the scalability significantly. The implication is that for traffic other than best-effort, on-demand/on-request label setup is a more scalable approach as the likelihood of all the flows for all possible classes active on a single link is very small. Note, the theoretical limit imposed by the size of the VPI/VCI bit- space actually overstates the case by ignoring the practical limits imposed by the ATM NICs of Ingress and Egress LSRs. Typical NICs can support in the order of a few thousand simultaneous SAR instances. An Ingress LSR with a NIC that supports 4k SAR instances can at most have only 4k labeled paths originating from it and terminating on it. Any MPLS domain built with Edge LSRs supporting Y SAR instances will have substantially less than Y edge LSRs. This has a consequential impact on the number of labels demanded through the core LSRs of the MPLS domain. 4. Setup On-Demand/On-Request Instead of pre-establishing switched paths among all endpoints, one can set up switched paths on-demand or on-request. Such setup is use- ful for the following reasons: 1) For many traffic types such as multicast and QoS/explicit routes, pre-establishment of switched paths is not possible. Wang & Armitage Expiration: Dec 1997 [Page 3] Internet Draft Scalability Issues July 1997 2) On-demand/on-request setup can exploit the locality of the traffic flows thus improves the scalability. With on-demand/on-request setup, the theoretical scalability issue becomes the probability of having 256M flows simultaneously active on a single link. Even on backbone links, and given the limited abili- ties of Ingress and Egress LSRs to source and sink thousands of labeled paths, this number of independent and non-aggregatable flows is arguably unlikely. With on-demand/on-request setup, the scalability issue becomes the probability of having 256M flows simultaneously active on a single link. Even on backbone links, this number of independent and non- aggregatable flows is arguably unlikely. Even if this becomes a problem in the future, intra-LSR solutions are possible (e.g. the virtual VC space, which is discussed in the next section). 5. Virtual Label Space It is conceivable that an unusual topology could result in the worst case label consumption predicted above (e.g. some hot spots in a backbone network connecting two large networks by a single link). However, since the worst-case label consumption is localized, it is arguably preferable to find a localized solution (rather than some- thing that would affect all switches in an MPLS domain). One simple solution is to use the the Virtual label space. At such hot spots, we can have multiple parallel physical links instead of a single physical link. For example, if we have L smaller physical links distributed across L ports between two LSRs, the total usable label space (on the link, and in the port cards of the LSRs) is expanded by L times relative to what a single link could support. 6. VP Merge VP merge allows multiple VPIs to be merged and uses different VCIs for distinguishing flows or packets within the merged VP. So each egress router can be represented with a single VPI, and packets from different ingress routers going to the same egress router simply use different VCI at the mergeing point. With VP merge the total number of labels available is not changed when compared to simply using the whole VPI/VCI space as a single label. However, since VPIs are set up for forwarding and VCIs are allocated "as needed" to resolve cell interleaving, So VP merge does improve the scalability by exploiting Wang & Armitage Expiration: Dec 1997 [Page 4] Internet Draft Scalability Issues July 1997 the locaility of the flows. In this sense, it is similar to the On- Demand/On-Request setup discussed in section 4. However, the differ- ence is that in VP merge, VPI space is pre-allocated while VCI space is allocated "as-needed". This feature does seem to be a good trade- off between setting all label switched paths in advance and allocat- ing on a purely "as-needed" basis. VP merge also reduces the number of labels that have to be managed by the switches. However, the down- side of VP merge is that it requires collision detection and resolu- tion when allocating VCIs to make it work. Another problem is that the VPI space is limited to 4096. 7. VC Merge VC merge can reduce the worst-case label requirement to N, where N is the number of endpoints. However, VC merge requires modifications to current ATM cell switching. In VC merge, a switch has to wait until the last cell of a packet to arrive before it can start to forward the cells. In effect, the switch operates in a frame-forwarding mode. VC merge may introduce extra buffering, depending on whether inter- leaving of cells from packets going to different destinations. For FIFO queuing, no such interleaving takes place. Thus a VC-merged net- work has the same performance as a frame-based network. If we assume per-flow round-robin, cells from packets to different destinations may interleave, at the next switch, the cells have to be sorted out in the re-assemble buffer. At the cell level, the switch now operates in a non-work-conserving mode which introduces extra delay and buffering, particularly when the utilization is low. 8. Security Issues Security Issues are not discussed here. 9. Conclusion Based on the above analysis, our conclusion is that combined VPI/VCI space in ATM should be able to support networks of sufficient sizes, and even label space is exhausted on some hot spots, simple solutions exist to extend label space at such points. Wang & Armitage Expiration: Dec 1997 [Page 5] Internet Draft Scalability Issues July 1997 10. References [1] Y. Rekhter, B. Davie, D. Katz, E. Rosen, G. Swallow, "Cisco Systems' Tag Switching Architecture Overview", RFC2105, Feb 1997 [2] A. Viswanathan, N. Feldman, R. Boivie, R. Woundy, "Aggregate Route-Based IP Switching", Internet-Draft, Mar 1997 [3] Y. Katsube, K. Nagami, H. Esaki, "Cell Switch Router - Basic Concept and Migration Scenario" Networld+Interop'96 Engineer Conference, July 1996 [4] Peter Newman, Tom Lyon, Greg Minshall, "Flow Labelled IP: A Connectionless Approach to ATM" IEEE Infocom, March 1996 [5] Arup Acharya, Rajiv Dighe, Furquan Ansari, "IPSOFACTO: IP Switching Over Fast ATM Cell Transport", Internet Draft, July 1997 [6] Indra Widjaja, Anwar Elwalid, "Performace issues in VC-Merged Capable Switches for IP over ATM Networks", pre-print, 1997. Authors' Address: Zheng Wang Bell Labs Lucent Technologies 101 Crawfords Corner Road Holmdel, NJ 07733 Email: zhwang@bell-labs.com Grenville Armitage Bell Labs Lucent Technologies 101 Crawfords Corner Road Holmdel, NJ 07733 Email: gja@lucent.com Wang & Armitage Expiration: Dec 1997 [Page 6]