Internet Draft INTERNET-DRAFT Double Identification Label Swapping (DILS July 97 Multiprotocol Label Switching (MPLS) Working Group Internet Draft Gilad Goren (Tadiran LTD) Ilias Iliadis Patrick Droz (IBM Corp.) Double Identification Label Swapping (DILS) for Merged ATM Connections <draft-droz-dils-arch-00.txt> Status of this Memo This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress". To learn the current status of any Internet-Draft, please check the "1id-abstract.txt" listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). Abstract This draft describes Double Identification Label Swapping (DILS) for Merged multipoint-to-point Connections as described in the ARIS proposal [ARIS]. DILS is a method allowing the receiver on a merged ATM AAL5/UBR connection to recognize cells associated with a frame, so that the frame may be properly reassembled. This method uses two labels per cell in order to fully identify cells of frames. Furthermore, this method allows the unique identification of the sending entity. It also avoids the memory requirements and latency associated with performing frame reassembly at intermediate switches. 1. Introduction In the case of multipoint-to-point connections, cells associated with frames stemming from different ingress nodes may be interleaved on Droz et al. Expires January 98 [Page 1] INTERNET-DRAFT Double Identification Label Swapping (DILS July 97 links as described in the MPLS framework document [MPLS] and, in particular, in the ARIS document [ARIS]. A solution to the cell interleaving problem is crucial in order to avoid additional delays and jitter in the context of flow merging on multipoint-to-point connections. The Double Identification Label Swapping (DILS) mechanism addresses this issue. The straight forward approach to frame switching over cell switched networks requires n*(n-1) virtual connections in order to connect in a full-mesh fashion all the end points. Every source-destination pair of two end points needs a single virtual connection to identify the flow of cells. Assuming that a source sends cells frame by frame, and also mark the frame boundaries, then the destination reassembles frames by demultiplexing the arriving cells according to their virtual connection. Virtual connections are labeled by labels. The mechanism to establish and enable virtual connections in a distributed environment is known as label swapping. Labels have two duties in frame switching over cell switched networks. The first task is routing. A switch at every internal node of the network, routes cells along a virtual connection according to the label they carry. The second task of the labels is to identify the sources of the cells, in order to enable cell interleaving inside the network, and frame reassembly at the end points. These two missions are independent and therefore can be separated. Actually, one needs only O(n) different labels (called destination labels) to route the cells, i.e. to identify the destination of a cell, and OTHER O(n) labels (called source labels) to identify the source of a cell. Together there are O(n x n) (source,destination) label pairs, but `non-egress' switches need only one label (the destination label) for routing, so they have to handle only O(n) different labels, and only the egress switches of the network interpret the other label (the source label) so they also need to handle only O(n) labels. 2. Label swapping Without loss of generality we can assume that every egress end point of the network is also an ingress end point. The ARIS mechanism is used to establish `n' L2 routing trees attached to the `n' egresses end points of the network. Since an egress end point is also an ingress end point, a single label can be used to identify this ingress/egress end point. Consequently, we have a total number of O(n) labels at the egress switches. When a cell is sent from an ingress end point A to an egress end point B, it carries two labels, the destination-label and the source-label. In the context of ATM, the labels are carried in the Droz et al. Expires January 98 [Page 2] INTERNET-DRAFT Double Identification Label Swapping (DILS July 97 VPI/VCI fields of the ATM cell header. These labels are swapped as the cells are traveling through the switches. The entries of the label swapping table have the following structure: (upstream-label, downstream-label, output-port). The destination-label is swapped according to a table lookup in the upstream column. The source-label is swapped according to the incoming port and a table lookup in the downstream column. 3. Label establishment This is a two phase protocol in which the second phase is only necessary in the case of asymmetric paths between two end points. The first phase starts with the following procedure that is initiated for each end point. The egress integrated switch router (ISR) associated with an end point, say A, generates an establishment message (A,s1), where 'A' is an egress ID and 's1' is a label, and sends it to its neighbor switches (ISRs). Only the neighbors belonging to the 'L3 A'_egress tree will confirm the reception. Let us consider one of these neighbor switches, and let us assume that it received the establishment message (A,s1) at port x. This switch generates subsequently an establishment message (A,s2) and sends it to its own neighbors. Once again, only the neighbors belonging to the A_egress tree will confirm the reception. At the reception confirmed ports the following entry is created: (s2 s1 x). This procedure is repeated, until the establishment messages reach all the ingress switches associated with the other end points of the egress tree. By the end of this first phase, all the destination-labels associated with end point A have been established along its egress tree. Note also that, according to the ARIS protocol, the establishment messages carry the information about the path they traverse while they propagate towards the ingress switches. The first phase is subsequently followed by a similar reverse procedure that uses this information. The purpose of this second phase is the establishment of the source-labels that are also needed along the path. Consider another end point B. When an establishment message associated with the end point A finally reaches the egress switch associated with B, the second phase begins. This egress switch generates a reverse establishment message (rB,f1) and sends it to the previous switch. This message has a different structure than the original forward one, in that it carries the exact end-to-end path information. The relation between the reverse labels, such as f1, and the labels used by the forward establishment messages associated with the B_egress tree is discussed below. Let us assume that this message arrives at port y. The switch sends to the appropriate port, according to the path information carried inside this message, another reverse establishment message (rB,f2) and creates an entry: (f2 f1 y) at port y. This procedure is repeated at intermediate switches until the Droz et al. Expires January 98 [Page 3] INTERNET-DRAFT Double Identification Label Swapping (DILS July 97 reverse establishment message reaches the egress switch associated with the end point A. Note that it is possible that multiple establishment messages associated with any given end point may be generated at a port of a switch; a forward one associated with the egress tree of this end point and the reverse ones associated with the egress trees of the other end points. In this case, the label of these messages should be the same. This means that the label of the second establishment message and any other subsequent establishment message should be the same as the label sent with the first establishment message. Owing to different link delays, the first establishment message may be a forward establishment message or a reverse one. Note also that, in case of symmetric networks, the label space is reduced because the forward and reverse establishment messages associated with any given flow use the same path and, therefore, the same labels. 4. Example (Symmetric case) A ----\ +---+ A - E: egress identifiers >| 1 | 1 - 5: ISRs B ----/ +---+ / u / +---+ x | 2 | z +---+ y / \ / / \ / a +---+ c | 3 | u: physical port of ISR-1 b +---+ d x - z: physical ports of ISR-2 / \ a - d: physical ports of ISR-3 / \ v: physical port of ISR-4 +---+ v w +---+ /---- D w: physical port of ISR-5 C ---->| 4 | | 5 |< +---+ +---+ \---- E To simplify the example, we assume here that each ISR uses one VC label per end point, i.e. the same VC label is sent to each of the upstream neighbors. Every ISR can choose labels independently of the other ISRs. At the beginning of the establishment phase, the egresses (1,4,5) send establishment messages to their neighbors (egress ID, label): 1: (A,1) (B,2) 4: (C,10) 5: (D,1) (E,10) Droz et al. Expires January 98 [Page 4] INTERNET-DRAFT Double Identification Label Swapping (DILS July 97 When an establishment message coming from an egress end point arrives at an ISR on a given port, this ISR marks this port and then generates its own establishment message for this particular egress end point and it sends it out to the appropriate ports (it does not send it to a downstream node or to the marked port). If the ISR, is the egress switch associated with this end point, it discards the message. Next, ISR-2 forwards the following establishment messages to port y: 2: (A,10) (B,11) Then, ISR-3 sends the following establishment messages: 3: (A,10) (B,21) to ports b, c, d, and (C,12) to ports a, c, d, and (D,13) (E,14) to ports a, b, c. ISR-2 forwards the following messages : 2: (C,1) (D,2) (E,3) to ports x and z. At that stage, the label swapping tables of ISR-2 and ISR-3 have the following structure: (in label, out label, out port). ISR-3: 10 10 a (-> A) 21 11 a (-> B) 12 10 b (-> C) 13 1 d (-> D) 14 10 d (-> E) ISR-2: 10 1 x (-> A) 11 2 x (-> B) 1 12 y (-> C) 2 13 y (-> D) 3 14 y (-> E) At the egress/ingress ISRs the virtual circuits goes directly to the IP layer. The swapping tables of the egress ISRs looks like (a hyphen stands for a VC that start/end at the ISR): ISR-1: 1 - - (-> A) 2 - - (-> B) - 1 u (-> C) - 2 u (-> D) - 3 u (-> E) Droz et al. Expires January 98 [Page 5] INTERNET-DRAFT Double Identification Label Swapping (DILS July 97 ISR-4: - 10 v (-> A) - 21 v (-> B) 10 - - (-> C) - 13 v (-> D) - 14 v (-> E) ISR-5: - 10 w (-> A) - 21 w (-> B) - 12 w (-> C) 1 - - (-> D) 10 - - (-> E) When the establishment messages initiated by A and B reach the egress switches 4 and 5, the reverse establishment message procedure is initiated. Messages (rC,10), (rD,1) and (rE,10) arrive at ISR-3. Then ISR-3 forwards the messages (rC,12) (rD,13) and (rE,14) on the port at which the forward establishment messages arrived originally, namely port a. Finally, ISR-2 forwards the messages (rC,1) (rD,2) and (rE,3) on port x. Similarly, when the establishment messages initiated by C, D and E reach ISR-1, the reverse establishment messages (rA,1) and (rB,2) are sent on port u. Then ISR-2 sends the messages (rA,10) and (rB,11) on port y. The swapping tables are expanded as follows: ISR-3: 10 10 a (<- A) 21 11 a (<- B) 12 10 b (<- C) 13 1 d (<- D) 14 10 d (<- E) and ISR-2: 10 1 x (<- A) 11 2 x (<- B) 1 12 y (<- C) 2 13 y (<- D) 3 14 y (<- E) Note that, due to symmetry, the entries of the swapping tables remain the same. The only difference is their port location. For example, the difference between entry <10 10 a (<- A)> and entry <10 10 a (-> A)> is that the former is created on port a, whereas the latter is created on ports b, c and d. Now, when a cell is sent from A to D it goes through three hops. The Droz et al. Expires January 98 [Page 6] INTERNET-DRAFT Double Identification Label Swapping (DILS July 97 label fields have the following structure as it goes from ISR-1 to ISR-2, ISR-3 and ISR-5 (dst label, src label): ISR-1 to ISR-2: (2, 1) ISR-2 to ISR-3: (13, 10) ISR-3 to ISR-5: (1, 10) If at the same time another cell travels from C to D it goes through two hops: ISR-4 to ISR-3: (13, 10) ISR-3 to ISR-5: (1, 12) Notice that although both cells have the same labels when they arrive to ISR-3, they carry different labels as they leave ISR-3. 5. Example (asymmetric case) A ----\ +---+ A - E: egress identifiers >| 1 | 1 - 6: ISRs B ----/ +---+ f / u \ / \ +---+ x g +---+ | 2 | | 6 | z +---+ y h +---+ m / \ / \ / \ / \ a +---+ c | 3 | u,f: physical port of ISR-1 b +---+ d x - z: physical ports of ISR-2 / \ a - d: physical ports of ISR-3 / \ v: physical port of ISR-4 +---+ v w +---+ /---- D w: physical port of ISR-5 C ---->| 4 | | 5 |< g,h,m: physical ports of ISR-6 +---+ +---+ \---- E To simplify the example, we assume here that each ISR uses one VC label per end point, i.e. the same VC label is sent to each of the upstream neighbors. Every ISR can choose labels independently of the other ISRs. At the beginning of the establishment phase, the egresses (1,4,5) send establishment messages to their neighbors (egress ID, label): 1: (A,1) (B,2) 4: (C,10) 5: (D,1) (E,10) Droz et al. Expires January 98 [Page 7] INTERNET-DRAFT Double Identification Label Swapping (DILS July 97 When an establishment message coming from an egress end point arrives at an ISR on a given port, this ISR marks this port and then generates its own establishment message for this particular egress end point and it sends it out to the appropriate ports (it does not send it to a downstream node or to the marked port). If the ISR, is the egress switch associated with this end point, it discards the message. Intermediate switches mark the ports that they send establishment messages. Suppose, for example, that the flows originated by A and B reach ISR-3 through ISR-6, and the flows destined to A and B are routed via ISR-2. Next, ISR-2 forwards the following establishment messages to port y: 2: (A,10) (B,11) Then, ISR-3 sends the following establishment messages: 3: (A,10) (B,21) to ports b, c, d, and (C,12) to ports a, c, d, and (D,13) (E,14) to ports a, b, c. ISR-6 forwards the following messages: 6: (C,1) (D,2) (E,3) to ports g and m. At that stage, the label swapping tables of ISR-2, ISR-6 and ISR-3 have the following structure: (in label, out label, out port). ISR-3: 10 10 a (-> A) 21 11 a (-> B) 12 10 b (-> C) 13 1 d (-> D) 14 10 d (-> E) ISR-2: 10 1 x (-> A) 11 2 x (-> B) ISR-6: 1 12 h (-> C) 2 13 h (-> D) 3 14 h (-> E) At the egress/ingress ISRs the virtual circuits goes directly to the IP layer. The swapping tables of the egress ISRs looks like (a hyphen stands for a VC that start/end at the ISR): ISR-1: Droz et al. Expires January 98 [Page 8] INTERNET-DRAFT Double Identification Label Swapping (DILS July 97 1 - - (-> A) 2 - - (-> B) - 1 f (-> C) - 2 f (-> D) - 3 f (-> E) ISR-4: - 10 v (-> A) - 21 v (-> B) 10 - - (-> C) - 13 v (-> D) - 14 v (-> E) ISR-5: - 10 w (-> A) - 21 w (-> B) - 12 w (-> C) 1 - - (-> D) 10 - - (-> E) When the establishment messages initiated by A and B reach the egress switches 4 and 5, the reverse establishment message procedure is initiated. Messages (rC,10), (rD,1) and (rE,10) arrive at ISR-3. Then ISR-3 forwards the messages (rC,12) (rD,13) and (rE,14) on the port at which the forward establishment messages arrived originally, namely port a. Finally, ISR-2 forwards the messages (rC,41) (rD,42) and (rE,43) on port x. Similarly, when the establishment messages initiated by C, D and E reach ISR-1, the reverse establishment messages (rA,1) and (rB,2) are sent on port f. Then ISR-6 sends the messages (rA,10) and (rB,52) on port h. The swapping tables are expanded as follows: ISR-3: 10 10 c (<- A) 21 52 c (<- B) 12 10 b (<-> C) 13 1 d (<-> D) 14 10 d (<-> E) ISR-2: 41 12 y (<- C) 42 13 y (<- D) 43 14 y (<- E) ISR-6: 10 1 g (<- A) 52 2 g (<- B) and Droz et al. Expires January 98 [Page 9] INTERNET-DRAFT Double Identification Label Swapping (DILS July 97 ISR-1: - 41 u (<- C) - 42 u (<- D) - 43 u (<- E) Now, when a cell is sent from A to D it goes through three hops. The label fields have the following structure as it goes from ISR-1 to ISR-6, ISR-3 and ISR-5 (dst label, src label): ISR-1 to ISR-6: (2, 1) ISR-6 to ISR-3: (13, 10) ISR-3 to ISR-5: (1, 10) If at the same time another cell travels from C to D it goes through two hops: ISR-4 to ISR-3: (13, 10) ISR-3 to ISR-5: (1, 12) Notice that although both cells have the same labels when they arrive to ISR-3, they carry different labels as they leave ISR-3. 6. Implementation The source-label is not a standard part of an ATM cell. In order to use it one may choose one of the following options: 1) use the VPI/VCI 24 bits as two fields of 12 bits each, 2) use the VPI for the destination label and the VCI field for the source label, 3) use two bytes from the 48 bytes payload to carry the source label. The amount of the distributed information to perform the swapping operation depends strongly on the underlying hardware and software architecture. Option 1) can be implemented on most ATM switching hardware without any hardware changes. One sets up the swapping table to swap the VPI and VCI concurrently (a VCC with non-zero VPI). Only the border nodes have to interpret the source and destination labels which cross the border between VPI and VCI fields. Twelve bits per label should be enough as we do not expect more then 4096 boundary ISRs in one network. An interesting version of Option 2) is when per egress tree the same label can be used on every branch of the tree. In such a situation the source label does not need to be swapped at all. The swapping table can therefore be set up as pure VPC where only the VPI has to be Droz et al. Expires January 98 [Page 10] INTERNET-DRAFT Double Identification Label Swapping (DILS July 97 swapped. Such an implementation requires a special mapping from every egress identifier to a globally unique label. This and some other mappings are discussed in the ARIS proposal. More generally, DILS fits in well with ARIS. Option 3) may be required in case of very large networks where one wants to have 16 bits per label. An important implementation question is where the swapping information is stored. In general one uses swapping tables per port as well as some global information in the control point. This means that normally the information to swap the destination label is found in the input port while the information for the source label is found on the outgoing port. So one can do a swap operation on the incoming as well as on the outgoing port. But it is better to transparently encode the full information in the table of the incoming port. In case a node manages to allocate the same label on all of its upstream links, then a more compact representation of the swapping information can be achieved in the control point. This is because the swapping information has a global nature. 7. RM Cells The existence of the source address information per cell allows an egress to send RM cells back to the ingress in order to have ABR flow control. This information can also be used for billing purposes. 8. Security Consideration Security considerations are not addressed in this memo. 9. Summary This draft presents a method that uses both forward and backward swapping information to uniquely identify source/destination pairs in order to solve the cell interleaving problem that arises in the context of flow merging. By combining the forward and backward information, a globally unique sender identification can be avoided thus leading to a fully distributed architecture. Combined with ARIS, the proposed method provides a strong solution for IP flow merging. DILS and ARIS complement each other by providing the following properties. The required label space is in the order of n (O(n)) compared to the n-squared VCC full-meshed solution. The combined scheme preserves the native ATM traffic characteristics Droz et al. Expires January 98 [Page 11] INTERNET-DRAFT Double Identification Label Swapping (DILS July 97 because it avoids pseudo reassembly. In addition, no globally unique sender identification is required. 10. References [ARIS] "ARIS: Aggregate Route-Based IP Switching", Arun Viswanathan, Nancy Feldman, Rick Boivie, Rich Woundy <draft-viswanathan-aris-overview-00.txt>, March 1997. "ARIS Specification", Nancy Feldman, Arun Viswanathan <draft-feldman-aris-spec-00.txt>, March 1997 [MPLS] "A Framework For Multiprotocol Label Switching", R. Callon et. al., Internet Draft, May, 1997 11. Authors Gilad Goren Tadiran Telecomm LTD 16 Martin Gehl St. P.O. Box 500 Petah-Tikva 49104 ISRAEL giladg@telecomm.tadiran.co.il Ilias Iliadis IBM Research Division Zurich Research Laboratory Saumerstrasse 4 8803 Ruschlikon Switzerland ili@zurich.ibm.com Patrick Droz IBM Research Division Zurich Research Laboratory Saumerstrasse 4 8803 Ruschlikon Switzerland dro@zurich.ibm.com Droz et al. Expires January 98 [Page 12]