Internet Draft Network Working Group Wang Yahong Internet-Draft Datang Telecom Expiration Date: June 2000 Dec 1999 A probe method in MPLS loop avoidance <draft-wang-mpls-loop-probe-00.txt> Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. To view the list Internet-Draft Shadow Directories, see http://www.ietf.org/shadow.html. Abstract This paper introduces a very simple mechanism, based on "probe" message, which can be used for loop avoidance in the case that a LSP is changed. This probe mechanism is based on the idea that in the situation of one LSR has a new next hop and changes the LSP to new next hop, if this change leads to a loop, the only reason for the loop formation is that the LSR changes its next hop. So that we use a very simple length-fixed probe message, no state is needed to keep in the tandem LSRs, and the message primitives are very limited and simple. Also this method is flexible and robust. Wang Yahong Expires June 2000 [Page 1] Internet Draft draft-wang-mpls-loop-probe-00.txt June 1999 1. Why we need a simple loop avoidance mechanism It is clear that some mechanisms should be provided in MPLS to avoid the negative effect of LSP Loop formation[1-3], especially for the LSRs which can not process the TTL. There are several solutions to loop problem in MPLS,such as loop prevention,loop detection,loop mitigation. So a loop avoidance mechanism should be simple and effective, can it have a better competitive ability. Also in MPLS, the loop is not frequently formed, it is a little probability event, the time is urgent especially in case of one LSP is changed, so if a loop avoidance mechanism is too complex, the price is also very high. The current two mechanisms, diffusion(path-vector) algorithm and colored thread algorithm[3-4], are complex, diffusion algorithm need a length-variant path vector message, while "colored thread" is too complex in algorithm self(a complex state machine), it also needs the LSRs involved to keep some states, store some thread messages, and do five primitive threads action. Complex means less stable, more difficult to understand and implement, and may delay the time to detect a loop. So we need a simple and clear loop avoidance method, in this paper, we propose a very simple loopavoidance method in the case that a LSP is changed(not the case a LSP is built up),called "probe" method. As the draft is in it's preliminary form, so some items are not explained clearly,the explanations and background information can be found in[1][2][3]. 2. Basic idea The "probe" method is based on the watching result in the process of a loop formation in the case that a LSR changes its next hop and so changes a LSP. In the situation of one LSR has a new next hop and changes the LSP to new next hop, if this change leads to a loop, the only reason for the loop formation is the LSR changes its next hop. This result is clear and straight forward. In normal situation, before one LSR changes its next hop, there is no loop. So if the LSR changes the next hop and leads to a loop formation, there is only one action, the LSR changes its next hop. And there is at least one action for LSP changes from a loopless state to a loop state. So that the only reason for the loop formation is the LSR changes its next hop. We can get some further reductions from this result. If one LSR have a new next hop and change the LSP for certain FEC[1] to new next hop, there only two situation, create a loop or not. And in the situation of a loop formation the LSR must be in the loop. So if the LSR sends a Wang Yahong Expires June 2000 [Page 2] Internet Draft draft-wang-mpls-loop-probe-00.txt June 1999 message(probe) downstream, the message can only reach the egeress LSR (loopless) or return to itself(loop), no other situation (if the message has enough time to live). 3. The Probe method The probe method is used for loop avoidance in the case that a LSP is changed because one or more LSR changes its next hop. The basic mechanism of "probe" method is that when one LSR, referred as "Source LSR", has a new next hop and changes the LSP to new next hop, the Source LSR sends a probe message download with its LSR id(such as the IP address), so it can judge if there is loop for this next hop change based on the probe message. There are two alternatives for the detailed action. 3.1 Optimistic method To a LSP, once a LSR in the LSP has a new next hop to egress LSR in the LSP, so it changes it's next hop to the new LSR, now it can not decide if this change leads to a loop. So it sends a Probe Message(carries the LSR id and the flag to indicate it is a probe message) downstream to the new next hop LSR. To the Probe Message, we called the LSR who originally sends it is "Source LSR", the egress LSR for the LSP is also called as Egress LSR, the other LSR in the LSP is Tandem LSR. Once the Source LSR sends the Probe, it start a Timer(Max hops * time spent in each hop) .If Source LSR receive the Probe Message before the Timer is time out, it can decide the new hop leads to a loop, if no, it think the new LSP is loop free. The duration of the Timer is determined, during the time, it can do nothing, or it can send the data along the new LSP,but cache the data sent before the timer is time out, if loop is detected, it can send these data to a new alternative LSP. The Tandem LSR receive the Probe Message, it just send it downstream, do nothing else, when the probe message reaches the Egress LSR, it just is discard. 3.2 Pessimistic method To a LSP, once a LSR in the LSP has a new next hop to egress LSR in the LSP, so it changes it's next hop to the new LSR, now it can not decide if this change leads to a loop. So it sends a Probe Message(carries the LSR id and the flag to indicate it is a probe message) downstream to the new next hop LSR. Once the Source LSR sends the Probe, it start a Timer(Max hops * time spent in each hop) . So during the time-out period, it waits for the ACK message from the Egress LSR, if the timer is triggered but the Wang Yahong Expires June 2000 [Page 2] Internet Draft draft-wang-mpls-loop-probe-00.txt June 1999 Source LSR have not receive the ACK from the Egress LSR, so it can decide there is a loop. Else if the Source LSR receives the ACK Message from the Egress LSR before the timer is time out, it can judge there is no loop. The Tandem LSR receives the Probe Message, it just sends it downstream, do nothing else, when it receives the ACK message from the Egress LSR, it just sends it upstream, do nothing else. When the probe message reaches the Egress LSR,the Egress LSR responses to send back a probe acknowledgement message, for short as ACK. The duration of the Timer is determined, during the time, it can do nothing, or it can send the data along the new LSP, but cache the data sent before the timer is time out, if loop is detected, it can send these data to a new alternative LSP. Though the two methods are very similar, we give them separate names. Because for the Optimistic method, we suppose that the initial state is loopless, and we send a probe message to affirm there is a loop. For the Pessimistic method, we suppose that the initial state is loop, and we send a probe message to affirm there is a loopless. Because the two method are based on a timer(for Pessimistic method, maybe we can use a timer-less method, but if the time is too long to receive a ACK message, the L3 transient loop maybe disappear[5]). So they both can be affected by the bottleneck node(LSR) that where processing load is terribly high, then a Probe message will be delayed too much. So that for Optimistic method in some cases, loop formation will be omitted, for Pessimistic method, in some cases, some loopless situation maybe mistakenly judged to be loop. we can combine the two methods to get a better result. And we can adopt some means, such as sends a probe message several copies at a time, or use a local acknowledgement method to accelerate the process of probe message. We need a simulation and quantitative analysis and comparison to decide the details and do some selection. We use an example from [4] to explain this probe methode, in the figure below, thers is a break between LSR R5 and R6, so that R5 change its next hop to R1, so that there is loop R5-->R1--->R3--->R5. When R5 change the next hop from R6 to R1, it sends a Probe meassage downstream to R1,the probe passes form R1---> R3, then back to R5, The Source LSR can detect this loop in a very short time. R1------>R3------>R5------>R6------->R8 ^ ^ ^ | | | R2 R4 R7 Wang Yahong Expires June 2000 [Page 3] Internet Draft draft-wang-mpls-loop-probe-00.txt June 1999 4. Discussion 4.1 How and when a MPLS loop can be formed? Though the reasons for a loop formation are not too clear by now, but there are at least three explicit reasons: Transient L3 route loop(no TTL ability in some LSRs), route mis-configuration, a path psss through a MPLS domain and other non-MPLS domain. Loop maybe formed by two actions. One is that the LSP is bulit up, whether in a ordered or independent, or in downstream or downstream-on-demand[2]. the other is that one (or more )LSR changes the next hop for the same FEC and so changes the LSP. 4.2 Loop prevention and Loop detection For Probe Method, if during the timer-out period, the Source LSR does not send any data on the new LSP, it is loop prevention, if the Source LSR does send data on the new LSP, it is loop detection. 4.3 TTL in the probe message Generally, to avoid a Probe Message forms a loop, or is sent endless, a TTL field is needed. If we only use the Optimistic method, in the situation of only one Source LSR, TTL is no needed, because the Probe Message is sent downstream, so one result is that it reach the egress LSR(no loop), the second result is it return the Source LSR, no other results. But in the case of two or more Source LSR in one LSP, the case is complex,further study is need.And in the pessimistic method, the ACK message may need a TTL to avoid the Tandem LSR to keep some states. 4.4 Send downstream Because the LSP is lable-merged or VC-merged in some LSRs, so we think the best probe method is to send the probe downstream, not upstream. So that the process is simple and the number of LSRs involved is less. 4.5 Value of the Loop Timer It is very difficult for determine the time-out value of timer, though the value can be equal to Max hops *time spent in each hop(Maximum or average ). A feasible value need to determined by some simulation and affected by current network size and related protocol. 4.6 Probe method used in the LSP setup case Because in the case of LSP setup, because the LSR that initiate the LSP may not in the Loop if a loop is formed. But I think the idea behind the probe method may do some help. And in the case of LSP setup, time is not too urgent, so we can use a simple but maybe stupid method, such as the Wang Yahong Expires June 2000 [Page 4] Internet Draft draft-wang-mpls-loop-probe-00.txt June 1999 Confirm Information form an Egress LSR to Ingress LSR to determine there is no loop in this LSP. It is clear the probe method is very simple and also in its preliminary stage. Some questions should be for further discussion and simulation. 5. Conclusion The probe method is very simple method for loop avoidance in MPLS, its probe message only carries the LSR id and the flag to indicate it is a probe message, message length is fixed and short. The tandem LSRs just forward the probe message, the action is very simple, the tandem LSRs may need not know the message is for Loop probe. No state is needed to keep and no received message is needed to stored except the Souce LSR (Pessimistic method) who sends this probe and the Egress LSR(Optimistic method).Also it is very flexible and robust. 6. Author's Address Wang Yahong Beijing R&D Center Datang Telcom 40 Rd. XueYuan, 100083 Beijing, P.R.China. wangyh@catt.ac.cn 7. References [1] R. Callon, et al, " A Framework for Multiprotocol Label Switching", September 1999, <draft-ietf-mpls-framework-05.txt> [2] Eric C. Rosen, et al, " Multiprotocol Label Switching Architecture", August 1999, draft-ietf-mpls-arch-06.txt [3] Y. Obha, et al, "MPLS Loop Prevention Mechanism", October 1999, <draft-ietf-mpls-loop-prevention-02.txt> [4] Y. Obha, "Issues on loop prevention in MPLS Networks", pp64-68, IEEE Com Mag, Dec,1999 [5] S.Ayandeh, Y. Fan, " MPLS Routing Dynamics"(Expired) Wang Yahong Expires June 2000 [Page 5]