Internet Draft

Network Working Group                                     Dino Farinacci
Internet Draft                                             Yakov Rekhter
Expires: April, 1999                                       cisco Systems
                                                           November 1998


           Multicast Label Binding and Distribution using PIM
                <draft-farinacci-multicast-tagsw-01.txt>


Status of this Memo

   This document is an Internet-Draft.  Internet-Drafts are working
   documents of the Internet Engineering Task Force (IETF), its areas,
   and its working groups.  Note that other groups may also distribute
   working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   To learn the current status of any Internet-Draft, please check the
   "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow
   Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
   munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
   ftp.isi.edu (US West Coast).


Abstract

   This document describes a method for advertising labels for multicast
   flows.  It strives to use downstream label assignment to be
   consistent with unicast label distribution. This proposal is media-
   type independent. Therefore, it works for multi-access/multicast
   capable LANs, point-to-point links, and NBMA networks.


1.0 Overview

   We propose to use PIM and combine the (*,G) and (S,G) join state with
   label assignment and distribution. Labels and multicast routes will
   be sent together in one message.


1.1 Goals

   i. We are motivated to have the upstream Label Switch Router (LSR)



Farinacci & Rekhter   Multicast Tagging using PIM               [Page 1]

Internet Draft                                             November 1998


   use one label for multicast data delivery on a network so we can make
   use of data-link multicast delivery where available.

   ii. We are motivated to use downstream label assignment to achieve:

      o Simplicity and consistency with unicast label assignment.

      o A per interface Label Information Base (LIB) that guarantees
      unique label assignments on any interface.

      o Consistent algorithms for label assignment and distribution
      among different media types.

      o Both routing table state and the label binding information
      associated with the state are advertised together in a single
      control message thus reducing race conditions.

      o Avoid label reallocation or reassignment when there are RPF
      changes (i.e. the multicast distribution tree takes different
      shape).

      o To improve utilization of label space by randomizing label
      assignment among all downstream routers joining for a group.

   iii. Works with dense-mode or sparse-mode operation.


2.0 Proposal

   A LSR that supports multicast sends PIM Join messages on behalf of
   hosts that join groups. It sends Joins messages to upstream
   neighboring LSRs toward the RP for the shared-tree (*,G) or toward a
   source for a source-tree (S,G). If the LSR creates the state for the
   group, it will assign a label for the respective (*,G) or (S,G)
   state. It includes the label in the Join message associated with the
   multicast routing table entry. The entry is created in its LIB using
   the label as its incoming label component.

   The upstream LSR, when it receives the Join, will cache the new
   multicast routing table state along with the label. An entry is
   created in the LIB and the label is used as the outgoing component.
   This label will be used by the upstream LSR to forward multicast data
   packets.

   Since PIM Join messages are multicast on a LAN, other downstream
   LSRs, that are interested in the group, will hear the message and can
   cache the binding of multicast routing table state and label state
   together. Since the upstream LSR is going to forward data packets



Farinacci & Rekhter   Multicast Tagging using PIM               [Page 2]

Internet Draft                                             November 1998


   using the advertised label, they must be ready to accept the data
   packet with that advertised label.

   The first downstream LSR that joins for a group, is the label
   assigner (or called in other forums as the Label Allocation Server)
   on a LAN for a multicast route. All other downstream LSRs that send
   PIM Join messages will use the same label that the assigner selected.
   A LSR that sends a PIM Join message with a label of 0 means that it
   doesn't know the label for the associated multicast routing table
   entry. When this occurs, the assigner can trigger a PIM Join message
   making the label known.

   This algorithm works on point-to-point links because there is only
   one downstream LSR on the link which always becomes the label
   assigner.

   On NBMA networks, all PIM routers are known to each other through
   pseudo-broadcast mechanisms provided by the data-link layer. However,
   PIM Join messages are unicast to the upstream LSR. Therefore, other
   downstream LSRs will not hear the label assigner's advertisement. To
   overcome this issue, we have each downstream LSR become the label
   assigner on NBMA networks. Since the upstream LSR is going to
   pseudo-broadcast the data anyways it can supply a label for each
   packet that goes to each respective downstream LSR.


2.1 Corner cases

   Multiple downstream LSRs cannot assign the same label value for any
   multicast route because they partition the label space into non-
   overlapping ranges according to [4]. When a LSR is enabled on an
   interface, it obtains a unique label range for the LAN.

   When the label assigner leaves the group, the label that it assigned
   still remains active. The next highest IP addressed downstream LSR
   becomes the owner of that label and may change it if it sees fit.
   However, it is not required to change it. All downstream LSRs can
   continue to use the assignment in their Join messages.

   If two systems both join for the first time (they do not have state),
   at the same time and each choose a different label value, the highest
   IP addressed downstream LSR's label will be used by the upstream LSR.
   The lower addressed LSR will hear the higher addressed LSR's Join too
   and will also use it's label.

   If the label assigner crashes, the highest IP addressed downstream
   LSR assigns a new label to the multicast routes, which were assigned
   by the crashing LSR, and triggers a Join message so all other LSRs on



Farinacci & Rekhter   Multicast Tagging using PIM               [Page 3]

Internet Draft                                             November 1998


   the LAN to use the new label.

   When a LAN partitions due to a layer-2 switch failure, it follows the
   same logic for the case when a LSR stops joining for a group. When
   the partition heals, there may be an RPF neighbor change in one of
   the partitions.  When there is an RPF neighbor change and the
   downstream routers trigger joins to their new RPF neighbor with a
   different label assignment than the other partition is using, one of
   two resolutions occur:

      1) The LSR which is the allocator in the partition of the new RPF
      neighbor will trigger a join if it has a higher IP address than
      the allocator in the other region. The downstream routers in the
      other partition use the new label assignment immediately.

      2) If the LSR which is the allocator in the partition of the new
      RPF neighbor has a lower IP address, all downstream routers and
      the new RPF neighbor will switch to the label assigned by the
      allocator in the other partition.

   If an RPF change occurs (the topology changed so the upstream LSR is
   different), the PIM protocol spec indicates that a PIM Join may be
   triggered to get on the new distribution tree as soon as possible. In
   this case, if the label assigner becomes the upstream LSR, then the
   new highest IP addressed downstream LSR may become the label
   assigner. It may change the label if it sees fit. Otherwise, the same
   label is used.


3.0 Coexistence of Label-Capable and Label-Incapable multicast routers

   An upstream router will know if all routers on a subnet are LSRs or
   not.  If there are any label incapable routers, the upstream router
   will not label encapsulate multicast data packets. The PIM Hello
   message will indicate if the router is label capable. The PIM Hello
   message is sent by every multicast capable router.

   If the upstream router detects any non-PIM neighbors on the subnet,
   it will assume that they are label capable and will not label
   encapsulate multicast data packets.

   An optimization may be achieved, if the upstream router knows that
   all downstream routers interested in the group are LSRs, it may label
   encapsulate multicast data packets even though there are other label
   incapable routers on the subnet.

   Related to the above cases, if there is a group member on a LAN, co-
   located with a multicast LSR, only a single packet will be forwarded.



Farinacci & Rekhter   Multicast Tagging using PIM               [Page 4]

Internet Draft                                             November 1998


   It is the responsibility of the upstream router to decapsulate the
   labeled packet and forward it on the LAN as an IP packet so the
   member can receive it. The downstream routers may forward the IP
   packet or label encapsulate it.


4.0 Label Conflict Resolution

   The use of different data-link layer code-points (i.e. Ethertypes,
   PPP protocol types) for unicast and multicast label switching allows
   to disambiguate between labels associated with unicast routes versus
   labels associated with multicast routes. Therefore, the assignment of
   labels for unicast routes could be done completely independent from
   the assignment of labels for multicast routes, without creating any
   risk of ambiguity. For example, the same label value could be
   allocated for a unicast route and for a multicast route.


5.0 Modifications to PIMv2

   PIMv2 has a packet format for each address type it may support when
   encoding both multicast and unicast addresses. We will define a new
   address type called "Label Address" for unicast address encoding. The
   label will accompany the source address in the Encoded Source Address
   format as specified in [2].  The label value will be in a 32-bit
   quantity following the source address. So, for example, an IPv4 Label
   Address format would look like:

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Rsrvd   |S|W|R|   Mask Len    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                         Source Address                        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                            Label                              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                  Current Multicast Route Timer                |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   Label
      A 20-bit value assigned by the LSR sending the Join/Prune message.
      The 20 bit value is inserted in the low-order 20-bits of this 32-
      bit field.

   Current Multicast Route Timer
      The sender of a Join/Prune message inserts the current time left
      before expiration for the multicast route table entry described by



Farinacci & Rekhter   Multicast Tagging using PIM               [Page 5]

Internet Draft                                             November 1998


      the Source Address (either the (S,G) or (*,G) entry). This is
      needed so all routers on a common multi-access subnet can time-out
      the entry close to the same time without each other recreating the
      state when the source goes inactive.

   Refer to [2] for other field descriptions not specified here.


6.0 Label Distribution for dense-mode groups

   In dense-mode PIM, there is no downstream Join message traveling
   upstream to perform the binding of multicast routes with labels.
   However, since we don't want a separate algorithm for dense-mode
   groups, we extend this basic design for dense-mode PIM.

   When a downstream LSR creates (S,G) state from the receipt of 1)
   data, or 2) Join/Prune or Graft messages, it will start a periodic
   timer to send Join messages with label assignment information
   present. The messages look no different and are treated on receipt no
   differently than in the sparse-mode case.

   The periodic Join message will be multicast on the LAN with an
   upstream target address of 0.0.0.0. All multicast LSRs on the LAN
   must know the group operates in dense-mode. This is accomplished
   using standard PIM mechanisms.


7.0 Security Considerations

   Security considerations are not discussed in this memo.


8.0 Acknowledgments

   The authors would like to thank Fred Baker and Eric Rosen from cisco
   Systems for their insightful comments on this draft.


9.0 Author's Address

   Dino Farinacci
   Cisco Systems, Inc.
   170 Tasman Drive
   San Jose, CA, 95134
   Email: dino@cisco.com

   Yakov Rekhter
   Cisco Systems, Inc.



Farinacci & Rekhter   Multicast Tagging using PIM               [Page 6]

Internet Draft                                             November 1998


   170 Tasman Drive
   San Jose, CA, 95134
   Email: yakov@cisco.com


10.0 References

   [1] Multiprotocol Label Switching Architecture, draft-ietf-mpls-
   arch-02.txt, Rosen, Viswanathan, Callon, July, 1998.

   [2] Protocol Independent Multicast-Sparse Mode (PIM-SM): Protocol
   Specification, RFC 2362, Estrin, Farinacci, Helmy, Thaler, Deering,
   Handley, Jacobson, Liu, Sharma, Wei, June, 1998

   [3] LDP Specification, <draft-ietf-mpls-ldp-01.txt>, Andersson,
   Doolan, Feldman, Fredette, Thomas, August, 1998

   [4] Partitioning Label Space amoung Multicast Routers on a Common
   Subnet, , Farinacci,
   October, 1998

   [5] "MPLS Label Stack Encoding", draft-ietf-mpls-label-encaps-03.txt,
   Rosen, Rekhter, Tappan, Fedorkow, Li, Conta, September, 1998




























Farinacci & Rekhter   Multicast Tagging using PIM               [Page 7]