Internet Draft PIM-SM Working Group Heinrich Hummel Internet Draft Siemens AG Expiration Date: January 2001 June 2000 Source Only Multicast draft-hummel-pim-so-00.txt Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract This draft addresses several aspects of multicast services like IP- based TV but also like n-party conferences. It breaks with traditional concepts: An IP-based TV-broadcast's delivery tree shall be identified by the source address S and a Multicast-ID provided by S. The same shall apply to each from n well correlated point-to(n-1)point delivery trees of a small n-party conference. Consequently, a TV-host may send out more than one program and any user-host may participate in more than one conference. Mapping to a globally unique multicast group address G is not needed, would be complex, and does not work. QoS/Policy-Routing, the degree of state (Layer-3, Layer-3 plus MPLS- label, or no state), and several more aspects need to be respected HOMOGENEOUSLY all over the entire delivery tree. This cannot be provided by reverse path setup while depending on each receiver's guesses. Therefore, the tree (i.e. an initial tree as well as additional branches at a later time) must be established in downstream direction. No RP- and no BSR-routers are required. Hummel June 2000 [Page 1] Source Only Multicast Exp. Jan. 2001 1. Introduction, Requirements This draft addresses several aspects of multicast services. It is a "PIM-SO"-draft because the applications in mind are a)IP-based TV (single-source to eventually millions of leaves) and b) small n-party conferences however being implemented by n well correlated 1-point- to-(n-1)point unidirectional delivery trees. The presented proposals do not aim for minimizing the changes to PIM-SM /MSDP nor do they retain from questioning holy icons like class D addressing or reverse path setup. Instead they try to meet the real objectives of the services: Each home-page may eventually be its own TV-station and may eventually find millions of watchers. There may be tens of millions of small conferences and adhoc-conferences simultaneously. Any multicast service instance needs to be properly identified. Hereby, limitation and artificial complexity due to inappropriate,superfluous class-D addressing must be avoided. The following requirements a) - g) cannot be met by reverse path setup: a) QoS-sensitive routing Because of its voice/video nature, QoS-sensitive routing is more important for IP-based TV than for any unicast IP service ever. No one enjoys a cracking voice nor a flickering picture. b) Delivery tree using a minimal number of links Because of its broadband nature, delivery trees which use a minimal number of links are desirable. c) Policy-sensitive routing There may be "contents"-provider (like Time Warner) and service provider (like AOL) which share the common interest as to establish a Policy-sensitively routed delivery tree. d) VPN-sensitive routing Likewise, as VPN-support becomes a big issue (as well as a big business opportunity), delivery trees might be favored which intrude a VPN at only one single link. e) Degree of state Different degrees of state in the net have to be taken into consideration: Layer-3-state (as known from PIM-SM's delivery trees), Layer-3-state plus MPLS-label,or no state in any transit node ! Hummel June 2000 [Page 2] Source Only Multicast Exp. Jan. 2001 f) Option: Source notification for approval There may be multicast service instances which require that a join request is sent up to the source S for being approved. A join request (which is like the ATM Forum's LIJ SETUP REQUEST) may hit the delivery tree at any node and learn by a single bit of the state whether or not it needs to be forwarded up to source S for being approved. g) Option: Use of proxy root nodes The amount of work load (e.g. for the root node) may become an issue in case of TV-broadcasts with millions of receivers. It can be shifted to proxy root nodes especially if the source S does not care about approvement: A skeleton delivery tree may be established whose leaves are proxy root nodes. These proxy root nodes may be determined by configuration (in anticipation of all join requests) or based on a few "early" join requests. Each proxy root node may head a particular subtree of the entire delivery tree. Join requests are either intercepted by such proxy root nodes or are re-directed to them. All the aspects/requirements from a) to g) should respectively must be applied to the entire tree consistently/homogeneously. None of them can be taken care of in case the delivery tree is established reversely as is done by PIM-Join-messages - not at all and not homogeneously. Therefore, the delivery tree must be established in downstream direction - e.g. as is done by ATM Forum's Network LIJ concept. There,see [LIJ], at first a LEAF SETUP REQUEST message is sent upstream towards the source,which may or may not hit the delivery tree on its way. In case it does hit the delivery tree, it is supposed to follow the parent links of the delivery tree until it reaches either a proxy root node or the root node or source S. In response, an ADD PARTY/SETUP message is sent downstream. Hereby, the (proxy)root node will compute and inserts an explicit route information. Also: A proxy root node which receives a (PNNI- hierarchical) explicit route information (=DTLs), may compute and insert a "refill" of the received DTLs. (The fact that ATM Forum did not even grant baseline-text status to one of its best protocols ever is another story). However, in [LIJ] the entire delivery tree is built up by adding one branch at a time. Going beyond, the root node may compute and establish an initial delivery tree, which meets all the mentioned requirements, while honoring all "early" join requests in one stroke. Analogously, a proxy root node may compute and establish an initial subtree. Hummel June 2000 [Page 3] Source Only Multicast Exp. Jan. 2001 The root node (as well as the proxy root node) may honor "late" join requests by computing and adding additional branches to the tree (subtree), which also meet all the mentioned requirements. It will be shown how to compute and establish any such tree/branch. Finally, no RP-routers and no BSR-routers are required. This eliminates a lot of complexity and drawbacks (see also negative statements in [1]). 2 Multicast Service Identification 2.1 Identifying a TV-Multicast Service A TV-multicast service may be identified by the source address S of some "TV-"host and by a multicast-identifier (MC-ID), which identifies one of several packet streams originated by this TV-host. Hereby, it is irrelevant whether a such identified packet stream is endless (call it TV-channel or TV-program), or starts and ends at well announced times (call it a TV-broadcast). In the ATM Forum such a multicast-identifier was called "LIJ Call Identifier". This MC-ID may be a 4 bytes sized identifier, carrying numbers freely designed and assigned according to the TV-sender's internal management. However history teaches us that the IETF should care for some reserved numbers. A potential receiver may get this MC-ID e.g. from a Weekly TV-Guide or from the web-page of the TV-station. The information pair (S, MC-ID) completely identifies a particular multicast service instance. There is no need for and there is no sense in mapping this pair of information to a world-wide unique multicast group address G. No MADCAP is required. No shortage of Gs needs to be feared. We do not loose much by getting rid of G: In the net a state is now identified by 8 bytes (for S, MC-ID) rather than by 4 bytes (for G). At the edge the dubious because surjective mapping "G --> MAC-address M" needs to be replaced by a mapping "(S,MC-ID) --> M". IGMP must be modified. Receiver R cannot send a IGMP-Join message anymore which contains G and M. Instead, receiver R must send a new IGMP-Join-message which contains (S,MC_ID). The designated router DR(R) must assign an available MAC-address M and tell R, by means of a brand new message, that M corresponds to (S,MC-ID). This message may be unicasted back to R or broadcasted to the entire LAN. Hummel June 2000 [Page 4] Source Only Multicast Exp. Jan. 2001 Towards the internet however, DR (R) uses the information pair (S,MC-ID) as to identify the particular multicast service instance. 2.2 Identifying a small n-party conference Similarily, with respect to millions of small n-party voice/video/data conferences and adhoc-conferences: No globally unique multicast group address G is needed either. Anyone who has an IPv4 address may participate in an n-party conference at any time. Again: there are not sufficient many Gs. And there is no need either to employ a worldwide management that hands out available G values. Instead, the n-party conference can be conceived as a set of n well correlated point-to-(n-1)-point multicast service instances (in general; i.e.this shall not exclude cases where several parties reside on the same LAN). The n-party conference shall comprise n delivery trees, e.g. the tree with source S=party Pi, an MC-ID provided by Pi, and receiver parties P1,...,Pi-1, Pi+1,...Pn. Note (and compare it with the intended solutions of the new SGM/XCAST-WG): a): Any party may participate in several conferences simultaneously by using different MC-IDs. b): Each delivery tree is well identified by (S,MC-ID). This becomes important in case the delivery tree is real, i.e. does consist of (S,MC-ID)-specifically allocated states in the net (layer-3 state, or even layer-3 state + MPLS-label), and if it is impacted by any failing node or link. A multicast service which uses a pure logical delivery tree, i.e. a tree without allocated states in the net, is still a potential option. But not the only one! 3 Computing an initial delivery tree Prior to the starting time of the data transmission, the source S (=TV-host) may receive and collect all the join requests (which are like the ATM Forum's LIJ SETUP REQUEST messages) of all those receiver nodes (or even receivers if the service cares for individual approvals !) to which an initial delivery tree shall be established before data transmission starts. At the starting time source S sends a Tree-Establish message to DR(S) which contains the QoS-constraints, the requested degree-of-state, the option parameters, the list of DR(R)s and evtl. a list of proxy root nodes. Based on these information DR(S) computes a delivery tree while disregarding network links respectively network routes that do not meet the constraints and by doing the following: Hummel June 2000 [Page 5] Source Only Multicast Exp. Jan. 2001 3.1 Network topology is known (e.g. due to a link-state protocol): The absolute minimal tree is that tree whose number of links is minimal. We may approximate such a tree by computing a tree which branches off closest to the leaves as follows: The DR(S) runs a Dijkstra and computes a tree, whose n leaves nodes are all the DR(R)s and all the proxy root nodes and whose root node is DR(S) itself. We enlist all n leaf nodes in a sequence which starts with the most remote leaf node and ends with the nearest leaf node (in terms of number of hops). From the computed tree we keep only the linear node sequence between DR(S) and the most remote leaf node as a starter. Looping from the second most remote leaf node to the nearest leaf node we repetitively run Dijkstras: Each time we take the currently indexed leaf node i as the Dijkstra source node and iterate until the first shortest path (of least hops) to any node x of the so far built tree is computed. We add this shortest path (from x to i) to the tree. As a result we get a fairly minimal delivery tree. Example: Assume, the very first Dijkstra computation had resulted in the following shortest path tree (it takes 4 links to each DR(R) ) : The physical lines are drawn using these symbols: /,\,-,I The tree is added to the physical lines by doubling the symbols, i.e. by: //,\\, =, II =========o============o============o=============o DR(R1) // I I I I DR(S)o==========o============o============o=============o DR(R2) \\ I I I I =========O============O============O=============o DR(R3) Note that this shortest-path tree uses 12 links. Though R1,R2 and R3 are all equally remote,they may be enlisted according to their remoteness just in this sequence (R1,R2,R3). The delivery tree after the loop of Dijkstra computations will look like this: =========o============o============o=============o DR(R1) // I I I II DR(S)o ---------o------------o------------o-------------o DR(R2) \ I I I II ---------o------------o------------o-------------o DR(R3) Note that this minimal tree only uses 6 links ! Hummel June 2000 [Page 6] Source Only Multicast Exp. Jan. 2001 3.2 Collection of routes is provided (due to distance-vector prot.) Because we do not know the full meshes of the network topology, a "shortest path" delivery tree (i.e. a tree with a shortest path between its root and each leaf node) is the best we can accomplish. With regard to the example above: We must be happy with the tree which uses 12 links. We determine the shortest path delivery tree by "putting all routes to the DR(R)s on top of each other" and observe how they "diverge". 4 Establishing the initial delivery tree The computed tree must be mapped to a respective,tree-structured explicit routing information. See [TREE]: There, it is described how the root node may encode it (it is called TREE ROUTE TLV) and how any passed node should decode it, so that it can find out what information is relevant for itself and which part of the received TREE ROUTE TLV must be forwarded along which child link (oif). According to [TREE], the TREE ROUTE TLV consists of - Explicit Route TLVs (ER-TLVs) each of which contains a non- branching sequence of ER-HOP-TLVs (defined in [CR-LDP]); - "("-TLVs and ")"-TLVs for shaping the tree structure; - Node-Info-TLVs, which properly positioned, will carry target information to individual nodes and make them alert; - Nodes-Info-Begin-TLVs and Nodes-Info-End-TLVs, which properly positioned, will carrying target information to well confined clusters of nodes within the delivery tree and make them alert. Indeed, the TREE ROUTE TLV processing will guide a Tree-Establish message along the computed tree route to all the leaf nodes. It will make any leaf node aware to be a leaf in this tree, no matter whether it is a pure leaf node or a leaf+transit node. Each passed router is enabled to allocate the appropriate state according to the signalled state-degree: i.e. either the Layer-3- state, or the Layer-3-state plus MPLS-label-state ! If no state shall be allocated at any downstream router, no Tree- Establish message is required. Instead, the root node must forward each data packet together with the TREE ROUTE TLV which guides it to the leaf nodes. Hereby, the size of the TREE ROUTE TLV may be considered as a burden. If so, and if QoS-routing is not important than the TREE ROUTE TLV may be compressed by the following: Remove Hummel June 2000 [Page 7] Source Only Multicast Exp. Jan. 2001 from each ER-TLV all its ER-HOP-TLVs except the last one, but set the loose-Bit in the remaining ER-HOP-TLVs! According to [CR-LDP] the data packets will be forwarded based on the transit routers' own best knowledge towards the next loose-hop address. 5 Adding a single branch to the tree After the initial tree is established the root node may learn that a new DR(R) wants to join. It may compute an additional branch while considering all the objectives as it has done for computing the initial tree. In analogy to ATM Forum's ADD PARTY/SETUP, the root node may send out a message for adding the new branch. This message may be guided by a TREE ROUTE TLV whose first ER-TLV will send the message to the node X at which the new branch shall branch-off. This ER-TLV may be followed by a Node-Info-TLV which tells node X to extend its state by an new oif and to set a parameter in the message so that from there on, guided by a second ER-TLV, each passed node down to the new DR(R) is instructed to allocate a state for this multicast service. 6 Deleting a branch / subtree A DR (R) may notice that there are no local receivers anymore. Therefore, it may send upstream a message that a) prunes the respective branch properly and which, eventually, is forwarded further upstream either to some proxy root node or to the root node as to give that node some clue how the remaining tree would look like. Remember, the (proxy) root node in a network which runs a link-state-protocol may want to know about the (remaining) tree structure so that it can compute a minimal extension to the tree when a new branch shall be added. Also, source S may want to remove a particular receiver R from the delivery tree, which eventually results in removing a branch from the delivery tree. Finally, any network link may fail such that an entire subtree is separated from the delivery tree. In all these cases the necessary action can be worked out in detail by emulating [LIJ]. 7 Re-optimization of delivery trees See above: If minimal delivery trees are tried to be accomplished (if support by a link-state routing protocol can be assumed) it may occur that the tree "deteriorates" after some time because receivers join Hummel June 2000 [Page 8] Source Only Multicast Exp. Jan. 2001 and quit in arbitrary disadvantageous sequence. Therefore it may become appropriate from time to time to doublecheck on this and to start a process as to re-optimize the shape of the delivery tree. 8 N-party conference using correlated delivery trees As mentioned above in 2.2 small n-party conferences can be realized b n point-to-(n-1)point delivery trees, each of which is identified by source S (who is equal to receiver party Pi) and by receiver parties P1,...,Pi-1, Pi+1,....,Pn. The n delivery trees need to be correlated. Each party must know all the other parties in the conference. Whenever a new party contacts any single party which is already in the conference, they both may mutually agree that the new party shall join the conference. The contacted party shall supply to the newcomer a list of all members which are already participating in the conference as well as to all these (old) members the address of the newcomer. The newcomer may establish one full delivery tree, whereas each old party may add one new branch to the delivery tree which is rooted by himself. Whenever a conference party wants to quit the conference, it will tear-down "its own" delivery tree. Hereby, each remaining party gets to know about the quitting party and may remove the respective branch to the quitting party from its own delivery tree. 9 Conclusion At a moment where two new multicast work items are started ("PIM Source Only" and "Small Group Multicast/XCAST") I like to ask the working group members to seriously consider all the aspects brought up by this draft, and by doing so, find consistent and generalized solutions for big and small multicast services alike, with and without MPLS, which work no matter how many instances there will be. 10 References [1] "IP-Multicast: Past, Present and Future" by Christoph Diot & Radia Perlman, tutorial at the IEEE Infocomm 2000 in Tel Aviv. [TREE] Hummel and Loke "Explicit Tree Routing", draft-hummel-mpls- explicit-tree-01.txt, June 1999 [LIJ] ATM Forum Specification ltd-cs-ra-pnni-lij-01.02 "PNNI Leaf Hummel June 2000 [Page 9] Source Only Multicast Exp. Jan. 2001 Initiated Join (LIJ) v1.2 living list, Feb. 1999 [CR-LDP] "Constraint-Based LSP Setup using LDP", draft-ietf-mpls-cr- ldp-03.txt, Bilel Jamoussi, Editor 11 Author's Address Heinrich Hummel Siemens AG Hofmannstrasse 51 81379 Munich, Germany Tel: +49 89 722 32057 Email: heinrich.hummel@icn.siemens.de Full Copyright Statement "Copyright (C) The Internet Society (March 2000). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implmentation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. Hummel June 2000 [Page 10]