Internet Draft Internet Draft Yasuhiro Katsube Ken-ichi Nagami Hiroshi Esaki (Toshiba R&D Center) March 1st, 1995 Router Architecture Extensions for ATM : Overview <draft-katsube-router-atm-overview-00.txt> Status of this memo This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress". To learn the current status of any Internet-Draft, please check the "1id-abstract.txt" listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). Abstract This memo describes new internetworking architecture which makes better use of the property of ATM. IP datagrams are transferred along hop-by-hop path via routers similar to the Classical IP Model [RFC1577], but datagram assembly/disassembly and IP header processing are not necessarily carried out at individual routers in the proposed architecture. A concept of "Cell Switch Router (CSR)" is introduced as a new internetworking equipment, which has ATM cell switching capabilities in addition to conventional IP datagram forwarding. CSR can concatenate one incoming VC and another outgoing VC in order to provide a certain communication with an ATM level connectivity even when its endpoints do not share a common IP address prefix. Proposed architecture can provide applications with desired QOS and as much bandwidth as current ATM switch networks provide while retaining current router-based internetworking concept. In addition, it is interoperable with IP level resource reservation protocol such as RSVP. Katsube, et al. Expires Sept. 1st, 1995 [Page 1] Internet Draft March 1st, 1995 1. Introduction The Internet is growing both in its size and its traffic volume. In addition, recent applications often require guaranteed bandwidth and QOS rather than best effort. Such changes make the current hop-by-hop datagram forwarding paradigm inadequate, then accelerate investigations on new internetworking architectures. Roughly two distinct approaches can be seen as possible solutions; the use of ATM to convey IP datagrams, and the revision of IP to support flow concept and resource reservation. Both approaches do not seem to take each other's approach into account although integration or interworking of them may be necessary to provide end hosts with high throughput and QOS guaranteed internetworking services over any datalink platforms as well as ATM. New internetworking architecture proposed in this draft is based on "Cell Switch Router" which has the following properties. - It makes the best use of ATM's property while retaining current router-based internetworking and routing architecture. - It takes into account intereoperability with future IP that supports flow concept and resource reservations. Section 2 of this draft explains background and motivations of our proposal. Section 3 describes an overview of the proposed internetworking architecture and its several remarkable features. Section 4 discusses various issues which would have close relationship to the design of the detailed architecture. 2. Backgrounds and Motivations It is considered that the current hop-by-hop best effort datagram forwarding paradigm will not be adequate to support future large scale Internet which accommodates huge amount of traffic with certain desired quality. Two major schools of investigations can be seen in IETF whose main purpose is to improve ability of the Internet with regard to its throughput and QOS. One is to utilize ATM technology as much as possible, and the other is to introduce the concept of resource reservation and flow into IP. 1) Utilization of ATM Although basic properties of ATM; necessity of connection setup, necessity of traffic contract, etc.; is not necessarily suited to conventional IP datagram transmission, its excellent throughput and Katsube, et al. Expires Sept. 1st, 1995 [Page 2] Internet Draft March 1st, 1995 delay characteristics let us to investigate the realization of IP datagram transmission over ATM. A typical internetworking architecture specified by IETF IPoverATM WG is "Classical IP Model"[RFC1577]. This model allows direct ATM connectivities only between nodes that share the same IP address prefix. IP datagrams should traverse routers whenever they go beyond IP subnet boundaries even though their source and destination are accommodated in the same ATM cloud. Although an ATMARP is introduced which is not based on legacy datalink broadcast but on centralized ATMARP servers, this model does not require drastic changes to the legacy internetworking architectures with regard to the IP datagram forwarding process. This model still has problems of limited throughput and large latency due to IP header processing at every router. It will become more critical when multimedia applications that require much larger bandwidth and smaller latency will become dominant in the near future. Another internetworking model is currently under discussion in IETF ROLC WG[KAT94] and the ATM Forum Multiprotocol-over-ATM SWG. The model, that we call "NHRP (Next Hop Resolution Protocol) Model" here, aims at resolving throughput and latency problems in the Classical IP Model and making the best use of the ability of ATM. ATM connections can be directly established from an ingress point to an egress point of an ATM cloud even when they do not share the same IP address prefix. In order to enable it, the entity of Next Hop Server[KAT94] (or Route Server) is introduced which can find an egress point of the ATM cloud nearest to the given destination and resolves its ATM address. A sort of query/response protocols between the server(s) and clients and possibly server and server will be specified. After the ATM address of a desired egress point is resolved, the client establishes a direct ATM connection to that point through ATM signaling procedures.[ATM3.1] IP datagram forwarding function and routing protocol processing function, both of which are provided by conventional routers, are distributed to the ATM cloud and server(s) respectively. Once a direct ATM connection has been set up through this procedure, IP datagrams do not have to experience hop-by-hop IP processing but can be transmitted over the direct ATM connection. Therefore, high throughput and low latency communications become possible even if that go beyond IP subnet boundaries. In this model, ATM is utilized not only as a datalink function but also as a replacement of current hop-by-hop IP forwarding function. However, it should be noted that the provision of such direct ATM connections does not mean disappearance of legacy routers which interconnect distinct ATM-based IP subnets. For example, hop-by-hop IP datagram forwarding function would still be required in the following cases: Katsube, et al. Expires Sept. 1st, 1995 [Page 3] Internet Draft March 1st, 1995 - When you want to transmit IP datagrams before direct ATM connection from an ingress point to an egress point of the ATM cloud is established - When you neither require a certain QOS nor transmit large amount of IP datagrams for some communication[REK95] - When the direct ATM connection is not allowed by security or policy reasons 2) IP level resource reservation and flow support Apart from investigation on specific datalink technology such as ATM, resource reservation technologies for desired IP level flows have been studied and are still under discussion. Their typical examples are STII[STII] and RSVP[RSVP]. STII is regarded as a connection oriented IP which requires connection setup process from a sender to a receiver (or receivers) before transmitting datagrams. STII-capable routers along the path of the requested connection reserve their resources for datagram forwarding according to its flow spec. RSVP itself is not a connection oriented technology since datagrams can be transmitted regardless of the result of resource reservation process. After a resource reservation process from a receiver to a sender (or senders) is successfully completed, RSVP-capable routers along the path of the flow reserve their resources for datagram forwarding according to its flow spec. Neither STII nor RSVP restrict underlying datalink networks since their primary purpose is to let routers provide each IP flow with desired forwarding quality (by controlling their datagram scheduling rules). Since various datalink networks will coexist as well as ATM datalink in the future, these IP level resource reservation technologies would be necessary in order to provide end-to-end IP flow with desired bandwidth and QOS. Taking these backgrounds into consideration, we should be aware of several issues which motivate our proposal. - ATM specific internetworking architecture proposed as NHRP model [KAT94] does not take into account an interoperability with IP level resource reservation or connection setup protocols. Especially emulating RSVP in the NHRP-based ATM cloud seems to require much effort since RSVP is soft-state, receiver-oriented Katsube, et al. Expires Sept. 1st, 1995 [Page 4] Internet Draft March 1st, 1995 protocol. - Although STII or RSVP-based routers will provide each IP flow with a desired bandwidth and QOS, they have some native throughput limitations due to processor-based IP forwarding mechanism compared with the switching mechanism of ATM. Main objective of our proposal is to resolve above issues. Proposed internetworking architecture makes the best use of the property of ATM by extending legacy routers which can also handle future IP such as flow support and resource reservation in the future. 3. Internetworking Architecture Based On Cell Switch Router 3.1 Overview Cell Switch Router (CSR) is a key network element of the proposed internetworking architecture. The CSR provides cell switching function in addition to conventional IP datagram forwarding. Communications with high throughput and small latency, that are native property of ATM, become possible by using this cell switching function even when the communications pass through IP subnetwork boundaries. In an ATM Internet composed of CSRs, VPI/VCI-based cell switching which bypasses datagram assembly/disassembly and IP header processing is possible at every CSR for communications which are worth doing that (e.g., communications which require certain amount of bandwidth and QOS), while hop-by-hop datagram forwarding based on IP header is also possible at every CSR for other conventional communications. By using such cell-level switching capabilities, the CSR is able to concatenate incoming and outgoing VPI/VCIs, although the concatenation in this case is controlled outside the ATM cloud unlike conventional ATM switch nodes. By carrying out such VPI/VCI concatenations at multiple CSRs consecutively, native ATM pipe composed of multiple ATM connections, each of which connects adjacent CSRs (and CSR and hosts/routers), can be provided. We call such an ATM pipe "ATM Bypass-pipe" to differentiate it from "ATM VCC (VC connection)" provided by a single ATM datalink cloud. Example network configurations based on CSRs are shown in figure 1. An ATM datalink network may be a large cloud which accommodates multiple IP subnets X, Y and Z. Or several distinct ATM datalinks may accommodate single IP subnet X, Y and Z respectively. The latter configuration is referred as "Conventional Model" in [OHTA94] [ESAKI94] which would be straightforward in discussing CSR, but CSR is also applicable to the former configuration as well. Katsube, et al. Expires Sept. 1st, 1995 [Page 5] Internet Draft March 1st, 1995 Two different kinds of ATM VCs are defined between adjacent CSRs or between CSR and ATM-attached hosts/routers. 1) Default-VC It is general purpose VC used by any communications which select conventional hop-by-hop IP processed route. All incoming cells received from this VC are assembled to IP datagrams and handled based on their IP headers. VCs set up in the Classical IP Model are classified into this category. 2) Dedicated-VC It is used to be concatenated with other Dedicated-VCs and constitutes ATM Bypass-pipe for certain communications. The number of Dedicated-VCs necessary between adjacent two nodes is the same as the number of Bypass-pipes which pass through these nodes. Ingress/egress nodes of the Bypass-pipe can be either CSRs or ATM- attached routers/hosts which understand a Bypass-pipe control protocol. (we call that "Bypass-capable nodes") On the other hand, intermediate nodes of the Bypass-pipe should be CSRs since they need to have cell switching capabilities as well as to understand Bypass-pipe control protocol. Route for a Bypass-pipe is determined when it is set up based on IP routing table in each CSR. In figure 1, IP datagrams from source host or router X.1 to destination host or router Z.1 are transferred over the route X.1 -> CSR1 -> CSR2 -> Z.1 regardless of whether the communication is hop-by-hop basis or Bypass-pipe basis. Routes for individual Dedicated-VCs which constitutes the Bypass-pipe X.1 --> Z.1 (X.1 -> CSR1, CSR1 -> CSR2, CSR2 -> Z.1) would be determined based on ATM routing protocol [IISP][PNNI], and would be independent of IP level routing. An example of an IP datagram transmission mechanism is as follows. o The host/router X.1 checks an identifier of each IP datagram, which may be "destination IP address (prefix)", "source/destination IP address (prefix) pair", "destination IP address and port ID", "source IP address and Flow label (in IPv6)", and so on. Based on either of those identifier, it determines over which VC the datagram should be transmitted. o The CSR checks the VPI/VCI value of each incoming cell. When the mapping from the incoming interface/VPI/VCI to outgoing interface/VPI/VCI is found in an ATM routing table, it is directly Katsube, et al. Expires Sept. 1st, 1995 [Page 6] Internet Draft March 1st, 1995 forwarded to the specified interface through ATM switch module. When the mapping in not found in the ATM routing table (or the table shows an IP module as an output interface), the cell is assembled to an IP datagram then forwarded to an appropriate outgoing interface/VPI/VCI. IP subnet X IP subnet Y IP subnet Z <---------------------> <-----------------> <---------------------> +-------+ Default +-------+ Default +-------+ Default +-------+ | | -VC | CSR 1 | -VC | CSR 2 | -VC | | | Host +=============+ +===============+ +=============+ Host | | X.1 +-------------+++++---------------+++++-------------+ Z.1 | | +-------------+++++---------------+++++-------------+ | | +-------------+++++---------------+++++-------------+ | | |Dedicated | | Dedicated | |Dedicated | | +-------+ -VCs +-------+ -VCs +-------+ -VCs +-------+ <---------------------------------------------------> Bypass-pipe Figure 1 Internetworking Architecture based on CSR 3.2 Features Main feature of the proposed CSR-based internetworking architecture is the same as that of NHRP-based architecture in the sense that they both provide direct ATM level connectivity beyond IP subnet boundaries. There are, however, several remarkable differences in the CSR-based architecture from NHRP-based architecture as follows. 1) Relationship between IP routing and ATM routing In NHRP model, an egress point of the ATM network is first determined in the next hop resolution phase based on IP level routing information. Then the actual route for an ATM-VC to the obtained egress point is determined in the ATM connection setup phase based on ATM level routing information. Both kinds of routing information would be calculated according to factors such as network topology and available bandwidth for the large ATM cloud. The ATM routing will be based on IISP[IISP] or PNNI phase1[PNNI] while the IP routing will be based on conventional one such as OSPF/BGP. We need to manage two Katsube, et al. Expires Sept. 1st, 1995 [Page 7] Internet Draft March 1st, 1995 different routing protocols over the large ATM cloud until Integtrated-PNNI[IPNNI] which takes both ATM level metric and IP level metric into account will be phased in. In CSR model, IP level routing determines an egress point of the ATM cloud as well as determines inter-subnet level path to the point that shows which CSRs it should pass through. ATM level routing determines intra-subnet level path for ATM-VCs (both Dedicated-VC and Default-VC) only between adjacent nodes (CSRs or ATM-attached hosts/routers). Since roles of routing are hierarchically subdevided into IP level (router level) and ATM level (ATM SW level), ATM routing does not have to manage all over the ATM cloud but only individual IP subnets independent from each other. This will decrease the amount of information for ATM routing protocol handling. 2) Dynamic routing and redundancy support CSR-based network can dynamically change routes for Bypass-pipes when related IP level routing information changes. Ingress points of these Bypass-pipes (ATM-attached sender hosts or routers) do not have to be aware of such dynamic change of routes since CSRs related to IP routing changes can follow them and change routes for related Bypass- pipes by themselves. The same things apply when some error or outage happens in any ATM nodes/links/routers on the route of a Bypass-pipe. CSRs that have noticed such error or outage would change routes for related Bypass- pipes by themselves. 3) Support of hard-state and soft-state control Although direct ATM-VCs in NHRP model are controlled by ATM signaling which is hard-state protocol, Bypass-pipes provided by CSR-based network are controlled by dedicated protocols which can be both hard-state and soft-state. A motivation of the support of soft-state in Bypass-pipe control is an interworking with RSVP protocol. Soft-state protocol will be much more suited to the realization of dynamic routing described in 2). 4) Support of sender-initiated and receiver-initiated control Although setup of direct ATM-VCs in NHRP model is sender-initiated only, setup of Bypass-pipes in CSR model can be either sender- initiated or receiver-initiated. Detailed procedures for receiver- initiated protocol will be designed to handle RSVP messages. Katsube, et al. Expires Sept. 1st, 1995 [Page 8] Internet Draft March 1st, 1995 4. Discussions for Designing Detailed Architecture Several issues that need further investigations in order to design detailed architecture are discussed in this section. 4.1 Network Reference Model In order to help understanding discussions in this section, the following network reference model are assumed. Source hosts S1, S2, and destination hosts D1, D2 are attached to Ethernet, while S3 and D3 are attached to ATM. Routers R1 and R5 are attached to Ethernet only, while R2, R3 and R4 are attached to ATM. ATM datalink for subnet #3 and subnet #4 can either be physically separated datalinks or be the same datalink. Bypass-pipes can be set up [S3 or R2]-->R3-->[D3 or R4]. That means that S3, D3, R2, R3 and R4 need to speak Bypass-pipe control protocol described later, and means that R3 needs to be a CSR. We use term "Bypass-capable nodes" for hosts/routers which can speak Bypass-pipe control protocol but are not necessarily CSRs. As shown in this reference model, Bypass-pipe can be set up from host to host (S3-->R3-->D3), router to host (R2-->R3-->D3), host to router (S3-->R3-->R4), and router to router (R2-->R3-->R4). Ether Ether ATM ATM Ether Ether | | +-----+ +-----+ | | | | | | | | | | S1--| S2---| S3---| | | |---D3 |---D2 |--D1 | | | | | | | | |---R1---|---R2---| |--R3--| |---R4---|---R5---| | | | | | | | | | | +-----+ +-----+ | | subnet subnet subnet subnet subnet subnet #1 #2 #3 #4 #5 #6 Figure 2 Network Reference Model Katsube, et al. Expires Sept. 1st, 1995 [Page 9] Internet Draft March 1st, 1995 4.2 Ways of Providing Dedicated-VCs There are roughly three alternatives regarding the way of providing Dedicated-VCs in individual IP subnets as components of a Bypass- pipe. a) On-demand SVC setup Dedicated-VCs are set up in individual IP subnets each time you want to set up a Bypass-pipe through the ATM signaling procedure. Each Dedicated-VC is released when the corresponding Bypass-pipe is released. b) Picking up one from a bunch of (semi-)PVCs Several Dedicated-VCs are set up beforehand between CSR and CSR, or CSR and other ATM-attached nodes (hosts/router) in each IP subnet. Unused VC is picked up as a Dedicated-VC from these PVCs in each IP subnet when a Bypass-pipe is set up. A sort of "Unused VC list" will be managed by a peer nodes which share these PVCs. c) Picking up one VCI in PVP/SVP A PVPs or SVPs are set up between CSR and CSR, or CSR and other ATM- attached nodes (hosts/routers) in each IP subnet. PVPs would be set up as a kind of router/host initialization procedure, while SVPs, on the other hand, would be set up through ATM signaling when the first VC (either Default- or Dedicated-) setup request is initiated by either of some peer nodes. Then, Unused VCI value is picked up as a Dedicated-VC in the PVP/SVP in each IP subnet when a Bypass-pipe is set up. A sort of "Unused VC list" will be managed by the peer nodes which share the PVP/SVP. The SVP can be released through ATM signaling when no VCI value is active state. That may require some revision to RFC1577. The best choice will be a) with regard to efficient network resource usage. However, you may go through three steps, ATMARP (in each IP subnet), SVC setup (in each IP subnet) and Bypass-pipe setup in this case. Whether a) is practical choice or not will depend on whether you can allow larger Bypass-pipe setup time due to three-step procedure mentioned above, or whether you can send datagrams over Default-VCs in a hop-by-hop manner while waiting for Bypass-pipe set up. In the case of b) or c), the issue of Bypass-pipe setup time will be improved since SVC setup step can be skipped. In b), each node (CSR Katsube, et al. Expires Sept. 1st, 1995 [Page 10] Internet Draft March 1st, 1995 or ATM-attached host/router) should specify some traffic descriptors even for unused VCs, and the ATM datalink should reserve its resired resource (such as VCI value and bandwidth) for them. In addition, the ATM datalink may have to carry out UPC functions for those unused VCs. In c), on the other hand, traffic descriptors which should be specified by each node for the ATM datalink is not each VC's but VP's only. Resource reservations for individual VCs will be carried out not as a function of the ATM datalink but of each CSR or ATM-attached host/router if necessary. Only function which need to be provided by the ATM datalink is control of VPs' bandwidth such as UPC and dynamic bandwidth negotiation if necessary. As a result of the above observations, we will implement c) as a preliminary protocol specification, but may add a) in the future version if it is desirable. 4.3 Channels for Bypass-pipe Control Message Transfer There are several alternatives regarding the protocol for managing (setting up and releasing) a Bypass-pipe. This subsection explains these alternatives and discusses their properties from various viewpoints. Among alternatives of how to provide Dedicated-VCs described in 4.2, "picking up an unused VCI in PVP/SVP" is assumed. When we use "on- demand SVC setup", Dedicated-VC setup in each subnet through ATM signaling (and possibly ATMARP before that) will be added to the following procedures. An alternative described in iii), however, investigates a possibility of managing Bypass-pipes by only extending ATM signaling messages, and of excluding the necessity of additional control messages. Three alternatives are discussed, Inband message protocol, Outband message protocol, and ATM signaling extension. i) Inband Control Message When setting up a Bypass-pipe, control messages are transmitted over an unused Dedicated-VC which will eventually be used as a component of the Bypass-pipe. These messages are handled at each CSR and forwarded over an unused Dedicated-VC along the selected route (based on IP routing table) for the requested Bypass-pipe. Unlike outband message protocol described in ii), each message does not have to indicate VCI value used as a Dedicated-VC since the message itself is carried by that VC. That leads to a possibility that existing messages such as STII, RSVP, or NHRP messages itself can be utilized Katsube, et al. Expires Sept. 1st, 1995 [Page 11] Internet Draft March 1st, 1995 as a Bypass-pipe control message with no modification. However, there are following shortcomings; - Bypass-pipe control messages transmitted after a Bypass-pipe has been set up (e.g., Bypass-pipe release) cannot be identified at intermediate CSRs since those messages are forwarded at cell level there. - Bypass-incapable routers (routers which does not understand Bypass- pipe control protocol) would not receive and process cells transmitted over "unknown VCI". That means that Bypass-pipe management massages will be discarded by Bypass-incapable routers. We can manage to find solutions for the former issue. That is, intermediate CSRs can identify Bypass-pipe control messages by marking cell headers, e.g., PTI bit on indicating F5 OAM cell. It would be difficult, however, to find solutions for the latter issue. As a result of these observations, Inband message will not be adequate as a Bypass-pipe control. ii) Outband Control Message When both setting up and releasing a Bypass-pipe, control messages are transmitted over VCs which are different from Dedicated-VCs used as components of the Bypass-pipe. Although each message has to indicate VCI value used as a Dedicated-VC, which means that STII or RSVP messages received from conventional routers/hosts cannot be utilized as a Bypass-pipe control messages as they are, the shortcomings in inband case do not exist. Three alternatives are possible regarding how to convey Bypass-pipe control messages hop-by-hop over ATM datalink networks. 1) Defines VC for Bypass-pipe control messages only. 2) Uses Default-VC and discriminates Bypass-pipe control messages from user datagrams by an LLC/SANP value in RFC1483 encapsulation. 3) Uses Default-VC and discriminates Bypass-pipe control messages from user datagrams by a protocol field value in IP header. When we take into account interoperability with Bypass-incapable routers, 1) will not be a good choice. Whether we select 2) or 3) depends on whether we should consider multiprotocol rather than IP only. We select 3) as a preliminary protocol specification. Katsube, et al. Expires Sept. 1st, 1995 [Page 12] Internet Draft March 1st, 1995 iii) Use of ATM Signaling Message Supposing that ATM signaling messages can convey IP addresses (and possibly port IDs) of source and destination, it may be possible that ATM signaling messages be used as Bypass-pipe control messages also. In that case, an ATM Call Setup message indicates a setup of a Dedicated-VC to an ATM address of a desirable next-hop IP node, and also indicates a setup of a Bypass-pipe to an IP address (and possibly port ID) of a target destination node. Information elements for the Dedicated-VC setup (ATM address of a next-hop node, bandwidth, QOS, etc.) are handled at ATM nodes, while information elements for the Bypass-pipe setup (source and destination IP addresses, and possibly their port IDs, etc.) are transparently transferred to the next-hop IP node. The next-hop IP node accepts Dedicated-VC setup and handles such IP level information elements. Then it transmits an ATM signaling message to the ATM network in order to forward Bypass-pipe setup request to the next-hop IP node as well as to request Dedicated-VC setup to that. Examples of Bypass-pipe setup procedure are shown in figure 3. ATM ATM +----------+ +----------+ | | | | S3---| ATM-SW | | ATM-SW |---D3 | |X| | | |X| | ---R2---| |----R3----| |---R4--- +----------+ +----------+ subnet subnet #3 #4 Setup Setup Setup Setup (S3-R3) (S3-R3) (R3-D3) (R3-D3) (Trg)|------>|-|------>|---|------>|-|------>| |<------|-|-------|---|<------|-|-------| Conn Conn Conn Conn Setup Setup Setup Setup (S3-R3) (S3-R3) (R3-D3) (R3-D3) (Trg)|------>|-|------>|-+-|------>|-|------>| |<------|-|-------|-+ |<------|-|-------| Conn Conn Conn Conn (Trg) : Trigger event for Bypass-pipe setup Figure 3 Use of ATM Signaling Messages Katsube, et al. Expires Sept. 1st, 1995 [Page 13] Internet Draft March 1st, 1995 A difference of the two examples is whether a CONNECT message means confirmation of Bypass-pipe setup or confirmation of Dedicated-VC setup. The problems of this method are, - Information elements which specify IP level (and port level) information need to be defined, e.g., B-HLI or B-UUI, as an ATM signaling standard. - It would be difficult to support soft-state operation since ATM signaling is naturally a hard-state protocol. As a result of above observations, we will select ii) as a preliminary protocol specification, while the possibility of iii) will be further study issue. 4.4 Triggers for Bypass-pipe Setup/Release This subsection discusses several possible events which triggers a node to set up or release a Bypass-pipe. 4.4.1 Triggers for Bypass-pipe Setup As of now, following two cases should be taken into account as triggers for Bypass-pipe setup. 1) Request by IP layer or upper layers of an end host 2) Decision by a Bypass-capable router based on the measurement of IP datagrams transmitted. The case 1) is further classified into the cases in which the request is initiated by a sender-host or a receiver-host, and the cases in which the request is initiated by a Bypass-capable host or a Bypass- incapable host. Therefore we should take the following four cases into account. (see figure 2) 1-1) Request by Bypass-capable sender-host (S3) 1-2) Request by Bypass-incapable sender-host (S1, S2) 1-3) Request by Bypass-capable receiver-host (D3) 1-4) Request by Bypass-incapable receiver-host (D1, D2) Katsube, et al. Expires Sept. 1st, 1995 [Page 14] Internet Draft March 1st, 1995 For simplicity, we mean that the Bypass-(in)capable node is equal to ATM-(non)attached node in figure 2, but there may be the case that an ATM-attached node does not support Bypass-pipe control protocol. In the case of 1-1) or 1-3), sender-host(S3) or receiver-host(D3) itself executes Bypass-pipe setup protocol. IP layer or upper layers in S3 or D3 requests Bypass-pipe setup to its own Bypass management entity directly, or via other QOS/Resource management entity such as STII (in the case of sender-host) or RSVP (in the case of receiver- host). Resulting Bypass-pipe will finally be S3-->R3-->D3 in this case. In the case of 1-2) or 1-4), neither sender-host(S1, S2) nor receiver-host (D1, D2) has Bypass management entity. IP layer or upper layers in S1/S2 or D1/D2 requests resource reservation to its own QOS/Resource management entity such as STII or RSVP. Then it transmits IP level resource reservation messages over whatever datalink network (Ethernet in figure 2). A Bypass-capable router R2 or R4 which has received those messages either translates those IP level resource reservation messages to Bypass-pipe control messages or encapsulates those messages in Bypass-pipe control messages. In any case, Bypass-pipe management entities in R2 or R4 initiates Bypass-pipe setup procedures triggered by IP level resource management messages from S1/S2 or D1/D2. Resulting Bypass-pipe will finally be R2-->R3-->R4 in this case. In the case of 2), a Bypass-capable router R2 initiates Bypass-pipe setup procedures with its own decision based on the measurement of IP datagrams transmitted toward a certain destination host or network. Unlike case 1), the purpose of setting up Bypass-pipe in case 2) is to reduce IP processing burden at intermediate CSRs rather than to provide end hosts or applications with desired bandwidth/QOS. For example, when R2 detects large amount of datagrams bound for IP subnet #6, it may initiates Bypass-pipe setup with its destination set to subnet#6. Resulting Bypass-pipe will finally be R2-->R3-->R4 in this case. Whether the use of this Bypass-pipe is limited to the communication destined to subnet#6 or is open to communications destined to other networks (e.g., subnet#5) depends on whether the information about the Bypass-pipe is advertized as a routing information, and requires further study. 4.4.2 Triggers for Bypass-pipe Release The same items as the case of Bypass-pipe setup applies to Bypass- pipe release, 1) and 2). However, 1) will have variations of whether IP level resource reservation protocol which is running in Bypass- incapable hosts is hard-state (STII) or soft-state (RSVP). Bypass- Katsube, et al. Expires Sept. 1st, 1995 [Page 15] Internet Draft March 1st, 1995 pipe release can be initiated at R2 or R4 by the reception of explicit IP level resource release messages in both cases, and by the detection of "no resource keep message" for a predetermined timeout period in the case of soft-state. Detailed discussion will be given later. It should be noted that the trigger for setup/release of Bypass-pipe is not limited to examples given in this subsection although we assume them in specifying version 1 protocol. Actual Bypass-pipe control protocol, therefore, should not be dependent on what the trigger is. 4.5 Bypass-pipe Control Protocol It is desirable that the Bypass-pipe can be set up in response to triggers at least described in 4.4. That is, a Bypass-capable node (host or router) should initiate Bypass-pipe setup when, - It (router) has received IP level resource reservation messages from its upstream (e.g., STII) or downstream (e.g., RSVP) node. [1-2) or 1-4) in 4.4] - It (host) has received IP level resource reservation primitives from its own IP level resource reservation entities such as STII or RSVP. [1-1) or 1-3) in 4.4] - It (router) has decided to set up a Bypass-pipe for a communication to a certain host or network based on the result of IP level traffic measurement. [2) in 4.4] Taking them into account, the protocol should be designed to support - both hard-state (STII) and soft-state (RSVP) control - both sender-initiated (STII) and receiver-initiated (RSVP) control Since currently available IP level protocols are sender-initiated hard-state (STII) and receiver-initiated soft-state (RSVP), we investigate brief examples of those two as a preliminary protocol specification. 4.5.1 Sender-Initiated Hard-State Control Examples of sender-initiated hard-state control protocol are shown in Katsube, et al. Expires Sept. 1st, 1995 [Page 16] Internet Draft March 1st, 1995 figure 4. A Bypass-capable node (R2 or S3) which has been triggered by an upstream node (S1 or S2) or by an internal entity (S3) transmits a Bypass-pipe setup message (BP-Setup) toward target destination node which may or may not be Bypass-capable. BP-Setup includes at least target host or network address, identifier of a Bypass-pipe which is being set up, identifier of a Dedicated-VC which is being used as a component of the Bypass-pipe, and desirable bandwidth. A node which has received BP-Setup from the previous node obtains next-hop node based on IP routing table, decides whether requested bandwidth is available to the obtained node, and picks up an available Dedicated-VC to the node when the bandwidth is available. Then it transmits BP-Setup to the next-hop node over a Default-VC. This procedure is repeated all the way to the target node until whether the BP-Setup reaches the target node or some intermediate node recognizes that it cannot extend Bypass-pipe anymore (e.g., by the policy restriction or bandwidth shortage). Such a node creates a Bypass-pipe setup ack message (BP-SetupAck) over the reverse route toward the requesting node. Bypass-pipe setup procedure completes when the BP-SetupAck has returned to the requesting node. ATM ATM +---------+ +---------+ | | | | S3---| | | |---D3 | | | | ---R2---| |---R3---| |---R4--- | | | | +---------+ +---------+ subnet subnet #3 #4 [Setup] BP-Setup BP-Setup (Trg)|-------------->|--|-------------->| |<--------------|--|<--------------| BP-SetupAck BP-SetupAck [Release] BP-Release BP-Release (Trg)|-------------->| |-------------->| |<--------------| |<--------------| BP-RelAck BP-RelAck (Trg): Trigger event for Bypass-pipe setup Figure 4 Sender-Initiated Hard-State Control Katsube, et al. Expires Sept. 1st, 1995 [Page 17] Internet Draft March 1st, 1995 With regard to Bypass-pipe release, BP-Release message is issued by either of nodes on the route of the Bypass-pipe by its own trigger or by the indirect trigger (e.g., the reception of STII Disconnect from Bypass-incapable host which uses the Bypass-pipe). 4.5.2 Receiver-Initiated Soft-State Control Examples of receiver-initiated soft-state control protocol are shown in figure 5. In soft-state, sender node periodically transmits Path messages (BP-Path) over the Default-VC forward along the route specified by the IP routing table. These BP-Path messages are necessary in order that the resource reservation messages (BP-Resv) transmitted by the receiver node are routed in the reverse direction toward the sender node. A Bypass-capable node which has been triggered by a downstream node or by an its internal entity transmits a resource reservation message (BP-Resv) toward the sender node along the reverse route given by the BP-Path. Contents of the BP-Resv message will be the same as that of the RSVP Resv message with the addition of Bypass-pipe specific information such as an identifier of a Dedicated-VC which is being used as a component of the Bypass-pipe. A node which has received BP-Resv from the downstream node obtains previous-hop node based on the Path state given by the BP-Path messages, decides whether requested bandwidth to the previous node is available, and picks up an available Dedicated- VC to that node. Then it transmits the BP-Resv to that node over a Default-VC. This procedure is repeated all the way to the sender node. If the BP-Resv message fails to be transmitted at an intermediate node, that node would become an ingress point of the Bypass-pipe or send back an error message. After setting up a Bypass-pipe, the sender node still continues to send BP-Path messages, and the receiver node continues to send BP- Resv messages periodically. When individual nodes along the Bypass- pipe route have not received BP-Resv messages for a predetermined timeout period, they would decide that the Bypass-pipe can be released. The Bypass-pipe can be released by an explicit Teardown message as well as the above-mentioned timeout-based release. When some IP routing change happens to nodes related to a Bypass- pipe, BP-Path messages will be automatically transferred along the new route. BP-Resv messages from the receiver node then will be sent back along the new route, that will lead to removal of the Bypass- pipe from old route and creation of the Bypass-pipe over the new route. Katsube, et al. Expires Sept. 1st, 1995 [Page 18] Internet Draft March 1st, 1995 ATM ATM +---------+ +---------+ | | | | S3---| | | |---D3 | | | | ---R2---| |---R3---| |---R4--- | | | | +---------+ +---------+ subnet subnet #3 #4 [Setup] BP-Path BP-Path |-------------->|--|-------------->| |<--------------|--|<--------------|(Trg) | BP-Resv | | BP-Resv | | | | | %%% Communications over Bypass-pipe %%% [Keep] | | | | | BP-Path | | BP-Path | |-------------->|--|-------------->| |<--------------|--|<--------------| | BP-Resv | | BP-Resv | (Trg): Trigger event for Bypass-pipe setup Figure 5 Receiver-Initiated Soft-State Control 5. Security Considerations Security issues are not discussed in this memo. 6. Open Issues o Detailed protocol sequences and proposed message formats. It should be interoperable with conventional IP level resource management protocols such as STII and RSVP. o Mapping of IP level filter specs to ATM cell level control mechanisms, especially in multicast cases. o Do we support any combination of both hard-state/soft-state and sender-initiated/receiver-initiation control? o Support of source route option on the CSR-based network. Katsube, et al. Expires Sept. 1st, 1995 [Page 19] Internet Draft March 1st, 1995 7. References [ATM3.1] The ATM-Forum, "ATM User-Network Interface Specification, v.3.1", Sept. 1994. [ESAKI94] H. Esaki, et al., "Connection Oriented and Connectionless IP Forwarding over ATM Networks", IETF Internet Draft (work in progress), draft-esaki-co-cl-ip-forw-atm-01.txt, Oct. 1994. [IISP] The ATM-Forum, "Interim Inter-switch Signaling Protocol (IISP) Protocol Specification, Version 1.0", Dec. 1994. [IPNNI] R. Callon, "Integrated PNNI for Multi-Protocol Routing", The ATM Forum Contribution No. 94-0789, Sept. 1994. [KAT94] D. Katz and D. Piscitello, "NBMA Next Hop Resolution Protocol(NHRP)", IETF Internet Draft (work in progress), draft-ietf- rolc-nhrp-03.txt, Nov. 1994. [OHTA94] M. Ohta, et al., "Conventional IP over ATM", IETF Internet Draft (work in progress), draft-ohta-ip--over-atm-01.txt, July 1994. [PNNI] The ATM-Forum, "P-NNI Draft Specification R5", Jan. 1995. [REK95] Y. Rekhter and D. Kandlur, "IP Architecture Extensions over ATM", IETF Internet Draft (work in progress), draft-rekhter-ip-atm- architecture.txt, Jan. 1995. [RFC1483] J. Heinanen, "Multiprotocol Encapsulation over ATM Adaptation Layer 5", IETF RFC 1483, July 1993. [RFC1577] M. Laubach, "Classical IP and ARP over ATM", IETF RFC 1577, Oct. 1993. [RSVP] L. Zhang, et al., "Resource ReSerVation Protocol (RSVP), Version 1 Functional Specification", IETF Internet Draft (work in progress), draft-ietf-rsvp-spec-04.ps, Nov. 1994. [STII] L. Delgrossi and L. Berger, "Internet STream Protocol Version 2(STII)", Internet Draft (work in progress), draft-ietf-st2-spec-02.ps, Feb. 1995. 8. Authors' Address Yasuhiro Katsube R&D Center, Toshiba 1 Komukai Toshiba-cho, Saiwai-ku, Kawasaki 210 Katsube, et al. Expires Sept. 1st, 1995 [Page 20] Internet Draft March 1st, 1995 Japan Phone : +81-44-549-2238 Email : katsube@isl.rdc.toshiba.co.jp Ken-ichi Nagami R&D Center, Toshiba 1 Komukai Toshiba-cho, Saiwai-ku, Kawasaki 210 Japan Phone : +81-44-549-2238 Email : nagami@isl.rdc.toshiba.co.jp Hiroshi Esaki R&D Center, Toshiba 801 Schapiro Research Building, c/o CTR, Columbia Univ. 530 West, 120th St., New York, NY 10027 Phone : 212-854-2365 Email : hiroshi@ctr.columbia.edu Katsube, et al. Expires Sept. 1st, 1995 [Page 21]