Internet Draft
Draft		                                      Fred Baker
	                                              Carol Iturralde   
                                                      Francois	Le Faucheur
                                                      Bruce Davie          
	                                              Cisco Systems


	           RSVP	    Reservation Aggregation	  September 1999


		Aggregation of RSVP for	IPv4 and IPv6 Reservations
			draft-ietf-issll-rsvp-aggr-00.txt




	  This document	is an Internet-Draft and is in full conformance
	  with all provisions of Section 10 of RFC 2026. Internet Drafts
	  are working documents	of the Internet	Engineering Task Force
	  (IETF), its Areas, and its Working Groups.  Note that	other
	  groups may also distribute working documents as Internet
	  Drafts.

	  Internet Drafts are valid for	a maximum of six months	and may
	  be updated, replaced,	or obsoleted by	other documents	at any
	  time.	It is inappropriate to use Internet Drafts as reference
	  material or to cite them other than as a "work in progress".
	  Comments should be made to the authors and the rsvp@isi.edu
	  list.

          The list of current Internet-Drafts can be accessed at
          http://www.ietf.org/ietf/1id-abstracts.txt

          The list of Internet-Draft Shadow Directories can be accessed at
          http://www.ietf.org/shadow.html.

	  Copyright (C)	The Internet Society (1999).  All Rights Reserved

	  Abstract

	  A key	problem	in the design of RSVP version 1	is, as noted in
	  its applicability statement, that it lacks facilities	for
	  aggregation of individual reserved sessions into a common
	  class. The use of such aggregation is	required for
	  scalability.

	  This document	describes the use of a single RSVP reservation
	  to aggregate other RSVP reservations across a	transit	routing
	  region, in a manner conceptually similar to the use of Virtual
	  Paths	in an ATM network. It proposes a way to	dynamically
	  create the aggregate reservation, classify the traffic for
	  which	the aggregate reservation applies, determine how much
	  bandwidth is needed to achieve the requirement, and recover
	  the bandwidth	when the sub-reservations are no longer
	  required. It also contains recommendations concerning
	  algorithms and policies for predictive reservations.












	  Baker	et al.	      Expiration: March	2000		[Page 1]




	  Draft		   RSVP	Reservation Aggregation	  September 1999


	  1.  Introduction

	  A key	problem	in the design of RSVP version 1	[RSVP] is, as
	  noted	in its applicability statement,	that it	lacks facilities
	  for aggregation of individual	reserved sessions into a common
	  class. The use of such aggregation is	recommended in [CSZ],
	  and required for scalability.

	  The problem of aggregation may be addressed in a variety of
	  ways.	For example, it	may sometimes be sufficient simply to
	  mark reserved	traffic	with a suitable	DSCP (e.g. EF),	thus
	  enabling aggregation of scheduling and classification	state.
	  It may also be desirable to install one or more aggregate
	  reservations from ingress to egress of an "aggregation region"
	  (defined below) where	each aggregate reservation carries
	  similarly marked packets from	a large	number of flows. This is
	  to provide high levels of assurance that the end-to-end
	  requirements of reserved flows will be met, while at the same
	  time enabling	reservation state to be	aggregated.

	  Throughout, we will talk about "Aggregator" and
	  "Deaggregator", referring to the routers at the ingress and
	  egress edges of an aggregation region. Exactly how a router
	  determines whether it	should perform the role	of aggregator or
	  deaggregator is described below.

	  We will refer	to the individual reserved sessions (the
	  sessions we are attempting to	aggregate) as "end-to-end"
	  reservations ("E2E" for short), and to their respective
	  Path/Resv messages as	E2E Path/Resv messages.	We refer to the
	  the larger reservation (that which represents	many E2E
	  reservations)	as an "aggregate" reservation, and its
	  respective Path/Resv messages	as "aggregate Path/Resv
	  messages".



	  1.1.	Problem	Statement: Aggregation Of E2E Reservations

	  The problem of many small reservations has been extensively
	  discussed, and may be	summarized in the observation that each
	  reservation requires a non-trivial amount of message exchange,
	  computation, and memory resources in each router along the
	  way. It would	be nice	to reduce this to a more manageable
	  level	where the load is heaviest and aggregation is possible.

	  Aggregation, however,	brings its own challenges. In





	  Baker	et al.	      Expiration: March	2000		[Page 2]




	  Draft		   RSVP	Reservation Aggregation	  September 1999


	  particular, it reduces the level of isolation	between
	  individual flows, implying that one flow may suffer delay from
	  the bursts of	another. Synchronization of bursts from
	  different flows may occur. However, there is evidence	[CSZ] to
	  suggest that aggregation of flows has	no negative effect on
	  the mean delay of the	flows, and actually leads to a reduction
	  of delay in the "tail" of the	delay distribution (e.g. 99%
	  percentile delay) for	the flows. These benefits of aggregation
	  to some extent offset	the loss of strict isolation.


	  1.2.	Proposed Solution

	  The solution we propose involves the aggregation of several
	  E2E reservations that	cross an "aggregation region" and share
	  common ingress and egress routers into one larger reservation
	  from ingress to egress. We define an "aggregation region" as a
	  contiguous set of systems capable of performing RSVP
	  aggregation (as defined following) along any possible	route
	  through this contiguous set.

	  Communication	interfaces fall	into two categories with respect
	  to an	aggregation region; they are "exterior"	to an
	  aggregation region, or they are "interior" to	it. Routers that
	  have at least	one interface in the region fall into one of
	  three	categories with	respect	to a given RSVP	session; they
	  aggregate, they deaggregate, or they are between an aggregator
	  and a	deaggregator.

	  Aggregation depends on being able to hide E2E	RSVP messages
	  from RSVP-capable routers inside the aggregation region. To
	  achieve this end, the	IP Protocol Number in the E2E
	  reservation's	Path, PathTear,	and ResvConf messages is changed
	  from RSVP (46) to RSVP-E2E-IGNORE (a new value, to be
	  assigned) upon entering the aggregation region, and restored
	  to RSVP at the deaggregator point. These messages are	ignored
	  (no state is stored and the message is forwarded as a	normal
	  IP datagram) by each router within the aggregation region
	  whenever they	are forwarded to an interior interface.	Since
	  the deaggregating router perceives the previous RSVP hop on
	  such messages	to be the aggregating router, Resv and other
	  messages do not require this modification; they are unicast
	  from RSVP hop	to RSVP	hop anyway.

	  The token buckets (SENDER_TSPECs and FLOWSPECS) of E2E
	  reservations are summed into the corresponding information
	  elements in aggregate	Path and Resv messages.	Aggregate Path





	  Baker	et al.	      Expiration: March	2000		[Page 3]




	  Draft		   RSVP	Reservation Aggregation	  September 1999


	  messages are sent from the aggregator	to the deaggregator(s)
	  using	RSVP's normal IP Protocol Number. Aggregate Resv
	  messages are sent back from the deaggregator to the
	  aggregator, thus establishing	an aggregate reservation on
	  behalf of the	set of E2E flows that use this aggregator and
	  deaggregator.	There may be several such aggregate reservations
	  between the same two routers,	representing different classes
	  of traffic; the aggregate reservation	is therefore for the
	  traffic marked with a	particular DSCP.


	  1.3.	Definitions

	  We define an "aggregation region" as a set of	RSVP-capable
	  routers for which E2E	RSVP messages arriving on an exterior
	  interface of one router in the set would traverse one	or more
	  interior interfaces (of this and possibly of other routers in
	  the set) before finally traversing an	exterior interface.

	  Such an E2E RSVP message is said to have crossed the
	  aggregation region.

	  We define the	"aggregating" router for this E2E flow as the
	  first	router that processes the E2E Path message as it enters
	  the aggregation region (i.e.,	the one	which forwards the
	  message from an exterior interface to	an interior interface).

	  We define the	"deaggregating"	router for this	E2E flow as the
	  last router to process the E2E Path as it leaves the
	  aggregation region (i.e., the	one which forwards the message
	  from an interior interface to	an exterior interface).

	  We define an "interior" router for this E2E flow as any router
	  in the aggregation region which receives this	message	on an
	  interior interface and forwards it to	another	interior
	  interface.  Interior routers perform neither aggregation nor
	  deaggregation	for this flow.

	  Note that by these definitions a single router with a	mix of
	  interior and exterior	interfaces may have the	capability to
	  act as an aggregator on some E2E flows, a deaggregator on
	  other	E2E flows, and an interior router on yet other flows.










	  Baker	et al.	      Expiration: March	2000		[Page 4]




	  Draft		   RSVP	Reservation Aggregation	  September 1999


	  1.4.	Detailed Aspects of Proposed Solution

	  A number of issues jump to mind in considering this model.


	  1.4.1.  Traffic Classification Within	The Aggregation	Region

	  One of the reasons that RSVP Version 1 did not identify a way
	  to aggregate sessions	was that there was not a clear way to
	  classify the aggregate. With the development of the
	  Differentiated Services architecture,	this is	at least
	  partially resolved; traffic of a particular class can	be
	  marked with a	given DSCP and so classified. We presume this
	  model.

	  We presume that on each link en route, a queue, WDM color, or
	  similar management component is set aside for	all aggregated
	  traffic of the same class, and that sufficient bandwidth is
	  made available to carry the traffic that has been assigned to
	  it. This bandwidth may be adjusted based on the total	amount
	  of aggregated	reservation traffic assigned to	the same class.

	  There	are numerous options for exactly which Diff-serv PHBs
	  might	be used	for different classes of traffic as it crosses
	  the aggregation region. This is the "service mapping"	problem
	  described in [ISDS], and is applicable to situations broader
	  than those described in this document.  Arguments can	be made
	  for using either EF or one or	more AF	PHBs for aggregated
	  traffic.

	  Independent of which PHB is used, care needs to be take in an
	  environment where provisioned	Diff-Serv and aggregated RSVP
	  are used in the same network,	to ensure that the total offered
	  load for a single PHB	does not exceed	the link capacity
	  allocated to that PHB. One solution to this is to reserve one
	  of the four AF classes strictly for the aggregated reservation
	  traffic while	using other AF classes for provisioned Diff-
	  Serv.

	  Inside the aggregation region, some RSVP reservation state is
	  maintained per aggregate reservation,	while a	single
	  classification and scheduling	state (e.g., a DSCP used for
	  classifying traffic) is maintined per	aggregate reservation
	  class	(rather	than per aggregate reservation).  For example,
	  if Guaranteed	Service	is represented by the EF DSCP throughout
	  the aggregation region, there	may be a reservation for each
	  aggregator/deaggregator pair in each router, but only	the EF





	  Baker	et al.	      Expiration: March	2000		[Page 5]




	  Draft		   RSVP	Reservation Aggregation	  September 1999


	  DSCP need be inspected at each interior interface, and only a
	  single queue is used for all EF traffic.


	  1.4.2.  Deaggregator Determination

	  The first question is	"How do	we know	which aggregate
	  reservation a	particular E2E flow should aggregate into?" To
	  know that, we	must know three	things:	its aggregating	router,
	  its deaggregating router, and	(assuming DSCPs	are used to
	  differentiate	among various reservations between the same two
	  routers), the	relevant DSCP.

	  Determination	of the aggregator is trivial: we know that an
	  E2E flow has arrived at an aggregator	when its Path message
	  arrives at a router on an exterior interface and must	be
	  forwarded on an interior interface.

	  Determining the DSCP is equally easy,	or at least it is in
	  concept. The DSCP is chosen for an aggregate reservation based
	  on some policy, which	may take into account such factors as
	  the intserv service class requested for the flow. (Some
	  details in the exact point at	which the DSCP can be determined
	  are discussed	below.)

	  Determination	of the deaggregator is more involved. If an SPF
	  routing protocol, such as OSPF or IS-IS, is in use, and if it
	  has been extended to advertise information on	Deaggregation
	  roles, it can	tell us	the set	of routers from	which the
	  deaggregator will be chosen. In principle, if	the aggregator
	  and deaggregator are in the same area, then the identity of
	  the deaggregator could be determined from the	link state
	  database. However, this approach would not work in multi-area
	  environments or for distance vector protocols.

	  One method for Deaggregator determination is manual
	  configuration. With this method the network operator would
	  configure the	Aggregator and the Deaggregator	with the
	  necessary information.

	  Another method allows	automatic Deaggregator determination and
	  corresponding	Aggregator notification. When the E2E RSVP Path
	  message transits from	an interior interface to an exterior
	  interface, the deaggregating router must advise the
	  aggregating router of	the correlation	between	itself and the
	  flow.	This has the nice attribute of not being specific to the
	  routing protocol. It also has	the property of	automatically





	  Baker	et al.	      Expiration: March	2000		[Page 6]




	  Draft		   RSVP	Reservation Aggregation	  September 1999


	  adjusting to route changes. For instance, if because of a
	  topology change, another Deaggregator	is now on the shortest
	  path,	this method will automatically identify	the new
	  Deaggregator and swap	to it.


	  1.4.3.  Size of Aggregate Reservations

	  A range of options exist for determining the size of the
	  aggregate reservation, presenting a tradeoff between
	  simplicity and scalability. Simplistically, the size of the
	  aggregate reservation	needs to be greater than or equal to the
	  sum of the bandwidth of the E2E reservations it aggregates,
	  and its burst	capacity must be greater than or equal to the
	  sum of their burst capacities. However, if followed
	  religiously, this leads us to	change the bandwidth of	the
	  aggregate reservation	each time an underlying	E2E reservation
	  changes, which loses one of the key benefits of aggregation,
	  the reduction	of message processing cost in the aggregation
	  region.

	  We assume, therefore,	that there is some policy, not defined
	  in this specification	(although sample policies are suggested
	  which	have the necessary characteristics). This policy
	  maintains the	amount of bandwidth required on	a given
	  aggregate reservation	by taking account of the sum of	the
	  bandwidths of	its underlying E2E reservations, while
	  endeavoring to change	it infrequently. This may require some
	  level	of trend analysis. If there is a significant probability
	  that in the next interval of time the	current	aggregate
	  reservation will be exhausted, the router must predict the
	  necessary bandwidth and request it. If the router has	a
	  significant amount of	bandwidth reserved but has very	little
	  probability of using it, the policy may be to	predict	the
	  amount of bandwidth required and release the excess.

	  This policy is likely	to benefit from	introduction of	some
	  hysteresis (i.e. ensure that the trigger condition for
	  aggregate reservation	size increase is sufficiently different
	  from the trigger condition for aggregate reservation size
	  decrease) to avoid oscillation in stable conditions.

	  Clearly, the definition and operation	of such	policies are as
	  much business	issues as they are technical, and are out of the
	  scope	of this	document.







	  Baker	et al.	      Expiration: March	2000		[Page 7]




	  Draft		   RSVP	Reservation Aggregation	  September 1999


	  1.4.4.  Intra-domain Routes

	  RSVP directly	handles	route changes, in that reservations
	  follow the routes that their data follow. This follows from
	  the property that Path messages contain the same IP source and
	  destination address as the data flow for which a reservation
	  is to	be established.	However, since we are now making
	  aggregate reservations by sending a Path message from	an
	  aggregating to a deaggregating router, the reserved (E2E) data
	  packets no longer carry the same IP addresses	as the relevant
	  (aggregate) Path message. The	issue becomes one of making sure
	  that data packets for	reserved flows follow the same path as
	  the Path message that	established Path state for the aggregate
	  reservation. Several approaches are viable.

	  First, the data may be tunneled from aggregator to
	  deaggregator,	using technologies such	as IP-in-IP tunnels, GRE
	  tunnels, MPLS	label-switched paths, and so on. These each have
	  particular advantages, especially MPLS, which	allows traffic
	  engineering. They each also have some	cost in	link overhead
	  and configuration complexity.

	  If data is not tunneled, then	we are depending a
	  characteristic of IP best metric routing , which is that if
	  the route from A to Z	includes the path from H to L, and the
	  best metric route was	chosen all along the way, then the best
	  metric route was chosen from H to L. Therefore, an aggregate
	  path message which crosses a given aggregator	and deaggregator
	  will of necessity use	the best path between them.

	  If this is a single path, the	problem	is solved. If it is a
	  multi-path route, and	the paths are of equal cost, then we are
	  forced to determine, perhaps by measurement, what proportion
	  of the traffic for a given E2E reservation is	passing	along
	  each of the paths, and assure	ourselves of sufficient
	  bandwidth for	the present use. A simple, though inelegant, way
	  of doing this	is to reserve the total	capacity of the
	  aggregate route down each path.

	  For this reason, we believe it is advantageous to use	one of
	  the above-mentioned tunneling	mechanisms in cases where
	  multiple equal-cost paths may	exist.










	  Baker	et al.	      Expiration: March	2000		[Page 8]




	  Draft		   RSVP	Reservation Aggregation	  September 1999


	  1.4.5.  Inter-domain Routes

	  The case of inter-domain routes differs somewhat from	the
	  intra-domain case just described. Specifically, best-path
	  considerations do not	apply, as routing is by	a combination of
	  routing policy and shortest AS path rather than simple best
	  metric.

	  In the case of inter-domain routes, data traffic belonging to
	  different E2E	sessions (but the same aggregate session) may
	  not enter an aggregation region via the same aggregator
	  interface, and/or may	not leave via the same deaggregator
	  interface. It	is possible that we could identify this
	  occurrence in	some central system which sees the reservation
	  information for both of the apparent sessions, but it	is not
	  clear	that we	could determine	a priori how much traffic went
	  one way or the other apart from measurement.

	  We simply note that this problem can occur and needs to be
	  allowed for in the implementation. We	recommend that each such
	  e2e reservation be summed into its appropriate aggregate
	  reservation, even though this	involves over-reservation.


	  1.4.6.  Reservations for Multicast Sessions

	  Aggregating reservations for multicast sessions is
	  significantly	more complex than for unicast sessions.	The
	  first	challenge is to	construct a multicast tree for
	  distribution of the aggregate	Path messages which follows the
	  same path as will be followed	by the data packets for	which
	  the aggregate	reservation is to be made. This	is complicated
	  by the fact that the path taken by a data packet may depend on
	  many factors such as its source address, the choice of shared
	  trees	or source-specific trees, and the location of a
	  rendezvous point for the tree.

	  Once the problem of distributing aggregate Path messages is
	  solved, there	are considerable problems in determining the
	  correct amount of resources to reserve at each link along the
	  multicast tree. Because of the amount	of heterogeneity that
	  may exist in an aggregate multicast reservation, it appears
	  that it would	be necessary to	retain information about
	  individual E2E reservations within the aggregation region to
	  allocate resources correctly.	Thus, we may end up with a
	  complex set of procedures for	forming	aggregate reservations
	  that do not actually reduce the amount of stored state





	  Baker	et al.	      Expiration: March	2000		[Page 9]




	  Draft		   RSVP	Reservation Aggregation	  September 1999


	  significantly	for multicast sessions.	[BERSON] describes
	  possible ways	to reduce this state by	using measurement-based
	  admission control.

	  As noted above, there	are several aspects to RSVP state, and
	  our approach for unicast aggregates all forms	of state:
	  classification, scheduling, and reservation state. One
	  possible approach to multicast is to focus only on aggregation
	  of classification and	scheduling state, which	are arguably the
	  most important because of their impact on the	fast path. That
	  approach is the one described	in the current draft.


	  1.4.7.  Multi-level Aggregation

	  Ideally, an aggregation scheme should	be able	to accommodate
	  recursive aggregation, with aggregate	reservations being
	  themselves aggregated. Multi-level aggregation can be
	  accomplished using the procedures described here and a simple
	  extension to the protocol number swapping process.

	  We can consider E2E RSVP reservations	to be at aggregation
	  level	0. When	we aggregate these reservations, we produce
	  reservations at aggregation level 1. In general, level n
	  reservations may be aggregated to form reservations at level
	  n+1.

	  When an aggregating router receives an E2E Path, it swaps the
	  protocol number from RSVP to RSVP-E2E-IGNORE.	In addition, it
	  should write the aggregation level (1, in this case) in the 2
	  byte field that is present (and currently unused) in the
	  router alert option. In general, a router which aggregates
	  reservations at level	n to create reservations at level n+1
	  will write the number	n+1 in the router alert	field. A router
	  which	deaggregates level n+1 reservations will examine all
	  messages with	IP protocol number RSVP-E2E-IGNORE but will
	  process the message and swap the protocol number back	to RSVP
	  only in the case where the router alert field	carries	the
	  number n+1. For any other value, the message is forwarded
	  unchanged. Interior routers ignore all messages with IP
	  protocol number RSVP-E2E-IGNORE. Note	that only a few	bits of
	  the 2	byte field in the option would be needed, given	the
	  likely number	of levels of aggregation.









	  Baker	et al.	      Expiration: March	2000	       [Page 10]




	  Draft		   RSVP	Reservation Aggregation	  September 1999


	  1.4.8.  Reliability Issues

	  There	are a variety of issues	that arise in the context of
	  aggregation that would benefit from some form	of explicit
	  acknowledgment mechanism for RSVP messages. For example,  it
	  is possible to configure a set of routers such that an E2E
	  Path of protocol type	RSVP-E2E-IGNORE	would be effectively
	  "black-holed", if it never reached a router which was
	  appropriately	configured to act as a deaggregator. It	could
	  then travel all the way to its destination where it would
	  probably be ignored due to its non-standard protocol number.
	  This situation is not	easy to	detect.	The aggregator can be
	  sure this problem has	not occurred if	an aggregate PathErr
	  message is received from the deaggregator (as	described in
	  detail below). It can	also be	sure there is no problem if an
	  E2E Resv is received.	However, the fact that neither of these
	  events has happened may only mean that no receiver wishes to
	  reserve resources for	this session, or that an RSVP message
	  loss occurred, or it may mean	that the Path was black-holed.
	  However, if a	neighbor-to-neighbor acknowledgment mechanism
	  existed, the aggregator would	expect to receive an
	  acknowledgment of the	E2E Path from the deaggregator,	and
	  would	interpret the lack of a	response as an indication that a
	  problem of configuration existed. It could then refrain from
	  aggregating this particular session. We note that such a
	  reliability mechanism	has been proposed for RSVP in [REFRESH]
	  and propose that it be used here.

























	  Baker	et al.	      Expiration: March	2000	       [Page 11]




	  Draft		   RSVP	Reservation Aggregation	  September 1999


	  2.  Elements of Procedure

	  To implement aggregation, we define a	number of elements of
	  procedure.


	  2.1.	Receipt	of E2E Path Message By Aggregating Router

	  The very first event is the arrival of the E2E Path message at
	  an exterior interface	of an aggregator.  Standard RSVP
	  procedures [RSVP] are	followed for this, including onto what
	  set of interfaces the	message	should be forwarded. These
	  interfaces comprise zero or more exterior interfaces and zero
	  or more interior interfaces. (If the number of interior
	  interfaces is	zero, the router is not	acting as an aggregator
	  for this E2E flow.)

	  Service on exterior interfaces is handled as defined in
	  [RSVP].

	  Service on interior interfaces is complicated	by the fact that
	  the message needs to be included in some aggregate
	  reservation, but at this point it is not known which one,
	  because the deaggregator is not known. Therefore, the	E2E Path
	  message is forwarded on the interior interface(s) using the IP
	  Protocol number RSVP-E2E-IGNORE, but in every	other respect
	  identically to the way it would be sent by an	RSVP router that
	  was not performing aggregation.



	  2.2.	Handling Of E2E	Path Message By	Interior Routers

	  At this point, the e2e Path message traverses	zero or	more
	  interior routers.  Interior routers receive the e2e Path
	  message on an	interior interface and forward it on another
	  interior interface. The Router Alert IP Option alerts	interior
	  routers to check internally, but they	find that the IP
	  Protocol is RSVP-E2E-IGNORE and the next hop interface is
	  interior. As such, they simply forward it as a normal	IP
	  datagram.


	  2.3.	Receipt	of E2E Path Message By Deaggregating Router

	  The E2E Path message finally arrives at a deaggregating
	  router, which	receives it on an interior interface and





	  Baker	et al.	      Expiration: March	2000	       [Page 12]




	  Draft		   RSVP	Reservation Aggregation	  September 1999


	  forwards it on an exterior interface.	Again, the Router Alert
	  IP Option alerts it to intercept the message,	but this time
	  the IP Protocol is RSVP-E2E-IGNORE and the next hop interface
	  is an	exterior interface.

	  At this point, the deaggregating router associates the flow
	  with an aggregate reservation. This selection	is done	on the
	  basis	of policy, and may take	into account not only the
	  aggregating router (whose IP Address may be found in the RSVP
	  Hop Object) but other	information about the flow. If no such
	  aggregate reservation	exists and the router is so configured,
	  it may generate a PathErr with code NEW-AGGREGATE-NEEDED back
	  to the aggregating router. This should not result in any
	  reservation being taken down,	but may	result in the
	  aggregating router initiating	the necessary aggregate	Path
	  message, as described	in the following section.

	  The deaggregating router changes the e2e Path	message's IP
	  Protocol from	RSVP-E2E-IGNORE	to IP Protocol RSVP, updates the
	  ADSPEC of the	e2e Path using information accumulated by the
	  aggregate Path ADSPEC	(if an aggregate Path has been
	  received), and the E2E Path message is forwarded towards its
	  intended destination.	To enable correct updating of the
	  ADSPEC, a deaggregating router may wait for the arrival of an
	  aggregate Path before	forwarding the E2E Path.



	  2.4.	Initiation of New Aggregate Path Message By Aggregating
	  Router

	  The aggregating router is responsible	to take	account	of the
	  SENDER_TSPEC information from	individual E2E Path messages in
	  constructing the SENDER_TSPEC	of the aggregate Path message it
	  sends	to its deaggregating router. The aggregating router may
	  know that an E2E session is associated with a	given
	  deaggregator when one	of two events occurs: it receives a
	  PathErr message with the error code NEW-AGGREGATE-NEEDED from
	  the deaggregator, or it receives an E2E Resv message from the
	  deaggregator.	In the latter case, the	Resv contains a	DCLASS
	  object [DCLASS] indicating which DSCP	the deaggregator
	  believes that	the E2E	flow belongs in. In the	former case, the
	  aggregator must make its own determination of	a suitable DSCP
	  based	on the information in the E2E Path message(s) being
	  aggregated and using locally available policy	information.
	  The identity of the deaggregator itself is found in either the
	  ERROR	SPECIFICATION of the PathErr message or	the RSVP HOP





	  Baker	et al.	      Expiration: March	2000	       [Page 13]




	  Draft		   RSVP	Reservation Aggregation	  September 1999


	  object of the	E2E Resv.

	  On receipt of	either message,	if no corresponding aggregate
	  path state exists from the aggregator	to the deaggregator for
	  a session with the appropriate DSCP, and the aggregator is
	  configured to	do so, the aggregator should generate an
	  aggregate Path message for the aggregate reservation.	The
	  destination address of the aggregate Path message is the
	  address of the deaggregating router, and the message is sent
	  with IP protocol number RSVP.


	  2.5.	Handling of E2E	Resv Message by	Deaggregating Router

	  Having sent the E2E Path message on toward the destination,
	  the deaggregator must	now expect to receive an E2E Resv for
	  the session. On receipt, its responsibility is to ensure that
	  there	is sufficient bandwidth	reserved within	the aggregation
	  region to support the	new E2E	reservation, and if there is,
	  then to forward the E2E Resv to the aggregating router.

	  If there is insufficient bandwidth reserved, it should follow
	  the normal RSVP procedures [RSVP] for	a reservation being
	  placed with insufficient bandwidth to	support	the reservation.
	  It may also immediately attempt to increase the aggregate
	  reservation that is supplying	bandwidth by increasing	the size
	  of the flowspec that it includes in the aggregate Resv that it
	  sends	upstream. However, this	may not	be sufficient to
	  increase the size of the aggregate reservation, because RSVP
	  routers take the minimum of the Sender TSpec and Receiver
	  TSpec	when installing	a reservation, and thus	the installed
	  aggregate reservation	may be limited by the size of the sender
	  TSpec. The likelihood	of this	situation can be reduced by a
	  sufficiently large choice of TSpec by	the aggregator.

	  When sufficient bandwidth is available, it may simply	send the
	  E2E Resv message with	IP Protocol RSVP to the	aggregating
	  router. This message should, in addition to other data,
	  contain the DCLASS object to indicate	which DSCP the
	  deaggregating	router expects the aggregator to use. The choice
	  of DSCP may be made based on a combination of	information in
	  the received E2E Resv	and local policy. An example policy
	  might	dictate	a certain DSCP for Guaranteed Service and
	  another DSCP for Controlled Load. The	de-aggregator will also
	  add the token	bucket from the	FLOWSPEC object	into its
	  internal understanding of how	much of	that reservation is in
	  use.





	  Baker	et al.	      Expiration: March	2000	       [Page 14]




	  Draft		   RSVP	Reservation Aggregation	  September 1999


	  2.6.	Initiation of New Aggregate Resv Message By
	  Deaggregating	Router

	  Upon receiving an E2E	Resv message on	an exterior interface,
	  and having determined	the appropriate	DSCP for the session,
	  the deaggregator looks for corresponding path	state for a
	  session with the chosen DSCP.	 If aggregate Path state exists,
	  but no aggregate Resv	state exists, the deaggregator creates
	  an aggregate Resv and	sets its initial request to a value not
	  smaller than the requirement of the E2E reservation it is
	  supporting.

	  If no	aggregate Path state exists for	the appropriate	DSCP,
	  this may be because the aggregator has not yet responded to
	  the arrival of the E2E Resv sent in the preceding step. To
	  avoid	deadlock while waiting for a response, it would	be
	  desirable to use the acknowledgment mechanisms described in
	  [REFRESH].

	  Once the deaggregator	has established	the aggregate Path
	  state, then it sends an aggregate Resv message toward	the
	  aggregator (i.e., to the previous hop), using	the AGGREGATED-
	  RSVP session and filter specifications. Since	the DSCP is in
	  the SESSION object, the DCLASS is unnecessary. The message
	  should be reliably delivered using the mechanisms in [REFRESH]
	  or, alternatively, the CONFIRM object	may be used, to	assure
	  that the aggregate Resv does indeed arrive and is granted.
	  This enables the deaggregator	to determine that the requested
	  bandwidth is available to allocate to	the E2E	flows it
	  supports.


	  2.7.	Handling of Aggregate Resv Message by Interior Routers

	  The aggregate	Resv message is	handled	in essentially the same
	  way as defined in [RSVP]. The	Session	object contains	the
	  address of the deaggregating router (or the group address for
	  the session in the case of multicast)	and the	DSCP that has
	  been chosen for the session. The Filterspec object identifies
	  the aggregating router. These	routers	perform	admission
	  control and resource allocation as usual and send the
	  aggregate Resv on towards the	aggregator.










	  Baker	et al.	      Expiration: March	2000	       [Page 15]




	  Draft		   RSVP	Reservation Aggregation	  September 1999


	  2.8.	Handling of E2E	Resv Message by	Aggregating Router

	  The E2E Resv message is the final confirmation to the
	  aggregating router that a proportion of a given aggregate's
	  bandwidth has	been reserved. At this point, it should	ensure
	  that the E2E reservation is associated with an appropriate
	  aggregate, that the aggregator and deaggregator expectations
	  synchronize, and that	all things are in place. In particular,
	  it needs to ensure that the DCLASS carried in	the E2E	Resv
	  matches the DSCP for an aggregate session to that
	  deaggregator;	if not,	it needs to create a new aggregate Path
	  for the appropriate DSCP and send it to the deaggregator. It
	  should also ensure that the SENDER_TSPEC from	the E2E	Path
	  message has been accumulated into the	appropriate aggregate
	  Path message.	Under normal circumstances, this is the	only way
	  it will be informed of this association. It should now forward
	  the E2E Resv to its previous hop, following normal RSVP
	  processing rules [RSVP].


	  2.9.	Removal	of E2E Reservation

	  E2E reservations are removed in the usual way	via PathTear,
	  ResvTear, timeout, or	as the result of an error condition.
	  When they are	removed, their FLOWSPEC	information must also be
	  removed from the allocated portion of	the aggregate
	  reservation. This same bandwidth may be re-used for other
	  traffic in the near future. When E2E Path messages are
	  removed, their SENDER_TSPEC information must also be removed
	  from the aggregate Path.


	  2.10.	 Removal of Aggregate Reservation

	  Should an aggregate reservation go away (presumably due to a
	  configuration	change,	route change, or policy	event),	the E2E
	  reservations it supports are no longer active. They must be
	  treated accordingly.


	  2.11.	 Handling of Data On Reserved E2E Flow by Aggregating
	  Router

	  Prior	to establishment that a	given E2E flow is part of a
	  given	aggregate, the flow's data should be treated as	traffic
	  without a reservation	by whatever policies prevail for such.
	  Generally, this will mean being given	the same forwarding





	  Baker	et al.	      Expiration: March	2000	       [Page 16]




	  Draft		   RSVP	Reservation Aggregation	  September 1999


	  behavior as non-essential traffic. However, upon establishing
	  that the flow	belongs	to a given aggregate, the aggregating
	  router is responsible	to mark	any related traffic with the
	  correct DSCP and forward it in the manner appropriate	to
	  traffic on that reservation. This may	imply forwarding it to a
	  given	IP next	hop, or	piping it down a given link layer
	  circuit, tunnel, or MPLS label switched path.

	  The aggregator is responsible	for performing per-reservation
	  policing on the E2E flows that it is aggregating. The
	  aggregator performs metering of traffic belonging to each
	  reservation to assess	compliance to the token	bucket for the
	  corresponding	E2E reservation. Packets which are assessed in
	  compliance are forwarded as mentioned	above. Packets which are
	  assessed out of compliance must be either dropped or marked to
	  a different DSCP. The	detailed policing behavior is an aspect
	  of the service mapping described in [ISDS].


	  2.12.	 Procedures for	Multicast Sessions

	  Because of the difficulties of aggregating multicast sessions
	  described above, we focus on the aggregation of scheduling and
	  classification state in the multicast	case. The main
	  difference between the multicast and unicast cases is	that
	  rather than sending an aggregate Path	message	to the unicast
	  address of a single deaggregating router, in the multicast
	  case we send the "aggregate" Path message to the same	group
	  address as the E2E session. This ensures that	the aggregate
	  Path message follows the same	route as the E2E Path. This
	  difference between unicast and multicast is reflected	in the
	  Session objects defined below. A consequence of this approach
	  is that we continue to have reservation state	per multicast
	  session inside the aggregation region.

	  A further challenge arises in	multicast sessions with
	  heterogeneous	receivers. Consider an interior	router which
	  must forward packets for a multicast session on two
	  interfaces, but has only received a reservation request on one
	  of those interfaces. It receives packets marked with the DSCP
	  chosen for the aggregate reservation.	When sending them out
	  the interface	which has no installed reservation, it has the
	  following options:

	       a) remark those packets to best effort before sending
	       them out	the interface;






	  Baker	et al.	      Expiration: March	2000	       [Page 17]




	  Draft		   RSVP	Reservation Aggregation	  September 1999


	       b) send the packets out the interface with the DSCP
	       chosen for the aggregate	reservation.

	  The first approach suffers from the drawback that it requires
	  MF classification at an interior router in order to recognize
	  the flows whose packets must be demoted. The second approach
	  requires over-reservation of resources on the	interface on
	  which	no reservation was received. In	the absence of such
	  over-reservation, the	packets	sent with the "wrong" DSCP would
	  be able to degrade the service experienced by	packets	using
	  that DSCP legitimately.

	  To make MF classification acceptable in an interior router, it
	  may be possible to treat the case of heterogenous flows as an
	  exception. That is, an interior router only needs to be able
	  to recognize those individual	microflows that	have
	  heterogeneous	resource needs on the outbound interfaces of
	  this router.


	  3.  Protocol Elements

	  3.1.	IP Protocol RSVP-E2E-IGNORE

	  This specification presumes the assignment of	a protocol type
	  RSVP-E2E-IGNORE, whose number	is at this point TBD. This is
	  used only on messages	which require a	router alert (Path,
	  PathErr, and ResvConf), and signifies	that the message must be
	  treated one way when copied to an interior interface,	and
	  another way when copied to an	exterior interface.


	  3.2.	Path Error Code

	  A PathErr code NEW-AGGREGATE-NEEDED is presumed. This	value
	  does not signify that	a fatal	error has occurred, but	that an
	  action is required of	the aggregating	router to avoid	an error
	  condition in the near	future.


	  3.3.	SESSION	Object

	  The SESSION object contains two values: the IP Address of the
	  aggregate session destination, and the DSCP that it will use
	  on the E2E data the reservation contains. For	unicast
	  sessions, the	session	destination address is the address of
	  the deaggregating router. For	multicast sessions, the	session





	  Baker	et al.	      Expiration: March	2000	       [Page 18]




	  Draft		   RSVP	Reservation Aggregation	  September 1999


	  destination is the multicast address of the E2E session (or
	  sessions) being aggregated.  The inclusion of	the DSCP in the
	  session allows for multiple sessions toward the same address
	  to be	distinguished by their DSCP and	queued separately. It
	  also provides	the means for aggregating scheduling and
	  classification state.	In the case where a session uses a pair
	  of PHBs (e.g.	AF11 and AF12),	the DSCP used should represent
	  the numerically smallest PHB (e.g. AF11). This follows the
	  same naming convention described in [BRIM].

	  Session types	are defined for	IPv4 and IPv6 addresses.

		o    IP4 SESSION object: Class = SESSION,
		     C-Type = RSVP-AGGREGATE-IP4

		     +-------------+-------------+-------------+-------------+
		     |		    IPv4 Session Address (4 bytes)	     |
		     +-------------+-------------+-------------+-------------+
		     | /////////// |	Flags	 |  /////////  |     DSCP    |
		     +-------------+-------------+-------------+-------------+


		o    IP6 SESSION object: Class = SESSION,
		     C-Type = RSVP-AGGREGATE-IP6

		     +-------------+-------------+-------------+-------------+
		     |							     |
		     +							     +
		     |							     |
		     +		    IPv6 Session Address (16 bytes)	     +
		     |							     |
		     +							     +
		     |							     |
		     +-------------+-------------+-------------+-------------+
		     | /////////// |	Flags	 |  /////////  |     DSCP    |
		     +-------------+-------------+-------------+-------------+

	  3.4.	SENDER_TEMPLATE	Object

	  The SENDER_TEMPLATE  object identifies the aggregating router
	  for the aggregate reservation.











	  Baker	et al.	      Expiration: March	2000	       [Page 19]




	  Draft		   RSVP	Reservation Aggregation	  September 1999


		o    IP4 SENDER_TEMPLATE object: Class = SENDER_TEMPLATE,
		     C-Type = RSVP-AGGREGATE-IP4

		     +-------------+-------------+-------------+-------------+
		     |		      IPv4 Aggregator Address (4 bytes)	     |
		     +-------------+-------------+-------------+-------------+



		o    IP6 SENDER_TEMPLATE object: Class = SENDER_TEMPLATE,
		     C-Type = RSVP-AGGREGATE-IP6

		     +-------------+-------------+-------------+-------------+
		     |							     |
		     +							     +
		     |							     |
		     +		 IPv6 Aggregator Address (16 bytes)	     +
		     |							     |
		     +							     +
		     |							     |
		     +-------------+-------------+-------------+-------------+


	  3.5.	FILTER_SPEC Object

	  The FILTER_SPEC object identifies the	aggregating router for
	  the aggregate	reservation, and is syntactically identical to
	  the SENDER_TEMPLATE object.
























	  Baker	et al.	      Expiration: March	2000	       [Page 20]




	  Draft		   RSVP	Reservation Aggregation	  September 1999


	  4.  Policies and Algorithms For Predictive Management	Of
	  Blocks Of Bandwidth

	  The exact policies used in determining how much bandwidth
	  should be allocated to an aggregate reservation at any given
	  time are beyond the scope of this document, and may be
	  proprietary to the service provider in question. However, here
	  we explore some of the issues	and suggest approaches.

	  In short, the	ideal condition	is that	the aggregate
	  reservation always has enough	resources to allocate to any E2E
	  reservation that requires its	support, and never takes too
	  much.	Simply stated, but more	difficult to achieve. Factors
	  that come into account include significant times in the
	  diurnal cycle: one may find that a large number of people
	  start	placing	calls at 8:00 AM, even though the hour from 7:00
	  to 8:00 is dead calm.	They also include recent history: if
	  more people have been	placing	calls recently than have been
	  finishing them, a prediction of the necessary	bandwidth a few
	  moments hence	may call for more bandwidth than is currently
	  allocated. Likewise, at the end of a busy period, we may find
	  that the trend calls for declining reservation amounts.

	  We recommend a policy	something along	this line. At any given
	  time,	one should expect that the amount of bandwidth required
	  for the aggregate reservation	is the larger of the following:

	  (a)  a requirement known a priori, such as from history of the
	       diurnal cycle at	a particular week day and time of day,
	       and

	  (b)  the trend line over recent history, with	90 or 99%
	       statistical confidence.

	  We further expect that changes to that aggregate reservation
	  would	be made	no more	often than every few minutes, and
	  ideally perhaps on larger granularity	such as	fifteen	minute
	  intervals or hourly. The finer the granularity, the greater
	  the level of signaling required, while the coarser the
	  granularity, the greater the chance for error, and the need to
	  recover from that error.

	  In general, we expect	that the aggregate reservation will not
	  ever add up to exactly the sum of the	reservations it
	  supports, but	rather will be an integer multiple of some block
	  reservation size, which exceeds that value.






	  Baker	et al.	      Expiration: March	2000	       [Page 21]




	  Draft		   RSVP	Reservation Aggregation	  September 1999


	  5.  Security Considerations

	  Numerous security issues pertain to this document; for
	  example, the loss of an aggregate reservation	to an aggressor
	  causes many calls to operate unreserved, and the reservation
	  of a great excess of bandwidth may result in a denial	of
	  service. However, these issues are not confined to this
	  extension: RSVP itself has them. We believe that the security
	  mechanisms in	RSVP address these issues as well.


	  6.  IANA Considerations

	  Beyond allocating an IP Protocol, a PathErr code, and	an RSVP
	  Addressing object "type", there are no IANA issues in	this
	  document. We do not define an	object that will itself	require
	  assignment by	IANA.


	  7.  Acknowledgments

	  The authors acknowledge that published documents and
	  discussion with several people, notably John Wroclawski, Steve
	  Berson, and Andreas Terzis materially	contributed to this
	  draft. The design derives directly from an internet draft by
	  Roch Guerin [GUERIN] and from	Steve Berson's drafts on the
	  subject. It is also influenced by the	design in the diff-edge
	  draft	by Bernet et al	[BERNET] and by	the RSVP tunnels draft
	  [TERZIS].























	  Baker	et al.	      Expiration: March	2000	       [Page 22]




	  Draft		   RSVP	Reservation Aggregation	  September 1999


	  8.  References

	  [CSZ]
	       Clark, D., S. Shenker, and L. Zhang, "Supporting	Real-
	       Time Applications in an Integrated Services Packet
	       Network:	Architecture and Mechanism," in	Proc.
	       SIGCOMM'92, September 1992.

	  [IP] RFC 791,	"Internet Protocol". J.	Postel.	Sep-01-1981.

	  [HOSTREQ]
	       RFC 1122, "Requirements for Internet hosts -
	       communication layers".  R.T. Braden. Oct-01-1989.

	  [FRAMEWORK]
	       Nichols,	"Differentiated	Services Operational Model and
	       Definitions", 02/11/1998, draft-nichols-dsopdef-00.txt

	  [PRINCIPLES]
	       RFC 1958, "Architectural	Principles of the Internet". B.
	       Carpenter.  June	1996.

	  [ASSURED]
	       Clark and Wroclawski, "An Approach to Service Allocation
	       in the Internet", 08/04/1997, draft-clark-diff-svc-
	       alloc-00.txt

	  [BROKER]
	       Nichols and Zhang, "A Two-bit Differentiated Services
	       Architecture for	the Internet", 12/23/1997, draft-
	       nichols-diff-svc-arch-01.txt

	  [BERSON]
	       Berson and Vincent. "Aggregation	of Internet Integrated
	       Services	State".	draft-berson-rsvp-aggregation-00.txt,
	       August 1998

	  [BRIM]
	       Brim and	Carpenter. "Per	Hop Behavior Identification
	       Codes". draft-brim-diffserv-phbid-00.txt, April 1999.

	  [ISDS]
	       Bernet et al. "Integrated Services Operation Over
	       Diffserv	Networks". draft-ietf-issll-diffserv-rsvp-
	       03.txt, Sept. 1999.







	  Baker	et al.	      Expiration: March	2000	       [Page 23]




	  Draft		   RSVP	Reservation Aggregation	  September 1999


	  [GUERIN]
	       Guerin, R., Blake, S. and Herzog, S.,"Aggregating RSVP
	       based QoS Requests", Internet Draft, draft-guerin-
	       aggreg-rsvp-00.txt, November 1997.

	  [RSVP]
	       Braden, R., Zhang, L., Berson, S., Herzog, S. and Jamin,
	       S., "Resource Reservation Protocol (RSVP) Version 1
	       Functional Specification", RFC 2205, September 1997.

	  [BERNET]
	       Bernet, Y.,  Durham, D.,	and F. Reichmeyer, "Requirements
	       of Diff-serv Boundary Routers", Internet	Draft, draft-
	       bernet-diffedge-01.txt, November, 1998.

	  [REFRESH]
	       Berger, L., Gan,	D., and	G. Swallow, "RSVP Refresh
	       Reduction Extensions", Internet Draft, draft-berger-
	       rsvp-refresh-reduct-02.txt, May 1999.

	  [TERZIS]
	       Terzis, A., Krawczyk, J., Wroclawski, J., and L.	Zhang,
	       "RSVP Operation Over IP Tunnels", Internet Draft, draft-
	       ietf-rsvp-tunnel-04.txt,	May 1999.

	  [DCLASS]
	       Bernet, Y., "Usage and Format of	the DCLASS Object With
	       RSVP Signaling",	Internet Draft,	draft-bernet-dclass-
	       01.txt, June 1999.


	  9.  Authors' Addresses

	       Fred Baker
	       Cisco Systems
	       519 Lado	Drive
	       Santa Barbara, California 93111
	       Phone: (408) 526-4257
	       Email: fred@cisco.com

	       Carol Iturralde
	       Cisco Systems
	       250 Apollo Drive
	       Chelmsford MA,01824 USA
	       Phone: 978-244-8532
	       Email: cei@cisco.com






	  Baker	et al.	      Expiration: March	2000	       [Page 24]




	  Draft		   RSVP	Reservation Aggregation	  September 1999


	       Francois	Le Faucheur
	       Cisco Systems
	       291, rue	Albert Caquot
	       06560 Valbonne, France
	       Phone: +33.1.6918 6266
	       Email: flefauch@cisco.com

	       Bruce Davie
	       Cisco Systems
	       250 Apollo Drive
	       Chelmsford MA,01824 USA
	       Phone: 978-244-8921
	       Email: bdavie@cisco.com


	  10.  Full Copyright Statement


	  Copyright (C)	The Internet Society (1999).  All Rights
	  Reserved.

	  This document	and translations of it may be copied and
	  furnished to others, and derivative works that comment on or
	  otherwise explain it or assist in its	implmentation may be
	  prepared, copied, published and distributed, in whole	or in
	  part,	without	restriction of any kind, provided that the above
	  copyright notice and this paragraph are included on all such
	  copies and derivative	works.	However, this document itself
	  may not be modified in any way, such as by removing the
	  copyright notice or references to the	Internet Society or
	  other	Internet organizations,	except as needed for the purpose
	  of developing	Internet standards in which case the procedures
	  for copyrights defined in the	Internet Standards process must
	  be followed, or as required to translate it into languages
	  other	than English.

	  The limited permissions granted above	are perpetual and will
	  not be revoked by the	Internet Society or its	successors or
	  assigns.

	  This document	and the	information contained herein is	provided
	  on an	"AS IS"	basis and THE INTERNET SOCIETY AND THE INTERNET
	  ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR
	  IMPLIED, INCLUDING BUT NOT LIMITED TO	ANY WARRANTY THAT THE
	  USE OF THE INFORMATION HEREIN	WILL NOT INFRINGE ANY RIGHTS OR
	  ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A
	  PARTICULAR PURPOSE."





	  Baker	et al.	      Expiration: March	2000	       [Page 25]