Internet Draft

TE-WG Working Group                            Heinrich Hummel
Internet Draft                                   Siemens AG
Expiration Date: October 2000


                                                 April 2000

                Orchestrally conducted Traffic  ( OCT )

                      draft-hummel-te-oct-02.txt


Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.
   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.
   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet- Drafts as reference
   material or to cite them other than as "work in progress."
   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt
   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

Abstract

   This draft proposes to engineer traffic such that we can speak of an
   Orchestrally Conducted Traffic (OCT): Any traffic stream (given by
   its source and destination nodes and by a priority class) may use
   several differently routed LSPs.  Each, traffic stream ingress
   reports, periodically, the currently measured traffic load (in
   bitrate) to a common TE Conductor (TEC) who evaluates them as to
   forecast what traffic load change might occur in the immediate time
   ahead. Accordingly the TEC  computes well synchronized likelihood
   values by which to take which route/LSP. These values will be such
   that any traffic stream will be served as good as possible, as often
   as possible, while equally ranked traffic streams are treated fair,
   higher ranked streams are prioritized  and  the network thruput is
   maximized. The TEC sends the likelihood values to the respective
   ingress node, who will  distribute the received packets of the
   traffic stream to the pertaining LSPs accordingly.





Hummel                         April 2000                       [Page 1]





                  Orchestrally conducted Traffic (OCT)    Exp. Oct. 2000


1. Introduction and summary

   This draft proposes to engineer traffic such that we can speak of an
   Orchestrally Conducted Traffic (OCT): Any traffic stream (given by
   its source and destination nodes and by a priority class) may use
   several differently routed Label Switched Pathes (LSPs).  Each,
   traffic stream ingress reports, periodically, the currently measured
   traffic load (=bitrate value) to a common Traffic Engineering
   Conductor (TEC) who evaluates them as to forecast the immediately
   upcoming  traffic load.  Accordingly the TEC  computes well
   synchronized likelihood values by which to take which route/LSP.
   These values will be such that any traffic stream will be served as
   good as possible, as often as possible, while equally ranked traffic
   streams are treated fair, higher ranked streams are prioritized  and
   the network thruput is maximized. The TEC sends the likelihood values
   to the respective ingress node, who will  distribute the received
   packets of the traffic stream to the pertaining LSPs accordingly.


2. OCT concept in detail

 2.1 Goals and Basics

   The OCT concept is based on a periodic information exchange between
   the ingress nodes of all traffic streams and a central Traffic
   Engineering Conductor (TEC). For now, it is assumed that
   communication channels are established (i.e. special LSPs) to cater
   this information exchange.

   It is also assumed that one or several differently routed LSPs per
   traffic stream are established for carrying the user data. The OCT
   concept solves the question "by which likelihood should which LSP
   been taken so that the entire traffic is balanced best onto the given
   network ?"
   Best balancing includes many aspects, some of which have already been
   mentioned: Higher ranked traffic streams shall be prioritized versus
   lower ranked traffic streams. Equally ranked traffic streams shall be
   serviced equally fair. The thruput should be maximized.
   But implicitly there are even more properties of "best balancing":
   avoiding congestions as best as possible, resolving a congestion
   without creating a new congestion at just another location.
   Maximizing the thruput: It means: Among equally ranked traffic
   streams, give preference nevertheless to that portion of a traffic
   stream that can be serviced by a shorter LSP (using less hops).
   Eventually such a preferenced handling is beneficial to all traffic
   streams (e.g. so that all traffic streams can be completely serviced
   rather than some are more, some are less deteriorated). Eventually
   however the traffic streams that use  shorter LSPs are indeed better



Hummel                         April 2000                       [Page 2]





                  Orchestrally conducted Traffic (OCT)    Exp. Oct. 2000


   serviced than the others. Conclusion: Maximizing the thruput is more
   important than equal treatment.

   Let us partition the whole traffic into traffic streams  T(i,e,p),
   each of which is identified by an information-triplet {i-ngress node,
   e-gress node, p-riority class}. By other words, a traffic stream is
   the sum of all IP-Micro flows which enter the MPLS network domain at
   the same node (= ingress node), which will leave the MPLS network
   domain at the same node (=egress node) and which share the same
   priority class, based on their DiffServ parameters.  For each traffic
   stream T(i,e,p) there shall be n(i,e,p)  pre-established label
   switched pathes, i.e. LSP-k; k=1,...,n(i,e,p), for transmitting its
   packets. LSP-1 shall be the most preferrable one, LSP-n(i,e,p) the
   least preferrable one.

   An ingress node i knows all traffic streams T(i,e,p) it originates.
   For each of them it knows all respective n(i,e,p) LSPs, incl. all
   their link sequences, and measures, continuously, the respective
   current total traffic load as well as the individual current traffic
   load share per LSP. Periodically the ingress node i reports  the
   total traffic load (in bitrate) as well as the LSP-specific traffic
   load shares to the TEC -with respect to all traffic streams it
   originates.

   Besides these reports, the TEC shall know  the network topology incl.
   the maximum bandwidth BWj of each network link j.  Furthermore it
   shall know the link sequences of all LSPs for all traffic streams in
   the network.

   The sending of the reports to the TEC  by all ingress nodes may be
   slightly jittered, but by neglecting the jitter, the TEC shall
   receive all reports with respect to the same time of day. It is
   advisable that any report also contain either the exact time of day
   it has been sent or that time of day the report is meant for.

 2.2 Forecasting traffic load

   The TEC may apply different mathematical/heuristical methods,
   algorithms, concepts, theories as to compute / forecast  the total
   traffic load TTL (i,e,p) for each traffic stream T(i,e,p).

   For example, two methods are shown by the following:

   Method 1:
   Make a linear extrapolation based on the youngest traffic load value
   at time t-1 ("t minus one") in the past and the present traffic load
   value at time t0 ("t zero"), as to determine the future traffic load
   at time t1 ("t one").  If total traffic load TTL(i,e,p)(t1) exceeds



Hummel                         April 2000                       [Page 3]





                  Orchestrally conducted Traffic (OCT)    Exp. Oct. 2000


   the highest traffic load ever reported for traffic stream T(i,e,p)
   then replace the extrapolated value by the highest total traffic load
   ever reported for T(i,e,p).
   If total traffic load TTL (i,e,p) (t1) turns out negative then
   replace this value by the value 0 + epsilon.



   Method 2 (very promising, if upstream from ingress node i* there is a
   webpage of general interest):
   Build the average traffic load AVTL (i*,e,p) (t0) out of all
   currently measured traffic loads TTL (i*,e,p) (t0) and forecast for
   all traffic streams TTL(i*,e,p) (t1) = AVTL(i*,e,p) (t0).


   Certainly there will be  more methods as to forecast any total
   traffic load for time t1.


   Learning from the success:
   Note that one report-timer-interval later the prognosticated traffic
   load can be verified, how well it turned out.  We can make statistics
   which method is more often more successful than the other methods. We
   may apply the more successful method more often ! So we may learn
   from the success.


 2.3 Prosaic description how to compute likelihood values

   Imagine the TEC as a person sitting in front of a casino table. On
   the table however, the topology graph of his network is drawn. On
   each ingress node i the TEC has placed, with respect to traffic
   stream T(i,e,p), a pile of uniformly sized (traffic load) chips: The
   height of the pile amounts to the absolut value of the forecasted
   total traffic load TTL (i,e,p) (t1). Let's call it total traffic
   load-pile.  (Eventually, at the bottom of any such pile, there may be
   a chip  whose size is smaller than the uniformly sized chips.)

   On any link j of the network there is also a pile of  (bandwidth)
   chips whose height amounts to the maximum bandwidth BWj of that link.
   Let's call it link-j-bandwidth pile.  (Eventually, at the bottom of
   any link-j-bandwidth pile, there may be a chip  whose size is smaller
   than the uniformly sized chips.)

   The goal is to distribute the chips of each total traffic load-pile
   by piling up respective LSP-specific traffic load piles.  Eventually,
   a remainder of the total traffic load-pile cannot be distributed in a
   way which is shown next.  It will reflect that proportion of the



Hummel                         April 2000                       [Page 4]





                  Orchestrally conducted Traffic (OCT)    Exp. Oct. 2000


   traffic load which cannot be transmitted.

   The TEC processes one chip after the other, each time,at least in
   general, from a different total traffic load-pile:

   The TEC tries to move the chip to the k-th LSP's pile;
   (k=1,2,...,n(i,e,p) ). At the  beginning k=1.

   Whenever a traffic load chip is moved to the k-th LSP-specific pile,
   a (same sized) bandwidth chip is removed from each link-j-bandwidth
   pile of all those links j which this k-th LSP traverses. If this is
   not feasable because at least one of these links lacks such a
   bandwidth chip, the same is tried with the (k+1)st LSP. If it is not
   feasable with any of the n(i,e,p) LSPs then there will be a remainder
   total traffic load pile whose (normed) height will indicate the
   likelihood by which a packet from traffic stream T(i,e,p) shall not
   be transmitted.

   The sequence by which the traffic load chips of all total traffic
   load -piles are handled as described above is very important as to:
   achieve fairness among equally ranked traffic streams,
   priortize/discriminate higher/lower ranked traffic streams, and get
   best thruput:

   Higher ranked traffic streams will be preferred versus the lower
   ranked traffic streams by processing ALL their traffic load  chips
   first.

   Fairness among equally ranked traffic streams will be achieved by
   processing their traffic load chips alternatingly. However, make also
   the following exception: Among equally ranked traffic streams process
   at first those traffic load chips which can be moved to "shorter"
   LSPs, i.e. to LSPs with fewer hops.

   When all the traffic load chips have been processed, the heights of
   the resulting LSP-specific piles and of the remainder pile need  to
   be normed i.e. divided by the sum of all these heights as to get
   likelihood values.

   The TEC shall send back to any ingress node i these likelihood values
   with respect to all traffic streams T(i,e,p) which are originated at
   that node i.

 2.4 Distributing the traffic load onto the LSPs

   An ingresss node must distribute the received traffic load onto the
   respective LSPs according to the likelihood values.  Ingress node i
   whichs receives a new IP packet may identify the traffic stream to



Hummel                         April 2000                       [Page 5]





                  Orchestrally conducted Traffic (OCT)    Exp. Oct. 2000


   which this packet belongs.  I.e. it may identify its egress node
   based on its IP destination address and may identify its priority
   class based on its Diffserv parameters.

   (a):
   In accordance with the actual likelihood values it will select an
   appropriate LSP for transmitting this packet, respectively will
   dishonor its forwarding request:  The ingress node may build a hash
   with respect to the IP source addresses of the IP packets of
   T(i,e,p).  The simplest hash is such that it intersects the entire IP
   UNICAST address space into n(i,e,p) +1  intervalls.  The sizes of
   these intervalls shall be proportional to the likelihood values. The
   IP source address of a received IP packet of a particular traffic
   stream will be examined as to determine from which interval it is
   part of. If it is from the k-th interval, then it is forwarded along
   the k-th LSP; k=1,...,n(i,e,p). If it is from the n(i,e,p)+1 st
   intervall, then it is not forwarded at all.

   This very simple hash mechanism suits best if the ingress node is
   fairly in the "center of the internet", i.e.  receives IP packets
   containing any existing IP UNICAST source address.

   (b):
   An ingress node i (e.g. of an access network) may know, by means of
   network management configuration or by learning while operating, the
   appropriate candidate IP source address range which may be
   intersected as to form the right hash-intervals.

   (c):
   If an ingress node   receives IP packets from a fairly limited number
   of sources,  e.g. given by less than 100 different IP source
   addresses, it may be feasable to queue up information pertaining to
   all currently active IP micro flows: A queue element may consist  of
   an IP source address and the LSP to be selected.  If the first packet
   of a new IP micro flow is received, i.e. if no queued information can
   be found that matches the received packet's IP source address then a
   new queue element must be allocated whose LSP information may be
   determined by this: Generate a random number which hits/selects any
   column of a statistical stage function which is built from the
   likelihood values. The k-th column will select the k-th LSP.  It
   shall be used and also entered in the new queue element.
   Any succeeding packet of a particular IP micro flow may be sent along
   the same LSP. Advantage: all packets of the same IP micro flow will
   arrive in proper sequence at the egress node. A mechanism needs to be
   added to determine  the end of the  lifetime of an IP micro flow,
   i.e. the point in time when a queue element is deleted:  Maintain an
   information in the queue element which is refreshed/incremented by
   the transmission request of each packet of the IP micro flow. Employ



Hummel                         April 2000                       [Page 6]





                  Orchestrally conducted Traffic (OCT)    Exp. Oct. 2000


   a periodic timer. Whenever it expires check all queue elements for
   all traffic streams originated in this ingress node whether they are
   aged out. If so, delete them.

3. Further Aspects

 3.1 Other traffic load input:

   The TEC may also receive information about SCHEDULED traffic from
   other senders than the ingress nodes.  Such senders may be a policy
   agent or a Network Management station/agent.

 3.2 Adding/Removing another LSP per traffic stream

  3.2.1 Adding an LSP:
   In the preceding it was assumed that there are some LSPs per traffic
   stream, without saying when and why an(other) LSP may be added or
   removed.  An obvious reason for establishing another LSP is given,
   when the TEC determines a considerable likelihood for NOT forwarding
   some packets of the traffic stream along any of the existing LSPs.

   Indeed, the TEC may compute the route for another LSP while taking
   into account the currently available bandwidth in the network due to
   the present  traffic loads and  may send to the right ingress node
   the computed route for the new LSP - as well as the likelihood by
   which to use it.

  3.2.1 Removing a LSP:
   Over a longer period of time, an ingress node just like the TEC may
   notice that an LSP is hardly used.  It may be aborted. TEC and
   ingress node must inform each other.

 3.3 Delaying instead of dishonoring packet transmission

   Despite of best traffic balancing and even despite of    3.2 there
   may be IP packets whose transmission request might be dishonored.
   Alternatively, their transmission may only be delayed.  Further study
   is required how to do it and what are the consequences with respect
   to the entire "traffic balancing result".

 3.4 On-demand LSPs:

   Another variation would be to establish LSPs just on demand. The
   ingress node may have for a traffic stream, instead of a set of
   existing LSPs, a set of routes to choose from. The described
   procedure as to determine the right LSP for transmitting an IP packet
   may be applied analogously however as to determine the explicit route
   to use when the on-demand LSP is to be established.



Hummel                         April 2000                       [Page 7]





                  Orchestrally conducted Traffic (OCT)    Exp. Oct. 2000


 3.5 Long term results

   Over a longer period of time, while and though the total traffic is
   orchestrally conducted, the TEC may find out which links are
   permanently under-utilized resp. over-utilized and may conclude which
   links should be stocked up by which amount of bandwidth respectively
   may conclude which links may be removed or used differently by
   becoming re-configured.

   Determining to which extent a link is permanently under-utilized:
   Simplest method: determine the smallest available bandwidth ever  of
   that link.

   Determining to which extent the bandwidth of a link should be stocked
   up:  After the TEC has computed and sent out the likelihood values
   for all traffic streams, it may compute likelihood values for a
   fictitious "should be"-network by the following:  In one additional
   cycle the TEC may   process each remainder TTL (i,e,p)-pile like
   being one big chip and by moving this "big chip" onto its most
   preferred LSP-1. Hereby substract from each affected link-j-bandwidth
   pile the   size of this "big chip" and do not mind if the height of
   any link-j-bandwidth pile becomes a negative number.  Indeed,after
   this cycle any resulting negative link-j-bandwidth pile, indicates
   the absolute amount of bandwidth by which link j should be stocked
   up.

   Of course, average values should be built, over a longer period of
   time, as to determine the permanent over/under-utilization of any
   link in the network.



 3.6 Equally ranked LSPs

   So far it has been assumed that per traffic stream T(i,e,p) there are
   n(i,e,p) LSPs and that they are in ranking order: LSP-1 > LSP-2
   >....> LSP-n(i,e,p) .  LSP-1 ist the most favorable LSP, LSP-n(i,e,p)
   is the least favorable LSP e.g. in the sense that the more favorable
   LSP consists of fewer hops than the less favorable LSP. However among
   the n(i,e,p) LSPs there may be subsets of LSPs which are equally
   favorable and should be treated as equally ranked.

   Accordingly, when another traffic load chip shall be distributed to
   any LSP-specific traffic load pile, we should at first try any LSP
   from the best ranked subset of LSPs. That LSP should be selected such
   that an even distribution among the LSPs of this subset is warrented
   (which is accomplished by proper alternation).




Hummel                         April 2000                       [Page 8]





                  Orchestrally conducted Traffic (OCT)    Exp. Oct. 2000


   Proceed analogously with the next ranked subset of LSPs, in case no
   LSP of the better ranked subset can accommodate this chip.

 3.7 Central TEC versus de-centralized solutions

   This OCT concept operates with a central TEC. There may also be other
   alternative concepts which operate in a de-centralized fashion. In
   case the TE-WG is interested in standardizing a protocol according to
   this OCT concept, I would like to point out one important advantage
   of the centralized solution, i.e.  one associated  standardization
   objective:

   The ingress  nodes have to be provided with some basic initial
   software - just like the TEC. Later on the TEC may get smarter and
   smarter by getting newer software versions. At the same time however
   the ingress nodes can keep their old initial software.

 3.8. Standardization - required / not required

   It is possible to install the OCT concept in a completely proprietary
   fashon, i.e. without standardizing any new messages , TLVs ,
   codepoints or well-known TCP port number.

   In a simplest implementation special p2p LSPs are established from
   each ingress node to the TEC and vice versa for exchanging all
   relevant information in proprietary syntax. Alternatively the TEC may
   send out the likelihood values down a spanning tree, which is
   provided by a tree-structured source routing information called TREE
   ROUTE TLV, see [3], whereby each node i* will become aware which part
   of the received info is to forward along which child link and which
   part is just targetted to itself - which are all likelihood values
   for all traffic streams T(i*,e,p).

   It is up to the TE-WG (and maybe others) whether or not to
   standardize all required protocols and formats with respect to the
   communication between any ingress node and the TEC, as well as
   between any Network Management and/or Policy Agent and the TEC.

4.  Intellectual Property Considerations

   Siemens AG  may seek patent or other intellectual property protection
   for some or all of the technologies disclosed in this document. If
   any standards arising from this document are or become protected by
   one or more patents assigned to Siemens AG.  Siemens AG intend to
   disclose those patents and license them under openly specified and
   non-discriminatory terms.

5.  References



Hummel                         April 2000                       [Page 9]





                  Orchestrally conducted Traffic (OCT)    Exp. Oct. 2000


   [1] "Traffic  Engineering  Concept",
   draft-awduche-mpls-traffic-eng-00.txt, Awduche

   [2] "Provider Architecture for Differentiated  Services
   and  Traffic Engineering (PASTE)", 1/1998, draft-li-paste-00.txt,
   Li, Rekhter

   [3] "Explicit Tree Routing",draft-hummel-mpls-tree-route-01.txt,
   Hummel, Loke

6.  Author's Address

      Heinrich Hummel
      Siemens AG
      Hofmannstrasse 51
      81379 Munich, Germany
      Tel: +49 89 722 32057
      Email: heinrich.hummel@icn.siemens.de



   Full Copyright Statement

   "Copyright (C) The Internet Society (March 2000). All Rights
   Reserved. This document and translations of it may be copied and
   furnished to others, and derivative works that comment on or
   otherwise explain it or assist in its implmentation may be prepared,
   copied, published and distributed, in whole or in part, without
   restriction of any kind, provided that the above copyright notice and
   this paragraph are included on all such copies and derivative works.
   However, this document itself may not be modified in any way, such as
   by removing the copyright notice or references to the Internet
   Society or other Internet organizations, except as needed for the
   purpose of developing Internet standards in which case the procedures
   for copyrights defined in the Internet Standards process must be
   followed, or as required to translate it into languages other than
   English.

   The limited permissions granted above are perpetual and will not be
   revoked by the Internet Society or its successors or assigns.











Hummel                         April 2000                      [Page 10]