Internet Draft
Internet Engineering Task Force 		   Hirokazu Ishimatsu 
Internet-Draft					   Yoshihiro Hayata
					           Susumu Yoneda
   						   Japan Telecom Co., LTD.

Expiration Date: May 2001                	   Ramesh Bhandari
                                                   George Newsome
					           Eve Varma
                                                   Lucent Technologies
                                                            
                                                   November, 2000


  
  Carrier Needs Regarding Survivability and Maintenance for
                          Switched
                      Optical Networks
                              
            draft-hayata-ipo-carrier-needs-00.txt
                              

Status of this Memo

This document is an Internet-Draft and is in full
conformance with all provisions of Section 10 of  RFC2026.

Internet-Drafts are working documents of the Internet
Engineering Task Force (IETF), its areas, and its working
groups. Note that other groups may also distribute working
documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of
six months and may be updated, replaced, or obsoleted by
other documents at any time. It is inappropriate to use
Internet-Drafts as reference material or to cite them other
than as "work in progress."

The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt

The list of Internet-Draft Shadow Directories can be
accessed at http://www.ietf.org/shadow.html.


1.  Abstract

As discussed in [1], the need for survivable optical networks is 
critical, and introducing capabilities that further enhance network 
survivability continues to be an essential objective.   This is 
particularly important for operators with stringent requirements for 
network resilience and service survivability.  However, disruption of 
service can result not only from faults, but also from scheduled 
maintenance procedures. This draft introduces some additional 
considerations and carrier needs related to failure recovery and 
scheduled maintenance work in switched optical networks. These are of 
critical importance for serving -business customers who require super 
high quality service assurance and pay correspondingly high tariffs in 
order to guarantee this level of QoS.


2.  Conventions used in this document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT",  "SHOULD",  "SHOULD  NOT", "RECOMMENDED",  "MAY",  and
"OPTIONAL"  in  this  document  are  to  be  interpreted  as
described in RFC-2119.

3.  Introduction

The explosion of data services is increasingly imposing challenging 
network infrastructure requirements at the same time that wavelength 
services are emerging in the marketplace. Next generation optical 
networking solutions must enable scalable, flexible, and reliable 
networks as well as increased responsiveness to client network needs.  
Provision of an optical layer service framework has been discussed in 
the context of service considerations considered important for inter-
city network operators [2].  As described in this material, some key 
objectives include service functionality, a workable business model, and 
evolvability in a heterogeneous network environment. 

Key service functionality cited in [2] has included rapid provisioning 
and restoration.   Automated provisioning of optical layer resources in 
support of scheduled and demand-based customer/client needs offers 
opportunities for supporting new services as well as handling routine 
maintenance activities in a non-service disrupting manner (e.g., 
scheduled or predictable maintenance-related churn).

Assuring support for a workable business model that can adapt to change, 
e.g., arbitrage, is important. In particular, it has become clear that 
there is a range of reasonable business models that might be utilized in 
an operator's network, depending upon the scope and objectives of the 
enterprise.  In particular, as discussed in [4], such models might be 
used in various ways, and for various purposes, even by different 
organizations within the same network operator domain.

Evolvability is an important consideration as it is essential for 
service providers to have a smooth network evolution path for addressing 
the unique problems inherent in simultaneously supporting an existing 
network while deploying a new multi-service infrastructure.  Clearly, it 
is also necessary to enable emergent service providers to optimally 
tailor their networks for their targeted market and service offerings; 
however, emergent providers quickly need to deal with embedded base as 
soon as initial deployment of resources has occurred.

Within the remainder of this draft, we will focus upon service 
functionality and business model objectives in relation to service 
survivability and maintenance considerations for highly reliable 
services such as the super high quality services discussed below.

4.  Switched Optical Services

The basic requirement of a switched optical service is that a channel is 
established via an appropriate signaling mechanism before data can be 
transferred and that this establishment is achieved in the following 
manner:
- a real-time client specifies its traffic characteristics and its end-
  to-end performance requirements to the server
- the most suitable route for a channel that meets these requirements is 
  determined
- translate the end-to-end parameters into local parameters at each NE 
  and attempt to reserve resources via signaling.

The service abstraction defines a contractual relationship between 
client and server. Hence once the connection is established the server 
guarantees in the absence of a failure that it will meet its contractual 
obligations. This contract is basically agreed before data transfer. 
When the server guarantees the contract, several actions have to be 
taken in case of a failure. This paper addresses those actions in Sec. 
4.2.

4.1 Super High Quality Services Characteristics

Super high quality services (also known as private line services) 
offered by a carrier currently have the following characteristics:

- The exact physical and logical location of a private line userÆs path 
  in the  network is known and uniquely identifiable, (i.e. the optical 
  fiber cable, fiber, optical channel, SDH logical path, port of 
  transmission equipment/router, etc) is known to the network operator.
- When a logical path or port is switched to an alternate route (i.e., 
  a back-up path) due to an unexpected event, after the event or 
  failure is repaired, the carrier switches traffic back from the 
  alternate path to the original path.
- For scheduled maintenance, the carrier  always asks customers having 
  super high quality services (that may be affected due to this 
  maintenance work) their preference in terms of when this work may be 
  carried out.   The carrier then carries out the scheduled maintenance 
  work according to customer preference regarding date and time, as it 
  is essential that important customers not be adversely impacted in 
  any way by scheduled maintenance work.
- The carrier provides for guaranteed service survivability in the 
  event of failures.  It does so by providing alternate paths for 
  carrying services, with the service and alternate paths being 
  physically and topologically diverse.


4.2 Service Survivability Considerations

As  discussed in [3], there is a range of failures that can occur  
within  a network, and high reliability  applications will require a 
variety of failures to be taken into account.  Examples  that have been 
considered include office  outages, failures  arising  from diverse 
circuits  traversing  shared protection  facilities such as rings, and 
natural disasters. It is essential to fully prepare for those natural 
disasters such as earthquakes, volcanoes and typhoons.   Further, for 
super high quality services, there is  extreme sensitivity to service 
interruptions.  Thus,  it is  important  that the service and alternate 
paths  do  not have  links  that  are part of any Shared Risk  Link  
Groups (SRLG)  [3], or pass through the same "region of failure".  
Additionally, in order to assure an  optimized survivable network 
architecture, it is desirable  that the alternate path can be switched-
back to the original  service path once the failure is repaired (note 
that not all carriers may choose to revert).  The following different 
grades of services may be defined with actions to be taken in the event 
of a failure:

- Standard service, which is provided from a given source to a given 
  destination over a path computed in accordance with normal network 
  capacity constraints; when the customer loses connection on account 
  of a fault, the customer may request the same connection which the 
  network will then try to establish on a newly computed path.

- Medium High Quality Service which, at the customerÆs request, 
  provides a connection over a path that avoids a certain set of cities 
  or regions, which are prone to damage due to natural disasters such 
  as earthquakes, volcanoes, typhoons, etc. These "regions of failure" 
  may each be ascribed a "radius of failure" determined from a study of 
  the past history of the spatial extent and severity of damage in 
  those regions; in the event of a failure of this service, the 
  customer may request reestablishment of a connection, which the 
  network will attempt to provide over a new path. 

- High Quality Service, which is provided with a physically disjoint 
  back-up path in case of failure of the primary path; there are no 
  requirements on city avoidance, etc; as a result, the back-up 
  basically provides guarantee of continuity of service only in the 
  event of link or equipment failure.

- Super High Quality Service, which is provided with a physically 
  disjoint back-up path, constrained to have no "region of failure" in 
  common with the original path. Such type of service may be requested 
  by big business customers who essentially want continuity of service 
  at all times. In fact, since the downtime of the primary path may be 
  significantly large in major catastrophes such as those due to 
  earthquakes, floods, etc., a carrier may offer to provide a back-up 
  for the back-up over which the guaranteed services were switched upon 
  failure of the primary path.

 The above four types of services may be summarized in the table below:

	Service Type	Physically disjoint 	Avoid a Region of 
			protection path		Failure

	Regular		No			No	
	Medium		No			Yes
	High		Yes		 	No
	Super High	Yes			Yes

In the event the constraints for the above high quality services can 
only be met partially (e.g., 100% physical diversity between a given 
pair of source and destination cannot be provided, e.g., because it just 
does not exist for the particular source-destination pair), then the 
customer, instead of being refused the desired service, may simply be 
offered service with a correspondingly reduced level of service 
protection; for example, if the percentage amount of fiber overlap on 
the primary and secondary routes is x, then the customer may be offered 
the service with a reduction in service continuity guarantee by x%, and 
thus also with correspondingly reduced costs to the customer. 
Furthermore, in those cases, where the customer does not want to pay the 
full cost of the above high quality services, even when such service 
exists, then service may still be provided, but with corresponding 
reduced quality guarantees within the class of service under 
consideration.

4.3 Data Bases and Algorithms

Because natural disasters such as earthquakes, typhoons, etc. can damage 
a large area in one instance, it is important to ascertain the regions 
within the service provider's network prone to damage by such 
calamities. Normally, such areas have a history of damage, and it should 
be possible to construct a data base on the location, intensity of 
disaster, its frequency, and the size of the area affected; the area 
affected may be expressed as a "radius of failure". It may also be 
possible to use the information on the intensity of disaster and the 
frequency of occurrence to assign probabilities of failure to the 
offered services. For path computation, the following data bases are 
needed:

- Nodes, links, and their fiber span content, or alternatively, nodes, 
  fiber spans and links riding the individual spans also called Shared 
  Risk Link Groups (SRLG's); clearly, if a link or node is not in 
  service, it is not included in path computation.

- Regions of failure, corresponding radii of failure and locations 
  within the service provider's network; these should be taken into 
  account before computing paths for the medium high and super high 
  quality services.

For highly reliable services such as the super high quality services, 
physically-disjoint paths for real-life networks (which involve span-
sharing links or SRLGÆs) are required. Ref. [5] describes algorithms for 
such real-life networks. The algorithms emphasize optimality to save 
network costs. Depending upon the span-sharing topologies of a given 
network, these optimal algorithms can be very fast, and thus suitable 
for running in the real-time environment. For networks, with very 
complicated span-sharing topologies, exact algorithms do exist [5], but 
they are slow for large networks, since the problem becomes NP-complete. 
In such situations, fast heuristics may be developed [5] (see also [2] 
for a discussion on diversity).

4.4  Business Model Considerations

As described in [4], there are several business models that may be 
applicable for network operators: ISP owning all Layer 1 infrastructure 
and only delivering IP-based services, ISP owning or leasing Layer 1 
infrastructure and only delivering IP-based services, retailer or 
wholesaler for multi-services, and a carrierÆs carrier or bandwidth 
broker.   A carrier  owns the layer 1 infrastructure and sells multiple 
service types to customers, which may include other operator networks.  
This bandwidth brokering, or reseller, role takes on a new meaning in 
the context of service resilience.  For many years, in Japan, operators 
have collaborated to handle traffic in the event of natural disasters, 
so that bandwidth can be borrowed from each other.  Thus, if an operator 
doesnÆt have the capacity, they can borrow capacity from another 
network.  Accommodating the unexpected is a key factor in this case. 
Indeed it seems to be a common pattern in industry that businessÆs that 
provide service and operate their own infrastructure tend to separate 
into two businessÆs. This makes it likely that even though 
infrastructure may be whole owned today, it may well not be tomorrow. 
This makes it important to take account of fully separated business 
models (case 3 and 4 of [4]) even if this does not seem to represent the 
majority of today's business's.

5.  Implications for switched optical networks

Considering the discussion in Subsections 4.1 - 4.4, switched optical 
networks must minimally:
- Support the various grades of high quality services, including the 
  Super High Quality Service described in Sec. 4.1.
- Support survivability considerations related to diverse routing, 
  tailored to the unique characteristics of JapanÆs geography and 
  routing of fibers.
- Enable "bandwidth borrowing on demand" from other carriers as well as 
  support for multiple service types.

Examples of necessary functionality are provided in more detail below, 
as well as some related connection setup operations.

5.1	Functions

- When referring to Section 4, we can see that the following functions 
  need to be supported:
- Ability for network operator to manually set the date and time that a 
  path switching function should take place, and have that occur 
  automatically.  (The guarantee that the switch occurs as scheduled is 
  closely linked to resource allocation policies; see T1X1.5/2000-194 
  for further discussion on scheduled connections.)
- Ability to specify switching to a physically/topologically disjoint 
  path from the service path.
- Ability to maintain and update the data bases in a timely manner so 
  that a connection request is supported with the most current 
  knowledge of the network.
- Ability for operator to support a survivability policy that enables 
  the capability for switch-back to the original service path.
- Ability to support an operator policy to prioritize service requests 
  so that, in the event of a fault, customers with super high quality 
  services have first priority in being switched to disjoint paths.
- Ability to enable key customers to request constraints on the 
  connection path(e.g., avoid City X because an earthquake has just 
  occurred, or simply because the city is very much prone to damage 
  from natural disasters such as earthquakes, volcanoes and typhoons. 
  This involves the ability to express geographic constraints, as 
  opposed to just physical (equipment) or topological constraints.
- Ability to prevent new customers from being added to a particular 
  link for a certain amount of time (e.g., because of a failure, 
  natural disaster, scheduled maintenance).  This requires the ability 
  to mark particular resources as out of service.
- Ability for the operator to query service management function to 
  establish the exact location and characteristics of service paths for 
  key customers.
- Ability for the operator to view information regarding which 
  customer/user is associated with which service path(s).

5.2  Connection Setup Operation

Referring  to [4], some relevant connection setup parameters include:

1) Scheduled service - ability to request the connection to  be made at 
some specified time in the future (see T1X1.5/2000-194 for further 
discussions). 
2) Scheduled duration - ability to specify a duration  for  the 
Connection. 
3) Resilience - ability to request resilience against  server layer 
faults, and specify a particular degree of risk (see Sec. 4.2)
4) Connection Constraints - ability to specify the constraints as in the 
three levels of high quality service described in Sec. 4.2.

6.  References

[1] J. Luciani, B. Rajagopalan, D. Awduche, B. Cain, B. Jamoussi, "IP 
over Optical Networks - A Framework", , March 2000
[2] John Strand, "Optical Layer Services Framework", T1X1.5/2000-142
[3] Monica Lazer, John Strand, "Some Routing Constraints", T1X1.5/2000-
143
[4] George Newsome, "ASON - Requirements at the Client API", 
T1X1.5/2000-158
[5] Ramesh Bhandari, "Survivable Networks - Algorithms for Diverse 
Routing", Kluwer Academic Publishers, 1999.

7. Authors' Contact Information

Hirokazu Ishimatsu
Japan Telecom
hirokazu@japan-telecom.co.jp

Yoshihiro Hayata
hayata@japan-telecom.co.jp

Sussumo Yoneda
Japan Telecom
yone@japan-telecom.co.jp

Ramesh Bhandari
Lucent Technologies
bhandari1@lucent.com

George Newsome
Lucent Technologies
gnewsome@lucent.com

Eve Varma
Lucent Technologies
evarma@lucent.com


			Expiration Date: May 2001