Internet Draft
Internet Engineering Task Force 		   Ramesh Bhandari
Internet-Draft					   Siva Sankaranarayanan
					           Eve Varma
   						   Lucent Technologies

Expiration Date: May 2001                	   
                                                            
                                              November, 2000


  
  High Level Requirements for Optical Shared Mesh Restoration
                              
            draft-bhandari-optical-restoration-00.txt
                              

Status of this Memo

This document is an Internet-Draft and is in full
conformance with all provisions of Section 10 of  RFC2026.

Internet-Drafts are working documents of the Internet
Engineering Task Force (IETF), its areas, and its working
groups. Note that other groups may also distribute working
documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of
six months and may be updated, replaced, or obsoleted by
other documents at any time. It is inappropriate to use
Internet-Drafts as reference material or to cite them other
than as "work in progress."

The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be
accessed at http://www.ietf.org/shadow.html.


1. Abstract

In this draft, we provide the high level requirements for optical shared  
mesh restoration within the optical transport network. 


2. Conventions used in this document

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT",  "SHOULD",  "SHOULD  NOT", "RECOMMENDED",  "MAY",  and
"OPTIONAL" in  this  document  are  to  be  interpreted  as
described in RFC-2119.
 

3. Introduction

Because of the enormity of the traffic that optical networks are 
expected to carry, resulting from the continued explosive growth of 
data-oriented applications, optical network survivability has become an 
issue of paramount importance.  In conjunction, there is a continuing 
drive for maximizing efficiency and minimizing costs in large networks.  
Very fast restoration mechanisms such as 1+1 schemes (with restoration 
times of the order of the tens of milliseconds) exist, but given the 
degree of network resource consumption, alternative options are 
essential.  With the availability of large optical cross-connects, 
shared mesh network restoration at the optical layer is a versatile 
approach that should be considered. Simulations [1] have shown that 
shared mesh networks require much less additional capacity than rings.   
Although less network resource consuming, the trade-off has been service 
restoration time. However, mesh based restoration is not inherently 
"slow"; if appropriate architectural requirements are established in a 
timely manner, it should be possible to enable fast restoration times 
(e.g, restoration times comparable to those provided by the SONET ring-
based infrastructures). Within this contribution, we provide 
architectural requirements that enable fast and efficient optical mesh 
restoration. 

4. Optical Mesh Network Architecture

This draft focuses upon next generation optical networks based upon ITU-
T Recommendations G.872 [2] and G.709 [3]. Optical mesh networks 
basically consist of optical cross-connects (OXCs) interconnected by 
DWDM links. Associated with these OXCs are controllers that facilitate 
communications among them. (Note that these controllers may be internal 
or external to the controlled OXCs, and a one for one relationship is 
not assumed). An optical channel (OCh) connection through the optical 
transport network (OTN) is established along a route having capacity 
(wavelength availability) between its designated ingress and egress 
points. The OCh connection between the source and the destination OXCs 
is comprised of a series of OXCs interconnected by OCh link connections, 
and a signaling mechanism is used to appropriately configure the OXC 
during OCh network connection establishment. 

Note that an optical channel transparently carries a variety of client 
signals (e.g., IP, SONET/SDH, ATM, GbE), and provides OAM capabilities 
such as tandem connection monitoring (TCM) and end-end signal integrity 
checking. Thus, an optical channel traversing a series of optical 
subnetworks, can be monitored at various points along the route, 
typically at subnetwork boundaries, as well as at the OCh termination 
points (end-points). When there is a breakdown of an OCh network 
connection due to a failed OCh link connection(s) or OXC node, the 
affected traffic needs to be restored using an alternate route. There 
are two ways in which this restoration may be performed:

1) reroute around the point of failure, e.g., a failed link 
connection 

2) reroute from the tandem connection monitoring (TCM) or OCh 
termination points. 

The first method mandates the need for fault localization in advance 
of initiating restoration actions. I.e., it is necessary to pinpoint 
the precise location of the fault along the OCh network connection so  
that rerouting can be performed around it.  Relatively quick fault  
isolation might be provided by digitally monitoring OCh overhead at every 
optical NE (ONE); however, this builds a dependency upon digital pro-
cessing throughout the entire OTN. This introduces additional cost incurred 
from proliferation of OEO throughout the network solely for maintenance 
reasons (vs. impairment mitigation), not to mention additional digital 
monitoring equipment to determine performance degradation. 
Alternatively, controlling the expense by sharing the monitoring 
equipment over many optical channels leads to an unacceptably large 
fault detection time [4]. More significantly, this method inhibits 
evolution towards transparent optical networks. Further, fault 
localization in transparent optical networks may be complicated by the 
non-linear interactions typical of such networks. This can result in 
time-consuming correlations to identify the root cause of signal 
impairments.

The second method of restoration involves rerouting from the OCh 
terminations or the TCM points, and therefore does not require fault 
isolation to occur before initiation of restoration actions. It is 
expected to be fast because it utilizes the ability to accurately 
detect loss of signal from TCM and OCh termination points, from which 
signaling may subsequently be initiated to restore the traffic on an 
alternate path. Since the exact location of the fault along the primary 
path is unknown, the alternate path has to be "physically-disjoint" from 
the primary path. We further note that this approach is conducive towards
evolution to increasingly large transparent (all optical, no 
OEO) subnetworks in two ways: it avoids embedding dependencies on 
digital processing within the OTN; it is tailored to the needs of all-
optical networks. In what follows, we assume restoration is path-based, 
i.e., it takes place from the TCM or OCh termination points, and that 
the alternate path is physically-disjoint. It is important to mention 
that, if the primary path traverses multiple subnetworks or operator 
domains, then due to monitoring at the edge of each domain, restoration 
may be performed within that domain. This would also avoid the need for 
signaling inter-working between multiple domains.

Clearly, to effect restoration on these alternate disjoint paths, spare 
capacity must be reserved on each link of the path. For the network to 
be efficient, this spare capacity must be shared for restoration of 
other working paths as well. For fast and scaleable optical network 
restoration, it is also desirable to maintain the network-state in a 
distributed manner. Below we point out some high-level requirements for 
restoration at the optical layer.


5. Requirements for Fast Optical Mesh Restoration

Any optical mesh restoration scheme must 

- Be independent of OCh client (e.g., IP, ATM, SDH/SONET, GbE).

- Avoid dependency of restoration action initiation on non-time 
critical functions. Therefore, it should not require fault 
localization to occur before initiating restoration actions.

  -> Restoration must be triggered from the TCM or OCh termination  
     points.

  -> The alternate path must be physically disjoint; by physically 
     disjoint, we mean not only node and link disjoint, but also 
     span-disjoint.

- Have scalability in the event of catastrophic failures such as fiber 
cable cuts.  

  -> Appropriate mechanisms must be utilized that can restore the   
     (expected) large amount of affected traffic rapidly, and in a 
     cost-effective manner; e.g., core network application domain 
     encompassing up to a few hundred nodes per subnetwork, and 
     thousands of point-to-point demands.

- Utilize a robust and efficient signaling mechanism.

  -> The signaling network must remain functional after a failure in 
     the transport and/or signaling network infrastructure.

Clearly, for restoration to be carried out effectively, it is necessary 
for the connection controllers to have information on the network 
topology (such as link state and wavelength availability) as well as on 
physical aspects of  the transport network such as fiber span and span-
sharing links. Appropriate algorithms are needed to determine physically 
disjoint paths for restoration (see, e.g., [5]), since restoration must 
take place from the TCM or OCh termination points. To ensure that paths 
are actually physically disjoint (i.e., node, link, and span 
disjoint), span-sharing link topologies or Shared Link Risk Groups 
(SRLGÆs) [5-6] of the actual physical fiber network must be understood. 
For special high quality services [7], another key consideration 
involves regions of failure, specified by the corresponding radii of 
failure.  This is because, for such services, diverse routes should not 
pass through a region where there is the risk of both the primary and 
alternate paths failing simultaneously due to catastrophic disasters 
such as earthquakes, floods, etc.

Appropriate mechanisms  (see, e.g., [5]) and algorithms may need to be 
constructed to expedite the restoration process and to make the 
restorable mesh network cost effective by sharing spare capacity.  
Approaches to garner information on network topology are currently under 
consideration within various fora (e.g., via the use of appropriate 
extensions to OSPF (see, e.g., [8])).


5.References


[1] S. Baroni et. al., Proc. Conference on Optical Fiber Communications, 
Paper TuK2 March 2000.
2] Agreed revisions to Version 2 of G.872 per October 1999 Q19/13 
Meeting, provided to T1X1.5 for information, 
ftp://ftp.t1.org/pub/t1x1/2000x15/0x150500.pdf
[3] Draft ITU-T Recommendation G.709, Oct. 2000 version submitted for 
approval at the Feb. 2001 SG 15 meeting, provided to T1X1.5 for 
information, ftp://ftp.t1.org/pub/t1x1/x1.5/0x152460.doc
[4] G. Newsome, "Maintenance Philosophy for the OTN", T1X1.5/99-108R1
[5] R. Bhandari, "Survivable Networks: Algorithms for Diverse Routing", 
Kluwer Academic Publishers (1999)
[6] S. Chaudhuri et al, "Control of Lightpaths in an Optical Network", 
Internet Draft <draft-chaudhuri-ip-olxc-control-00.txt> February 2000
[7] H. Ishimatsu et al, "Carrier Needs Regarding Survivability and 
Maintenance for Switched Optical Networks", , submitted in this meeting.
[8] G. Wang et al, "Extensions to OSPF/IS-IS for Optical Networking", 
Internet Draft  March 2000

6. Authors' Contact Information

Ramesh Bhandari
Lucent Technologies
bhandari1@lucent.com

Sivakumar Sankaranarayanan
Lucent Technologies
ssnarayanan@lucent.com

Eve Varma
Lucent Technologies
evarma@lucent.com


		
			Expiration Date: May 2001