Internet Draft
Network Working Group Kireeti Kompella
Internet Draft Manoj Leelanivas
Expiration Date: May 2001 Quaizar Vohra
Juniper Networks
Javier Achirica Ronald Bonica
Telefonica Data WorldCom
Chris Liljenstolpe Eduard Metz
Cable & Wireless KPN Dutch Telecom
Chandramouli Sargor
Vijay Srinivasan
CoSine Communications
MPLS-based Layer 2 VPNs
draft-kompella-mpls-l2vpn-02.txt
1. Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as ``work in progress.''
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Kompella et al. [Page 1]
Internet Draft draft-kompella-mpls-l2vpn-01.txt November 2000
2. Abstract
Virtual Private Networks (VPNs) based on Frame Relay or ATM circuits
have been around a long time. While these VPNs work well, the costs
of maintaining separate networks for Internet traffic and VPNs and
the administrative burden of provisioning these VPNs have led Service
Providers to look for alternative solutions. In this document, we
present a VPN solution where from the customer's point of view, the
VPN is based on Layer 2 circuits, but the Service Provider maintains
and manages a single MPLS-based network for IP, MPLS IP VPNs, and
Layer 2 VPNs.
3. Introduction
The first corporate networks were based on dedicated leased lines
interconnecting the various offices of the corporation. Such
networks offered connectivity and little else: they didn't scale
well, they were expensive for the service providers (and hence for
their customers), and provisioning them was a slow and arduous task.
The first Virtual Private Networks (VPNs) were based on Layer 2
circuits: X.25, Frame Relay and ATM (see [VPN]). Layer 2 VPNs were
easier to provision, and virtual circuits allowed the service
provider to share a common infrastructure for all the VPNs. These
features were passed on to the customers in terms of cost savings.
However, while Layer 2 VPNs were a significant step forward from
dedicated lines, they still had their drawbacks. First, they tied
the service provider VPN infrastructure to a single medium (e.g.,
ATM). This became even more of a burden if the Internet
infrastructure was to share the same physical links. Second, the
Internet infrastructure and the VPN infrastructure, even if they
shared the same physical network, needed separate administration and
maintenance. Third, while provisioning was much easier than for
dedicated lines, it was still complex. This was especially evident
in the effort to add a site to an existing VPN.
This document offers a solution that preserves the advantages of a
Layer 2 VPN while allowing the Service Provider to maintain and
manage a single (MPLS-based) network for IP, MPLS IP VPNs ([IPVPN])
and Layer 2 VPNs, and reducing the provisioning problem
significantly. In particular, adding a site to an existing VPN in
most cases requires configuring just the Provider Edge router
connected to the new site.
The rest of this section discusses the relative merits of MPLS-based
Layer 2 and Layer 3 VPNs. Section 4 describes the operation of an
MPLS-based Layer 2 VPN. Sections 5 and 6 offer two alternative means
Kompella et al. [Page 2]
Internet Draft draft-kompella-mpls-l2vpn-01.txt November 2000
of signalling Layer 2 VPNs, one using LDP and the other using BGP.
3.1. Terminology
We assume that the reader is familiar with Multi-Protocol Label
Switching (MPLS [MPLS]), the Label Distribution Protocol (LDP [LDP])
and the Border Gateway Protocol version 4 (BGP [BGP]).
The terminology we use follows. A "customer" is a customer of a
Service Provider seeking to interconnect the various "sites"
(independently connected networks) through the Service Provider's
network, while maintaining privacy of communication and address
space. The device in a customer site that connects to a Service
Provider router is termed the CE (customer edge device); this device
may be a router or a switch. The Service Provider router to which a
CE connects is termed a PE. A router in the Service Provider's
network which doesn't connect directly to any CE is termed P. These
definitions follow those given in [IPVPN].
3.2. Advantages of Layer 2 VPNs
We define a Layer 2 VPN as one where a Service Provider provides a
layer 2 network to the customer. As far as the customer is
concerned, they have (say) Frame Relay circuits connecting the
various sites; each CE is configured with a DLCI with which to talk
to other CEs. Within the Service Provider's network, though, the
layer 2 packets are transported within MPLS Label-Switched Paths
(LSPs).
The Service Provider does not participate in the customer's layer 3
network, in particular, in the routing, resulting in several
advantages to the SP as a whole and to PE routers in particular.
3.2.1. Separation of Administrative Responsibilities
In a Layer 2 VPN, the Service Provider is responsible for Layer 2
connectivity; the customer is responsible for Layer 3 connectivity,
which includes routing. If the customer says that host x in site A
cannot reach host y in site B, the Service Provider need only
demonstrate that site A is connected to site B. The details of how
routes for host y reach host x are the customer's responsibility.
Another very important factor is that once a PE provides Layer 2
connectivity to its connected CE, its job is done. A misbehaving CE
can at worst flap its interface. On the other hand, a misbehaving CE
Kompella et al. [Page 3]
Internet Draft draft-kompella-mpls-l2vpn-01.txt November 2000
in a Layer 3 VPN can flap its routes, leading to instability of the
PE router or even the entire SP network. This means that the Service
Provider must aggressively damp route flaps from a CE; this is common
enough with external BGP peers, but in the case of VPNs, the scale of
the problem is much larger; also, the CE-PE routing protocol may not
be BGP, and thus not have BGP's flap damping control.
3.2.2. Migrating from Traditional Layer 2 VPNs
Since "traditional" Layer 2 VPNs (i.e., real Frame Relay circuits
connecting sites) are indistinguishable from MPLS-based VPNs from the
customer's point-of-view, migrating from one to the other raises few
issues. With Layer 3 VPNs, special care has to be taken that routes
within the traditional VPN are not preferred over the Layer 3 VPN
routes (the so-called "backdoor routing" problem, whose solution
requires protocol changes that are somewhat ad hoc).
3.2.3. Privacy of Routing
In a Layer 2 VPN, the privacy of customer routing is a natural
fallout of the fact that the Service Provider does not participate in
routing. The SP routers need not do anything special to keep
customer routes separate from other customers or from the Internet;
there is no need for per-VPN routing tables, and the additional
complexity this imposes on PE routers.
3.2.4. Layer 3 Independence
Since the Service Provider simply provides Layer 2 connectivity, the
customer can run any Layer 3 protocols they choose. If the SP were
participating in customer routing, it would be vital that the
customer and SP both use the same layer 3 protocol(s) and routing
protocols.
3.2.5. Multicast Routing
Supporting IP multicast over MPLS-based Layer 3 VPN is as yet
undocumented.
In the Layer 2 VPN case, the CE routers run native multicast routing
directly. The SP backbone just provides pipes to connect the CE
routers; whether the CE routers run IP unicast or IP multicast or
some other network protocol is irrelevant to the SP routers.
Kompella et al. [Page 4]
Internet Draft draft-kompella-mpls-l2vpn-01.txt November 2000
3.2.6. PE Scaling
In the Layer 2 VPN scheme described below, each PE transmits a single
small chunk of information about every CE that the PE is connected to
to every other PE. That means that each PE need only maintain a
single chunk of information from each CE in each VPN, and keep a
single "route" to every site in every VPN. This means that both the
Forwarding Information Base and the Routing Information Base scale
well with the number of sites and number of VPNs. Furthermore, the
scaling properties are independent of the customer: the only germane
quantity is the total number of VPN sites.
This is to be contrasted with Layer 3 VPNs, where each CE in a VPN
may have an arbitrary number of routes that need to be carried by the
SP. This leads to two issues. First, both the information stored at
each PE and the number of routes installed by the PE for a CE in a
VPN can be (in principle) unbounded, which means in practice that a
PE must restrict itself to installing routes associated with the VPNs
that it is currently a member of. Second, a CE can send a large
number of routes to its PE, which means that the PE must protect
itself against such a condition. Thus, the SP must enforce limits on
the number of prefixes accepted from a CE; this in turn requires the
PE router to offer such control.
The scaling issues of Layer 3 VPNs come into sharp focus at a BGP
route reflector (RR). An RR cannot keep all the advertised routes in
every VPN since the number of routes will be too large. The
following solutions/extensions are needed to address this issue:
1) RRs could be partitioned so that each RR services a subset of
VPNs so that no single RR has to carry all the routes. This
method has the disadvantage that a PE changing its VPN
membership could force a change in the RR configuration, and
would require carefully constructing RR topologies.
2) An RR could use a preconfigured list of Route-Targets for its
inbound route filtering. The RR may also need to install
Outbound Route Filters [BGP-ORF] which contain the above list
of Route-Targets on each of its peers so that they do not send
unnecessary VPN routes. This method also requires significant
extensions along with the fact that multiple RRs are needed to
service different sets of VPNs.
Kompella et al. [Page 5]
Internet Draft draft-kompella-mpls-l2vpn-01.txt November 2000
3.2.7. Ease of Configuration
Configuring traditional Layer 2 VPNs was a burden primarily because
of the O(n*n) nature of the task. If there are n CEs in a Frame
Relay VPN, say full-mesh connected, n*(n-1)/2 DLCI PVCs must be
provisioned across the SP network. At each CE, (n-1) DLCIs must be
configured to reach each of the other CEs. Furthermore, when a new
CE is added, n new DLCI PVCs must be provisioned; also, each existing
CE must be updated with a new DLCI to reach the new CE.
In our proposal, the provisioning of "PVCs" across the SP network is
handled by signalling protocols (LDP, RSVP-TE), reducing a large part
of the provisioning burden. Furthermore, we assume that DLCIs at the
CE edge are relatively cheap; and labels in the SP network are cheap.
This allows the SP to "over-provision" VPNs: for example, allocate 50
CEs to a VPN when only 20 are needed. With this over-provisioning,
adding a new CE to a VPN requires configuring just the new CE and its
associated PE; existing CEs and their PEs need not be re-configured.
3.3. Advantages of Layer 3 VPNs
Layer 3 VPNs ([IPVPN] in particular) offer a good solution when the
customer traffic is wholly IP, customer routing is reasonably simple,
and the customer sites connect to the SP with a variety of Layer 2
technologies.
3.3.1. Layer 2 Independence
One major restriction in a Layer 2 VPN is that the Layer 2 medium
with which the various sites of a single VPN connect to the SP must
be uniform. On the other hand, the various sites of a Layer 3 VPN
can connect to the SP with any supported media; for example, some
sites may connect with Frame Relay circuits, and others with
Ethernet.
A corollary to this is that the number of sites that can be in a
Layer 2 VPN is determined by the number of Layer 2 circuits that the
Layer 2 technology provides. For example, if the Layer 2 technology
is Frame Relay with 2-octet DLCIs, a CE can connect to at most about
a thousand other CEs in a VPN.
Kompella et al. [Page 6]
Internet Draft draft-kompella-mpls-l2vpn-01.txt November 2000
3.3.2. SP Routing as Added Value
Another problem with Layer 2 VPNs is that the CE router in a VPN must
be able to deal with having N routing peers, where N is the number of
sites in the VPN. This can be alleviated by manipulating the
topology of the VPN. For example, a hub-and-spoke VPN architecture
means that only one CE router (the hub) needs to deal with N
neighbors. However, in a Layer 3 VPN, a CE router need only deal
with one neighbor, the PE router. Thus, the SP can offer Layer 3
VPNs as a value-added service to its customers.
Moreover, with layer 2 VPNs it is up to a customer to build and
operate the whole network. With Layer 3 VPNs, a customer is just
responsible for building and operating routing within each site,
which is likely to be much simpler than building and operating
routing for the whole VPN. That, in turn, makes Layer 3 VPNs more
suitable for customers who don't have sufficient routing expertise,
again allowing the SP to provide added value.
3.3.3. Class-of-Service
Class-of-Service issues have been addressed for Layer 3 VPNs. Since
the PE router has visibility into the network layer (IP), the PE
router can take on the tasks of CoS classification and routing.
Class-of-Service issues for Layer 2 VPNs will be addressed in a
future revision.
4. Operation of a Layer 2 VPN
The following simple example of a customer with 4 sites connected to
3 PE routers in a Service Provider network will hopefully illustrate
the various aspects of the operation of a Layer 2 VPN. For
simplicity, we assume that a full-mesh topology is desired.
In what follows, Frame Relay serves as the Layer 2 medium, and each
CE has multiple DLCIs to its PE, each to connect to another CE in the
VPN. If the Layer 2 medium were ATM, then each CE would have
multiple VPI/VCIs to connect to other CEs. For PPP and Cisco HDLC,
each CE would have multiple physical interfaces to connect to other
CEs.
Kompella et al. [Page 7]
Internet Draft draft-kompella-mpls-l2vpn-01.txt November 2000
4.1. Network Topology
Consider a Service Provider network with edge routers PE0, PE1, and
PE2. Assume that PE0 and PE1 are IGP neighbors, and PE2 is more than
one hop away from PE0.
Suppose that a customer C has 4 sites S0, S1, S2 and S3 that C wants
to connect via the Service Provider's network using Frame Relay.
Site S0 has CE0 and CE1 both connected to PE0. Site S1 has CE2
connected to PE0. Site S2 has CE3 connected to PE1 and CE4 connected
to PE2. Site S3 has CE5 connected to PE2. (See the Figure 1 below.)
Suppose further that C wants to "over-provision" each current site,
in expectation that the number of sites will grow to at least 10 in
the near future. However, CE4 is only provisioned with 9 DLCIs.
Suppose finally that CE0 and CE2 have DLCIs 100 through 109 free; CE1
and CE3 have DLCIs 200 through 209 free; CE4 has DLCIs 107, 209, 265,
301, 414, 555, 654, 777 and 888 free; and CE5 has DLCIs 417-426.
4.2. Configuration
The following sub-sections detail the configuration that is needed to
provision the above VPN. For the purpose of exposition, we assume
that the customer will connect to the SP with Frame Relay circuits,
and that the customer's IGP of choice is OSPF.
While we focus primarily on the configuration that an SP has to do,
we touch upon the configuration requirements of CEs as well. The
main point of contact in CE-PE configuration is that both must agree
on the DLCIs that will be used on the interface connecting them.
If the PE-CE connection is Frame Relay, it is recommended to run LMI
between the PE and CE with the PE as DCE and the CE as DTE. For the
case of ATM VCs, OAM cells may be used; for PPP and Cisco HDLC,
keepalives may be used.
4.2.1. CE Configuration
Each CE that belongs to a VPN is given a "CE ID". CE IDs must be
unique in the context of a VPN. We assume that the CE ID for CE-k is
k. Each CE is also configured with a maximum number of CEs that it
can connect to; this is the CE's "range".
Each CE is configured to communicate with its corresponding PE with
the set of DLCIs given above; for example, CE0 is configured with
Kompella et al. [Page 8]
Internet Draft draft-kompella-mpls-l2vpn-01.txt November 2000
Figure 1: Example Network Topology
S0 S3
.............. ..............
. . . .
. +-----+ . . .
. | CE0 |-----------+ . +-----+ .
. +-----+ . | . | CE5 | .
. . | . +--+--+ .
. +-----+ . | . | .
. | CE1 |-------+ | .......|......
. +-----+ . | | /
. . | | /
.............. | | /
| | SP Network /
.....|...|.............................../.....
. | | / .
. +-+---+-+ +-------+ / .
. | PE0 |-------| P |-- | .
. +-+---+-+ +-------+ \ | .
. / \ \ +---+---+ .
. | -----+ --| PE2 | .
. | | +---+---+ .
. | +---+---+ / .
. | | PE1 | / .
. | +---+---+ / .
. | \ / .
...|.............|.............../.............
| | /
| | /
| | /
S1 | | S2 /
.............. | ........|........../......
. . | . | | .
. +-----+ . | . +--+--+ +--+--+ .
. | CE2 |-----+ . | CE3 | | CE4 | .
. +-----+ . . +-----+ +-----+ .
. . . .
.............. ..........................
DLCIs 100 through 109. OSPF is configured to run over each DLCI.
Each CE also "knows" which DLCI connects it to each other CE. A
simple algorithm is to use the CE ID of the other CE as an index into
the DLCI list this CE has (with zero-based indexing, i.e., 0 is the
first index). For example, CE0 is connected to CE3 through its
Kompella et al. [Page 9]
Internet Draft draft-kompella-mpls-l2vpn-01.txt November 2000
fourth DLCI, 103; CE4 is connected to CE2 by the third DLCI in its
list, namely 265. This is the methodology used in the examples
below; the actual methodology used to pick the DLCI to be used is a
local matter; the key factor is that CE-k may communicate with CE-m
using a different DLCI from the DLCI that CE-m uses to communicate to
CE-k, i.e., the SP network effectively acts as a giant Frame Relay
switch. This is very important, as it decouples the DLCIs used at
each CE site, making for much simpler provisioning.
4.2.2. PE Configuration
Each PE is configured with the VPNs in which it participates. Each
VPN has an VPN ID that is unique within the SP network. For each
VPN, the PE has a list of CEs that are members of that VPN. For each
CE, the PE knows the CE ID, which DLCIs to expect from the CE, and
the CE's range.
4.2.3. Adding a New Site
The first step in adding a new site to a VPN is to pick a new CE ID.
If all current members of the VPN are over-provisioned, i.e., their
range includes the new CE ID, adding the new site is a purely local
task. Otherwise, the sites whose range doesn't include the new CE ID
and wish to communicate directly with the new CE must have their
ranges increased to incorporate the new CE ID.
The next step is ensuring that the new site has the required
connectivity (see below). This may require tweaking the connectivity
mechanism; however, in several common cases, the only configuration
needed is local to the PE to which the CE is attached.
The rest of the configuration is a local matter between the new CE
and the PE to which it is attached.
It bears repeating that the key to making additions easy is over-
provisioning. However, what is being over-provisioned is the number
of DLCIs/VCIs that connect the CE to the PE. This is a local matter,
and generally is not an issue.
4.3. PE Information Exchange
When a PE is configured with all the needed information for a CE, it
first of all chooses a contiguous set of labels with n labels, where
n is the CE's range. Call the smallest label in this set the label-
base. The PE then advertises (for this CE): its Router ID, the VPN
Kompella et al. [Page 10]
Internet Draft draft-kompella-mpls-l2vpn-01.txt November 2000
ID, the CE ID, the CE's range, and the label-base. This is the basic
Layer 2 VPN advertisement. This same advertisement is sent to all
other PEs. Note that PEs that may not be part of the VPN can receive
and keep this information, in case at some future point, a CE
connected to the PE joins the VPN.
If the PE-CE connection goes down, or the CE configuration is
removed, the above advertisement is withdrawn.
4.3.1. PE Advertisement Processing
When a PE receives a Layer 2 VPN advertisement, it checks if the VPN
ID matches any VPN that it is a member of. If not, the PE just
stores the advertisement for future use.
Otherwise, suppose the advertisement is from PE A for VPN X, CE m
with range Rm and label base Lm. For each CE that the receiving PE B
is connected to that is a member of VPN X, PE B does the following.
0) Look up the configuration information associated with the CE.
If the encapsulation type for VPN X in the advertisement does
not match the configured encapsulation type for VPN X, stop.
1) Say the configured CE ID is k, the range is Rk, and the DLCI
list is Dk[]. Also, get the label base PE B allocated for
this CE, say Lk.
2) Check if k = m. If so, issue an error: "CE ID k has been
allocated to two CEs in VPN X (check CE at PE A)". Stop.
3) Check if k >= Rm, or m >= Rk. If so, issue a warning: "Cannot
communicate with CE m (PE A) of VPN X: outside range". Stop.
4) Look in the appropriate table to see which label will get to
PE A. This is the "outer" label, Z.
5) The DLCI that CE-k will use to talk to CE-m is Dk[m]. The
"inner" label for sending packets to CE-m is (Lm + k). The
"inner" label on which to expect packets from CE-m is (Lk + m).
6) Install a "route" such that packets from CE-k with DLCI Dk[m]
will be sent with outer label Z, inner label (Lm + k). Also,
install a route such that packets received with label (Lk + m)
will be mapped to DLCI Dk[m] and be sent to CE-k.
7) Activate DLCI Dk[m] to the CE. This can be done using LMI.
If an advertisement is withdrawn, the appropriate DLCI must be de-
activated, and the corresponding routes must be removed from the
forwarding table.
Kompella et al. [Page 11]
Internet Draft draft-kompella-mpls-l2vpn-01.txt November 2000
4.3.2. Example of PE Advertisment Processing
Consider the example network of Figure 1. Let the VPN connecting S0,
S1, S2 and S3 has a VPN id of 1. Suppose PE2 receives an
advertisement from PE0 for VPN 1, CE ID 0 with CE range R0 = 10 and
label base L0 = 1000. Since PE2 is connected to CE4 which is also in
VPN 1, PE2 does the following:
0) Look up the configuration information associated with CE4.
The advertised encapsulation type matches the configured
encapsulation type (both are Frame Relay), so proceed.
1) CE4's range R4 is 9, its DLCI list D4[] is [ 107, 209, 265,
301, 414, 555, 654, 777, 888], and its label base L4 is 4000.
2) CE0 and CE4 have ids 0 and 4 respectively, so step 2 of 4.3.1
is skipped.
3) Since CE4's id is less than R0, and CE0's id is less than R4,
step 3 of 4.3.1 is skipped.
4) Look in the appropriate table on PE2 to see which label will
get to PE0. Let the label be 10001.
5) The DLCI that CE4 will use to talk to CE0 is D4[0], i.e., 107.
The inner label for sending packets to CE0 is (L0 + 4), i.e
1004. The inner label on which to expect packets from CE0 is
(L4 + 0), i.e., 4000.
6) Install a "route" such that packets from CE4 with DLCI 107
will be sent with outer label 10001, inner label 1004. Also,
install a route such that packets received with label 4000 will
be mapped to DLCI 107 and be sent to CE4.
7) Activate DLCI 107 to CE4.
Since CE5 is also attached to PE2, PE2 needs to do processing similar
to the above for CE5.
Similarly, when PE0 receives an advertisment from PE2 for VPN1, CE4,
with range R4 = 9, and label base L4 = 4000. PE0 processes the
advertisment for CE0 (and CE1, which is also in VPN 1).
0) Look up the configuration information associated with CE0.
The advertised encapsulation type matches the configured
encapsulation type (both are Frame Relay), so proceed.
1) CE0's range, R0, is 9, its DLCI list D0[] is [100 - 109],
and its label base L0 is 1000.
2) CE0 and CE4 have ids 0 and 4 respectively, so step 2 of 4.3.1
is skipped.
3) Since CE4's id is less than R0, and CE0's id is less than R4,
step 3 of 4.3.1 is skipped.
4) Let the outer label to reach PE2 be 9999.
5) The DLCI which CE0 will use to talk to CE4 is D0[4], i.e., 104.
The inner label for sending packets to CE4 is (L4 + 0), i.e.
Kompella et al. [Page 12]
Internet Draft draft-kompella-mpls-l2vpn-01.txt November 2000
4000. The inner label on which to expect packets from CE4 is
(L0 + 4), i.e., 1004.
6) Install a "route" such that packets from CE0 with DLCI 104
will be sent with outer label 9999, inner label 4000. Also,
install a route that packets received with label 1004 will be
mapped to DLCI 104 and be sent to CE0.
7) Activate DLCI 104 to CE0.
Note that the inner label of 4000, computed by PE0, for sending
packets from CE0 to CE4 is the same as what PE2 computed as the
incoming label for receiving packets originated at CE0 and destined
to CE4. Similarly, the inner label of 1004, computed by PE0, for
receiving packets from CE4 to CE0 is same as what PE2 computed as the
outgoing label for sending packets originated at CE4 and destined to
CE0.
4.3.3. Generalizing the VPN Topology
In the above, we assumed for simplicity that the VPN was a full mesh.
To allow for more general VPN topologies when using LDP for
signalling, we introduce the notion of node colors, and the "spoke"
attribute; together, these constitute a node's "connectivity". A
node (CE) in a VPN can be colored with one or more colors.
Furthermore, a node may be a hub or a spoke. Two nodes are connected
iff they share a color in common, and they are not both spokes.
To incorporate connectivity into the processing of advertisements,
add step 3' to the above:
3') If CE k and CE m are not connected, stop.
This notion of connectivity does not allow arbitrary topologies to be
built; however, it is a compromise of generality and efficiency.
A more general mechanism based on BGP extended communities can also
be used; naturally, this mechanism can only be used when signalling
VPNs with BGP. See below for details.
Kompella et al. [Page 13]
Internet Draft draft-kompella-mpls-l2vpn-01.txt November 2000
5. Packet Transport
When a packet arrives at a PE from a CE in a Layer 2 VPN, the layer 2
address of the packet identifies to which other CE the packet is
destined. The procedure outlined above installs a route that maps
the layer 2 address to an outer label (which identifies the PE to
which the destination CE is attached) and an inner label (which
identifies the destination CE). If the destination PE is one hop
away from the source PE, and Penultimate Hop Popping (PHP) is used,
there is no outer label. If the destination PE is the same as the
source PE, no labels are needed.
The packet may then be modified (depending on the layer 2
encapsulation) and then sent to the destination PE with the
appropriate number of labels.
If the destination PE is the same as the source, the packet "arrives"
with no labels. Otherwise, the packet arrives with one label (if PHP
is used) or two labels, in which case the outer label is discarded;
the remaining (inner) label is used to determine which CE is the
destination CE. The packet is restored to a fully-formed layer 2
packet, and then sent to the CE.
The MTU on the Layer 2 access links MUST be chosen such that the size
of the L2 frames plus the L2VPN header does not exceed the MTU of the
MPLS network. Layer 2 frames that exceed the MPLS MTU after
encapsulation MUST be dropped.
5.1. Layer 2 Frame Format
For each VPN encapsulation type (see section 5.1.3), we describe
below the format of the frame as it is transported in the MPLS LSP.
Figure 2: Format of a Layer 2 Packet Carried in MPLS
+---------------------------------------------------+
| MPLS | Outer | Inner | Sequence | Modified Layer |
| Encap | Label | Label | Number | 2 Frame |
+---------------------------------------------------+
The "Outer Label" is used to transport the packet to the PE that is
attached to the destination CE.
The "Inner Label" is used by the destination PE to distinguish which
CE to send the packet to, and what layer 2 address to use (if
Kompella et al. [Page 14]
Internet Draft draft-kompella-mpls-l2vpn-01.txt November 2000
applicable). The Inner Label may also carry "non-address"
information in its experimental bits. The label itself is 20 bits;
the stack bit is as defined in [ENCAP]. The experimental bits are
named N (Notification), C (Control) and L (Loss) as in the following
figure. Note that the inner label is only used for forwarding from
the destination PE to the destination CE, not within the MPLS
network.
Figure 3: Format of the Inner Label
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Label (20 bits) |N C L|S| (Unused) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The "Sequence Number" is an optional two octet unsigned number that
wraps back to zero that is used to ensure in-sequence delivery of L2
frames. The sequence number field is only included if its use is
indicated via VPN signalling. A Layer 2 'connection' between two
specific CEs is characterized within the MPLS network by the PEs to
which the CEs are attached and a specific Inner Label in each
direction. For each such Layer 2 connection, the sequence number
field is set to zero for the first packet transmitted and incremented
by 1 for each subsequent packet sent on the same Layer 2 connection.
When an out-of-sequence packet arrives at the receiver, it MAY be
buffered for future delivery or discarded.
The modification to the Layer 2 frame depends on the Layer 2 type.
The following sections describe the modification for each protocol
type, and other per-protocol information.
5.2. Frame Relay
A Frame Relay frame has the following format:
For transport over an MPLS LSP, the octets are removed. The
rest of the frame is transported as is.
At the destination PE, a new DLCI is added, and the fully-formed
Frame Relay frame sent to the CE.
A DLCI contains "non-address" bits, namely, Forward and Backward
Explicit Congestion Notification (FECN and BECN), the
Kompella et al. [Page 15]
Internet Draft draft-kompella-mpls-l2vpn-01.txt November 2000
Command/Response (C/R) bit and the Discard Eligible (DE) bit. The
ingress LSR MAY set the experimental bits as follows: copy BECN to
the N bit; copy the C/R bit to the C bit; and copy DE to the L bit.
Otherwise, the ingress SHOULD set the experimental bits to 0. The
egress LSR MAY in turn copy the N bit to BECN of the outgoing DLCI,
the C bit to C/R and the L bit to DE.
Note that this is orthogonal to preferential treatment of the layer 2
frame in the MPLS network. If there are two LSPs (L-LSPs) to the
destination PE, one for normal traffic and another for out-of-spec
traffic, the ingress LSR MAY choose which LSP to use (i.e., which
outer label) based on the DE bit. If there is one LSP (E-LSP), but
an experimental bit is used to denote out-of-spec traffic, the
ingress LSR MAY set this experimental bit based on the DE bit.
5.3. ATM AAL/5
For ATM AAL/5 VPNs, the AAL/5 PDU is transported without indication
of the VPI/VCI. At the receiving PE, the AAL/5 PDU is fragmented, a
cell header with the correct VPI/VCI added to each cell, and the
cells sent to the CE.
If any of the cells that constitute the AAL/5 PDU have the CLP bit
set, the ingress LSR MAY set the L bit. If the L bit is set in the
inner label at the destination PE, this PE MAY set the CLP bit in
each cell when fragmenting the AAL/5 PDU.
Again, the ingress PE may give preferential treatment to the ATM PDU
based on whether any cell had the CLP bit set or all cells had their
CLP bits clear.
5.4. ATM Cells
For ATM Cell VPNs, ATM cells (including the 5 octet header) are
transported. At the receiving PE, the cells are sent to the CE.
The experimental bits of the inner label SHOULD be set to zero at the
ingress and ignored by the destination PE.
5.5. PPP, Cisco HDLC, Ethernet
For PPP, Cisco HDLC and unswitched Ethernet VLANs VPNs, the Layer 2
frame is transported whole, without any modification. The Layer 2
frame does not include HLDC flags or Ethernet preamble, nor CRCs; we
assume that bit/byte stuffing has been undone. At the receiving PE,
Kompella et al. [Page 16]
Internet Draft draft-kompella-mpls-l2vpn-01.txt November 2000
the frame is sent to the CE.
The experimental bits of the inner label SHOULD be set to zero at the
ingress and ignored by the destination PE.
6. Signalling MPLS-Based Layer 2 VPNs
There are two alternative means of signalling the MPLS-based Layer 2
VPNs described in this document: using LDP ([LDP]) or using BGP
version 4 ([BGP]).
In LDP, VPN CE information and its associated label base are carried
in a Label Mapping message, distributed in the downstream unsolicited
mode described in [LDP]. A new FEC element is defined below to carry
all the information corresponding to a VPN CE, except from the label
base. The label base is carried in the Label TLV following the FEC
TLV. If a FEC element in a FEC TLV encodes Layer 2 VPN information,
it MUST be the only FEC element in the FEC TLV.
The Layer 2 VPN FEC element is depicted in Figure 4 below.
Figure 4: L2 VPN FEC Element
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Encaps. Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Control Flags | Reserved (Must Be Zero) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| VPN ID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| CE ID | CE Range |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| CE Connectivity |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sub-TLVs |
. ... .
. ... .
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
In BGP, the Multiprotocol Extensions [BGP-MP] are used to carry
L2-VPN signalling information. [BGP-MP] defines the format of two
BGP attributes (MP_REACH_NLRI and MP_UNREACH_NLRI) that can be used
to announce and withdraw the announcement of reachability
Kompella et al. [Page 17]
Internet Draft draft-kompella-mpls-l2vpn-01.txt November 2000
information. We introduce a new address family identifier (AFI) for
L2-VPN [to be assigned by IANA], a new subsequent address family
identifier (SAFI) [to be assigned by IANA], and also a new NLRI
format for carrying the individual L2-VPN CE information. This NLRI
will be carried in the above-mentioned BGP attributes. This NLRI
MUST be accompanied by one or more extended communities. The
extended community type is "Layer 2 VPN" (to be assigned by IANA);
and the format is :, where is 4 octets
in length, and is two octets. All extended
communities accompanying one or more Layer 2 VPN NLRIs MUST have the
same .
PEs receiving VPN information may filter advertisements based on the
extended communities, thus controlling CE-to-CE connectivity.
The format of the Layer 2 VPN NLRI is as shown in Figure 5 below.
Figure 5: BGP NLRI for L2 VPN Information
+------------------------------------+
| Length (2 octets) |
+------------------------------------+
| Encaps Type (1 octet) |
+------------------------------------+
| Control Flags (1 octet) |
+------------------------------------+
| Label base (3 octets) |
+------------------------------------+
| Reserved (Must Be Zero) (1 octet) |
+------------------------------------+
| CE ID (2 octets) |
+------------------------------------+
| CE Range (2 octets) |
+------------------------------------+
| Variable TLVs (0 to n octets) |
| ... |
+------------------------------------+
6.1. Signalled Information
Kompella et al. [Page 18]
Internet Draft draft-kompella-mpls-l2vpn-01.txt November 2000
6.1.1. Type (LDP only)
The Type is L2-VPN (to be decided by IETF Consensus Action).
6.1.2. Length
In LDP, the Length is the entire length of the L2 VPN FEC element,
including the fixed header and all the sub-TLVs.
In BGP, the Length field indicates the length in octets of the L2-VPN
address prefix.
6.1.3. Encapsulation Type
Identifies the layer 2 encapsulation, e.g., ATM, Frame Relay etc.
The following encapsulation types are defined:
Value Encapsulation
0 Reserved
1 ATM PDUs (AAL/5)
2 ATM Cells
3 Frame Relay
4 PPP
5 Cisco-HDLC
6 Ethernet VLAN (unswitched)
7 MPLS
6.1.4. Control Flags
This is a bit vector, defined as in the following Figure.
Figure 6: Control Flags Bit Vector
0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+
| Reserved |S|
+-+-+-+-+-+-+-+-+
The following bit is defined; the rest MUST be set to zero.
Name Bit Meaning
S 0 Sequenced delivery of frames is required
Kompella et al. [Page 19]
Internet Draft draft-kompella-mpls-l2vpn-01.txt November 2000
6.1.5. Label base (BGP only)
The label-base which is to be used for determining the inner label
for forwarding packets to the CE identified by CE ID. (Note: LDP
carries the label-base in the Label TLV following the FEC TLV.)
6.1.6. VPN ID (LDP only)
A 32 bit number which uniquely identifies a VPN in a provider's
domain.
6.1.7. CE ID
A 16 bit number which uniquely identifies a CE in a VPN.
6.1.8. CE Range
A 16 bit number which describes the range of CE IDs to which the
advertised CE is willing to connect. In particular, a PE receiving
an L2 VPN TLV MUST NOT use a label greater than or equal to
+
when sending traffic for this VPN to the advertising PE.
6.1.9. CE Connectivity (LDP only)
A 32-bit number encoding connectivity. If the leftmost bit is 1, the
CE is a spoke. The remaining 31 bits encode the CE colors (bit i = 1
means the CE has color i).
6.1.10. Sub-TLVs
New sub-TLVs can be introduced as needed.
In LDP, the TLV encoding mechanism described in [LDP] must be used.
In BGP, TLVs (type takes 1 octet) can be added to extend the
information carried in the L2 VPN address prefix.
A TLV (type = 1) will be used for carrying VLAN IDs if the
encapsulation is VLAN.
Kompella et al. [Page 20]
Internet Draft draft-kompella-mpls-l2vpn-01.txt November 2000
6.2. BGP L2 VPN capability
The BGP Multiprotocol capability extension [BGP-CAP] is used to
indicate that the BGP speaker wants to negotiate L2 VPN capability
with its peers. The capability code is 1, the capability length is
4, and the AFI and SAFI values will be set to the L2 VPN AFI and L2
VPN SAFI (discussed in section 5) respectively.
6.3. Advantages of Using BGP
PE routers in an SP network typically run BGP v4. This means that
SPs are familiar with using BGP, and have already configured BGP on
their PEs, so configuring and using BGP to signal Layer 2 VPNs is not
much of an additional burden to the SP operators. This is especially
the case when the protocol of choice for signalling MPLS LSPs across
the SP network is RSVP (perhaps for its Traffic Engineering
properties); in this case, the SP may find using LDP to signal Layer
2 VPN information undesirable.
Another advantage of using BGP is that with BPG it is easier to build
inter-provider VPNs. Mechanisms for this will be described in a
future version.
7. Acknowledgments
The authors would like to thank Dennis Ferguson, Der-Hwa Gan, Dave
Katz, Nischal Sheth, John Stewart, and Paul Traina for the
enlightening discussions that helped shape the ideas presented here,
and Ross Callon for his valuable comments.
The idea of using extended communities for more general connectivity
of a Layer 2 VPN was a contribution by Yakov Rekhter, who also gave
many useful comments on the text; many thanks to him.
Kompella et al. [Page 21]
Internet Draft draft-kompella-mpls-l2vpn-01.txt November 2000
8. Security Considerations
The security aspects of this solution will be discussed at a later
time.
9. IANA Considerations
(To be filled in in a later revision.)
10. References
[BGP] Rekhter, Y., and Li, T., "A Border Gateway Protocol 4 (BGP-4)",
RFC 1771, March 1995.
[BGP-CAP] Chandra, R., and Scudder, J., "Capabilities Advertisement
with BGP-4", RFC 2842, May 2000.
[BGP-MP] Bates, T., Rekhter, Y., Chandra, R., and Katz, D.,
"Multiprotocol Extensions for BGP-4", RFC 2858, June 2000
[BGP-ORF] Chen, E., and Rekhter, Y., "Cooperative Route Filtering
Capability for BGP-4", March 2000 (work in progress).
[BGP-RFSH] Chen, E., "Route Refresh Capability for BGP-4", draft-
ietf-idr-bgp-route-refresh-01.txt, March 2000, (work in progress).
[ENCAP] Rosen, E., Rekhter, Y., Tappan, D., Fedorkow, G., Farinacci,
D., Li, T., and Conta, A., "MPLS Label Stack Encoding", draft-ietf-
mpls-label-encaps-08.txt (work in progress)
[IPVPN] Rosen, E., and Rekhter, Y., "BGP/MPLS VPNs", RFC 2547, March
1999.
[LDP] Andersson, L., Doolan, P., Feldman, N., Fredette, A., and
Thomas, B., "LDP Specification", draft-ietf-mpls-ldp-11.txt, August
2000 (work in progress).
[MPLS] Callon, R., Doolan, P., Feldman, N., Fredette, A., Swallow,
G., and Viswanathan, A., "A Framework for Multiprotocol Label
Switching", draft-ietf-mpls-framework-05.txt, September 1999 (work in
progress).
[VPN] Kosiur, Dave, "Building and Managing Virtual Private Networks",
Wiley Computer Publishing, 1998.
Kompella et al. [Page 22]
Internet Draft draft-kompella-mpls-l2vpn-01.txt November 2000
11. Intellectual Property Considerations
Juniper Networks may seek patent or other intellectual property
protection for some of all of the technologies disclosed in this
document. If any standards arising from this document are or become
protected by one or more patents assigned to Juniper Networks,
Juniper intends to disclose those patents and license them on
reasonable and non-discriminatory terms.
CoSine Communications may seek patent or other intellectual property
protection for some of all of the technologies disclosed in this
document. If any standards arising from this document are or become
protected by one or more patents assigned to CoSine Communications,
CoSine intends to disclose those patents and license them on
reasonable and non-discriminatory terms.
12. Full Copyright Statement
Copyright (C) The Internet Society (2000). All Rights Reserved.
This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Kompella et al. [Page 23]
Internet Draft draft-kompella-mpls-l2vpn-01.txt November 2000
13. Author Information
Kireeti Kompella
Juniper Networks
1194 N. Mathilda Ave
Sunnyvale, CA 94089
kireeti@juniper.net
Manoj Leelanivas
Juniper Networks
1194 N. Mathilda Ave
Sunnyvale, CA 94089
manoj@juniper.net
Quaizar Vohra
Juniper Networks
1194 N. Mathilda Ave
Sunnyvale, CA 94089
qv@juniper.net
Javier Achirica
Telefonica Data
javier.achirica@telefonica-data.com
Ronald P. Bonica
WorldCom
22001 Loudoun County Pkwy
Ashburn, Virginia, 20147
rbonica@mci.net
Chris Liljenstolpe
Cable & Wireless
11700 Plaza America Drive
Reston, VA 20190
chris@cw.net
Eduard Metz
KPN Royal Dutch Telecom
St. Paulusstraat 4
2264 XZ Leidschendam
The Netherlands
e.t.metz@kpn.com
Chandramouli Sargor
CoSine Communications
1200 Bridge Parkway
Redwood City, CA 94065
Kompella et al. [Page 24]
Internet Draft draft-kompella-mpls-l2vpn-01.txt November 2000
csargor@cosinecom.com
Vijay Srinivasan
CoSine Communications
1200 Bridge Parkway
Redwood City, CA 94065
vijay@cosinecom.com
Kompella et al. [Page 25]