U.S. patent application number 14/672058 was filed with the patent office on 2015-07-23 for dynamic control channel establishment for software-defined networks having centralized control.
The applicant listed for this patent is Juniper Networks, Inc.. Invention is credited to Jayabharat Boddu, Abhijit K. Choudhury, James M. Murphy, Pradeep Sindhu.
Application Number | 20150207724 14/672058 |
Document ID | / |
Family ID | 53545802 |
Filed Date | 2015-07-23 |
United States Patent
Application |
20150207724 |
Kind Code |
A1 |
Choudhury; Abhijit K. ; et
al. |
July 23, 2015 |
DYNAMIC CONTROL CHANNEL ESTABLISHMENT FOR SOFTWARE-DEFINED NETWORKS
HAVING CENTRALIZED CONTROL
Abstract
Dynamic control channel establishment for an access network is
described in which a centralized controller provides seamless
end-to-end service from a core-facing edge of a network to access
nodes. For example, a method includes receiving, by the centralized
controller, a discover message originating from a network node,
which includes an intermediate node list that specifies a plurality
of network nodes the discover message traversed from the network
node to an edge node, determining, based on the plurality of nodes
specified by the discover message, a path from the edge node to the
network node, allocating each of a plurality of Multi-protocol
Label Switching (MPLS) labels to a respective outgoing interface of
each of the plurality of network nodes, and outputting one or more
control messages for configuring the network node, wherein the
control messages are encapsulated within a label stack comprising
the allocated plurality of labels.
Inventors: |
Choudhury; Abhijit K.;
(Cupertino, CA) ; Murphy; James M.; (Alameda,
CA) ; Sindhu; Pradeep; (Los Altos Hills, CA) ;
Boddu; Jayabharat; (Los Altos, CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Juniper Networks, Inc. |
Sunnyvale |
CA |
US |
|
|
Family ID: |
53545802 |
Appl. No.: |
14/672058 |
Filed: |
March 27, 2015 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
14231350 |
Mar 31, 2014 |
|
|
|
14672058 |
|
|
|
|
13842453 |
Mar 15, 2013 |
8693374 |
|
|
14231350 |
|
|
|
|
61738955 |
Dec 18, 2012 |
|
|
|
Current U.S.
Class: |
370/255 |
Current CPC
Class: |
H04L 45/122 20130101;
H04L 45/42 20130101; H04L 45/26 20130101; H04L 45/507 20130101;
H04L 45/021 20130101; H04L 45/02 20130101; H04L 47/12 20130101;
H04L 41/12 20130101; H04L 45/50 20130101; H04L 45/026 20130101;
H04L 45/68 20130101; H04L 41/0654 20130101 |
International
Class: |
H04L 12/751 20060101
H04L012/751; H04L 12/721 20060101 H04L012/721; H04L 12/755 20060101
H04L012/755; H04L 12/733 20060101 H04L012/733; H04L 12/723 20060101
H04L012/723 |
Claims
1. A method comprising: sending, by a network node, a plurality of
hello messages to neighboring network nodes within a network,
wherein each of the plurality of hello messages is sent on a
different respective network link coupled to the network node and
includes an indicator specifying a respective distance as a number
of network hops from the network node to a centralized controller
that manages the network; receiving, by the network node, a
plurality of hello reply messages from respective neighboring
network nodes within the network in response to the plurality of
hello messages, wherein each of the plurality of hello reply
messages is received on a different respective network link coupled
to the network node and includes a respective indicator specifying
a respective distance as a number of network hops from the
respective neighboring network node sending the hello reply
messages to a centralized controller that manages the network;
determining, by the network node and based at least in part on the
respective distance specified by one or more of the plurality of
hello reply messages received from the neighboring network nodes,
an active one of the network links coupled to the network node to
one of the neighboring network nodes having a shortest distance to
the centralized controller; forwarding, by the network node, a
discover message on the active link to the neighboring network node
having the shortest distance to the centralized controller, wherein
the discover message includes a neighbor node list specifying a set
of neighboring network nodes from which hello reply packets were
received and an intermediate node list that will specify a set of
network nodes the discover message will traverse; and after
receiving a discover reply message sent by the centralized
controller in response to the centralized controller receiving the
discover message, sending, by the network node, a control message
to the centralized controller encapsulated with a Multi-protocol
Label Switching (MPLS) label that indicates the control message is
to be automatically forwarded by a receiving one of the network
nodes along a shortest path toward the centralized controller.
2. The method of claim 1, further comprising: receiving, by the
network node, a second discover message from a neighboring one of
the network nodes; checking, by the network node, whether an
intermediate node list specified by the second discover message
includes the network node; in response to determining that the
intermediate node list includes the network node, discarding the
second discover message; and in response to determining that the
intermediate node list does not include the network node: updating,
by the network node, the intermediate node list to include the
network node and an ingress port and egress port of the network
node; and forwarding, by the network node, an updated second
discover message on the active link to the neighboring network node
having the shortest distance to the centralized controller.
3. The method of claim 2, wherein the intermediate node list
identifies a Media Access Control (MAC) address of each of the
plurality of network nodes, and corresponding ingress and egress
port pairs through which the discover message traversed from the
network node to an edge node, inclusive of the network node and the
edge node.
4. The method of claim 1, further comprising: by the network node,
periodically broadcasting a hello message as a link-local broadcast
message on all ports of the network node; by the network node,
receiving respective hello reply messages from the neighboring
network nodes in response to the broadcast hello messages; in
response to receiving the respective hello reply messages: setting
as active a link on which the respective hello reply messages are
received from the neighboring network nodes; and updating a table
of shortest distances to the centralized controller based on at
least one member selected from a group consisting of (1) the number
of network hops specified by the hello messages and (2) the number
of network hops specified by the respective hello reply messages;
and by the network node, adding the neighboring network nodes from
which hello reply messages are received to the neighbor node
list.
5. The method of claim 1, wherein the discover message comprises a
first discover message that specifies a first generation number,
the method further comprising: sending, by the network device, a
keepalive message to the centralized controller; and in response to
receiving no keepalive reply message from the centralized
controller within a time period after sending the keepalive
message, forwarding a second discover message on the active link,
wherein the second discover message specifies a second generation
number having a value greater than a value of the first generation
number.
6. The method of claim 1, further comprising: receiving, by the
network node, a control message sent by the centralized controller
and destined for a neighboring network node, wherein the control
message is encapsulated within a label stack comprising a plurality
of MPLS labels allocated by the centralized controller; by the
network node, removing an outer label of the label stack; and
forwarding the control message with a modified label stack to a
next hop selected based on the outer label.
7. The method of claim 1, wherein sending the control message to
the centralized controller comprises sending an endpoint indication
message that indicates an endpoint status change, wherein the
endpoint indication message specifies a type of an endpoint, an
address of the endpoint, and a status of the endpoint indicating
whether the endpoint is up or down.
8. The method of claim 7, wherein sending the control message to
the centralized controller comprises sending a direct switch
response message to acknowledge receipt of a direct switch request
message from the centralized controller for mapping traffic from
the endpoint to a pseudo wire.
9. The method of claim 1, wherein sending the control message to
the centralized controller comprises sending a pseudo wire response
message to acknowledge receipt of a pseudo wire request message
from the centralized controller for creating a pseudo wire on the
network node.
10. A network node comprising: one or more processors; one or more
physical interfaces configured to send a plurality of hello
messages to neighboring network nodes within a network, wherein
each of the plurality of hello messages is sent on a different
respective network link coupled to the network node and includes an
indicator specifying a respective distance as a number of network
hops from the network node to a centralized controller that manages
the network, wherein the one or more physical interfaces receive a
plurality of hello reply messages from respective neighboring
network nodes within the network in response to the plurality of
hello messages, wherein each of the plurality of hello reply
messages is received on a different respective network link coupled
to the network node and includes a respective indicator specifying
a respective distance as a number of network hops from the
respective neighboring network node sending the hello reply
messages to a centralized controller that manages the network; and
a protocol module executing on the one or more processors, wherein
the protocol module is configured to determine, based at least in
part on the respective distance specified by one or more of the
plurality of hello reply messages received from the neighboring
network nodes, an active one of the network links coupled to the
network node to one of the neighboring network nodes having a
shortest distance to the centralized controller, wherein the
protocol module is configured to forward a discover message on the
active link to the neighboring network node having the shortest
distance to the centralized controller, wherein the discover
message includes a neighbor node list specifying a set of
neighboring network nodes from which hello reply packets were
received and an intermediate node list that will specify a set of
network nodes the discover message will traverse; and wherein the
protocol module is configured to, after receiving a discover reply
message sent by the centralized controller in response to the
centralized controller receiving the discover message, send a
control message to the centralized controller encapsulated with a
MPLS label that indicates the control message is to be
automatically forwarded by a receiving one of the network nodes
along a shortest path toward the centralized controller.
11. The network node of claim 10, wherein the one or more physical
interfaces are configured to receive a second discover message from
a neighboring one of the network nodes, wherein the protocol module
is configured to check whether an intermediate node list specified
by the second discover message includes the network node; wherein
the protocol module is configured to, in response to determining
that the intermediate node list includes the network node, discard
the second discover message; and wherein the protocol module is
configured to, in response to determining that the intermediate
node list does not include the network node: update the
intermediate node list to include the network node and an ingress
port and egress port of the network node; and forward an updated
second discover message on the active link to the neighboring
network node having the shortest distance to the centralized
controller.
12. The network node of claim 10, wherein the discover message
comprises a first discover message that specifies a first
generation number, wherein the protocol module is configured to
send a keepalive message to the centralized controller, and wherein
the protocol module is configured to, in response to receiving no
keepalive reply message from the centralized controller within a
time period after sending the keepalive message, forwarding a
second discover message on the active link, wherein the second
discover message specifies a second generation number having a
value greater than a value of the first generation number.
13. The network node of claim 10, further comprising a forwarding
plane, wherein the one or more physical interfaces are configured
to receive a control message sent by the centralized controller and
destined for a neighboring network node, wherein the control
message is encapsulated within a label stack comprising a plurality
of MPLS labels allocated by the centralized controller; wherein the
forwarding plane removes an outer label of the label stack, and
forwards the control message with a modified label stack to a next
hop selected based on the outer label.
14. The network node of claim 10, wherein the control message
comprises an endpoint indication message that indicates an endpoint
status change, wherein the endpoint indication message specifies a
type of an endpoint, an address of the endpoint, and a status of
the endpoint indicating whether the endpoint is up or down.
15. The network node of claim 14, wherein the control message
comprises a direct switch response message to acknowledge receipt
of a direct switch request message from the centralized controller
for mapping traffic from the endpoint to a pseudo wire.
16. The network node of claim 10, wherein the control message
comprises a pseudo wire response message to acknowledge receipt of
a pseudo wire request message from the centralized controller for
creating a pseudo wire on the network node.
17. A method comprising: receiving, by a centralized controller, a
discover message originating from a network node, wherein the
discover message includes an intermediate node list that specifies
a plurality of network nodes the discover message traversed from
the network node to an edge node; determining, by the centralized
controller and based on the plurality of nodes specified by the
discover message, a path from the edge node to the network node;
allocating, by the centralized controller, each of a plurality of
Multi-protocol Label Switching (MPLS) labels to a respective
outgoing interface of each of the plurality of network nodes; and
outputting, by the centralized controller, one or more control
messages for configuring the network node, wherein the control
messages are encapsulated within a label stack comprising the
allocated plurality of labels.
18. The method of claim 17, wherein the intermediate node list
identifies a Media Access Control (MAC) address of each of the
plurality of network nodes, and corresponding ingress and egress
port pairs through which the discover message traversed from the
network node to an edge node, inclusive of the network node and the
edge node.
19. The method of claim 18, wherein determining the path from the
edge node to the network node comprises reversing an order of nodes
and respective ingress and egress port pairs as set forth in the
neighbor node list of the discover message.
20. The method of claim 17, wherein receiving the discover message
comprises receiving the discover message via a uniform datagram
protocol (UDP) connection from the edge node, and wherein
outputting comprises outputting the control message via the UDP
connection to the edge node.
21. The method of claim 17, wherein the discover message further
specifies a neighbor node list learned by the network node, the
method further comprising: by the centralized controller, updating
stored network topology information based on the neighbor node
list.
22. The method of claim 21, wherein the discover message specifies
a generation number, the method further comprising: comparing, by
the centralized controller, the generation number specified by the
discover message to a current generation number received from the
access node, wherein updating the stored network topology
information comprises updating the stored network topology
information if the generation number specified by the discover
message is greater than or equal to the current generation number;
and in response to determining that the generation number specified
by the discover message is less than the current generation number,
discarding the discover message.
23. A centralized controller comprising: one or more physical
interfaces configured to receive a discover message originating
from a network node, wherein the discover message includes an
intermediate node list that specifies a plurality of network nodes
the discover message traversed from the network node to an edge
node; a path computation module configured to determine, based on
the plurality of nodes specified by the discover message, a path
from the edge node to the network node; and a path provisioning
module configured to allocate each of a plurality of Multi-protocol
Label Switching (MPLS) labels to a respective outgoing interface of
each of the plurality of network nodes, wherein the one or more
physical interfaces are configured to output one or more control
messages for configuring the network node, wherein the control
messages are encapsulated within a label stack comprising the
allocated plurality of labels, and wherein the one or more physical
interfaces are configured to receive one or more control messages
from the network node.
24. The centralized controller of claim 23, wherein the
intermediate node list identifies a Media Access Control (MAC)
address of each of the plurality of network nodes, and
corresponding ingress and egress port pairs through which the
discover message traversed from the network node to an edge node,
inclusive of the network node and the edge node.
25. The centralized controller of claim 24, wherein the path
computation module is configured to reverse an order of nodes and
respective ingress and egress port pairs as set forth in the
neighbor node list of the discover message.
26. The centralized controller of claim 23, wherein the one or more
physical interfaces are configured to receive the discover message
via a uniform datagram protocol (UDP) connection from the edge
node, and output the control message via the UDP connection to the
edge node.
27. The centralized controller of claim 23, wherein the discover
message further specifies a neighbor node list learned by the
network node, the centralized controller further comprising: a
topology module configured to update stored network topology
information based on the neighbor node list.
28. The centralized controller of claim 27, wherein the discover
message specifies a generation number, wherein the topology module
is configured to compare the generation number specified by the
discover message to a current generation number received from the
access node, and update stored network topology information if the
generation number specified by the discover message is greater than
or equal to the current generation number; and wherein the topology
module is configured to, in response to determining that the
generation number specified by the discover message is less than
the current generation number, discard the discover message.
Description
[0001] This application is a continuation-in-part of U.S.
application Ser. No. 14/231,350, filed Mar. 31, 2014, which is a
continuation of U.S. application Ser. No. 13/842,453, filed Mar.
15, 2013, now U.S. Pat. No. 8,693,374, which claims the benefit of
U.S. Provisional Application No. 61/738,955, filed Dec. 18, 2012,
the entire contents of each of which being incorporated herein by
reference.
TECHNICAL FIELD
[0002] The disclosure relates to packet-based computer
networks.
BACKGROUND
[0003] A wide variety of computing devices connect to service
provider networks to access resources and services provided by
packet-based data networks, such as the Internet, enterprise
intranets, content providers and virtual private networks (VPNs).
For example, many fixed computing devices utilize fixed
communication links, such as optical, digital subscriber line, or
cable-based connections, of service provider networks to access the
packet-based services. In addition, a vast amount of mobile
computing devices, such as cellular or mobile smart phones and
feature phones, tablet computers, and laptop computers, utilize
mobile connections, such as cellular radio access networks of the
service provider networks, to access the packet-based services.
[0004] Each service provider network typically provides an
extensive access network infrastructure to provide packet-based
data services to the offered services. The access network typically
includes a vast collection of access nodes, aggregation nodes and
high-speed edge routers interconnected by communication links.
These access devices typically execute various protocols and
exchange signaling messages to anchor and manage subscriber
sessions and communication flows associated with the subscribers.
For example, the access devices typically provide complex and
varied mechanisms for authenticating subscribers, identifying
subscriber traffic, applying subscriber policies to manage
subscriber traffic on a per-subscriber basis, applying various
services to the traffic and generally forwarding the traffic within
the service provider network.
[0005] As such, access networks represent a fundamental challenge
for service providers and often require the service providers to
make difficult tradeoffs over a wide range of user densities. For
example, in some environments, user densities may exceed several
hundred thousand users per square kilometer. In other environments,
user densities may be as sparse as one or two users per square
kilometer. Due to this diversity of requirements, access networks
typically make use of a host of heterogeneous communication
equipment and technologies.
SUMMARY
[0006] In general, techniques are described for dynamic control
channel establishment for an access/aggregation network in which a
centralized controller provides seamless end-to-end service from a
core-facing edge of a service provider network through aggregation
and access infrastructure out to access nodes located proximate to
endpoints such as subscriber devices. The controller operates to
provide a central configuration point for configuring access nodes
and aggregation nodes of an access/aggregation network of the
service provider to provide transport services to transport traffic
between access nodes and edge routers on opposite borders of the
aggregation network. A control channel between an access node and
the controller is dynamically established in accordance with the
techniques of this disclosure, and then the access node and the
controller can exchange various control messages using the
dynamically established control channel for subscriber management
and network service integration.
[0007] The architectures described herein provide centralized
control over the various network nodes within the network (e.g.,
access nodes and aggregation nodes), and support a separation of
control plane and data plane, with the network nodes supporting
full data-plane functionality but only a limited control plane with
no persistent configuration. The more complex control functions are
centralized at one or more controllers, which in turn configure the
limited control planes of the network nodes. This enables lower
total cost of ownership as the network nodes themselves can be
simpler and less expensive, while the controller allows a single
touch-point in the network that allows better control and
management.
[0008] In one example aspect, a method includes sending, by a
network node, a plurality of hello messages to neighboring network
nodes within a network, wherein each of the plurality of hello
messages is sent on a different respective network link coupled to
the network node and includes an indicator specifying a respective
distance as a number of network hops from the network node to a
centralized controller that manages the network, receiving, by the
network node, a plurality of hello reply messages from respective
neighboring network nodes within the network in response to the
plurality of hello messages, wherein each of the plurality of hello
reply messages is received on a different respective network link
coupled to the network node and includes a respective indicator
specifying a respective distance as a number of network hops from
the respective neighboring network node sending the hello reply
messages to a centralized controller that manages the network, and
determining, by the network node and based at least in part on the
respective distance specified by one or more of the plurality of
hello reply messages received from the neighboring network nodes,
an active one of the network links coupled to the network node to
one of the neighboring network nodes having a shortest distance to
the centralized controller. The method also includes forwarding, by
the network node, a discover message on the active link to the
neighboring network node having the shortest distance to the
centralized controller, wherein the discover message includes a
neighbor node list specifying a set of neighboring network nodes
from which hello reply packets were received and an intermediate
node list that will specify a set of network nodes the discover
message will traverse; and after receiving a discover reply message
sent by the centralized controller in response to the centralized
controller receiving the discover message, sending, by the network
node, a control message to the centralized controller encapsulated
with a Multi-protocol Label Switching (MPLS) label that indicates
the control message is to be automatically forwarded by a receiving
one of the network nodes along a shortest path toward the
centralized controller.
[0009] In another example aspect, a network node includes one or
more processors; one or more physical interfaces configured to send
a plurality of hello messages to neighboring network nodes within a
network, wherein each of the plurality of hello messages is sent on
a different respective network link coupled to the network node and
includes an indicator specifying a respective distance as a number
of network hops from the network node to a centralized controller
that manages the network, wherein the one or more physical
interfaces receive a plurality of hello reply messages from
respective neighboring network nodes within the network in response
to the plurality of hello messages, wherein each of the plurality
of hello reply messages is received on a different respective
network link coupled to the network node and includes a respective
indicator specifying a respective distance as a number of network
hops from the respective neighboring network node sending the hello
reply messages to a centralized controller that manages the
network. The network node also includes a protocol module executing
on the one or more processors, wherein the protocol module is
configured to determine, based at least in part on the respective
distance specified by one or more of the plurality of hello reply
messages received from the neighboring network nodes, an active one
of the network links coupled to the network node to one of the
neighboring network nodes having a shortest distance to the
centralized controller, wherein the protocol module is configured
to forward a discover message on the active link to the neighboring
network node having the shortest distance to the centralized
controller, wherein the discover message includes a neighbor node
list specifying a set of neighboring network nodes from which hello
reply packets were received and an intermediate node list that will
specify a set of network nodes the discover message will traverse;
and wherein the protocol module is configured to, after receiving a
discover reply message sent by the centralized controller in
response to the centralized controller receiving the discover
message, send a control message to the centralized controller
encapsulated with a MPLS label that indicates the control message
is to be automatically forwarded by a receiving one of the network
nodes along a shortest path toward the centralized controller.
[0010] In a further example aspect, a method includes receiving, by
a centralized controller, a discover message originating from a
network node, wherein the discover message includes an intermediate
node list that specifies a plurality of network nodes the discover
message traversed from the network node to an edge node;
determining, by the centralized controller and based on the
plurality of nodes specified by the discover message, a path from
the edge node to the network node; allocating, by the centralized
controller, each of a plurality of Multi-protocol Label Switching
(MPLS) labels to a respective outgoing interface of each of the
plurality of network nodes; and outputting, by the centralized
controller, one or more control messages for configuring the
network node, wherein the control messages are encapsulated within
a label stack comprising the allocated plurality of labels.
[0011] In another example aspect, a centralized controller includes
one or more physical interfaces configured to receive a discover
message originating from a network node, wherein the discover
message includes an intermediate node list that specifies a
plurality of network nodes the discover message traversed from the
network node to an edge node; a path computation module configured
to determine, based on the plurality of nodes specified by the
discover message, a path from the edge node to the network node;
and a path provisioning module configured to allocate each of a
plurality of Multi-protocol Label Switching (MPLS) labels to a
respective outgoing interface of each of the plurality of network
nodes, wherein the one or more physical interfaces are configured
to output one or more control messages for configuring the network
node, wherein the control messages are encapsulated within a label
stack comprising the allocated plurality of labels, and wherein the
one or more physical interfaces are configured to receive one or
more control messages from the network node.
[0012] In a further example aspect, a method includes by a
centralized controller, dynamically establishing a control channel
between the centralized controller and an access node in a
software-defined network having a plurality of network nodes
managed by the centralized controller; receiving, by the
centralized controller, a services indication message from a
network node of the plurality of network nodes, wherein the
services indication message indicates one or more network services
provided by the network node in a software-defined network having a
plurality of network nodes managed by the centralized controller;
establishing, by a centralized controller, a transport label
switched path (LSP) between the access node and the network node to
transport network packets between the access node and the network
node; receiving, by the centralized controller, an endpoint
indication message from the access node via the control channel,
wherein the endpoint indication message indicates that an endpoint
that has joined the network at the access node; responsive to
determining that a pseudo wire is needed between the access node
and the network node to provide to the endpoint a network service
of the one or more network services, outputting, by the centralized
controller, a pseudo wire request message via the control channel
to install forwarding state on the access node for creating the
pseudo wire between the access node and the network node; and
outputting, by the centralized controller, a direct switch message
via the control channel to configure the access node to map traffic
received from the endpoint to the pseudo wire.
[0013] In another example aspect, a centralized controller is
configured to dynamically establish a control channel between the
centralized controller and an access node in a software-defined
network having a plurality of network nodes managed by the
centralized controller, the centralized controller includes one or
more physical interfaces configured to receive a services
indication message from a network node of the plurality of network
nodes, wherein the services indication message indicates one or
more network services provided by the network node; and a path
provisioning module configured to establish a transport label
switched path (LSP) between the access node and the network node to
transport network packets between the access node and the network
node, wherein the one or more physical interfaces are configured to
receive an endpoint indication message from the access node via the
control channel, wherein the endpoint indication message indicates
that an endpoint that has joined the network at the access node.
The path provisioning module is configured to, responsive to
determining that a pseudo wire is needed between the access node
and the network node to provide to the endpoint a network service
of the one or more network services, output a pseudo wire request
message via the control channel to install forwarding state on the
access node for creating the pseudo wire between the access node
and the network node, and wherein the path provisioning module is
configured to output a direct switch message via the control
channel to configure the access node to map traffic received from
the endpoint to the pseudo wire.
[0014] In a further example, a method includes dynamically
establishing a control channel between a centralized controller and
an access node in a software-defined network having a plurality of
network nodes managed by the centralized controller; establishing a
transport label switched path (LSP) between the access node and a
network node of the plurality of network nodes to transport network
packets between the access node and the network node; sending, by
the access node and to the centralized controller via the control
channel, an endpoint indication message that indicates that an
endpoint that has joined the network at the access node; receiving,
by the access node, a pseudo wire request message from the
centralized controller via the control channel to install
forwarding state for creating a pseudo wire to the access node for
providing one or more network services to the endpoint; and
receiving, by the access node, a direct switch message from the
centralized controller via the control channel to configure the
access node to map traffic received from the endpoint to the pseudo
wire.
[0015] In a further example, an access node includes one or more
processors; a protocol module executing on the one or more
processors, wherein the protocol module is configured to
dynamically establish a control channel between the access node and
a centralized controller in a software-defined network having a
plurality of network nodes managed by the centralized controller,
wherein the protocol module is configured to establish a transport
label switched path (LSP) between the access node and a network
node of the plurality of network nodes to transport network packets
between the access node and the network node, wherein the protocol
module is configured to send, to the centralized controller via the
control channel, an endpoint indication message that indicates that
an endpoint that has joined the network at the access node, wherein
the protocol module is configured to receive a pseudo wire request
message from the centralized controller via the control channel to
install forwarding state for creating a pseudo wire to the access
node for providing one or more network services to the endpoint,
and wherein the protocol module is configured to receive a direct
switch message from the centralized controller via the control
channel to configure the access node to map traffic received from
the endpoint to the pseudo wire.
[0016] The techniques of this disclosure may provide one or more
advantages. For example, the techniques of this disclosure can
allow for reduced total cost of ownership (TCO) of service provider
networks, with the ability to increase capacity at reasonable cost
and being able to manage the networks and deploy services
efficiently. The availability of cost-effective Ethernet solutions
has helped the migration towards converged packet networks in the
access and aggregation space, and MPLS has become the transport of
choice for packetized traffic. However, this disclosure describes
an architecture for access/aggregation networks that is based on
packet switching, supports wired and mobile users, scales easily to
support thousands of infrastructure nodes, such as routers and
switches, as the service provider network grows, and makes the
control and management of the large networks easier.
[0017] The new architectures and techniques described herein for
access/aggregation networks may facilitate seamless plug-and-play
insertion of nodes within the service provider networks, requiring
little or no manual configuration. The nodes can join the network,
discover their neighbors and be able to download configurations.
Among other advantages, this helps reduce the overall operational
expenses of the network.
[0018] The centralized control plane (or controller) becomes the
site for centralized intelligence in the network and supports
programmability. This provides a foundation for software-defined
networking (SDN) in the access/aggregation networks with a high
level northbound application programming interface (API).
Applications can be written on this platform to utilize the
information that is gathered from the network and made available at
the controller.
[0019] The controller also allows the centralized configuration and
management of Network Services (L3VPN, EVPN, VPLS, and Internet)
which are bound by location or identity. The architecture is highly
reliable by design, so that it can provide the level of service
availability that service providers expect.
BRIEF DESCRIPTION OF THE DRAWINGS
[0020] FIG. 1 is a block diagram illustrating an example network
system in accordance with techniques described herein.
[0021] FIG. 2 is a block diagram illustrating a system including a
collection of Access Nodes and Aggregation Nodes to be discovered
by a controller according to the techniques of this disclosure.
[0022] FIG. 3 is a block diagram illustrating an example controller
in accordance with the techniques of this disclosure.
[0023] FIG. 4 is a block diagram illustrating an example
implementation of a path computation element of the controller of
FIG. 3.
[0024] FIG. 5 is a block diagram illustrating an example network
device in accordance with the techniques of this disclosure.
[0025] FIG. 6 is a block diagram illustrating an example OCC
Control Packet Structure according to the techniques of this
disclosure.
[0026] FIG. 7 is a block diagram illustrating an example OCC
Message Header in further detail.
[0027] FIG. 8 is a block diagram illustrating an example Base
Packet Structure for SRT packets.
[0028] FIG. 9 is a block diagram illustrating an example Node
Indication Header Structure.
[0029] FIG. 10 is a block diagram illustrating an example Node
Configuration Header Structure.
[0030] FIG. 11 is a block diagram illustrating an example TLV
Structure.
[0031] FIG. 12 is a block diagram illustrating an example Vendor
Specific TLV Structure.
[0032] FIG. 13 is a block diagram illustrating an example Hello
Message Structure.
[0033] FIG. 14 is a block diagram illustrating an example Hello
Reply Message Structure.
[0034] FIG. 15 is a block diagram illustrating an example Discover
Message Structure.
[0035] FIG. 16 is a block diagram illustrating an example Neighbor
Node List Element Structure.
[0036] FIG. 17 is a block diagram illustrating an example
Intermediate Node List Element Structure.
[0037] FIG. 18 is a block diagram illustrating an example Discover
Reply Message Structure.
[0038] FIG. 19 is a block diagram illustrating an example SRT Down
Message Structure.
[0039] FIG. 20 is a block diagram illustrating an example Port
Attributes Indication Message Structure.
[0040] FIG. 21 is a block diagram illustrating an example Shared
Resource Group Structure.
[0041] FIG. 22 is a block diagram illustrating an example Port
Attributes Confirmation Message Structure.
[0042] FIG. 23 is a block diagram illustrating an example
Capabilities Indication Message Structure.
[0043] FIG. 24 is a block diagram illustrating an example Services
Indication Message Structure.
[0044] FIG. 25 is a block diagram illustrating an example Endpoint
Indication Structure.
[0045] FIG. 26 is a block diagram illustrating an example MPLS FIB
Request Message Structure.
[0046] FIG. 27 is a block diagram illustrating an example MPLS FIB
Response Message Structure.
[0047] FIG. 28 is a block diagram illustrating an example Policer
Request Message Structure.
[0048] FIG. 29 is a block diagram illustrating an example Per CoS
Entry Element Structure.
[0049] FIG. 30 is a block diagram illustrating an example Policer
Response Message Structure.
[0050] FIG. 31 is a block diagram illustrating an example CoS
Scheduler Request Message Structure.
[0051] FIG. 32 is a block diagram illustrating an example Per CoS
Scheduler Entry Structure.
[0052] FIG. 33 is a block diagram illustrating an example CoS
Scheduler Response Message Structure.
[0053] FIG. 34 is a block diagram illustrating an example Filter
Request Message Structure.
[0054] FIG. 35 is a block diagram illustrating an example Filter
Rule Structure.
[0055] FIG. 36 is a block diagram illustrating an example Filter
Response Message Structure.
[0056] FIG. 37 is a block diagram illustrating an example Pseudo
Wire Request Message Structure.
[0057] FIG. 38 is a block diagram illustrating an example Pseudo
Wire Response Message Structure.
[0058] FIG. 39 is a block diagram illustrating an example Direct
Switch Request Message Structure.
[0059] FIG. 40 is a block diagram illustrating an example Direct
Switch Response Message Structure.
[0060] FIG. 41 is a block diagram illustrating an example MAC FIB
Request Message Structure.
[0061] FIG. 42 is a block diagram illustrating an example Next Hop
Port Descriptor.
[0062] FIG. 43 is a block diagram illustrating an example Next Hop
Pseudo Wire (PW) Descriptor.
[0063] FIG. 44 is a block diagram illustrating an example MAC FIB
Response Message Structure.
[0064] FIG. 45 is a flowchart illustrating example operation of
network devices in accordance with the techniques of this
disclosure.
[0065] FIG. 46 is a flowchart illustrating example operation of
network devices in accordance with the techniques of this
disclosure.
[0066] FIG. 47 is a block diagram illustrating an example network
system 900 consistent with the Direct Integration Model, according
to one or more aspects of the techniques of this disclosure.
[0067] FIG. 48 is a block diagram illustrating an example network
system 910 consistent with the Edge Node Layer 2 Model, according
to one or more aspects of the techniques of this disclosure.
[0068] FIG. 49 is a block diagram illustrating an example network
920 that includes a primary edge node (EN-P) and a secondary edge
node (EN-S).
[0069] FIG. 50 is a block diagram illustrating an example network
system that shows a forwarding model for a Virtual Private LAN
Switching (VPLS), single connect, port-based session.
[0070] FIG. 51 is a block diagram illustrating an example network
system that shows a forwarding model for a VPLS, dual connect,
port-based session.
[0071] FIG. 52 is a block diagram illustrating an example network
that shows a forwarding model for VPLS, single/dual connect,
MAC-based session.
[0072] FIG. 53 is a block diagram illustrating an example network
system that shows a layer two (L2) subnet arrangement.
[0073] FIG. 54 is a block diagram illustrating an example network
system that shows an L3 virtual private network (VPN)
arrangement.
[0074] FIG. 55 is a block diagram illustrating an example network
system that shows a forwarding model for local switching.
[0075] FIG. 56 is a block diagram illustrating an example network
system that shows per subscriber (endpoint) packet policy and
next-hop chaining at the Access node for Uplink.
[0076] FIG. 57 is a block diagram illustrating an example network
system that shows next-hop chaining at an access node for
downlink.
[0077] FIG. 58 is a block diagram illustrating an example network
system that shows next Policy and Next-Hop Chaining at the Edge
Node for Downlink.
[0078] FIG. 59 is a block diagram illustrating an example system
that shows Next-Hop Chaining at the Edge Node for Uplink.
DETAILED DESCRIPTION
[0079] FIG. 1 is a block diagram illustrating an example network
system 10 in accordance with techniques described herein. As shown
in the example of FIG. 1, network system 10 includes a service
provider network 20 coupled to a public network 22. Service
provider network 20 operates as a private network that provides
packet-based network services to subscriber devices 18A, 18B
(herein, "subscriber devices 18"). Subscriber devices 18A may be,
for example, personal computers, laptop computers or other types of
computing device associated with subscribers. Subscriber devices 18
may comprise, for example, mobile telephones, laptop or desktop
computers having, e.g., a 3G wireless card, wireless-capable
netbooks, video game devices, pagers, smart phones, personal data
assistants (PDAs) or the like. Each of subscriber devices 18 may
run a variety of software applications, such as word processing and
other office support software, web browsing software, software to
support voice calls, video games, videoconferencing, and email,
among others.
[0080] In the example of FIG. 1, service provider network 20
includes a pair of centralized redundant controllers 35A-35B
("controllers 35") that provide complete control-plane
functionality for access/aggregation network 24. As described
herein, controllers 35 provide seamless end-to-end service from a
core-facing edge of a service provider network through aggregation
and access infrastructure out to access nodes located proximate the
subscriber devices 18.
[0081] Access/access/aggregation network 24 provides transport
services for network traffic associated with subscribers 18.
Access/aggregation network 24 typically includes one or more
aggregation nodes ("AG") 19, such as internal routers and switches
that provide transport services between access nodes ("AXs") 28, 36
and edge nodes 30. After authentication and establishment of
network access through AXs 28, 36, any one of subscriber devices 18
may begin exchanging data packets with public network 22 with such
packets traversing AXs 28, 36 and AGs 19. Although not shown,
aggregation network may include other devices to provide security
services, load balancing, billing, deep-packet inspection (DPI),
and other services for mobile traffic traversing access/aggregation
network 24.
[0082] As described herein, controller 35A operates to provide a
central configuration point for configuring AGs 19 of
access/aggregation network 24 provide transport services to
transport traffic between AXs 28, 36 and edge nodes 30. Controllers
35 provide a redundant controller system with a control plane that
is constantly monitoring the state of links between nodes in
service provider network 20. Controller 35A serves as the master
controller and synchronizes state actively with the backup
controller 35B, and then in case of failure of the master
controller, the backup controller 35B takes over right away without
loss of any information.
[0083] The data plane of nodes in access/aggregation network 24
uses standard Multi-Protocol Label Switching (MPLS) for forwarding
subscriber traffic, and essentially transforms the entire
access/aggregation network 24 into an MPLS switching fabric. The
use of standard MPLS avoids imposing additional hardware
requirements on the nodes, since most hardware today already
support full-featured MPLS. In this architecture, the provisioning
of labels for MPLS forwarding is not done by signaling protocols
such as Label Distribution Protocol (LDP) or RSVP. Rather, the
forwarding tables are populated directly by the controller 35A, as
described in further detail below. The use of MPLS makes the
network resilient because of standard MPLS protection features.
[0084] The architecture features separation of control and data
planes, extracting the complex control functions from the nodes in
the access/aggregation network and centralizing them in controllers
35. Each AX 28, 36 and AG 19 may provide minimal control plane
functionality in addition to the normal full-featured data plane.
Consequently, the access and aggregation nodes are simple and
inexpensive by design, with a minimal control plane and basic MPLS
forwarding support with QoS. The nodes support plug-and-play
deployment, control channel establishment to controller 35A and
participate in auto-discovery of topology. Beyond that, all higher
level control functionality is performed on controller 35A, which
configures all the functions on the nodes. The simplicity of
processing required in the nodes and the use of standard forwarding
mechanisms is expected to reduce the cost of hardware required in
the nodes. Also, since the software running on the node has very
few features, software upgrades may be rarely needed. This, coupled
with centralized management and trouble-shooting, may reduce the
overall total cost of ownership.
[0085] Controller 35A manages configuration and operation of all
the nodes in service provider network 20. In this manner, service
provider network 20 can be considered a software-defined network.
Controller 35A sets up transport paths and dynamically adjusts them
according to node policy, subscriber policy, available capacity and
traffic load in the network. Controller 35A also uses dynamic
control algorithms to effect real-time traffic engineering and
Quality of Service (QoS) provisioning. In addition, controller 35A
automatically sets up Network Services for subscribers. Controller
35A also provides a single touch point into the network for
subscriber policy, provisioning and management, as well as for
applications to interact with the network via north-bound
Application Programming Interface (API).
[0086] As further described below, controllers 35 each include a
path computation module (PCM) that handles topology computation and
path provisioning for the whole of access/aggregation network 24.
That is, when controller 35A is the master controller, the PCM of
controller 35A processes topology information for
access/aggregation network 24, performs path computation and
selection in real-time based on a variety of factors, including
current load conditions of subscriber traffic, and provisions the
LSPs and pseudo wires within the access/aggregation network 24.
[0087] As described, each of AXs 28, 36, AGs 19, edge nodes 30
(generally referred to herein as "network nodes") and controllers
35 executes a control protocol, described herein as the
Multi-Protocol Label Switching-Open Centralized Control (MPLS-OCC)
Protocol, to allow the nodes to be as simple as possible with
minimal control functionality, while allowing the controllers 35 to
perform the complex control functions. Network nodes do not need to
run a routing protocol. As further described below, the MPLS-OCC
protocol allows network nodes to discover their neighbors and
report these neighbors to controller 35A. Controller 35A computes
the topology of the network based on information reported by
network nodes by messages in accordance with the MPLS-OCC protocol.
Given this topology, controller 35A may then compute paths through
the network and install forwarding table entries in the network
nodes to support packet switching between any two nodes in the
network. Controllers 35 are assumed to be Internet Protocol
(IP)-reachable from edge nodes 30 and therefore communicate with
edge nodes 30 via a Uniform Datagram Protocol (UDP) connection.
Controllers 35 may be deployed in redundant pairs with
active/standby semantics or in clusters, for example.
[0088] Controller 35A may use the MPLS-OCC protocol to provision
paths with per Class of Service (CoS) policers to maintain QoS and
fair network usage. The MPLS-OCC protocol also supports the
provisioning of schedulers on the ports carrying the paths based on
the bandwidth, scheduling discipline and buffer requirements per
CoS. Once controller 35A has provisioned paths, controller 35A may
use the MPLS-OCC protocol to connect endpoints to network services
provided at the edge router. Controller 35A may use the MPLS-OCC
protocol to provision Pseudo Wires (PWs) over the paths to connect
endpoints with network services. Traffic entering PWs may also be
subjected to per CoS policing and general packet filter actions.
The MPLS-OCC protocol does not rely on the data plane to be
established before the topology can be discovered. As described
herein, access nodes 28, 36 and controller 35A use the MPLS-OCC
protocol to automatically establish a control channel between
controller 35 and the access nodes independent of a data
channel.
[0089] Access nodes (AXs) 28, 36 and edge routers (ERs) 30 operate
at the borders of access/aggregation network 24 and, responsive to
network configuration information received from controller 35A, may
apply network services, such as authorization, policy provisioning
and network connectivity, to network traffic associated with
subscribers 18 in communication with access nodes 28, 36. In the
example of FIG. 1, for ease of explanation, service provider
network 20 is shown having two access nodes 28, 36, although the
service provider network may typically service thousands or tens of
thousands of access nodes.
[0090] Aggregation nodes 19 are nodes which aggregate several
access nodes 28, 36. AGs 19 may, for example, operate as label
switched routers (LSRs) that forward traffic along transport label
switched paths (LSPs) defined within access/aggregation network 24.
AGs 19 and AXs 28, 36 have reduced control planes that do not
execute a Multi-protocol Label Switching (MPLS) protocol for
allocation and distribution of labels for the LSPs (e.g., no LDP or
RSVP-TE protocol executing on the control planes). As one example,
AXs 28, 36 and AGs 19 each execute a control-plane protocol, such
as the MPLS-OCC protocol, to receive MPLS forwarding information
directly from controller 35A, without requiring conventional MPLS
signaling using a label distribution protocol such as LDP or
RSVP.
[0091] Access/aggregation network 24 may also include additional
network nodes that are not shown in FIG. 1. In general, the network
nodes can discover neighboring network nodes and report those
neighbors to controller 35A using a discovery mechanism. Network
nodes are interconnected via point to point links. All network
nodes are assumed to have at least one Ethernet MAC address that is
used as a globally unique identifier. Network nodes have OCC links
that are indexed locally.
[0092] Access nodes (also called "AX") may be considered a special
type of network nodes that provide access functions to endpoints.
An Access Node is a node that provides Ethernet services to an
Endpoint (EP). The Access Node may map the port through which an
Endpoint is connected to a Pseudo-Wire (PW) that carries the
Endpoint's traffic to/from the Network Service located at the Edge
Node. An AX may also locally switch traffic directly between two
ports or directly between itself and another AX. The incoming
traffic from the Endpoint may be subjected to uplink per packet
policy and CoS based at the AX. The AX is a label edge router (LER)
and applies per CoS policing to traffic entering an LSP. The AX is
configuration-less at boot time and acquires its configuration from
controller 35A. The AX uses MPLS-OCC to discover its neighbors and
set up a control channel to controller 35A.
[0093] In this example, service provider network includes an access
node (AX) 36 and endpoint (EP) 38 that provide subscriber devices
18A with access to access/aggregation network 24. In some examples,
AX 36 may comprise a router that maintains routing information
between subscriber devices 18A and access/aggregation network 24.
AX 36, for example, typically includes Broadband Remote Access
Server (BRAS) functionality to aggregate output from one or more
EPs 38 into a higher-speed uplink to access/aggregation network
24.
[0094] Edge nodes 30 may be considered a special type of network
nodes that have a connection to controller 35A. All packets from
controller 35A to any node in service provider network 20 are sent
via this connection. Edge Nodes 30 map a pseudo wire to Network
Services. Edge Nodes 30 may also apply downlink per packet policy
and CoS based policing to the traffic admitted to the PW. Network
services may be configured and managed on Edge Nodes 30 via
existing mechanisms. Edge Nodes 30 are assumed to be configured and
connected directly to some management network where controllers 35
reside, and are configured with the IP Address of controllers 35.
Edge Nodes 30 may provide an anchor point of active sessions for
subscriber devices 18. In this sense, Edge Nodes 30 may maintain
session data and operate as a termination point for communication
sessions established with subscriber devices 18 that are currently
accessing packet-based services of public network 22 via
access/aggregation network 24. Examples of a high-end mobile
gateway device that manages subscriber sessions for mobile devices
are described in U.S. Pat. No. 8,635,326, entitled MOBILE GATEWAY
HAVING REDUCED FORWARDING STATE FOR ANCHORING MOBILE SUBSCRIBERS,"
the entire content of which is incorporated herein by
reference.
[0095] Endpoints are any device that receives Ethernet services
from the network 20. An endpoint may be defined by a physical port
location on an AX or by a Media Access Control (MAC) address. In
the example of FIG. 1, EP 38 may communicate with AX 36 over a
physical interface supporting various protocols. EP 38 may, for
example, comprise a switch, a router, a gateway, or another
terminal that operates as a demarcation point between customer
equipment, such as subscriber devices 18B, and service provider
equipment. In one example, EP 38 may comprise a digital subscriber
line access multiplexer (DSLAM) or other switching device. Each of
subscriber devices 18A may utilize a Point-to-Point Protocol (PPP),
such as PPP over ATM or PPP over Ethernet (PPPoE), to communicate
with EP 38. For example, using PPP, one of subscriber devices 18
may request access to access/aggregation network 24 and provide
login information, such as a username and password, for
authentication by policy server (not shown). Other embodiments may
use other lines besides DSL lines, such as cable, Ethernet over a
T1, T3 or other access links.
[0096] As shown in FIG. 1, service provider network 20 may include
an access node (AX) 28 and EP 32 that provide subscriber devices
18B with access to access/aggregation network 24 via radio signals.
For example, EP 32 may be connected to one or more wireless radios
or base stations (not shown) to wirelessly exchange packetized data
with subscriber devices 18B. EP 32 may comprise a switch, a router,
a gateway, or another terminal that aggregates the packetized data
received from the wireless radios to AX 28. The packetized data may
then be communicated through access/aggregation network 24 of the
service provider by way of AGs 19 and edge routers (ERs) 30, and
ultimately to public network 22.
[0097] Access/aggregation network 24 provides session management,
mobility management, and transport services to support access, by
subscriber devices 18B, to public network 22. In some examples,
access/aggregation network 24 may include an optical access
network. For example, AX 36 may comprise an optical line terminal
(OLT) connected to one or more EPs or optical network units (ONUs)
via optical fiber cables. In this case, AX 36 may convert
electrical signals from access/aggregation network 24 to optical
signals using an optical emitter, i.e., a laser, and a modulator.
AX 36 then transmits the modulated optical signals over one or more
optical fiber cables to the CPEs, which act as termination points
of the optical access network. As one example, EP 38 converts
modulated optical signals received from AX 36 to electrical signals
for transmission to subscriber devices 18A over copper cables. As
one example, EP 38 may comprise a switch located in a neighborhood
or an office or apartment complex capable of providing access to a
plurality of subscriber devices 18A. In other examples, such as
fiber-to-the-home (FTTH), EP 38 may comprise a gateway located
directly at a single-family premise or at an individual business
capable of providing access to the one or more subscriber devices
18A at the premise. In the case of a radio access network, the EPs
may be connected to wireless radios or base stations and convert
the modulated optical signals to electrical signals for
transmission to subscriber devices 18B via wireless signals.
[0098] As described herein, access/aggregation network 24 may
provide a comprehensive solution to limitations of current access
networks. In one example, AXs 28, 36 provide optical interfaces
that are each capable of optically communicating with a plurality
of different endpoints through a common optical interface. Access
node 36 may, for example, communicate with EPs 38 through a passive
optical network using wave division multiplexing. Further, EPs 32,
38 may be low-cost, optical emitter-free EPs that incorporate a
specialized optical interface that utilizes reflective optics for
upstream communications. In this way, multiple EPs 38 are able to
achieve bi-directional communication with access router 36 through
a single optical interface of the access router even though the EPs
are optical emitter (e.g., laser) free. In some examples,
access/aggregation network 24 may further utilize optical splitters
(not shown) for the optical communications associated with each of
the different wavelengths provided by the optical interfaces of
access nodes 28, 36.
[0099] In some examples, the optical interfaces of access nodes 28,
36 provide an execution environment for a plurality of schedulers,
one for each port of the comb filter coupled to the optical
interface, i.e., one for each wavelength. Each scheduler
dynamically services data transmission requests for the set of EPs
32, 38 communicating at the given wavelength, i.e., the set of EPs
coupled to a common port of the comb filter by an optical splitter,
thereby allowing the access network to dynamically schedule data
transmissions so as to utilize otherwise unused communication
bandwidth. Further example details of an optical access network
that uses wave division multiplexing and dynamic scheduling in
conjunction with emitter-free EPs can be found in U.S. Pat. No.
8,687,976, entitled "OPTICAL ACCESS NETWORK HAVING EMITTER-FREE
CUSTOMER PREMISE EQUIPMENT AND ADAPTIVE COMMUNICATION SCHEDULING,"
issued Apr. 1, 2014, the entire contents of which are incorporated
herein by reference.
[0100] The techniques described herein may provide certain
advantages. For example, the techniques may allow a service
provider to achieve a reduction in total operating cost through use
of centralized controllers 35 in conjunction with high-speed
aggregation nodes 19 that are easy to manage and have no persistent
configuration. Moreover, the techniques may be utilized within
aggregation networks to unify disparate edge networks into a single
service delivery platform for business, residential and mobile
applications. Moreover, the techniques provide an aggregation
network architected to easily scale as the number of subscriber
devices 18.
[0101] FIG. 2 is a block diagram illustrating the basic
architecture of a system including a collection of Aggregation
Nodes "AG" 52A-52E (hereinafter "Aggregation Nodes 52") and access
nodes 62A-62B (hereinafter, access nodes 62) to be discovered by a
controller 54A or 54B (hereinafter, "controllers 54") using the
MPLS-OCC Protocol, according to the techniques of this disclosure.
MPLS-OCC is designed to allow the topology of a collection of
network-nodes connected via point to point links to be discovered
by a controller.
[0102] Controllers 54 represent the OCC Controller entity and may,
for example, represent controllers 35 of FIG. 1. In this example,
one of the controller's functions is to receive neighbor reports
from network nodes and from these reports to compute topology and
path information. In one example, controller 54 may be IP-reachable
from the edge nodes 56A-56B (hereinafter, "Edge Nodes 56") and
therefore may communicate to the Edge Nodes 56 via a UDP
connection. In the example of FIG. 3, controllers 54A and 54B are
deployed in redundant pairs with active/standby semantics. Other
examples may include a single controller 54 without a redundant
pair, or may include a set of three or more controllers 54
operating to provide centralized control.
[0103] Aggregation Nodes 52 can provide transport channels (e.g.,
PWs over LSPs) between Edge Nodes 56 and access nodes 62A-62C
(hereinafter, access nodes 62). Edge nodes 56 map network services
to the PWs. Access nodes 62 map network services via the PWs to End
Points (EP) 64A-64C (hereinafter, "end points 64"). For example,
end points 64 may include network devices such as routers, base
stations, 802.11 access points, IP hosts, and other network
devices.
[0104] Example operation of one implementation of the MPLS-OCC
protocol for use in establishing an SRT control channel is as
follows. Network nodes (including AGs 52 and access nodes 62)
discover their neighbors by sending Hello messages on all of their
OCC links. The Hello message specifies the distance to the active
controller 54A from the sending node, e.g., expressed in number of
hops. However, if the path to the controller 54A is via the link
over which the Hello is sent, the distance specified is infinite.
The distance is set to 1 plus the lowest distance the node received
from other nodes. The edge node 56 always sets the distance to
0.
[0105] When a Hello Reply is received, a neighbor is discovered.
Once a neighbor is discovered on a link, the network node declares
the link as active and adds the link to the neighbor set for that
node. The Hello Reply also carries the distance to the controller
54A from the neighbor. Once the network node receives all the Hello
Reply messages from the neighbors, it knows the shortest distance
from itself to the controller 54A.
[0106] A network node (for example, access node 62A) sends a
Discover message to controller 54A, and the Discover message
specifies the neighbor set. The Discover message contains a
generation number, the neighbor list specifying the neighbor set,
and an intermediate node list that is initially empty. The network
node sends the Discover message out its active link with the
shortest distance from controller 54A (e.g., to AG 52C). The
receiving node of the Discover message first checks to see if the
receiving node is on the intermediate node list. If the receiving
node is on the list this implies that the packet has visited the
node before, and the packet is dropped. If the receiving node is
not on the list, the receiving node adds itself to the list along
with its ingress and egress port indices, and then forwards the
packet toward controller 54A along its shortest path link.
[0107] This process continues until an edge node receives the
DDiscover message (e.g., edge node 56A). Edge node 56A transmits
all Discover messages directly to controller 54A via a UDP control
channel, such as one of UDP control channels 58A-58D.
[0108] When controller 54A receives the Discover message,
controller 54A compares the generation number of the Discover
message against the current generation number received for the
initiating node. If the generation number is newer, controller 54A
updates the neighbor list and the path to the node. Controller 54A
computes the path to the node by reversing the path the Discover
message took as recorded in the intermediate node list. This path
is referred to as a Source Routed Tunnel (SRT), or "control
channel" to the network node, and is used for the duration of this
generation number for all OCC communications with the network node.
Note that controller 54A may alternatively choose to compute an SRT
based on the available topology information, in which case
controller 54A need not use the same path as in the intermediate
node list found in the Discover message.
[0109] The SRT control channel to a network node is specified by an
MPLS label stack where the label value implicitly corresponds to
the egress port index. When controller 54A sends control channel
messages on the SRT control channel to a network node, controller
54A sends the messages having the MPLS label stack. Controller 54A
allocates the labels in the MPLS label stack. At each hop along the
SRT control channel to a network node, the outermost label is
popped from the stack. At the penultimate hop, the outermost SRT
MPLS label is popped. The top of the stack then includes a service
label that identifies the packet as an OCC control channel packet
at the final destination. This label is popped exposing a raw
Ethernet frame which is then processed by the node's networking
stack.
[0110] The controller 54A responds to the first Discover message of
a given generation number by issuing a Discover Reply message via
the newly discovered SRT to the node. When a node receives a
Discover Reply of a matching generation number, the controller 54A
and the node are in sync with respect the node's neighbor list and
the SRT used to send additional OCC control messages.
[0111] The SRT from a node to the controller is specified as a
single MPLS label meaning "To Controller". The "To controller"
label is a Multi-protocol Label Switching (MPLS) label that
indicates the message is to be automatically output by a receiving
node toward the centralized controller. The "To controller" label
is understood by all nodes in the network, and any packet received
by the node with TO_CONTROLLER specified as its outer label is
automatically switched by the receiving node to the least cost port
to the controller, as indicated by Hello messages previously
received by the node. In some examples, the "To controller" label
is manually configured on the nodes. In some examples, the nodes
receive the "To controller" label as configuration from the
centralized controller. The label remains unchanged as the packet
is transmitted to the next node on the path to the edge node. When
a packet arrives at the edge node with the TO_CONTROLLER label, the
LSP label is popped, exposing the CONTROL_SERVICE label, and the
packet is then transmitted via the UDP control channel to
controller 54A.
[0112] The node sends Keepalive packets to the controller 54A to
ensure the state of the SRT. The controller 54A responds with a
Keepalive Reply. If no Keepalive Reply occurs, the node restarts
the discovery process by sending Hello messages, and generates a
new Discover message with a new generation number to force the
acceptance at the controller 54A of a new SRT.
[0113] The SRT control channel may now be used to program the
forwarding plane of the node via other control messages. Such
forwarding plane programming may include, for example, the
specification of LSPs through the MPLS-OCC nodes, the provisioning
of policers and CoS schedulers, the admittance of endpoints at the
access nodes, and the provisioning of packet filters and Pseudo
Wires.
[0114] FIG. 3 is a block diagram illustrating an example controller
200 in accordance with one or more aspects of the techniques of
this disclosure. Controller 200 may include a server or network
controller, for example, and may represent an example instance of
any of controllers 35 of FIG. 1 or controllers 54 of FIG. 2.
[0115] Controller 200 includes a control unit 202 coupled to a
network interface 220 to exchange packets with other network
devices by inbound link 222 and outbound link 224. Control unit 202
may include one or more processors (not shown in FIG. 3) that
execute software instructions, such as those used to define a
software or computer program, stored to a computer-readable storage
medium (not shown in FIG. 3), such as non-transitory
computer-readable mediums including a storage device (e.g., a disk
drive, or an optical drive) or a memory (such as Flash memory or
random access memory (RAM)) or any other type of volatile or
non-volatile memory, that stores instructions to cause the one or
more processors to perform the techniques described herein.
Alternatively or additionally, control unit 202 may comprise
dedicated hardware, such as one or more integrated circuits, one or
more Application Specific Integrated Circuits (ASICs), one or more
Application Specific Special Processors (ASSPs), one or more Field
Programmable Gate Arrays (FPGAs), or any combination of one or more
of the foregoing examples of dedicated hardware, for performing the
techniques described herein.
[0116] Control unit 202 provides an operating environment for
network services applications 204, access authorization
provisioning module 208, path computation element 212, and edge
authorization provisioning module 210. In one example, these
modules may be implemented as one or more processes executing on
one or more virtual machines of one or more servers. That is, while
generally illustrated and described as executing on a single
controller 200, aspects of these modules may be delegated to other
computing devices.
[0117] Network services applications 204 represent one or more
processes that provide services to clients of a service provider
network that includes controller 200 to manage connectivity in the
aggregation domain (alternatively referred to as the "path
computation domain") according to techniques of this disclosure.
Network services applications 204 may provide, for instance,
include Voice-over-IP (VoIP), Video-on-Demand (VOD), bulk
transport, walled/open garden, IP Mobility Subsystem (IMS) and
other mobility services, and Internet services to clients of the
service provider network. Networks services applications 204
require services provided by path computation element 212, such as
node management, session management, and policy enforcement. Each
of network services applications 204 may include client interface
206 by which one or more client applications request services.
Client interface 206 may represent a command line interface (CLI)
or graphical user interface (GUI), for instance. Client 206 may
also, or alternatively, provide an application programming
interface (API) such as a web service to client applications.
[0118] Network services applications 204 issue path requests to
path computation element 212 to request paths in a path computation
domain controlled by controller 200. In general, a path request
includes a required bandwidth or other constraint and two endpoints
representing an access node and an edge node that communicate over
the path computation domain managed by controller 200. Path
requests may further specify time/date during which paths must be
operational and CoS parameters (for instance, bandwidth required
per class for certain paths).
[0119] Path computation element 212 accepts path requests from
network services applications 204 to establish paths between the
endpoints over the path computation domain. Paths may be requested
for different times and dates and with disparate bandwidth
requirements. Path computation element 212 reconciling path
requests from network services applications 204 to multiplex
requested paths onto the path computation domain based on requested
path parameters and anticipated network resource availability.
[0120] To intelligently compute and establish paths through the
path computation domain, path computation element 212 includes
topology module 216 to receive topology information describing
available resources of the path computation domain, including
access, aggregation, and edge nodes, interfaces thereof, and
interconnecting communication links.
[0121] Path computation module 214 of path computation element 212
computes requested paths through the path computation domain. In
general, paths are unidirectional. Upon computing paths, path
computation module 214 schedules the paths for provisioning by path
provisioning module 218. A computed path includes path information
usable by path provisioning module 218 to establish the path in the
network. Provisioning a path may require path validation prior to
committing the path to provide for packet transport.
[0122] FIG. 4 is a block diagram illustrating an example
implementation of a path computation element 212 of controller 200
of FIG. 3. In this example, path computation element 212 includes
northbound and southbound interfaces in the form of northbound
application programming interface (API) 230 and southbound API
(232). Northbound API 230 includes methods and/or accessible data
structures by which network services applications 204 may configure
and request path computation and query established paths within the
path computation domain. Southbound API 232 includes methods and/or
accessible data structures by which path computation element 212
receives topology information for the path computation domain and
establishes paths by accessing and programming data planes of
aggregation nodes and/or access nodes within the path computation
domain.
[0123] Path computation module 214 includes data structures to
store path information for computing and establishing requested
paths. These data structures include constraints 234, path
requirements 236, operational configuration 238, and path export
240. Network services applications 204 may invoke northbound API
230 to install/query data from these data structures. Constraints
234 represent a data structure that describes external constraints
upon path computation. Constraints 234 allow network services
applications 204 to, e.g., modify link attributes before path
computation module 214 computes a set of paths. For examples, Radio
Frequency (RF) modules (not shown) may edit links to indicate that
resources are shared between a group and resources must be
allocated accordingly. Network services applications 204 may modify
attributes of link to effect resulting traffic engineering
computations in accordance with MPLS-OCC. In such instances, link
attributes may override attributes received from topology
indication module 250 and remain in effect for the duration of the
node/attendant port in the topology. A link edit message to
constraints 234 may include a link descriptor specifying a node
identifier and port index, together with link attributes specifying
a bandwidth, expected time to transmit, shared link group, and fate
shared group, for instance. The link edit message may be sent by
the PCE.
[0124] Operational configuration 238 represents a data structure
that provides configuration information to path computation element
214 to configure the path computation algorithm with respect to,
for example, class of service (CoS) descriptors and detour
behaviors. In some examples, operational configuration 238 may
receive operational configuration information in accordance with
MPLS-OCC. In some examples, an operational configuration message
specifies CoS value, queue depth, queue depth priority, scheduling
discipline, over provisioning factors, detour type, path failure
mode, and detour path failure mode, for instance. In some examples,
a single CoS profile may be used for the entire path computation
domain.
[0125] Network Discovery is the process by which controller 200
learns of the capabilities of network nodes and their neighbors
(and therefore the topology of the network), creates control
channels by which controller 200 can configure the discovered nodes
and learns of the Network Services available at the Edge Nodes in
the network.
[0126] Topology module 216 of controller 200 performs topology
discovery according to the MPLS-OCC protocol, by exchange of
MPLS-OCC messages. Additional details of various MPLS-OCC messages
are provided below. A node uses MPLS-OCC Hello messages to discover
local neighbors. The node then reports the neighbors to controller
200 by sending MPLS-OCC Discover messages towards controller 200.
The Discover messages, as they work their way toward the edge node,
record the route taken to the edge node in the Intermediate Node
List (INL) of the Discover messages. This recorded route is used by
controller 200 to construct a Source Routed Tunnel (SRT) comprising
ingress and egress interface pairs that specify the path from an
edge node to the discovered node. Once the SRT is created,
controller 200 and the node use the SRT for subsequent control
message communication.
[0127] A node's capabilities are described via the MPLS-OCC
Capabilities Indication message. Controller 200 may use the
capabilities indicated in a Capabilities Indication message to make
decisions about how the node is used. Capabilities may indicate
resource limitations so controller 200 will not select nodes for
certain services whose resources are exhausted. Such resources
include Policers, Filter Rules, and output buffer space. Other
capabilities include CoS scheduling capability.
[0128] Controller 200 discovers Network Services via receiving the
MPLS-OCC Services Indication message from a network node such as an
edge node. A Network Service is defined by a name associated with a
Bridge Domain (or VLAN). It is assumed that Edge Nodes reporting
the same Network Service represent redundancy for that Network
Service.
[0129] As a result of Network Discovery, controller 200 has the
following information: controller 200 has a control channel between
itself and each discovered node. Controller 200 sees the topology
of the entire network. Controller 200 understands the capabilities
of each node. Controller 200 knows where Network Services are
located and can make decisions about how to deploy resilient
services.
[0130] Path computation module 214 computes paths across the
discovered topology. Paths may be computed at the request of other
system applications for two primary purposes: to establish an IP
connection for a node for the purposes of Node Management, and to
establish an Ethernet Service as some Endpoint. Generally speaking,
path computation module 214 computes paths are between AXs and ENs
to plumb PWs that map Endpoints to Network Services. In some
examples, however, path computation module 214 may compute paths
directly between AXs to support local switching.
[0131] Paths are requested with per-CoS bandwidth allocations. Each
CoS may also have specific Path Computation attributes such as an
Over-Subscription factor and Detour path requirements. Real-time
path computation is possible. In some examples, path requests may
be continuously modified to account for new sessions that are
utilizing the Path or to support Auto-Bandwidth functions. Given
that highly available access is of primary importance, the PCE
includes configuration mechanisms whereby the network may run in a
degraded state in the face of failure. A degraded state is defined
as the condition where all Paths so provisioned are allocated
proportionally less bandwidth than what they requested and some
Paths for which protection is requested, none is provided or the
protection provided is not via allocated resources. Such behavior
may contrast with offline path computation implementations that may
fail a path request for a given topology state. Such
implementations are more concerned with TE and protection of a more
static nature where, given a failure in the network, detours become
operational, but a re-computation across the new topology may not
be computed.
[0132] In some examples, CoS values are described as follows:
[0133] Queue Depth: Queue Depth represents the amount of time a
packet can sit in a queue before it becomes stale. For TCP traffic,
this time is generally the round trip time of the TCP session (150
msec). For VoIP this time is generally 10 to 50 msec. Different
nodes may have different buffer capacities. It may not be possible
to guarantee a specific time allotment per queue. Nodes should
therefore be able to size queues according to the available buffer
space and the service class for the queue.
[0134] Queue Depth Priority: When a class of service is active over
some interface, the interface queues are sized to buffer at the
indicated depth based on the bandwidth for the class. If there is
insufficient buffer space, queue size is reduced according to queue
depth priority. Lower priority classes are reduced before higher
priority classes.
[0135] Scheduling Discipline: Scheduling Discipline determines how
the queue is scheduled with respect to other queues.
Deficit-weighted round-robin (DWRR) may be used, together with
Strict scheduling for voice traffic. Controller 200 configures the
schedulers on all node interfaces according to the bandwidth and
scheduling class for each CoS active on the interface.
[0136] Over-Provisioning Factor: When a path is routed through the
network/path computation domain, the path received allocated
bandwidth from each link over which the is routed. For some classes
of service is it appropriate to over-provision the network. This
allows the policers at the edge and access to admit more traffic
into the network than the network may actually be able to handle.
This might be appropriate in cases where the traffic is best
effort, for example. By over-provisioning certain classes of
traffic, the network operator may realize better network
utilization while still providing required QoS for other classes
that are not over-provisioned.
[0137] Detour Type: Specifies the traffic engineering requirements
for computed detours. Due to resource restrictions, users may elect
to configure detours that have fewer constraints than the primary
paths. Detour paths may, for instance, take on one of the following
values: None, Best-effort, CoS-only, Strict-TE. The None value
specifies do not compute detours. The Best-effort value specifies
compute detours but ignore TE bandwidth and CoS requirements. CoS
is dropped from the packet header and therefore the detour traffic
gets best-effort CoS. The CoS-only value specifies preserve CoS but
do not traffic engineer the detour. Under these conditions, traffic
competes with other primary path traffic equally for available
resources, therefore, interface congestion may occur when the
detour is active. The Strict-TE value specifies preserve CoS and
traffic engineering for the detour.
[0138] Path Failure Mode: Defines the per-CoS behavior to take when
the primary path computation fails due to resource constraints. The
Proportional Path Reduction (PPR), Ignore, and Fail options are
available. The PPR option specifies all paths traversing the
congested links are reduced proportionally until all paths can be
accommodated over the points of congestion. The Ignore option
specifies raise an alert message but otherwise allow the network to
operate in this oversubscribed manner. The Fail option specifies
fail to compute the remainder of the paths and do not admit traffic
for failed paths into the network.
[0139] Detour Path Failure Mode: Defines the behavior of the system
when detour paths cannot be computed due to resource constraints.
This attribute may only be applicable when Detour Type is
Strict-TE.
[0140] MPLS-OCC messages used by controller 200 for FIB Programming
are described in further detail in below. FIB Programming includes
two major areas. The first is LSP plumbing. When Path computation
module 214 computes a Path, that Path is essentially a collection
of links connecting the ingress and egress nodes in the Path. When
Path computation module 214 computes a Path, path provisioning
module 218 provisions an LSP representing the Path across all the
nodes in the Path. Such provisioning is performed via the MPLS-OCC
MPLS FIB Request messages. As described below, MPLS FIB Request
messages specify the path ID, ingress label, egress label and
egress port for a given path at a given node. Note that the ingress
node is a special case and does not include the ingress label. The
labels for the detour paths may also be specified using the same
message.
[0141] At the ingress node, a per-CoS policer is also specified to
police the traffic entering the LSP according to the provisioned
Path's bandwidth requirements. Such policing allows the network to
meet the QoS requirements for various classes of service. The
MPLS-OCC Policer Request message is used to configure the policer
on the LSP ingress node.
[0142] For each interface over which the Path traverses, CoS
scheduler configure module 256 computes the CoS scheduling
parameters based on the set of paths that traverse the interface.
Whenever a Path is updated, CoS scheduler configure module 256 may
modify such CoS scheduling parameters. Since Paths may be
continuously recomputed or single network events may result in many
paths being re-plumbed, CoS scheduler configure module 256 may time
delay the CoS Scheduling operation to avoid thrashing and
overloading the control channel. The MPLS-OCC CoS Scheduler Request
message is used to configure the CoS Scheduler for some port.
[0143] The second primary area of FIB programming concerns the
admittance of Endpoints into the network. This involves the
establishment of a Pseudo-Wire (PW) at the AX and EN nodes and
mapping Endpoint traffic to and from this PW. Additionally,
per-Endpoint policy and CoS based policing may be applied at either
end of the PW. Controller 200 receives a MPLS-OCC Endpoint
Indication message sent by a network node to indicate the presence
of a new Endpoint or change of status associated with an endpoint,
and controller 200 sends a Pseudo Wire Request message to configure
the PW to support the Endpoint.
[0144] Node Management concerns the configuration and management of
an MPLS-OCC node. When a node is discovered by controller 200 using
MPLS-OCC, in some examples the following operations are performed:
(1) The node should be connected to a management IP network.
MPLS-OCC supports a limited set of node management functions. More
general or higher level functions should be supported over an IP
interface rather than the MPLS-OCC control channel. The reasoning
is as follows: The MPLS-OCC control channel is intended for low BW
functions and to provide resiliency. The control channel uses a
Source Routed Tunnel (SRT) for the basic control communication
between the controller and the node. Messages from the controller
to the node use the MPLS label stack that describes the source
routed path to the node. Messages from the node to the controller
use the TO_CONTROLLER label to traverse the path discovered via the
Hello messages. Also, there is a plethora of functionality and
protocols available for node management, once the node has an IP
address. These include TCP, SNMP, Telnet, etc. 2. The node's image
should be checked for compatibility with controller 200 and if
found incompatible or otherwise requiring an update, the node's
image should be updated. Image management is performed over the IP
interface using Secure Copy Protocol (SCP). 3. A stats collection
channel should be created between the node and the Controller.
Stats are collected via standard SNMP MIBs. Where no MIB exists
stats maybe collected via other mechanisms. 4. Depending on the
type of the node, additional configuration may be required. This
could include port or radio configuration.
[0145] Since third party nodes are expected to be integrated into
the larger system, such nodes may have their own management
systems. These systems may manage the node via the IP interface
described above. The mechanism by which nodes and management
systems discover each other is node specific. However, controller
200 provides northbound API 230 to third parties, which allows the
third parties to discover the existence of nodes, their type and IP
address. Node managers or node manager extenders (3rd party
managers) may also use APIs to: 1. Specify that a given link of a
node is a member of a shared resource group, thereby providing
information to the PCE that BW allocation from any link effects all
links in the group. 2. Discover the actual per CoS bandwidth
allocations for all links in a shared resource group such that per
link schedulers can be configured appropriately on each link. 3.
Specify that certain links are members of a fate shared group.
(Note, this could be a node management function or a function of
some other application. However, the point here is that this
information is available to MPLS-OCC and therefore must be
discovered through external mechanisms). Regardless, the PCE uses
this information to compute detours for paths that do not use links
from the same fate sharing group as their primaries.
[0146] Before a node can allocate an IP address the must establish
the data plane between itself and the EN to which the management
network is connected. The establishment of the data plane for a
node is almost the same as the establishment of the data plane for
an Endpoint, with a difference being that the data plane for the
node is terminated a the node's control processor rather than one
of the node's Endpoints. Therefore, the same subscriber management
functions are used in both cases.
[0147] Subscriber Management is the process by which subscribers
(or Endpoints) are admitted into the network. For each subscriber,
the controller derives an authorization record, e.g. based on
policy configured on the controller. The network may include a
Policy Manager entity associated with the Network Management
platform that configures the policy on the controller with
attributes that a user/subscriber can have, such as security level,
policer configuration, SLA level, for example. The authorization
record includes the Network Service to which the subscriber is to
be admitted, the policy to be applied to the subscriber's traffic
and the per-CoS bandwidth allocations for the subscriber. The
per-CoS bandwidth allocations can specify a minimum and maximum.
Such a range can be used to control the auto-bandwidth function of
controller 200.
[0148] Note that multiple PWs may run over a given pair of Paths
and that multiple subscribers may be handled by a single PW. In
this case, the BW allocation for a given path is the aggregate for
all the subscribers carried by the Path. Endpoints may be defined
by <node, port> or <node, port, MAC>. In the first
case, all the traffic from the <node, port> is subjected to
the authorization record. In the second case, all the traffic from
the MAC is subjected to the authorization record. Authorization
records may be derived from policy configured on the Controller.
Effectively, an Endpoint maps to a service profile that defines the
authorization record. If the Endpoint is defined by <node,
port> then there exists a mapping for each distinct <node,
port> to a service profile (or a wild-card configuration). If
the Endpoint is defined by <node, port, MAC> then the MAC
address may be authorized via dot1X and a service profile is
identified from the authorization record returned from RADIUS.
[0149] A subscriber's authorization record may include a minimum
and maximum bandwidth allocation per CoS for uplink and downlink
traffic. The minimum may be 0, indicating that no resources are
allocated for that subscriber's traffic at that CoS. The minimum
bandwidth is used to adjust the bandwidth allocations for the LSPs
carrying the subscriber's traffic. Therefore, a Path's bandwidth
allocation is the sum of the minimum bandwidth specification for
all subscribers whose traffic traverses that Path.
[0150] The maximum bandwidth allocation is the maximum bandwidth
that the subscriber may use. Policer configure module 254 may use
this value to define the per subscriber policer on ingress to the
PW carrying the subscriber's traffic. Maximum can be used either to
protect the LSP bandwidth allocation, in which case
minimum==maximum or to cap subscribers at some level. Service
providers may, in some examples, choose to cap bandwidth for the
purposes of selling higher levels of service. In other examples
there may also be no maximum specified and hence no per-subscriber
policing. If a subscriber's bandwidth exceeds the specified
maximum, the traffic may be either dropped or reclassified as
discard eligible. Alternatively, the discard-eligible packet could
be moved to some other class of service; as such it would suffer
from potential re-ordering issues with respect to
non-discard-eligible packets.
[0151] Thus far, what has been described is a mechanism to build a
DiffServ TE network where CoS allocations and Path computations are
a function of subscriber policy alone. However, the system must
also have the ability to support auto-bandwidth functionality.
Auto-bandwidth is the ability to automatically size paths according
to their current levels of load. With such a capability,
instantaneous load can be distributed across the network in real
time. Specifically, the algorithm could operate as follows:
1. Controller 200 initially plumbs paths with bandwidth set to the
sum of all subscriber minimum bandwidth. 2. Over time, controller
200 monitors the actual bandwidth utilized on the Path, including
dropped or reclassified packet counts. 3. If there are drops on the
Path, controller 200 increases the bandwidth allocated for that
path by some fraction. 4. For paths that are utilizing less than
their current allocation, controller 200 reduces the current
allocation to some fraction of their current utilization. 5. Go to
step 2.
[0152] Note that all of these operations are done per CoS. Given
this behavior and the PCE functionality, a set of use case
scenarios can be realized. Each of these use cases is analyzed in
detail in the following sections.
[0153] Voice generally fits into the DiffServ Expedited Forwarding
(EF) forwarding class. Voice is highly sensitive to drops, latency
and jitter. At the same time voice is typically very low bandwidth,
requiring on the order of perhaps 64 Kbps per voice session. Since
this architecture does not propose any interworking with voice
signaling protocols, the bandwidth allocation is static per
subscriber. Given the low bandwidth requirements, this is probably
reasonable in practice. To ensure adequate bandwidth, the voice
class is not over-provisioned. However, if a customer had a good
understanding of their voice duty cycle per subscriber, an
over-provisioning factor could be used.
[0154] Since voice has stringent real-time performance
requirements, operators would likely provide voice with paths with
detours that utilize strict TE. It is assumed that operators
deploying voice in their network are only expecting a small
percentage (5 to 10%) of the traffic to be voice traffic. So it is
unlikely that Paths will fail due to an inability to allocate
resources for voice. Therefore the Path Failure Mode could be
either Fail or Proportional Path Reduction (PPR) with PPR being
preferred as paths are still plumbed.
[0155] While auto-bandwidth could be used with voice, it is not
likely to have a significant impact on network utilization as voice
bandwidth is typically small. Voice traffic may be identified via
DiffServ marking or 5 tuple packet classification. It is generally
assumed that Endpoints will have the ability to mark their voice
streams.
[0156] Video is typically batch streamed but it could be streamed
in real time. It is somewhat sensitive to latency, delay and
jitter, but not to the extent of conversational voice. It is
bandwidth hungry and less elastic than typical Internet traffic.
Therefore, video more appropriately fits into a DiffServ Assured
Forwarding (AF) CoS. AF essentially gives better scheduling and
queuing resources versus best-effort classes. Today, delivery of
quality video is seen as a major differentiator by many service
providers, so the use of auto-bandwidth to garner the required
resources to deliver the video will likely be an attractive feature
to many service providers.
[0157] Assuming that video makes use of auto-bandwidth to deliver
the service, over-provisioning the class may not make sense since
network allocations are a function of the current usage levels.
However, if auto-bandwidth is not used then the network should be
over-provisioned based on the per subscriber duty cycle for
video.
[0158] Since video traffic is likely to be high bandwidth, use of
CoS-only for detours is recommended. Such detours will be used to
maintain existing streams, possibly at a reduced level of service,
while the network is being repaired. Video traffic may be
identified DiffServ marking or 5 tuple packet classification.
[0159] Internet Data with service level agreements (SLAs) typically
falls under the DiffServ AF Class. Typically a customer is given a
minimum BW allocation and allowed to burst to some maximum. The
traffic outside of the minimum is typically marked as discard
eligible and is delivered so long as there is no congestion in the
network. Since there is an SLA involved, it is assumed that the
minimum bandwidth is always allocated independent of actual load.
Also, since operators may want to offer more expensive plans with
higher maximum bandwidth, the bandwidth is capped. Therefore, the
extent to which auto-bandwidth can operate is more restrictive than
the video class. This also implies that over-provisioning may play
a bigger role. In effect, it might be the case that
over-provisioning and auto-bandwidth operate as two sides of the
same coin--a technique to maximize network utilization.
Over-provisioning is fast acting but uncontrolled and
auto-bandwidth is slow acting but offers some degree of control.
Detours are likely to be CoS only, but extreme SLAs may demand
strict TE. Traffic identification is simpler in this case as it
will depend on Endpoint or Network Service association. Some
traffic from subscribers with SLAs may get mapped to voice or video
classes and will therefore be subject to similar classification
issues.
[0160] Internet Data Best Effort is the class of traffic that fits
into all the space not occupied by the other classes of traffic. It
is most resilient to drops, latency and jitter and is very elastic.
It can be over-provisioned and is a good candidate for
auto-bandwidth. Detours should be used just for the sake of service
continuity. Since all traffic is still best effort the effect of
traffic on the detour path should be negligible. By default, all
traffic not otherwise classified falls into the best effort
class.
[0161] The following table summarizes the service classes and their
configuration for PCE parameters, auto-bandwidth and general
network CoS parameters. Example service classes are defined in
TABLE 1.
TABLE-US-00001 TABLE 1 Queue Over Path Service Queue Depth
Scheduling Provisioning Detour Failure Class Depth Priority
Discipline Factor Type Mode Auto BW Voice 20 msec High Strict 1
Strict- PPR no TE Video 150 msec Medium DWRR 2 CoS- PPR (0, max-
only video) Internet 150 msec Medium DWRR 5 CoS- PPR (SLA min, Data
only SLA max) with SLA Internet 100 msec Low DWRR 20 Best- PPR (0,
EP BW) Data effort Best Effort
[0162] FIG. 5 is a block diagram illustrating an example network
device 300 in accordance with the techniques of this disclosure.
Network device 300 may, for example, represent any of aggregation
nodes 19 or access nodes 28, 36 of FIG. 1, or aggregation nodes 52
or access nodes 62 of FIG. 2. For example, network device 300 may
be an access node that operates at the borders of the network and,
responsive to receiving provisioning messages from the controller,
applies network services including policy provisioning, policing
and network connectivity to the network packets received from the
subscriber devices. Network device 300 may reside at a border of an
aggregation network, and operate as an endpoint for pseudo wires to
map subscriber traffic into and out of the pseudo wires, for
example.
[0163] In the example of FIG. 5, network device 300 includes a
control unit 302 that comprises data plane 301 and control plane
303. Data plane 301 includes forwarding component 304. In addition,
network device 300 includes a set of interface cards (IFCs)
320A-320N (collectively, "IFCs 320") for communicating packets via
inbound links 322A-322N (collectively, "inbound links 322") and
outbound links 324A-324N (collectively, "outbound links 324").
Network device 300 may also include a switch fabric (not shown)
that couples IFCs 320 and forwarding component 304.
[0164] Network device 300 executes an MPLS-OCC module 306 that
operates in accordance with a control protocol as described herein,
referred to herein as MPLS-OCC protocol. For example, network
device 300 may send or receive any of the messages described
herein.
[0165] In some examples, MPLS-OCC module 306 outputs a Hello
message, on each interface and/or link. Each of the Hello messages
includes an identifier that is unique to network device 300 (e.g.,
an aggregation node or access node) that sent the Hello message and
the interface on which the Hello message was sent. The Hello
messages may also indicate a distance from the sending node to the
controller (e.g., in number of hops). In accordance with the
MPLS-OCC protocol, network device 300 also outputs a Hello Reply
message on each interface on which a Hello message was received.
The Hello Reply messages may also indicate a distance from the
sending node to the controller (e.g., in number of hops). MPLS-OCC
module 306 maintains a neighbor node list 310 that identifies
neighboring nodes from which network device 300 received Hello
messages and the interfaces on which the Hello messages were
received.
[0166] Responsive to receiving Hello Reply messages on a link,
network device 300 declares the link as an active link and adds the
neighboring node to the neighbor node list 310. MPLS-OCC module 306
determines, based on the received Hello and Hello Reply messages,
which active link has the shortest distance to the controller.
[0167] Network device 300 may output, on the active link having the
shortest distance to the controller, a Discover message that
specifies the neighbor node list identifying neighboring nodes and
interfaces on which neighboring access nodes and aggregation nodes
are reachable from network device 300. The Discover message also
includes an intermediate node list that indicates layer two
addresses and ingress/egress ports that the Discover message has
traversed so far from an originating node. Discover message
[0168] In addition, upon receiving a Discover message from a
neighboring node and determining that the Discover message does not
include a layer two address for network device 300, MPLS-OCC module
306 updates the intermediate node list of the Discover message to
add its own layer two address and ingress/egress port information.
MPLS-OCC module 306 may also store such information to intermediate
node list 312 ("IM node list"). After updating the Discover
message, MPLS-OCC module 306 forwards the updated Discover message
on the active link having the shortest distance to the controller.
Devices along the path from network device 300 to the controller
similarly forward the Discover message along the path to the
controller, updating the intermediate node list along the way. The
controller, upon receiving the Discover message, establishes a
Source Routed Tunnel (SRT) control channel with network device 300,
based on the intermediate node list specified by the Discover
message. Network device 300 executes the MPLS-OCC module 306
without executing an Interior Gateway Protocol (IGP) within a
control plane 303 of network device 300.
[0169] Network device 300 may receive from the controller a
Discover Reply message indicating that the controller has
acknowledged receipt of the DDiscover message. The Discover Reply
is sent via the SRT indicating that there is an MPLS label stack on
the packet from the controller that corresponds to the source
routed egress interface list used to route the packet. After
receiving the Discover Reply message, network device 300 may
periodically send Keepalive messages to the controller to maintain
liveness of the SRT, and receive Keepalive reply messages in
response. Responsive to determining that no Keepalive Reply is
received from centralized controller network device within a time
period, network device 300 may generate a new Discover message with
a new generation number to force acceptance at the controller of a
new SRT control channel.
[0170] The centralized controller computes topology information for
the network and computes the forwarding information for the data
channels (e.g., pseudo wires) in accordance with discovery messages
that are received from nodes in the network. Network device 300 may
receive, from the controller and via the respective SRT control
channels, a message that specifies the forwarding information
computed by the centralized controller for configuring forwarding
component 304 of network device 300 to forward the network packets.
In some examples, the pre-computed forwarding information comprises
directed FIB state including one or more MPLS labels for network
device 300 to use for sending packets on an LSP. In some examples,
the directed FIB state includes policers to police ingress traffic
for the LSP according to the computed bandwidth. Based on the
forwarding information, the centralized controller may also compute
one or more backup LSPs for the network, and output one or more
messages to network device 300 to communicate and install, within
network device 300, forwarding information for the backup LSPs.
[0171] Network device 300 may store the received forwarding
information for the LSPs and the backup LSPs to L-FIB 316 and/or
FIB 314, for example. Based on forwarding information base (FIB)
314 and labeled FIB (L-FIB) 316, forwarding component 304 forwards
packets received from inbound links 322 to outbound links 324 that
correspond to next hops associated with destinations of the
packets. In response to a network event, forwarding component 304
may reroute at least a portion of the network packets along the
backup LSP. The network event may be, for example, a link or node
failure. The controller may also compute detour LSPs to handle fast
reroute for any interior node failure.
[0172] In some examples, network device 300 may send a port
attributes indication message to describe its port attributes to
the controller, such as maximum bandwidth or port type, and the
centralized controller computes the forwarding information for the
LSPs based at least in part on quality of service (QoS) metrics
requested for the LSPs and the port attributes received in the port
attributes indication message.
[0173] In this manner, network device 300 has a reduced control
plane 303 that does not execute a conventional Multi-protocol Label
Switching (MPLS) protocol (e.g., LDP or RSVP) for allocation and
distribution of labels for the LSPs and does not execute a routing
protocol such as an interior gateway protocol (IGP). Instead,
network device 300 executes the MPLS-OCC module 306 to receive MPLS
forwarding information directly from a central controller (e.g.,
controller 35A of FIG. 1), without requiring conventional MPLS
signaling using a label distribution protocol such as LDP or RSVP.
The centralized controller network device provides a centralized,
cloud-based control plane to configure the plurality of aggregation
nodes and access nodes to effectively operate as an MPLS switching
fabric to provide transport LSPs and pseudo wires between the edge
nodes and the access nodes for transport of subscriber traffic. In
various examples, the messages exchanged between network device 300
and the centralized controller may conform to any of the message
formats described herein.
[0174] In one embodiment, forwarding component 304 may comprise one
or more dedicated processors, hardware, and/or computer-readable
media storing instructions to perform the techniques described
herein. The architecture of network device 300 illustrated in FIG.
5 is shown for example purposes only. In other embodiments, network
device 300 may be configured in a variety of ways. In one
embodiment, for example, control unit 302 and its corresponding
functionality may be distributed within IFCs 320.
[0175] Control unit 302 may be implemented solely in software, or
hardware, or may be implemented as a combination of software,
hardware, or firmware. For example, control unit 302 may include
one or more processors which execute software instructions. In that
case, the various software modules of control unit 302 may comprise
executable instructions stored on a computer-readable medium, such
as computer memory or hard disk.
[0176] Various example Control Packet Formats will now be
described. With reference to FIG. 1, for example, these control
packets may be exchanged between controller 35A and access nodes
28, 26, between controller 35A and aggregation nodes 19, for
example.
[0177] In one example embodiment, the control packets have the
structure illustrated in FIG. 6. FIG. 6 is a block diagram
illustrating an example OCC Control Packet structure 100 according
to the techniques of this disclosure. In the example of FIG. 6, OCC
control packet structure 400 includes an Ethernet Header 402, an
OCC message header 404, and an OCC message payload 406.
[0178] The Ethernet Header 402 is a standard Ethernet II header.
The Ethernet header 402 is used so that the OCC Control plane can
be run natively over standard Ethernet interfaces. If other
physical or logical interfaces are used, the only requirement
placed on those interfaces is that they can transport an Ethernet
frame. Generally, the Source Address is the MAC address of the
sending node and the Destination Address is the MAC address of the
receiving node or all Fs in the case of Hello and Discover
messages. The Ether type may be, for example, 0xA000.
[0179] The OCC Message Header 404 includes the message type. The
OCC Message Payload 406 is the payload for the specified message
type.
[0180] FIG. 7 is a block diagram illustrating an example OCC
Message Header 404 in further detail. The OCC Message Header is
used to identify the OCC message type. The OCC Message Header may
have the following structure. The OCC Message Header includes a
Vers field that specifies the version number of the protocol. This
document defines protocol version 1. The OCC Message Header
includes a Rsrvd field. This field is reserved. In some examples,
this is set to zero on transmission and ignored on receipt. A
Message Type field specifies the OCC message type. OCC message
types are described below. A Message Length field specifies the
length of the Message Payload.
[0181] FIG. 8 is a block diagram illustrating an example base
packet structure 410 for SRT packets.
[0182] The Outer Ethernet Header 412 is a standard Ethernet II
header, which is used to support the Ethernet encapsulation of the
MPLS label stack 414. The Source and Destination Addresses are the
source and destination MAC addresses of the nodes at either end of
the link.
[0183] The MPLS Label Stack 414 is of two varieties depending on
whether the packet is from the controller or to the controller.
When the packet is from the controller, the label stack includes
labels that correspond to the egress interfaces for the nodes
receiving the packet. Such labels were discovered when the Discover
message was sent from the node to the controller. For packets from
the node to the controller, there is only one element on the MPLS
label stack 414. This MPLS header encodes the special label which
means to send the packet to the controller. By virtue of the Hello
packets discussed earlier, each node knows the least cost next-hop
to the controller and installs this as the next-hop for the
TO_CONTROLLER label value. The value of TO_CONTOLLER is 17.
[0184] A label value of (Port Index+18) maps to egress port Port
Index. The implicit label operation is Pop. All labels in the range
of 18 through 255+18 are implicitly allocated to support source
routing.
[0185] Packet structure 410 also includes the OCC Control Packet
400, as described with respect to FIG. 6.
[0186] MPLS-OCC Control messages are organized into three message
sub types. Types having common elements are collected into a sub
type. Topology and Control Channel messages are used to establish
the control channel and describe the topology. There is no common
element set associated with these messages. These messages are all
sent directly across links to immediate neighbors except for
Discover Reply, Keepalive and Keepalive Reply which are sent via
the SRT. Topology and Control Channel messages can include the
following types:
1: Hello
2: Hello Reply
3: Discover
4: Discover Reply
5: Keepalive
6: Keepalive Reply
7: SRT Down
[0187] FIG. 9 is a block diagram illustrating an example Node
Indication Header Structure 520. Node Indication messages are used
by nodes to indicate their state to the controller, and get a
response back. Node Indication messages are sent via the SRT
channel. Each message payload is preceded by a common element
header with the following structure:
[0188] The Node Indication Header 414 includes sequence number
field that specifies the sequence number of the Indication or
Confirmation. Each message refers to an object and each object has
a version represented by the sequence number. Since MPLS-OCC
control is run over an unreliable datagram network, the sequence
number ensures consistent state between the controller and the
node. Receivers must ignore messages with sequence numbers less
than their version of the object's current sequence number. The
sequence number is also used to correlate the Indication with the
Confirmation. Each message has a key element or elements that
identify the object associated with the message. The sequence
number has no meaning across message types or within messages of
the same type but having different key element values. Sequence
Numbers may use Serial Number Arithmetic, such as described in R.
Elz, "Serial Number Arithmetic," Network Working Group RFC 1982,
August 1996.
[0189] An Operation field specifies the operation being performed
on the object. The operation may include SET or CLEAR. A SET
operation may create or modify the specified object in accordance
with the remainder of the message. A CLEAR operation may clear the
specified object.
[0190] A Status field specifies the status code, which is set to 0
for all Indication messages and set to the message specific value
on Confirmation. See the individual messages for details. A Key
Length field specifies the length of the type specific key data.
This information is used to correlate the object being operated on
between the controller and the node. All Node Indication messages
include their key data in the first Key Length bytes of their
message structures. The Node Indication Header includes a Reserved
field. This field is reserved. It is set to zero on transmission
and ignored on receipt. Certain messages may use the Node
Indication Header, including:
101: Port Attributes Indication
102: Port Attributes Confirmation
103: Capabilities Indication
104: Capabilities Confirmation
105: Services Indication
106: Services Confirmation
107: Endpoint Indication
108: Endpoint Confirmation
[0191] FIG. 10 is a block diagram illustrating an example Node
Configuration Header Structure 416. Node Configuration messages are
generated by the controller to configure the target node, and to
get responses back from the nodes. They are sent via the SRT
channel. The Node Configuration Header includes a Sequence Number
field that is used to correlate the Request with the Response. The
general rules for sequence numbers are the same as those described
for Node Indication Messages.
[0192] An Operation field specifies the operation being performed
on the object. The operation may include SET or CLEAR. A SET
operation may create or modify the specified object in accordance
with the remainder of the message. A CLEAR operation may clear the
specified object. A Status field includes a status code that is set
to 0 for all Request messages and set to the message specific value
on Responses. See the individual messages for details. A Key Length
field specifies the length of the type specific key data. This
information is used to correlate the object being operated on
between the controller and the node. All Node Configuration
messages include their key data in the first Key Length bytes of
their message structures.
[0193] The Node Configuration Header includes a Reserved field.
This field is reserved. It is set to zero on transmission and
ignored on receipt. The Node Configuration Header may be used with
certain messages, including:
201: MPLS FIB Request
202: MPLS FIB Response
203: Policer Request
204: Policer Response
205: CoS Request
206: CoS Response
207: Filter Request
208: Filter Response
209: Pseudo Wire Request
210: Pseudo Wire Response
211: Direct Switch Request
212: Direct Switch Response
213: MAC FIB Request
214: MAC FIB Response
[0194] FIG. 11 is a block diagram illustrating an example TLV
Structure 418. Some of the messages encode their attributes as TLV
(type, length, value) triples. TLVs are used generally to support
message attributes of variable length. They also ease message
extensions and can be used to support vendor specific attributes. A
TLV structure may include a Type field that specifies the TLV Type.
In some examples, a TLV of type 0 is used for vendor specific TLVs.
In some examples, the TLV type may be a message specific value
between 1 and 65535. See the message specific section for usage.
The Length field defines the length of the value portion in octets
(thus a TLV with no value portion would have a length of zero). A
Value field specifies the contents of TLV. See the specific TLV
description for more information.
[0195] FIG. 12 is a block diagram illustrating an example Vendor
Specific TLV Structure 420. A vendor specific TLV is used by
vendors to extend messages with vendor specific attributes. The
Value of the TLV has the following structure. A Vendor OID field
specifies a Vendor Organization Identifier, including a unique
identifier for the vendor as gotten from IEEE. A Value field
specifies the Vendor specific value. This may be of any length and
encoding as chosen by the vendor.
[0196] FIG. 13 is a block diagram illustrating an example Hello
Message Structure 422. The Hello message is a link-local broadcast
message used to discover a neighbor across a point to point link. A
node sends a Hello message periodically on all of its ports at a
rate chosen by the sending node. The Hello message is used for both
discovery and to determine the liveness of the link. The Source
Address of the Ethernet header is the MAC address of the sending
node and the Destination Address is all Fs. The Hello Message may
include a Port Index field. The port index is local to the sender.
Ports are indexed from 0 to 0xFE. The Port Index of 0xFF is
reserved for the control plane of the node. Therefore OCC nodes may
be restricted to 255 interfaces. In some examples, this may be a 16
or 32 bit index.
[0197] The Hello Message may include a Hop Count field that
indicates the number of hops the sender is from the controller.
Edge nodes set the hop count to 1 if they have a connection to the
controller. Non-edge nodes select the least cost Hop Count from
their neighbors and increment by 1 and transmit that hop count in
their messages. When a node sends a Hello message to the node it is
using as its TO_CONTROLLER nexthop, it sets the Hop Count to 0xFF
to ensure that the receiving node will not immediately attempt to
use the dependent node as an SRT path. When a node reboots, the
first Hello message also carries a Hop Count of 0xFF.
[0198] FIG. 14 is a block diagram illustrating an example Hello
Reply Message Structure 416. The Hello Reply message is a unicast
message used to reply to a Hello message. When a Hello Reply
message is received on a port, the link is set to active by the
receiver. The sender of the Hello Reply message MUST set the Source
Address of the Ethernet Header to its own MAC address, and the
Destination Address of the Ethernet Header to the Source Address of
the corresponding Hello message. The Hello Reply message is sent on
the same port from which the Hello message was received. The Hello
Reply message may include a Port Index field. The port index is
local to the sender. Upon receipt of this message, the receiving
node can unambiguously describe the link between itself and its
neighbor in terms of local and remote node identifiers (MAC
addresses) and local and remote port indexes. The Hello Reply
message may include a Hop Count field. The Hop Count field may be
substantially similar to the Hop Count field for the Hello
Message.
[0199] FIG. 15 is a block diagram illustrating an example Discover
Message Structure 418. The Discover message is generated by a node
when its neighbor list changes, when its currently active SRT times
out or when the least cost next-hop to the controller changes. In
all cases, a new generation number is generated. The Discover
message is periodically sent until its corresponding Discover Reply
is received.
[0200] A Discover Message may include an Instance field. The
Instance ID is a unique number for the instance of the node. A node
should generate a new instance ID each time it reboots or otherwise
resets its software state. The instance ID is used to disambiguate
a Discover message with the same generation number between resets.
The instance ID may be a random number or a monotonically
increasing integer for nodes having some ability to store
information between reboots.
[0201] A Discover Message may include a Generation Number field.
The Generation Number is a monotonically increasing number. The
controller ignores any Discover message with a generation number
less than the most recently received generation number (unless the
Instance changes). A Discover Message may include a Reserved field.
This field is reserved. It is set to zero on transmission and
ignored on receipt. A Discover Message may include a State bit
field ("S"). This bit is set to 1 if the node has state as
programmed by a controller. This bit is used by the controller to
synchronize state between the controller and the node.
Specifically, if the node has state but the controller does not
have any state for the node, the controller should request that the
node reset all of its state via the R flag of the Discovery Reply
message.
[0202] A Discover Message may include an Intermediate Node List
(INL) Start field that specifies the offset from the beginning of
the OCC Message Payload to the start of the Intermediate Node List.
This offset is required since the Neighbor Node List is variable in
length. A Discover Message may include an INL End field that
specifies the offset from the beginning of the OCC Message Payload
to the end of the Intermediate Node List. A Discover Message may
include a Neighbor Node List field that specifies the list of
neighbors associated with this node. Each element in the list
includes the neighbor's MAC address, the local port index on which
a Hello Reply message was received and the neighbor's port index as
indicated in the Hello Reply message.
[0203] A Discover Message may include an Intermediate Node List
field that specifies the MAC addresses and their corresponding
ingress and egress ports through which this packet traversed en
route from the originating node to the terminating edge node (EN)
inclusive.
[0204] FIG. 16 is a block diagram illustrating an example Neighbor
Node List Element Structure 420. A Neighbor MAC Address field
specifies the MAC address of the neighbor as reported in the
Ethernet Source Address of the Hello Reply Message. A Local Port
field specifies the local port index over which the Hello Reply was
received. A Remote Port field specifies the remote port index as
reported in the Port Index of the Hello Reply packet.
[0205] FIG. 17 is a block diagram illustrating an example
Intermediate Node List Element Structure 422. An Intermediate MAC
Address field specifies the MAC address of a node that received the
Discover message and sent the packet toward the controller. An
Ingress Port field specifies the index of the port on which the
packet was received. An Egress Port field specifies the index of
the port on which the packet was sent. This is also the port on the
least cost path to the controller.
[0206] FIG. 18 is a block diagram illustrating an example Discover
Reply Message Structure 424. The Discover Reply message is sent by
the controller to acknowledge receipt of the Discover message. The
Discover Reply is sent via an SRT indicating that there is an MPLS
label stack on the packet from the controller that corresponds the
source routed egress interface list used to route the packet. A
Discover Reply message may include a Generation Number field. The
Generation number is used to correlate the Discover Reply with the
original Discover message. If the Generation number does not match
the current generation number, the node discards the message. If
they do match, the node initiates keepalive processing on the
shortest path to the controller. A Discover Reply message may
include a Reserved field. This field is reserved. It is set to zero
on transmission and ignored on receipt.
[0207] A Discover Reply message may include a Reset bit ("R")
field. This bit is set when the node indicates it has programmed
forwarding state via the S bit of the Discover message but the
controller has no state for the node. In this case the controller
sets the R bit to force the node to reset all of its state and to
generate another Discover packet.
[0208] A Discover Reply message may include an Age Time field. Age
time is used to indicate to the node that the controller needs to
synchronize its state with the node's state. When Age Time is
non-zero, the node marks all of its state as "dirty." When Age
Time, measured in seconds, expires, all state with the "dirty" bit
set is cleared. In the mean-time, the controller is expected to
replay all the state that had been previously pushed to the
controller. When state is pushed into the controller, the "dirty"
bit is cleared. When Age Time is nonzero, the node resends all Node
Indication Messages.
[0209] A Discover Reply message may include a Crtl IP Version field
that specifies the IP Version of the controller IP address, which
implies its length. A Discover Reply message may include a
Controller IP Address field that specifies the IPv4 or IPv6 address
of the controller. This is the address the node should use to
establish a TCP/IP control channel with the controller. It may also
be used for other networking functions that are outside the scope
of this specification.
[0210] The Keepalive message is used to maintain liveness of an
SRT. It is periodically sent by a node after it has received a
Discover Reply for the current generation number. The Keepalive
message is sent via the SRT using the TO_CONTROLLER label from a
node to the controller. The Keepalive message has no additional
content. The Keepalive Reply message is sent by the controller upon
receiving a Keepalive message. The Keepalive Reply message has no
content. The Keepalive Reply message is sent via the SRT to the
sender of the corresponding Keepalive message.
[0211] FIG. 19 is a block diagram illustrating an example SRT Down
Message Structure 426. The SRT Down message is used to indicate to
a sending node that the SRT over which it has sent a packet has
broken. This message provides immediate feedback to the sender that
the SRT is down. With this indication, the sender does not have to
wait for a keepalive timeout before taking corrective action.
[0212] When an NN receives an SRT Down message it increments its
generation number and generate a new Discover message to establish
a new SRT with the controller. When a controller receives an SRT
Down message it modifies its state for the affected node such that
the next Discover message from the node of equal to or greater than
generation number is immediately accepted. That is, in some
examples the controller compares the generation number specified by
the Discover message to a current generation number received from
the access node, and updates it stored network topology information
if the generation number specified by the Discover message is
greater than or equal to the current generation number. This avoids
the condition where a Discover Reply for a given generation number
is not able to follow the SRT specified and all Discovers from the
node would be ignored since they specify a different INL.
[0213] When a controller receives an SRT Down message it may choose
to recompute a new SRT to the node based on its local knowledge of
the topology. Such a computation may shorten the required time to
repair the control channel to the node in question. It also opens
the opportunity for the controller to traffic engineer SRT paths to
the nodes in the network.
[0214] The node detecting the SRT Down constructs the SRT Down
message according to the following procedure: Construct an Ethernet
header where the Source Address is set to the MAC address of the
node sending the SRT Down message. The Destination Address is set
the MAC address of the original packet. Add a new Message Header
with OCC message type SRT Down and set the reason code. Append the
first 256 bytes of the original packet including the Ethernet
Header, the Message Header and the Message Payload. If the original
packet is from the controller, send the packet along the
TO_CONTROLLER path. Specifically encapsulate the packet with a
single MPLS header with the TO_CONTROLLER label. If the packet is
from a node and the node is a direct neighbor, send the packet to
the direct neighbor. If the packet is from any other node, the only
choice is to drop the packet since the path from the sender is not
known.
[0215] An SRT Down Message may include an SRT Down Reason Code
field that specifies the reason the SRT went down. Possible choices
include:
0: Reserved
[0216] 1: Neighbor at egress port is down 2: Invalid egress port
for this node
4-65535: Reserved
[0217] FIG. 20 is a block diagram illustrating an example Port
Attributes Indication Message Structure 428. The Port Attributes
Indication message is sent by a node to describe its port
attributes to the controller. Port attributes only describe locally
discoverable characteristics of ports such as maximum bandwidth or
port type. Logical characteristics of ports such as port coloring
or metric values are not something a node describes but could be
something associated with the port at the controller via mechanisms
such as configuration. Generally port attributes are attributes
that effect traffic engineering calculations.
[0218] A Port Attributes Indication Message may include a Port
Index field that specifies the port index for which the attributes
apply. The port index represents the key element for the message. A
Port Attributes Indication Message may include a Reserved field.
This field is reserved. It is set to zero on transmission and
ignored on receipt. A Port Attributes Indication Message may
include Port Attribute TLVs field that includes a list of byte
packed Port Attribute TLVs. Note that the MPLS-OCC message header
length is used to calculate the length of this element. Port
Attributes are encoded as TLVs to support extensibility. Port
Attribute fields are described using a set of Type/Length/Value
triplets as described above. The following Types are used for the
Port Attribute TLVs.
1: Port Bandwidth.
2: Shared Resource Group
3: Shared Fate Group
4: Expected Transmission Time
5-65535: Reserved
[0219] The Port Bandwidth Attribute may include a 64 bit value in
bits per second for the port bandwidth. The Shared Resource Group
Port attribute may include a structured 64-bit value that this port
to a set of other ports on the same node. The Shared Fate Group
Port Attribute may include a 32 bit integer that represents the
shared fate group for the specified port. The Expected Transmission
Time Port Attribute may include the time expected to transmit a
packet of 1K bytes across the port. Time is measured in
microseconds and is encoded as a 32 bit unsigned integer.
[0220] FIG. 21 is a block diagram illustrating an example Shared
Resource Group Structure 430. The Shared Resource Group Port
Attribute may include a structured 64 bit value that ties this port
to a set of other ports on the same node. Ports having the same
Shared Resource Group ID share bandwidth resources between
themselves. This value could be sent via the Hello message to the
node on the other end of the link so that the Shared Resource Group
is globally understood by the controller. A Local Group ID field
specifies a group ID local to the node. A Node MAC field specifies
the MAC address of the node generating the group ID. This construct
ensures that the group ID is globally unique.
[0221] FIG. 22 is a block diagram illustrating an example Port
Attributes Confirmation Message Structure 432. The Port Attributes
Confirmation message is sent by a controller to confirm the Port
Attributes Indication message. The Port Attributes Confirmation
Message may include a Port Index field that specifies the port
index for which the attributes apply. The port index represents the
key element for the message. The Port Attributes Confirmation
Message may include a Reserved field. This field is reserved. It is
set to zero on transmission and ignored on receipt.
[0222] FIG. 23 is a block diagram illustrating an example
Capabilities Indication Message Structure 434. The Capabilities
Indication message is used to signal to the controller the current
capabilities of the sending node. There is no key element
associated with this message since it is global to the node. A CoS
Scheduling Discipline field specifies a bit mask of supported CoS
Scheduling Disciplines. Possible values include:
0x0001: Deficit Weighted Round Robin. 0x0002: Strict Priority.
0x0004: Strict Priority Restricted.
[0223] All other values are reserved and is set to zero on and
transmission and ignored on receipt. A Per Port Queue Depth field
specifies the number of bytes of memory available for packet
queuing per port. A Number of Policer Instances field specifies the
number of policer instances that are supported. A Number of
Firewall Filter Rules field specifies the number of firewall filter
rules supported. A Number of MPLS Forwarding Entries field
specifies the number of MPLS forwarding entries supported. A Size
of MAC Table field specifies the number of MAC table entries
supported. A Max Label Stack Push field specifies the maximum
number of labels that may be pushed onto a packet. A Max Label
Stack Transit field specifies the maximum number of labels on the
label stack that can transit the node.
[0224] Capability TLVs are included to allow for variable length
capabilities, extensions and vendor specific attributes. The
following types are specified.
1: Vendor Name. A UTF-8 non-NULL terminated text string describing
the vendor name. 2: Model. A UTF-8 non-NULL terminated text string
containing the vendor specific model name or number. 3: Serial
Number. A UTF-8 non-NULL terminated text string containing the
node's serial number.
4-65535: Reserved
[0225] The Capabilities Confirmation is sent by the controller to
confirm receipt of capabilities from the node. It has no additional
content beyond the common headers.
[0226] FIG. 24 is a block diagram illustrating an example Services
Indication Message Structure 436. The Services Indication message
is used to signal to the controller the network services available
at the specified node. Generally the Services Indication message is
sent from an edge node. The Services Indication message is a byte
packed sequence of Service Name, Type, and Affinity. The Service
Name is the key element and the key length is specified in the Node
Indication Header. A Service Name field specifies the UTF-8 encoded
network service name and may be NULL terminated. A key_len field of
Node Indication Header specifies length of string, including a NULL
character. The Service Name field is not padded at the end. A Type
field specifies the type of service indicated. Service types
include:
Subnet (0) An IP subnet. VPLS (1) A VPLS or E-VPN instance. L3VPN
(2) A layer 3 VPN service.
[0227] An Affinity field specifies a relative measure of how
strongly the controller should favor the indicating node for the
given service against other nodes indicating the same service. The
affinity value may be set up by a network management system, which
has a global view of the network beyond the OCC subsystem and is
used to configure the devices and services in the OCC subsystem
part of the network. Services Confirmation message is sent by the
controller to confirm receipt of network services from the node. It
has no additional content beyond the common headers.
[0228] FIG. 25 is a block diagram illustrating an example Endpoint
Indication Structure 438. The Endpoint Indication message is sent
by a node to the controller to signal an endpoint status change. A
Type field specifies the type of endpoint. Types include:
Port-based (1): Port-based endpoint. Subscriber MAC is ignored.
MAC-based (2): MAC-based endpoint. Subscriber MAC is part of the
key.
[0229] A Port field specifies the port number. A Subscriber MAC
field specifies the optional MAC address of the endpoint. Used when
specifying MAC-based endpoints. A Status field specifies the status
of the endpoint. The value of the Status field may include: Up (1):
Endpoint is up. Down (2): Endpoint is down. The Endpoint
Confirmation message is sent by the controller to signal receipt of
an Endpoint Indication message from the node.
[0230] FIG. 26 is a block diagram illustrating an example MPLS FIB
Request Message Structure 440. The MPLS FIB Request message is
generated to download the pre-computed Label Information Base to an
individual network node in the OCC domain.
[0231] Upon receiving the message, the control plane software on
the network node parses the label configuration information and
programs the MPLS forwarding table. The key element for the FIB
entry is the concatenation of Path ID and Label. On ingress LERs
the Incoming Label is always set to 0. On LSRs, the Path ID need
not be used to uniquely identify the entry since a given label is
never used for more than one path. Note that optional words (A) and
(B) (see below) is only required for SET operations. They may be
omitted from CLEAR operations. Optional words (C) and (D) are
present if required by detour operations (see D and E bit
descriptions below). A Path ID field includes a 32-bit identifier
for this path and may occupy the first 32 bits of the key for this
entry. An Incoming Label field specifies the MPLS label for an
incoming packet. The Incoming Label is 0 for the ingress node of
the path and may occupy the second 32 bits of the key for this
entry.
[0232] A Policer ID field specifies the Policer ID to instantiate
and apply to traffic using this path. Note that this is generally
applied on the ingress LER. A policer ID of zero (0) indicates that
no policing is to be done on this LSP.
[0233] An MPLS FIB Request Message may include a "D" field. If set,
a detour path entry is specified. Specifically, optional word (C)
exists in the message. An MPLS FIB Request Message may include an
"E" field. If set, the detour path entry includes a second action.
Specifically, optional word (D) exists in the message. An MPLS FIB
Request Message may include an "M" field that specifies the CoS
mode for the detour path. Values for the "M" field may include:
0: Preserve the CoS bits in original packet. 1: Replace the
original CoS bits with a new CoS value specified in "Value"
field.
[0234] An MPLS FIB Request Message may include an "R" field. This
field is reserved. It is set to zero on transmission and ignored on
receipt. A Value field specifies the new CoS value to be used when
the M bit is "1". A Primary Port field specifies the Primary path
Port Index local to network node. A PA field specifies the MPLS
action to be operated on an incoming packet when it takes the
primary path. The actions include:
PUSH (1): Push the primary label to an incoming packet. SWAP (2):
Swap the label in an incoming packet with the primary label. POP
(3): Pop off the top-most label from an incoming packet.
[0235] Note that these numerical values are used for the DA1 and
DA2 message fields as well.
[0236] A Primary Egress Label field specifies the MPLS Label to be
pushed or swapped on to the outgoing packet. A Detour Port field
specifies the Port Index local to network node for a detour path if
present. A DA1 field specifies the first action to be operated on
an incoming packet when it takes the detour path. MPLS actions
include:
PUSH (1): Push Detour Egress Label 1.
[0237] SWAP (2): Swap the outermost label with Detour Egress Label
1. POP (3): Pop off the outermost label.
[0238] A Detour Egress Label 1 field specifies the MPLS Label value
used by the label operations specified in DA1. A DA2 field
specifies the second action to be operated on an incoming packet
when it takes the detour path. MPLS actions include:
PUSH (1): Push Detour Egress Label 2.
[0239] SWAP (2): Swap the outermost label with Detour Egress Label
2 POP (3): Pop off the outermost label.
[0240] A Detour Egress Label 2 field specifies the MPLS Label value
used by the label operations specified in DA2.
[0241] FIG. 27 is a block diagram illustrating an example MPLS FIB
Response Message Structure 442. The MPLS FIB Response message is
sent by a network node to acknowledge back to the controller that
the MPLS FIB Request message was received and processed with the
indicated status code. The following Status codes may be used in
the Node Configuration Header:
0: Success
1: Invalid Primary Egress Port
2: Invalid Primary Label Value
3: Invalid Primary Action
4: Invalid Detour Egress Port
5: Invalid Detour Label Value
6: Invalid Detour Action
7: Invalid Second Detour Egress Port
8: Invalid Second Detour Label Value
9: Invalid Second Detour Action
10: Invalid Incoming Label Value
[0242] A Path ID field specifies the path identifier from the MPLS
FIB Request message and may occupy the first 32 bits of the entry
key. An Incoming Label field specifies the Label from the MPLS FIB
Request message and may occupy the second 32 bits of the entry
key.
[0243] FIG. 28 is a block diagram illustrating an example Policer
Request Message Structure 444. The Policer Request message is used
to specify a policer for an individual network node in the OCC
domain. It specifies a BW Per CoS where BW is specified in bits per
second. The key element for the message is the policer ID. Note
that this message only specifies the policer. A policer instance is
actually created when the policer is associated with some other
object such as a filter or path. If the policer is modified, then
all instances derived must be updated.
[0244] A Policer ID field specifies a Policer ID that serves as a
unique identifier for this policer specification. This is the key
element. A Per CoS Entry field indicates the bandwidth to be
policed per CoS (see below).
[0245] FIG. 29 is a block diagram illustrating an example Per CoS
Entry Element Structure 446. A Per CoS Entry Element may include a
CoS field that specifies the Class of Service. Up to 8 classes of
service are supported. The class of service is also the same as the
EXP bits used on the encapsulated MPLS frames. A Per CoS Entry
Element may include an "A" field. If set, CoS is ignored and the
policer applies to all CoS. In this case, only one Per CoS Entry
may be present in the Policer Request Message. A Per CoS Entry
Element may include a Bandwidth field that specifies the bandwidth
allowed for the indicated class measured in bits per second. Note
that Bandwidth is a 64 bit unsigned integer. A Per CoS Entry
Element may include a Reserved field. This field is reserved. It is
set to zero on transmission and ignored on receipt.
[0246] FIG. 30 is a block diagram illustrating an example Policer
Response Message Structure 448. The Policer Response message is
sent by a network node to acknowledge back to the controller that
the Policer Request message was received and processed with the
indicated status code. The following Status codes may be used in
the Node Configuration Header:
0: Success
1: Invalid Attribute
[0247] A Policer ID field specifies the policer ID being
acknowledged.
[0248] FIG. 31 is a block diagram illustrating an example CoS
Scheduler Request Message Structure 450. The CoS Scheduler Request
message is used to configure the CoS schedulers for a specific port
of an individual network node in the OCC domain. The Port Index is
the primary element for the message. An Entries field specifies the
number of Per CoS Entries.
[0249] A Port Index field specifies the Port Index to which the CoS
entries are applied. A CoS Scheduler Request Message may include a
Reserved field. This field is reserved. It is set to zero on
transmission and ignored on receipt.
[0250] FIG. 32 is a block diagram illustrating an example Per CoS
Scheduler Entry Structure 452. The Per CoS Scheduler Entry
indicates the Per CoS Scheduling parameters. If there is no Per CoS
Entry for a given class of service then all packets matching that
class of service are dropped.
[0251] A Per CoS Scheduler Entry may include an CoS field that
specifies the Class of Service. Up to 8 classes of service are
supported. The class of service is also the same as the EXP bits
used on the encapsulated MPLS frames. A Per CoS Scheduler Entry may
include an "X" field. If set, the CoS should be given minimal
service. Only if there is no queued data for any other CoS is the
packet transmitted. Otherwise the packet is dropped. When X is set,
the remaining entries in the CoS Scheduler Entry are ignored. If X
is cleared, the CoS is scheduled as specified in the remainder of
the CoS Scheduler Entry. A Per CoS Scheduler Entry may include an
SD field that specifies the Scheduling Discipline to be applied to
this class of service. The following Scheduling Disciplines may be
specified:
0: Deficit Weighted Round Robin (DWRR). This CoS is scheduled
according to its Bandwidth weight. The DWRR scheduler round robins
across all DWRR CoS according to the Bandwidth Weight. Classes may
go into deficit if excess bandwidth exists. 1: Strict Priority.
When Strict Priority is selected, the CoS is always scheduled
whenever a packet of that class exists. When strict priority is
used, the Bandwidth for the class is ignored and other classes are
subjected to starvation or may not be serviceable according to
their bandwidth allocation. 2: Strict Priority Restricted. Strict
Priority Restricted schedules any packets of this class immediately
so long as the bandwidth allocation has not been exceeded. Once
bandwidth has been exceeded the class acts as any other DWRR
class.
3-15: Reserved.
[0252] Note that scheduling behaviors among implementations may
vary under certain circumstances. The following behaviors are
unspecified: The behavior of a DWRR scheduler when a subset of the
classes have exceeded their round robin allotment yet excess
bandwidth capacity exists. Schedulers should schedule in proportion
to the respective weights of the classes.
[0253] A Per CoS Scheduler Entry may include a Bandwidth field that
specifies the Bandwidth for the CoS. The Bandwidth is specified as
a percentage of the total available bandwidth of the port where 255
represents 100%. The sum of all CoS Scheduler Entry bandwidth
values should equal 255. The bandwidth specified may be 0,
indicating that the CoS is only scheduled when all other non-zero
bandwidth classes have been scheduled. If all bandwidth allocations
do not add up to 255, the implementation should normalize.
[0254] A Per CoS Scheduler Entry may include a Queue Length field
that specifies the length of the queue specified in milliseconds.
The implementation is expected to convert this number to a byte
value based on the port bandwidth. If queue buffer resources cannot
be allocated for all CoS then the scheduler should allocate
according to the relative proportion of the Queue Lengths specified
for all queues.
[0255] A RED Thresh 1 field specifies the first RED threshold. This
is the percentage of queue full for the specified QoS. The value is
specified as an 8 bit integer where 0 indicated 0% full and 255
indicates 100% full. RED Thresh 1 must be less than RED Thresh 2.
When the queue length is greater than RED Thresh 1 but less than
RED Thresh 2 (if specified) the packet is dropped with probability
RED Prob 1.
[0256] A RED Prob 1 field specifies the drop probability for a
packet when the queue depth has reached RED Thresh 1. This
probability is specified as an integer from 0 to 255 where 0
represents 0% probability and 255 represents 100% probability. It
is assumed that the drop probability is 0% when the current queue
length is less than RED Thresh 1. This specification does not
specify if the RED algorithm should use head, tail or random
drop.
[0257] A RED Thresh 2 field specifies the second RED threshold. If
unused it is set to 0. Otherwise it must be greater than RED Thresh
1. When the queue depth is greater than RED Thresh 2, packets are
dropped with probability RED Prob 2. RED Thresh 2 follows the same
encoding scheme as RED Thresh 1.
[0258] A RED Prob 2 field specifies the drop probability for a
packet exceeding RED Thresh 2. RED Prob 2 follows the same encoding
scheme as RED Prob 1.
[0259] FIG. 33 is a block diagram illustrating an example CoS
Scheduler Response Message Structure 454. The CoS Scheduler
Response message is sent by a network node to acknowledge back to
the controller that the CoS Scheduler Request message was received
and processed with the indicated status code. The following Status
codes may be used in the Node Configuration Header:
0: Success
1: Invalid Attribute.
[0260] A Cos Scheduler Response Message may include a Port Index
field that specifies the Port Index from the original request. A
Cos Scheduler Response Message may include a Reserved field. This
field is reserved. It is set to zero on transmission and ignored on
receipt.
[0261] FIG. 34 is a block diagram illustrating an example Filter
Request Message Structure 456. A Filter Request message specifies a
set of packet matching filter rules and actions to be taken when a
rule is matched, for an individual network node in the OCC domain.
Rules are specified in order of priority.
[0262] A Filter ID field specifies a unique 32 bit identifier for
the filter. The Filter ID is the key element. A Filter Request
Message may include "N" Filter Rules, where "N" is any positive
integer. Filter Rules 1-N include a priority ordered list of Filter
rules. Filter rules are byte packed.
[0263] FIG. 35 is a block diagram illustrating an example Filter
Rule Structure 458. A Filter Rule Structure may include fields for
Type, Action Mask, Action Arguments, and Flow Spec.
[0264] A Type field specifies the filter type, which may be defined
as follows:
0: IPv4: indicating that the Destination and Source Prefixes as
specified in the Flow Spec are to be interpreted as IPv4address. 1:
IPv6: indicating as above for IPv6. 2: MAC: indicating as above for
MAC addresses. When MAC addresses are specified, the IP Protocol is
instead interpreted as the Ether Type.
3-255: Reserved.
[0265] An Action Mask field specifies an Action Mask where bits are
defined as follows:
0x01 (Drop): Indicates that the packet is to be dropped. All other
actions may be ignored when Drop is set. 0x02 (Forward): Indicates
that the packet is to be forwarded normally. 0x04 (Set CoS):
Indicates that the packet's Class of Service should be modified as
indicated in the Action Argument. The Class of Service argument is
a single byte of which only the three least significant bits are
used. This action only sets the class of service for the packet,
which is carried in the MPLS header EXP bits. 0x08 (Set DSCP):
Indicates that the packet's DSCP field should be modified as
indicated in the Action Argument. The DSCP argument is a single
byte that is copied into the DSCP field of the IP header. This
action is only valid for IPv4 and IPv6type Filter Rules. 0x10
(Police): Indicates that the packet is to be policed by an instance
of the Policer ID specified as an Action Argument. The Policer ID
is encoded as a 32 bit identifier as specified in the Policer
message. 0x20 (Redirect): Indicates that the packet is to be
redirected to the specified next-hop. The Redirect Argument is a
variable length argument.
[0266] An Action Arguments field includes a variable length
collection of arguments as specified in the Action Mask
descriptions. Individual arguments are byte packed and need not end
on a 32 bit boundary. In other words, Flow Spec may not start on a
32 bit boundary.
[0267] A Flow Spec field includes a flow-spec as defined in P.
Marques, "Dissemination of Flow Specification Rules," Network
Working Group RFC 5575, August 2009, the entire contents of which
are incorporated by reference herein. Note that the flow spec is
not encoded as a NLRI with the NLRI specific lengthen coding. Only
the "NLRI value" from the Flow Specification NLRI is encoded in the
Flow Spec. The length value is not required since the flow-spec
entry's length is implicit in its encoding.
[0268] FIG. 36 is a block diagram illustrating an example Filter
Response Message Structure 460. The Filter Response message is sent
by a network node to acknowledge back to the controller that the
Filter Request message was received and processed. A Filter ID
field specifies a unique 32 bit identifier for the filter.
[0269] FIG. 37 is a block diagram illustrating an example Pseudo
Wire Request Message Structure 462. The Pseudo Wire Request message
is used to create a pseudo wire on the targeted node. The PW ID is
the primary key for message. A PW ID field specifies the Pseudo
Wire ID and may include the key for this message. A Switching Mode
field may include the following values: Switch (0): Indicates that
the PW is to act as a normal L2 switching port. This implies that
unknowns may be flooded to the PW and MAC addresses are to be
learned on the PW. Authorized (1): Indicates that the PW is to
handle only MAC addresses that have been associated with the PW via
a MAC FIB Request message. Packets with SMAC addresses from the PW
that do not match a MAC FIB entry must be dropped. Unknowns are
never flooded to this PW as all MAC addresses reachable via the PW
are known.
[0270] An NSN Length field specifies the length in octets of the
Network Service Name including the NULL termination character. A
Pseudo Wire Request Message may include a Reserved field. This
field is reserved. It is set to zero on transmission and ignored on
receipt. A Filter ID field specifies an optional filter ID to be
associated with all packets entering the PW, and is ignored if set
to 0. An Ingress Label field specifies the service label for
packets received from the PW. This is the local label. An Egress
Label field specifies the service label for packets transmitted to
the PW. This is the remote label. A Path ID field specifies the
egress Path carrying the PW. A Network Service Name field specifies
the network service to be carried via this PW. The name is a UTF-8
string of bytes terminated with the NULL (0) byte.
[0271] FIG. 38 is a block diagram illustrating an example Pseudo
Wire Response Message Structure 464. The Pseudo Wire Response
message is sent by a network node to acknowledge back to the
controller that the Pseudo Wire Request message was received and
processed with the indicated status code. The following Status
codes may be used in the Node Configuration Header:
0: Success.
1: Invalid Filter ID.
2: Invalid Path ID.
3: Invalid Switching Mode.
4: Parse Error.
[0272] A PW ID field specifies a unique identifier from the
corresponding Pseudo Wire Request.
[0273] FIG. 39 is a block diagram illustrating an example Direct
Switch Request Message Structure 466. The Direct Switch Request
Message is used by the controller to map all the traffic from an
endpoint in network node to a specific PW. An endpoint is defined
as either a port-index or a (port-index, MAC) tuple. When the
endpoint is defined as a port-index, all traffic from the PW is
mapped directly to the port. When the endpoint is defined as a
(port-index, MAC) tuple, all traffic from the PW matching the MAC
is mapped to the specified port-index. An optional Filter ID may be
specified with the message to filter the traffic from the end-point
before transmitting it via the PW. It is ignored if set to 0.
[0274] An Endpoint Type (EPT) field specifies an endpoint type. The
endpoint type is chosen from the following values:
Access Port (0): The endpoint is specified by an Access Port index
on the access node. When this type is specified, the MAC Address
element is ignored. All packets are mapped directly from the PW to
the port and vice versa. When this type is specified, the key to
the object is {EPT, Port Index}. Access (Port, MAC) (1): The
endpoint is specified as an Access Port and MAC address. When this
type is specified, all packets from the (port, MAC) are mapped
directly to the PW and all packets from the PW having DMAC==MAC
Address are mapped directly to the Port Index. When this type is
specified, the key to the object is {EPT, Port Index, MAC
Address}.
[0275] A Port Index field specifies the Port Index to be mapped. A
MAC Address field specifies the MAC address of the endpoint when
EPT is set to Access (Port, MAC). Otherwise this element is ignored
and must be set to 0. A Filter ID field specifies an optional
Filter to be applied to packets from the endpoint, and is ignored
if set to 0. A PW ID field specifies the PW ID to which all packets
from the endpoint are to be transmitted. All packets from the PW
are to be transmitted to the port when EPT is set to Access Port or
when EPT is set to Access (Port, MAC) packets whose DMAC matches
MAC address are sent to the port. A Network Service Name field
specifies the UTF-8 encoded Network Service Name. The string is
terminated with the 0 byte. Its length can also be computed from
the total message length found in the OCC message header.
[0276] FIG. 40 is a block diagram illustrating an example Direct
Switch Response Message Structure 468. The Direct Switch Response
message is sent by a network node to acknowledge back to the
controller that the Direct Switch Request message was received and
processed with the indicated status code. The following Status
codes may be used in the Node Configuration Header:
0: Success.
1: Invalid Filter ID.
2: Invalid PW ID.
3: Invalid EPT.
4: Invalid Port Index.
5: Parse Error.
[0277] An Endpoint Type (EPT) field includes the Endpoint Type as
specified in the request. A Port Index field includes the Port
Index as specified in the request. A MAC Address field includes the
MAC Address as specified in the request.
[0278] FIG. 41 is a block diagram illustrating an example MAC FIB
Request Message Structure 470. The MAC FIB Request message is used
by the controller to add a MAC entry into an Ethernet Switching
table a network node. The specific switching table to be used is
specified by the Network Service Name. Associated with the MAC
entry may be optional filters associated with the MAC address as it
passes through the FIB. Such filters may be associated with the
DMAC or the SMAC.
[0279] The key to the MAC FIB Request Message is the concatenation
of the MAC Address and the Network Service Name. A MAC Address
field includes the MAC Address to be added to the FIB. Packets
arriving at the Network Service whose DMAC matches this MAC address
are to be transmitted via the specified Next Hop. The MAC Address
is a key element. A Network Service Name field includes the Network
Service Name. The Network Service Name is typically a VLAN or
Bridge Domain to which this forwarding entry is to be associated.
The length of the Network Service Name may be computed from the Key
Length element of the Node Configuration Message header minus the
length of the MAC Address (6 bytes). The Network Service Name is
terminated with a 0 byte. The Network Service Name element is not
padded to a word boundary. A Next Hop Type field includes the Next
Hop Type, which may be chosen from the following values: Access
Port (0), and Pseudo Wire ID (1). The Access Port and Pseudo Wire
values are described in more detail in FIGS. 39 and 40,
respectively.
[0280] A MAC FIB Request Message may include a Reserved field. This
field is reserved. It is set to zero on transmission and ignored on
receipt. A Next Hop field may include a Port Index or PW ID as
defined by Next Hop Type. See Next Hop Type for type-specific
encoding. A DMAC Filter ID field specifies the filter to be
associated with all packets whose DMAC matches the MAC Address.
This is an optional element and is ignored if set to 0. A SMAC
Filter ID field specifies the filter to be associated with all
packets whose SMAC matches MAC Address. This is an optional element
and is ignored if set to 0.
[0281] FIG. 42 is a block diagram illustrating an example Next Hop
Port Descriptor 472. The Next Hop is specified by an Access Port
Index on the access node. The Next Hop element includes a Port
Index field which specifies the Port Index where the MAC is
located. In some examples, the Port Index may include 32 bits
rather than 8 to keep the implementation simpler.
[0282] FIG. 43 is a block diagram illustrating an example Next Hop
Pseudo Wire (PW) Descriptor 474. A Next Hop PW Descriptor includes
a PW ID field that specifies the endpoint as a Pseudo Wire ID.
[0283] FIG. 44 is a block diagram illustrating an example MAC FIB
Response Message Structure 476. The MAC FIB Response message is
sent by a network node to acknowledge back to the controller that
the MAC FIB Request message was received and processed with the
indicated status code.
[0284] The following Status codes may be used in the Node
Configuration Header:
0: Success.
1: Invalid Filter ID.
2: Invalid PW ID.
3: Invalid Next Hop Type.
4: Invalid Port Index.
5: Parse Error.
[0285] A MAC FIB Response Message may include a MAC Address field
that specifies the MAC Address from the corresponding MAC FIB
Request. A MAC FIB Response Message may include a Network Service
Name field that specifies the Network Service name from the
corresponding MAC FIB Request. The protocol may also include a
message for a node to signal to a controller that the node is
seeing multiple neighbors on a port. Multiple neighbors are illegal
since we are assuming P2P ports.
[0286] Generally speaking, many Ethernet based protocols assume an
Ethernet payload MTU of 1500 bytes. However, some MPLS control
packets may exceed this MTU. For example, the Discovery message may
include up to 256 neighbors and 64 intermediate nodes resulting in
an Ethernet payload of 2576 bytes. However, most modern systems
support Ethernet Jumbo frames (Ethernet MTU is 9216). Therefore, in
order to support a network of maximum scale, the Ethernet
interfaces should support Jumbo frames of at least 2576 payload
bytes. 802.11 links have an MTU of 7981, which is sufficient for
MPLS-OCC as well.
[0287] Other MPLS-OCC messages may grow arbitrarily large, for
example, the Filter Request message. Since this message is
essentially unbounded in size, MPLS-OCC must include either a
streaming mechanism or a fragmentation and reassembly mechanism.
Rather than specifying these mechanisms, MPLS-OCC should use
existing mechanisms that already exist, for example TCP/IP. Section
6 discusses a general solution to this problem.
[0288] MPLS-OCC uses three control channels. (1) Physical Link: A
Physical Link message channel is a physical Ethernet Link between
the sending and receiving nodes. The message is carried in an
Ethernet frame with the MPLS-OCC Ether type. The Hello, Hello Reply
and Discover messages are direct link Messages. (2) SRT channel: An
SRT message channel is the channel used for the basic control
communication between the controller and the node. Messages from
the controller to the node use the MPLS label stack that describes
the source routed path to the node. Messages from the node to the
controller use the TO_CONTROLLER label to traverse the path
discovered via the Hello messages. The SRT channel is used for all
messages required to maintain the SRT and build the data plane
which include Discover Reply, Keepalive, Keepalive Reply, SRT Down,
MPLS FIB Request/Response, Policer Request/Response, CoS
Request/Response, Filter Request/Response, Pseudo Wire
Request/Response, Direct Switch Request/Response and MAC FIB
Request/Response (for node data plane only).
[0289] (3) TCP/IP A TCP/IP channel may be used for any other
messages or any SRT messages as seen fit by the implementation.
Note that implementation must fall back to the SRT channel if the
TCP/IP channel goes down. TCP Keepalives should be used on the TCP
channel. The TCP channel should never be used for Discover, Hello,
Keepalive and SRT Down. The TCP channel should be used with care
for the MPLS FIB Request, Policer Request, CoS Request, Pseudo Wire
Request and MAC FIB Request as these messages are used to actually
build the Pseudo Wire over which the TCP/IP channel runs. Generally
speaking, the TCP/IP channel is intended to be used for messages
associated with endpoint authorization.
[0290] TCP/IP Channel Establishment is now described. In some
examples, the MPLS-OCC may be run over a TCP/IP channel, as TCP/IP
solves the general problems of fragmentation and reassembly and
flow control. In order to establish the TCP/IP channel, the node
must be connected to an IP network and acquire an IP address over
that network. A node, therefore, may be treated as an endpoint with
the special property that this endpoint is actually the control
plane of the node itself. When the controller sees a new node the
controller may give the node connectivity to an IP network by
creating a pair of LSPs between the node and the edge node
connected to the desired IP network (typically the management
network). Secondly, the controller creates a PW between the node's
control plane and the edge node's network service.
[0291] A Direct Switch Request message with an endpoint of type
Port and a port index of 0xFF indicates to the node that this
Direct Switch Request message is to be used to connect the node to
an IP network. The node may then attempt to allocate an IPv4
address and/or an IPv6 address via DHCP, DHCPv6 or IPv6 AD
mechanisms.
[0292] Once the node allocates the IP address the node may
construct an IP host stack over that interface. It may then connect
to the controller via the address specified in the Discover Reply
message to create a TCP/IP channel with the controller. Once the
channel is created, MPLSOCC control messages may be received over
that channel.
[0293] MPLS-OCC messages sent over the TCP/IP stream include the
OCC Message Header followed by the OCC Message Payload. The OCC
Message header may be used by socket applications to read a known
quantity of bytes from the message stream and determine the number
of bytes in the entire message, from which it can then read that
number of bytes from the stream.
[0294] FIG. 45 is a flowchart illustrating example operation of
network devices in accordance with the techniques of this
disclosure. In the example of FIG. 45, a control channel and a data
channel between a network node and the controller is established.
When a network node (AG or AX) is connected to the network for the
first time, the network node discovers its neighbors using the
messages described above for the MPLS-OCC protocol (500) and
reports this information to the controller. The mechanism of
connecting to the controller involves a discovery process, in which
the Discover message is sent out on the active link with the
shortest distance to the Controller (504). The network node
receiving the Discover message then forwards the Discover message
from the initiating node to the Edge Node after updating the
intermediate node list, and the Edge Node in turn sends the
Discover message to the Controller over a UDP connection. The
controller receives the Discover message over the UDP connection
(504). The information sent to the controller by the initiating
node includes the list of its neighbors as well as the set of
interfaces and intermediate nodes that the Discover message
traversed on the path to the Edge Node.
[0295] In some examples, the Discover message specifies a
generation number. Once the Controller receives the Discover
message, the Controller can compare the generation number specified
by the Discover message to a current generation number received
from the access node (506), and update the stored network topology
information if the generation number specified by the discover
message is greater than or equal to the current generation number
(507). If the controller determines that the generation number
specified by the Discover message is less than the current
generation number, the Controller may discard the Discover
message.
[0296] The Controller, upon receiving this list of intermediate
nodes and interfaces, is able to reverse the path traversed by the
Discover message to reach the initiating network node using a
source-routed mechanism. In this manner, the initiating network
node is able to connect to the Controller and set up a
bidirectional control channel for sending control messages to the
access node. For example, the Controller may reverse the
intermediate node list in the received packet to create a stack of
labels which create the control channel (508). The labels are a
direct mapping from label value to port number and so the
intermediate nodes may or may not be configured to handle the
label, i.e., depending on their capabilities the intermediate nodes
may be able to determine the output port by looking at the label
value, or their FIB may be configured to forward. The controller
sends a discover reply message to the access node, where the
discover reply message bears the label stack determined by the
controller (509), and the access node receives the discover reply
message via the edge node to complete the control channel between
the controller and the access node (511).
[0297] As each connected node reports its neighbors to the
Controller, the Controller is able to discover the topology of the
entire network. Once the topology is known, the Controller may, in
some examples, compute data channel paths between each Access Node
and each Edge Node based on the capacity available in the network,
the load in the network, the QoS required for the traffic to/from
the Endpoints connected to the Access Node and the overall policy
configured for subscribers (510). The paths are described as LSPs
between the Access Nodes and the Edge Nodes; the traffic from each
endpoint connected to the access node is carried over a Pseudo-Wire
within this LSP. In addition to the primary paths, the controller
may also compute detour paths.
[0298] Once the paths and the detours are computed, the Controller
outputs FIB configuration messages to configure the forwarding
tables (FIBs) in the edge nodes, aggregation nodes, and access
nodes with the appropriate ingress to egress label mapping for both
upstream and downstream directions, so that traffic forwarding is
enabled (512). In addition to the primary forwarding entries, the
controller configures the secondary forwarding entries as well so
that switchover, in case of link or node failure, can happen
without any Controller involvement. The controller may output the
FIB configuration messages with a label stack determined by the
controller based on the intermediate node list of a received
Discover message. When a node receives the FIB configuration
message from the controller (516), the node updates its FIB based
on the message (518) and forwards subscriber traffic to edge nodes
based on the FIB (e.g., via pseudo wires and LSPs configured by the
FIB configuration messages from the controller).
[0299] The nodes and controller can adapt to changes in topology.
Once the node has joined the network and is forwarding traffic, the
node continues to send periodic messages to its neighbors. If a
link or node fails, that information is discovered via this
mechanism. Each of the nodes around the failure independently and
locally determines this change, and switches the impacted LSPs to
their pre-configured detours. While the data-plane continues its
operation uninterrupted, each node immediately exchanges messages
with its neighbors to check which links and nodes are active. Each
node then reports this information via another Discover message,
which is sent to the neighboring node with the shortest path to the
Controller. The neighboring node with the shortest path to the
Controller updates and forwards this Discover message to the
Controller, which is thus notified of the topology change. The
Controller re-computes the topology and the paths. Finally, the
Controller configures the required changes, if any, into the
relevant nodes.
[0300] The path is then repaired in a make-before-break fashion at
the node adjacent to the failure, and the old portion of the path
is removed. Note that if a detour becomes unused, it should not be
deleted until all the paths that rely on it have been
re-assigned.
[0301] The nodes and controller can also adapt to changes in link
capacity. For example, the Controller may compute and configure
paths (LSPs) based on the bandwidth required by that path (total of
all the pseudo-wires carried over it). If for some reason, a link
capacity on that path changes (e.g., fading due to rain on a
wireless backhaul link), this information is conveyed to the
Controller, which then re-computes an alternate path for the
traffic and re-configures the impacted nodes to switch the traffic
over in a make-before-break fashion.
[0302] FIG. 46 is a flowchart illustrating example operation of
network devices in accordance with the techniques of this
disclosure. In the example of FIG. 46, a network node (e.g., an
edge node) can send a services indication message to the
controller, where the services indication messages indicates one or
more network services provided by the edge node (600). Details of
an example services indication message are described above. The
controller receives the services indication message (602). An
access node can detect that an endpoint has joined the network
(603), and in response, sends an endpoint indication message to the
controller (604). Details of an example endpoint indication message
are described above. The controller receives the endpoint
indication message (606). The controller may determine that a
pseudo wire is needed between the access node and the edge node to
provide to the endpoint a network service of the one or more
network services (YES branch of 608). In response, the controller
may output one or more pseudo wire request messages to the access
node and/or the edge node to install forwarding state for creating
the necessary pseudo wire between the access node and the edge node
(610). Details of an example pseudo wire request message are
described above. The access node and edge node receive the pseudo
wire request messages and install forwarding state based on them
(612, 614). When a pseudo wire is in place, the controller can
output a direct switch request message to configure the access node
to map traffic received from the endpoint to the pseudo wire (615).
Details of an example direct switch request message are described
above. The access node receives the direct switch request message
and installs forwarding state to map traffic from the endpoint to
the pseudo wire according to the direct switch request message. The
access node can then access network services for the endpoint via
the pseudo write (620), and the edge node can provide the network
services to the endpoint via the pseudo wire (622).
[0303] As described herein, the forwarding plane of network devices
such as access nodes and aggregation nodes (e.g., data plane 301 of
FIG. 5) is based on MPLS. MPLS is chosen for various reasons,
including that MPLS is well supported in existing switching ASICs
of certain network devices, MPLS is a high performance forwarding
paradigm that requires minimal processing yet achieves a high
degree of service enablement, and MPLS has good support for
fast-reroute processing.
[0304] The systems described herein leverage the MPLS concept of
pseudo wires (PW). PWs are used to virtualize physical ports. A PW
is used to connect an Endpoint to a Network Service such that the
Endpoint appears as if it is directly connected to the Edge Node.
There are some variations to this that are described in the
following sections. A PW is bidirectional and is therefore
comprised of a pair of Transport LSPs. Multiple PWs can be carried
by the same pair of Transport LSPs. At the Edge Node, a PW is
mapped to a Bridge Domain analogously to how a physical interface
may be mapped to a Bridge Domain. The Bridge Domain supports the
Network Service defined on the Edge Node. The FIB in the Edge Node
is populated by learning MAC addresses on the PW. The PW appears as
a LAN segment to the Bridge.
[0305] When a packet is transmitted via a PW, the Edge Node
constructs an encapsulation for the packet that includes the PW
label, the LSP transport label and finally the Ethernet header with
the MPLS Ethertype. The packet is then transmitted out the
interface associated with the Transport LSP. Note that the
Transport LSP will not be present if the Path for the Transport LSP
includes no intermediate links between the ingress node and egress
node. When packet is encapsulated, the CoS bits in the EXP header
for both the transport and PW label are set.
[0306] All intermediate nodes between the ingress and egress node
for the Transport LSP perform basic label swapping and forwarding
based on the FIB programming received from the central controller
in accordance with the MPLS-OCC protocol. The penultimate hop node
just does a label POP, exposing the PW label and forwards the
packet to the egress node. The egress node receives the packet with
the PW label exposed. This label is used to identify the PW and to
switch the packet in the correct Bridge Domain (for an Edge Node)
or to transmit the packer directly out a physical port (Access
Node).
[0307] The ingress node and all transit nodes in the path may have
detours provisioned by the central controller to handle the case
where a node or link goes down. The nodes in the path can detect a
link or node failure locally and select the detour without any
controller interaction. This forms the foundation for the data
plane high availability (HA) employed by this architecture.
[0308] There are two different models that can be used to achieve
network integration. In the first model, network integration is
performed directly on the PE router. This is referred to herein as
the "Direct Integration" model. In the second model, the network
integration is performed through VLANs that connect the PE router
to the MPLS-OCC Edge Node (EN). This is referred to herein as the
"Edge Node Layer 2" (ENL2) model. Each model is addressed and
compared in the following sections.
[0309] FIG. 47 is a block diagram illustrating an example network
system 900 consistent with the Direct Integration Model, according
to one or more aspects of the techniques of this disclosure. In the
Direct integration model, a set of VLANs or Bridge Domains are
created that serve as the entry points into the Network Services.
For example, if a basic Ethernet service is required, this service
is configured on an Integrated Routing and Bridging (IRB)
interface, over which a set of services could be configured. This
interface may, in some examples, appear as any other interface to
the PE router.
[0310] Services are configured as follows: VPLS/E-VPN--The logical
interface is configured over a Bridge Domain that may include other
physical ports and tags to map them to the domain. This logical
interface is then configured so that its availability may be
signaled to other PE routers in the provider network. When a PW is
created and added to the Bridge Domain by MPLS-OCC, packets are
switched to and from the PW according to MAC learning or MAC
authorization. See PW switching model below.
[0311] L3VPN--Similar to VPLS, a Bridge Domain is created to which
MPLS-OCC adds PWs. A routing protocol may be configured over the
corresponding IRB interface to get routes from CE networks into the
L3VPN instance. The interface is also configured as a member of the
correct routing instance so its routes maybe carried across the
provider core via BGP.
[0312] Basic Ethernet Service--In this case again a Bridge Domain
is created with a corresponding IRB interface on which a subnet and
mask may be configured. The interface may be included in some
routing instance. PWs are then added to the Bridge Domain as
sessions come up. PE routers may run VRRP between them.
[0313] In summary, all services models involve the creation of a
Bridge Domain over which the service and associations protocols are
configured as is done today. MPLS-OCC then requests the PWs that
are added and removed from the various Bridge Domains as required
by active sessions in the system. Note that in the direct
integration model, Bridge Domains may have no physical or logical
ports in their configuration.
[0314] FIG. 47 shows how the PWs carrying traffic from Endpoints
(EP) 1, 2, and 3 are mapped to the Network Services via the Bridge
Domains configured on the PE routers. The actual MPLS-OCC ports
connecting the ENs to the AGs are not mapped directly to the Bridge
Domains interfaces, but rather the PWs are mapped dynamically to
these domains when the Endpoints come up and their authorization
policy is established.
[0315] In the Direct integration model, the MPLS-OCC protocol is
running directly on an edge node's routing engine. The MPLS-OCC
protocol is also executing over some subset of the edge node's
interfaces. These interfaces should only have the MPLS-OCC protocol
and MPLS configured on them. They should not be made members of any
Bridge Domain or be given any IP address configuration. They must
be able to forward MPLS-OCC control packets to the Controller via
an IP/UDP encapsulation into a particular routing instance. They
must also be able to accept packets from an IP/UDP encapsulation
and send them out one of the MPLS-OCC ports or up to the control
plane. The MPLS-OCC forwarding daemon runs on the routing engine
and can send and receive MPLS-OCC control packets to and from the
forwarding element. Note that MPLS-OCC control packets sent through
the edge node must not be sent to the routing engine for
processing, as system performance will suffer. MPLS packets are
also sent and received over the MPLS-OCC interfaces. The edge node
can demultiplex MPLS packets to the correct PWs and then
de-capsulate and L2 switch their payloads in the appropriate Bridge
Domain.
[0316] When a subscriber comes up, the Controller (not shown in
FIG. 47) must be able to identify the Bridge Domain to which the
subscriber must be admitted. The policy associated with the
subscriber therefore includes the Bridge Domain name to which the
subscriber must be admitted. The MPLS-OCC client daemon running on
the edge node therefore signals the Bridge Domain names to the
Controller.
[0317] FIG. 48 is a block diagram illustrating an example network
system 910 consistent with the Edge Node Layer 2 Model, according
to one or more aspects of the techniques of this disclosure. In the
Edge Node Layer 2 model, Edge Nodes are not Edge Routers or PE
routers. Instead they are simple L2 switches that map PWs to Bridge
Domains. On the PE router, a set of Bridge Domains is configured.
Some or all of the physical ports comprising these Bridge Domains
are connected to the ENs. If more than one physical port is
connected to a given EN, it may be in a member of an aggregated
Ethernet.
[0318] In the Edge Node Layer 2 Model there are a few issues that
must be addressed. The EN must discover which ports are connected
to other MPLS-OCC nodes and which ports are connected to the PE
routers. MPLS-OCC Hello messages are used to discover ports
connecting to other MPLS-OCC nodes. Link Layer Discovery Protocol
(LLDP) is used to discover which ports are connected to the PE
router thereby implying the LLDP must be configured on the PE
router ports facing the ENs.
[0319] LLDP is also used to discover the VLAN names and IDs that
are reachable via the ports connecting the EN to the PE. The VLAN
name then serves as the service identifier that the EN uses to map
PWs to VLANs based on session authorization records. LLDP is also
used to determine the management VLAN from which the EN allocates
an IP address via DHCP. Aggregated Ethernets are discovered via
Link Aggregation Control Protocol (LACP) (802.1AX).
[0320] Once the VLAN connectivity has been established between the
EN and the PE router, an IP address is allocated to the EN via
DHCP. The DHCP server is assumed to run anywhere in the management
network. The DHCP server indicates to the EN the DNS name or IP
address of the Controller(s) via the traditional Option 43. Once
this discovery phase has been completed, the EN is now able to
operate as an MPLS-OCC Edge Node and the remainder of the MPLS-OCC
network and protocol can operate.
[0321] STP may be executed on the ENs to ensure loop free operation
between the ENs and the PEs. However, STP can be eliminated if
certain topologies are excluded. Specifically, an EN may connect to
one and only one PE. This, however, does not limit resiliency as
the EN can be seen as an extension of the PE.
[0322] When sessions are authorized, the EN maps session to VLANs
in a manner analogous to the way sessions are mapped to Bridge
Domains in the Direct Integration Model. Therefore ENs are required
to support per subscriber packet filtering, fine grained policing
and per LSP policing.
[0323] Advantages of Edge Node Layer-2 Model include the fact that
specific code may not be required, assuming the edge node
adequately supports LLDP and LACP. In addition, this model can
interwork with any PE router from any vendor supporting LLDP and
LACP. Disadvantages of Edge Node Layer-2 Model may be that it
requires that the EN, which looks more like an Aggregation switch,
support LER functions, fine grained filtering and policing as well
as LSP policing in the forwarding plane. Hence, commodity hardware
may not be up to the task. This model also requires the
implementation of LLDP, LACP and potentially STP on the EN. If STP
is used some ports may be put into the STP blocking state. Link
failure detection time is depends on non-MPLS-OCC mechanisms
existing between PE and EN, which might include STP that has slow
recovery properties. The Edge Node Layer-2 Model eliminates
MPLS-OCCs ability to control the downstream interface schedulers on
the PE router. Note that an Edge Node in the Edge Node Layer 2
model can appear as a line card in a PE router. The Edge Node could
present a single physical interface to the PE router.
[0324] For the following description, it is assumed that the direct
integration model is followed. That said, the differences between
the two models are not apparent in most of what follows. The
Controller must be told by the Edge Node the set of Network
Services the Edge Node supports. This is done by the Edge Node
sending a Service Indication message to the controller. The Service
Indication message includes the names of all the Bridge Domains
configured on the Edge Node to which PWs may be included. Therefore
the service is defined as an L2 Bridge Domain which is mapped,
possibly at L3, to some other Network Service. Multiple Edge Nodes
may support the same service; in fact every service should be
supported by at least two Edge Nodes to support resiliency in case
one Edge Node goes down.
[0325] Downstream traffic may arrive at either Edge Node if each
Edge Node is advertising same cost routes for the subnet associated
with the Edge Node (EN). However, since we would like to apply per
subscriber policing at the EN for downstream traffic, all the
traffic for a specific subscriber must traverse one of the ENs.
Therefore one of two techniques must be available to steer traffic
to the correct EN when traffic is forwarded to the EN at L3.
[0326] The first technique is to use VLAN anchoring where the EN
anchoring the VLAN advertises the route to the VLAN subnet with a
higher priority. This technique may have more efficient forwarding
properties and may distribute load well if there are many services
that can be distributed across the ENs. Drawback may include
implementation complexity and failover speed. However, fast L3
failover mechanisms can be employed where appropriate.
[0327] The second technique is to have the ENs directly connected
at L2 and by MAC learning, force all packets to the EN hosting the
PW to the AX requiring the service. This is the simplest
implementation but would result in half the downstream traffic
traversing the links between the ENs. It also has the nice property
that sessions could be individually distributed across the ENs. For
VPLS or E-VPN, multiple ENs being members of the same VLAN can load
share on a per MAC basis. MAC learning attracts packets to the
right EN, so there is no issue with policing since MAC addresses
are essentially fully qualified routes in this context.
[0328] FIG. 49 is a block diagram illustrating an example network
system 920 that includes a primary edge node (EN-P) and a secondary
edge node (EN-S). EN Resiliency and Opportunities for Node
Protection and Resilient pseudo wires: The next issue concerns
resiliency of the ENs. When an EN goes down, the other EN must be
able to take over for it using the fast detour techniques used
elsewhere in the MPLS system. However, the techniques used
elsewhere may not provide for node protection at the ingress or
egress of the LSP, only link protection is used at LSP egress.
However, since ENs providing specific Network Services are
typically deployed in pairs, it is possible to define an LSP with a
primary and secondary egress and ingress nodes.
[0329] When the penultimate hop (PH) node detects that the next hop
link for some LSP goes down, the PH can detour the traffic to the
secondary EN. This technique will work regardless of whether the
Primary EN (EN-P) went down or if just the link between the PH node
and EN-P went down as it is assumed that the Secondary EN (EN-S)
can route or forward the packet appropriately. If possible, the
Primary and Secondary EN nodes would use the same service label for
the same service. In such a case, the required forwarding
operations at the PH node are the same as those supported with the
existing detour schemes. However, if the labels cannot be
guaranteed to be the same, then the PH node must pop the LSP label,
swap the service label and then optionally push a detour LSP label.
Support for this sequence of operations must be investigated. The
speed of convergence for downstream traffic is dependent on the
convergence characteristics of the northbound protocols.
[0330] FIG. 49 illustrates a Primary PW (PW-P) 922 between AX and
EN-P. EN-P and EN-S are members of the same Bridge Domain. When the
link from AG1 to EN-P goes down, AG1 fast-detours to EN-S where, if
the DMAC is known it is sent to its destination, otherwise it is
flooded. EN-S is also now informed that PW-P-Detour (detour pseudo
wire) 924 is active and begins to forward packets toward AX via
PW-P-Detour 924. However, AX is not informed that PW-P 922 is in
detour state and will continue to use PW-P 922 until the controller
tears it down.
[0331] At the PW ingress, the path carrying traffic from EN-P to AX
can be specified with a secondary ingress node. This secondary
ingress node can effectively be thought of as an ingress detour. If
EN-P goes down, EN-S can send traffic to AG1 (or some other
convenient rendezvous point) using the label of the LSP carrying
PW-P. Since the PCE of the central controller knows that this path
is a detour, the path can be constructed using detour policy, which
may be less stringent than primary policy. Secondly, since the path
follows the original path from the rendezvous to the ingress, the
path is following the original traffic engineered path. Without the
secondary Ingress Node concept, a secondary LSP would be
constructed independent of the primary, which could result in
unnecessary allocation of resources to carry the secondary.
[0332] The following section details how the resilient egress PW is
used to support the L2 services provided by the architecture. This
section provides a detailed analysis of the supported Network
Services, Endpoint connectivity topology, session models and local
switching requirements under the assumption of service resiliency.
From this analysis a few basic patterns emerge that can be used to
realize the full set of requirements. There are several variables
that affect how to construct the forwarding model. The variables
are as follows in TABLE 2:
TABLE-US-00002 TABLE 2 Network Service VPLS/E-VPN L2 Subnet L3VPN
Endpoint Connectivity Single Connect: The Endpoint maintains a
single connection to the network. Dual Connect: The Endpoint is
dually connected to the network. Session Model Port Based: A
session is identified by the its physical port connectivity MAC
Based: A session is identified by its MAC address. Local Switching
Enabled/Disabled
[0333] This results in potentially twenty-four different
combinations of connectivity. However, several are overlapping. We
explore each scenario to identify the overlapping situation. Note
that in all cases we assume that EN redundancy exists. Lack of EN
redundancy is a degenerate case of EN redundancy. Also note that
this architecture maintains a loop-free property within the
MPLS-OCC nodes. However, Endpoints and Network Services may be
connected in such a way that loops are created via the MPLS-OCC/PW
cloud. It is assumed that under these situations, a loop avoidance
protocol such as STP is run transparently over the MPLS-OCC/PW
cloud.
[0334] FIG. 50 is a block diagram illustrating an example network
system 940 that shows a forwarding model for Virtual Private LAN
Switching (VPLS), single connect, port-based session. As shown in
FIG. 50, in this forwarding model access node switching is done
such that all packets from the access port are mapped directly to
PW-P, and vice versa. EN-P Switching is done by the controller
adding a PW to the Bridge Domain associated with the VPLS instance
where normal L2 forwarding applies, including broadcasting and
flooding over the PW. If EN-P or the link between the PH LSR and
EN-P goes down, packets are immediately switched to EN-S where
normal L2 forwarding applies.
[0335] For EN-S Switching, PW-P-Detour remains inactive until EN-S
discovers that EN-P is down. PW-P-Detour must remain inactive;
otherwise flooded packets from the VPLS will get duplicated at the
AX. PW-S becomes active when a packet is received over PW-S. The
following procedure is followed: (1) EN-S receives a packet over
PW-P-Detour for which it knows it is the Secondary. The SMAC, call
it MAC1, is learned over PW-P-Detour and the packet is forwarded,
and potentially flooded in the VPLS. The packet t is not flooded
over PW-P-Detour. EN-S now attracts packets for MAC1. (2) When a
packet for MAC1 arrives at EN-S, the packet is forwarded via
PW-P-Detour. (3) The packet then arrives at AX. The PW label is the
same as if it came from PW-P so AX cannot know that PW-P is
actually down. This is of no consequence as the detour remains
active until the path carrying the PW is repaired. (4) If there are
packets destined to MAC1 sent to EN-P, they will be lost in the
network unless the PH router employs the same type of resilient PWs
that the MPLS-OCC cloud employs. Note that EN-P may still be up.
Only the interface between the next-hop node and EN-P went down. In
this case any packets arriving at EN-P can take a detour around the
primary next-hop node.
[0336] FIG. 51 is a block diagram illustrating an example network
system 950 that shows a forwarding model for VPLS, dual connect,
port-based session. EP may be dually connected to the same AX or
two different AXs without loss of generality. PW1 and PW2 may be
connected to the same or different ENs without loss of generality.
There is a loop in the network but it is outside the MPLS-OCC
domain and therefore must be resolved via outside mechanisms such
as STP. Otherwise, since PW1 and PW2 are completely independent
entities from the perspective of state, their operation is
identical to the Single Connect scenario, as described with respect
to FIG. 50.
[0337] FIG. 52 is a block diagram illustrating an example network
system 960 that shows a forwarding model for VPLS, single/dual
connect, MAC-based session. The basic architecture for VPLS,
Single/Dual Connect, Port Based still applies with the following
exceptions: (1) MAC addresses are not learned on the ENs, but are
placed on the PWs by the Controller. Such placement is made on both
EN-P and EN-S. (2) The AXs maintain a forwarding table that maps
SMACs to uplink PWs and DMACs to downlink subscriber ports. (3)
There is a resiliency optimization that can be made due to explicit
MAC authorization state. Specifically, when EN2 detects that is has
become primary for the PW, it can generate a packet of learning for
all MACs. This implies that when MACs are authorized, they are
added to both the primary and secondary PWs.
[0338] FIG. 53 is a block diagram illustrating an example network
system 970 that shows a layer two (L2) subnet arrangement. The L2
Subnet scenarios are the same as the VPLS scenarios. The primary
difference is that the ENs are the default gateways in the subnet.
It is assumed that they are running virtual router redundancy
protocol (VRRP) between themselves. The PW-Detour remains inactive
until either a packet is received at EN-S via PW-Detour or VRRP
timeouts indicate the PW-S should become active.
[0339] Packets for EPs may arrive at either EN-P or EN-S. Since PW
is attracting packets to EN-P, it is assumed that if a packet for
some MAC at EP arrives at EN-S it is transmitted to EN-P via the
non-MPLS-OCC link that completes the subnet between EN-P and EN-S.
Of course, if layer three (L3) routing is attracting packets to
EN-P for the subnet, then the cross traffic between EN-S and EN-P
is eliminated.
[0340] FIG. 54 is a block diagram illustrating an example network
system 980 that shows an L3 virtual private network (VPN)
arrangement. Since L3 VPN is an L3 service, it is assumed that EN-P
and EN-S do not share the same Bridge Domain for the same service.
As the above drawing shows, different customer edge (CE) ports are
mapped to different Bridge Domains on different ENs to support
resiliency. The PWs are carried over LSPs that may have nodes with
detours but there is no egress node protection as is possible with
the previous L2 scenarios described. Failover is then a function of
the L3 routing protocols. Finally, note that L3 VPN could be
configured to have EN1 and EN2 on the same Bridge Domain and even
support a dual connection from the CE. However, the CE would have
to be configured to know that both ports are on the same Bridge
Domain and that there are multiple routers on the subnet. In
addition, there would not be any benefit from a common subnet
between the EP and the ENs since each is considered a unique
routing adjacency so the L3 state would still have to be updated
before the network healed as EN1 and EN2 would use different MAC
addresses. Note that a resiliency model similar to the VPLS model
could be supported if L3 rather than L2 packets were encapsulated
in the PW.
[0341] FIG. 55 is a block diagram illustrating an example network
system 990 that shows a forwarding model for local switching. Local
Switching, also known as X2 interface support, is supported under
the following restrictions: Session Model must be MAC Based so that
MAC location can be tracked and FIBs set directly. If we did not
keep track of the location of all the MACs then the system would
have to flood unknowns and flooding is not acceptable given that
loops are created.
[0342] When it is determined that two Endpoints may local switch
between themselves, the controller sets up a PW between the AXs
hosting the Endpoints by sending messages to the AXs, and the
controller installs the MAC addresses reachable via PW in the
corresponding FIB tables. When local switching is enabled, packets
from EPs are switched to the default PW unless there exists a MAC
address in the FIB matching the DMAC of the request. If the DMAC
matches a FIB entry, the packet is switched via the PW pointed to
by the MAC address. Unknowns, broadcast and multicast are always
sent to the EN via PW1-P. When a packet is received from the local
switching PW (LSPW), it is only switched to an EP if there exists
an entry for it in its FIB and this entry must be a locally
attached Endpoint. Unknowns from LSPWs are never passed to the EN.
The set of endpoints between which local switching is enabled may
be determined by: (1) Static policy associated with the endpoints.
For example they may be made members of a local switching group.
(2) Analytics of traffic switching between two MPLS-OCC PWs in the
same bridge group. (3) Analysis of Address Resolution Protocol
(ARP) and Internet Protocol version six (IPv6) Neighbor Discovery
between EPs. Analysis of ARP in combination with policy may prove
to be the most effective mechanism.
[0343] FIG. 56 is a block diagram illustrating an example network
system 1000 that shows per subscriber (endpoint) packet policy and
next-hop chaining at the Access node for Uplink. Per subscriber
packet policy is basically a firewall filter that is inserted in
the packet processing path for some subscriber. A filter rule
contains packet matching tuples with actions for policing,
dropping, marking or forwarding. Policers may also mark, drop or
forward. A subscriber can be identified by either a MAC address or
a physical port location. For an Access Node, the general next-hop
chain for subscriber packet policy is shown in FIG. 56. FIG. 56
shows the forwarding blocks used when the subscriber is identified
per MAC and there is FIB switching at the AX. In the specific case
of port based policy, the Policy block would not exist and the next
hop chain associated with the ingress port would point directly to
the (optional) Filter block. In the case of direct mapping (no FIB
switching) the next-hop chain excludes the FIB element next-hop.
The general next hop chain is encoded as follows:
ingress port -> policy lookup ( SMAC ) -> Filter ; PW ->
PW ##EQU00001## policy lookup -> Filter ; FIB -> Filter ; PW
-> PW FIB -> PW ##EQU00001.2##
[0344] The policy associated with the Endpoint therefore defines
the next-hop chain associated with the Endpoint. The next-hop chain
could be as simple as a single PW for port based sessions with no
packet policy or as complex as MAC based sessions with filtering
and local switching via a FIB. Any next-hop in the chain may modify
the next-hop chain by pushing new elements on the chain or by
replacing the entire chain as would be done by a Policy block or a
FIB.
[0345] On downlink, the process is similar but there is no filter
block as it is assumed that filtering has already been done on the
EN.
[0346] FIG. 57 is a block diagram illustrating an example network
system 1010 that shows next-hop chaining at an access node for
downlink. As shown in FIG. 57, on the downlink a packet arrives at
the AX via PW, and the PW is mapped either directly to an egress
port (in the case of port based session management) or a FIB
(assumed to be part of a named Bridge Domain) in the case of MAC
based policy.
PW -> egress port -> FIB FIB -> egress port
##EQU00002##
[0347] FIG. 58 is a block diagram illustrating an example network
system 1020 that shows next Policy and Next-Hop Chaining at the
Edge Node for Downlink. On the edge node for downlink a packet
arrives at some Bridge Domain and is switched to the egress PW. If
policy exists the packet is first passed through the policy filter.
In the case of port based session, the filter is associated with
the DMAC when it is learned from the PW. In the case of MAC based
sessions, the MAC is installed in the FIB and a per-MAC policy
entry is associated.
FIB->Filter; PW
[0348] FIG. 59 is a block diagram illustrating an example system
1040 that shows Next-Hop Chaining at the Edge Node for Uplink. For
uplink, a packet is switched from the PW directly in the FIB
associated with the PW.
PW->FIB
[0349] The architecture described herein can also provide for
application control of packet policy. Network operators may use
dynamic packet policy insertion. Routers typically only support
static policy and require a configuration change to modify the
policy rules effective on the system. With dynamic policy, an
application could know via external means that a specific
subscriber flow requires special treatment, for example, policing,
dropping or re-marking. In some examples, the controller provides
an API at the controller that allows an application to modify the
policy of a particular user in real-time.
[0350] For example, a voice stream might be identified by a voice
signaling gateway. This gateway could request that the controller
classifies the stream as Expedited Forwarding (EF) traffic and runs
a policer on the traffic to ensure that the EF class is not abused.
However, realizing such a capability in real time may not be
feasible for two reasons, one is the potential amount of per-flow
signaling required in the network, the second is the ability for
existing systems to effect the policy in real time, such a
capability may require rework in existing packet policy
mechanisms.
[0351] The architecture described herein may be targeted at some
specific example deployment scenarios. For example, in some aspects
the techniques of this disclosure may be used in mobile backhaul
networks for small-cell deployment. Examples of a central
controller operating with a mesh network of simple nodes are
described in U.S. Ser. No. 14/500,793, entitled "MESH NETWORK OF
SIMPLE NODES WITH CENTRALIZED CONTROL," filed Sep. 29, 2014, the
entire contents of which are incorporated by reference herein.
[0352] The limitations associated with licensed radio spectrum and
the increasing demands in data traffic from mobile users are
forcing service providers to think about solutions that require the
cell-size to shrink to increase spatial reuse of spectrum, as well
as solutions that require the use of unlicensed spectrum to
supplement the capacity provided by licensed bands. Increasing use
of small cells implies new demands on the backhaul
technologies--both wired and wireless. In some examples, the small
cells (say, on pole-tops) could be connected to the pre-aggregation
boxes either by fiber (using some variant of Passive Optical
Network (PON) technology) or by wireless technology. Wireless
backhaul technology could operate in the micro-wave range, 60 GHz
or sub 6 GHz ranges. Some of these technologies are inherently
line-of-sight (LOS) implying that the towers (or antennas) need to
be in clear view of each other, while others are either
near-line-of-sight or non-line-of-sight depending on how much the
waves can travel around obstacles in the line of view. Similarly, a
wireless backhaul device could be PTP (point-to-point) or PMP
(point-to-multi-point).
[0353] Support is also needed for Heterogeneous Networks (Het-Nets)
which use a combination of small cells and macro-cells to cover a
given area. The radio resources are shared between the small cell
radios and the macro-cellular radios and often traffic is
backhauled from the small cells to the macro-cell which acts as an
aggregator.
[0354] Service Providers are also deploying small-cells with Wi-Fi
access. Wi-Fi Access Points (APs), with radios for both Wi-Fi-based
access and Wi-Fi-based backhaul, could be deployed to operate as a
mesh, or could be used to extend the coverage of a wired network to
places without Ethernet cabling. In the case of Wi-Fi mesh, the
Root Access Point needs to have connectivity to the pre-aggregation
box, and simple mesh routing protocols are used to send packets to
the APs in the mesh.
[0355] There are several key requirements for this use case that
the architecture described herein can provide, including the
ability to operate at a large scale. The number of small cells in a
typical service provider deployment is likely to be in the
thousands. The techniques described herein can be used to easily
configure and provision these many devices and also ensure that the
experience of the connected users is of high quality. The previous
point about large scale also necessitates plug-and-play support to
avoid having the service provider to send expert technicians to
help set up the equipment at every location. Untrained technicians
should be able to mount the devices to the pole-tops and connect
them to power, and then the device should be able to connect to the
network, find the Controller and configure itself. The architecture
also may need to have a small software footprint. The pole-top
mounted backhaul devices in small cells often have very limited
hardware capabilities in terms of CPU power and memory. The
software that runs on these devices for controlling the device
needs to be lightweight. In addition, the access devices may be
exposed to the elements, and so should to be able to support
extremes in temperature, rain etc.
[0356] The techniques of this disclosure may be
backhaul-technology-agnostic, and work well for both wired and
wireless backhaul of the small cell traffic. In certain
deployments, like urban areas, fiber access may be available to the
small cell devices, while in other deployments, because of the
environment, wireless backhaul, LOS or NLOS, may be preferred.
[0357] Support for X2 interface is also needed. Long-term evolution
(LTE) introduces the notion of an X2 interface between adjacent
cells, primarily for the purpose of transferring low latency
control traffic between cells. The techniques of this disclosure
can provide support for such east-west connections or paths between
cells (e.g., between access nodes).
[0358] The techniques of this disclosure may also allow the
controller to provide robust timing and synchronization out to the
cells, which can be useful for the proper operation of the radio
access network (RAN), for example. The overall solution can have
support for Institute of Electrical and Electronics Engineers
(IEEE) 1588v2 and Synchronous Ethernet (Sync-E). This is described
in further detail in U.S. Ser. No. 14/586,507, entitled
CONTROLLER-BASED NETWORK DEVICE TIMING SYNCHRONIZATION, filed Dec.
30, 2014, the entire contents of which are incorporated by
reference herein.
[0359] Another example use-case is the support of macro-cellular
traffic. Typically a cell-site router will be placed in a but at
the site of the cell-tower, and the cell-site router aggregates
traffic from a number of access devices--2G, 3G, 4G/LTE--and
carries them to the pre-aggregation box in the CO. With the
dramatic increase in mobile traffic, the backhaul of traffic from
the macro-cell sites to the core network is a key area of spend for
service providers. Also, the advent of LTE and LTE-A has created an
inflection point where the service providers are considering packet
transport for the backhaul traffic.
[0360] The key requirements from this use case are: (1)
backhaul-technology-agnostic, working well for both wired and
wireless backhaul of mobile traffic. In certain deployments, there
may be fiber or other wired access to the cell-tower, while in
other deployments, because of the terrain wireless backhaul, may be
preferred. (2) Support for X2 interface is also needed. Long-term
evolution (LTE) introduces the notion of an X2 interface between
adjacent cells, primarily for the purpose of transferring low
latency control traffic between cells. The techniques of this
disclosure can provide support for such east-west connections or
paths between cells (e.g., between access nodes). (3) Robust timing
and synchronization out to the cells, and (4) ruggedization. The
backhaul device at the cell tower typically sits in an enclosure
but is still exposed to the elements. Such devices need to be
ruggedized, as they operate under extreme conditions.
[0361] Another example use case for the techniques of this
disclosure is for fixed wireless broadband. In places where sites
are very far apart (as in rural areas) or in places where it is
hard to install Ethernet cable or fiber, the last hop from the CO
to the residence may need to be over wireless links. Fixed wireless
broadband access is quite common in developing countries as it
allows quick rollout of services and on-boarding of subscribers.
Also, in rural areas of developed countries, where houses are far
apart, such wireless access is commonly used to connect customers
to the network. Typically, a tower with some point-to-multipoint
technology is used to connect to wireless devices on the sides of
houses, from where wired or wireless (e.g., Wi-Fi) connectivity is
provided to the residents in the dwelling. The key requirements
from this use-case are: (1) Robust wireless backhaul support: The
key feature in this use-case is that the last hop to the
customer-premise is wireless. So, the solution needs be able to
support high capacity and QoS over this wireless link. (2) Scale:
Since this use-case is about connectivity to customer premises, the
scale is likely to be large as the leaf-nodes (customer dwellings)
could be in the hundreds. Managing a large number of end-devices is
a key requirement as is monitoring and troubleshooting. (3)
Plug-and-play: The previous point about large scale also
necessitates plug-and-play support to avoid having the service
provider to send expert technicians to help set up the equipment at
the customer premises. The customers should ideally be able to
connect the device and then the device should be able to connect to
the network, find the Controller and configure itself. (4) Small
software footprint: The CPE devices are usually very inexpensive
and have very limited hardware capabilities in terms of CPU power
and memory. The software that runs on these devices for controlling
the device needs to be lightweight. (5) Ruggedized: The backhaul
devices on the tower and on the side of the dwelling are typically
exposed to the elements. Such devices need to be ruggedized as they
operate under extreme conditions.
[0362] Another example use case is for a converged access and
aggregation network. Service providers today are under increased
pressure to provide bandwidth and services while keeping prices
flat. One typical way they are doing this is by converging the
mobile backhaul, residential and business networks, which
previously used to be run as three separate networks. By moving to
a common or universal backhaul infrastructure for these three key
networks, the SPs are able to save expenses by making better use of
capacity, by providing a common management infrastructure for
configuration, monitoring and troubleshooting and by using common
subscriber management functionality. Residential and business
networks typically use wired backhaul to the CO, running over DSL,
cable or optical fiber. Mobile backhaul networks could use wired or
wireless backhaul.
[0363] The key requirements from this use-case for the architecture
are: (1) Plug-and-play: The need to support large numbers of
Endpoints on customer premises, business sites and cell-sites
necessitates plug-and-play devices to avoid having the service
provider send expert technicians to help set up the equipment at
every location. Untrained technicians or even the customer should
be able to install the devices and connect them to power, and then
the device should be able to connect to the network, find the
Controller and configure itself. (2) Small software footprint: The
CPE devices are usually very inexpensive and have very limited
hardware capabilities in terms of CPU power and memory. The
software that runs on these devices for controlling the device
needs to be lightweight. (3) Interoperability: Most SPs already
have a lot of equipment out in the field, and they are unwilling to
completely rip-and-replace their gear for a new technology. So, any
new technology needs to be able to interoperate with the equipment
that is already in the field. Also, some SPs are typically
unwilling to buy all their equipment from a single vendor. They
prefer standards-based technologies that will work with equipment
from multiple vendors, rather than be locked with a proprietary
technology from a single vendor, regardless of how good the
technology is. There are, of course, other SPs that are willing to
deploy proprietary technology from a vendor, because their
preference is to deploy an end-to-end solution from a single
vendor. (4) Wired and wireless backhaul: The solution should be
backhaul-technology-agnostic, and work well for both wired and
wireless backhaul. In this use-case, the backhaul is primarily
wired, although there may be some wireless backhaul for the mobile
traffic.
[0364] The techniques described in this disclosure may be
implemented, at least in part, in hardware, software, firmware or
any combination thereof. For example, various aspects of the
described techniques may be implemented within one or more
processors, including one or more microprocessors, digital signal
processors (DSPs), application specific integrated circuits
(ASICs), field programmable gate arrays (FPGAs), or any other
equivalent integrated or discrete logic circuitry, as well as any
combinations of such components. The term "processor" or
"processing circuitry" may generally refer to any of the foregoing
logic circuitry, alone or in combination with other logic
circuitry, or any other equivalent circuitry. A control unit
comprising hardware may also perform one or more of the techniques
of this disclosure.
[0365] Such hardware, software, and firmware may be implemented
within the same device or within separate devices to support the
various operations and functions described in this disclosure. In
addition, any of the described units, modules or components may be
implemented together or separately as discrete but interoperable
logic devices. Depiction of different features as modules or units
is intended to highlight different functional aspects and does not
necessarily imply that such modules or units must be realized by
separate hardware or software components. Rather, functionality
associated with one or more modules or units may be performed by
separate hardware or software components, or integrated within
common or separate hardware or software components.
[0366] The techniques described in this disclosure may also be
embodied or encoded in a computer-readable medium, such as a
computer-readable storage medium, containing instructions.
Instructions embedded or encoded in a computer-readable medium may
cause a programmable processor, or other processor, to perform the
method, e.g., when the instructions are executed. Computer-readable
media may include non-transitory computer-readable storage media
and transient communication media. Computer readable storage media,
which is tangible and non-transitory, may include random access
memory (RAM), read only memory (ROM), programmable read only memory
(PROM), erasable programmable read only memory (EPROM),
electronically erasable programmable read only memory (EEPROM),
flash memory, a hard disk, a CD-ROM, a floppy disk, a cassette,
magnetic media, optical media, or other computer-readable storage
media. It should be understood that the term "computer-readable
storage media" refers to physical storage media, and not signals,
carrier waves, or other transient media.
[0367] Various embodiments of the invention have been described.
These and other embodiments are within the scope of the following
claims.
* * * * *