U.S. patent application number 11/413409 was filed with the patent office on 2007-11-01 for differentiated services using weighted quality of service (qos).
This patent application is currently assigned to Tellabs San Jose, Inc.. Invention is credited to Robert J. Colvin, David S. Curry, Paul M. Hallinan, Man-Tung T. Hsiao, Sanjay Khanna, Rishi Mehta, Samer I. Nubani, Ravindra Sunkad.
Application Number | 20070253438 11/413409 |
Document ID | / |
Family ID | 38434805 |
Filed Date | 2007-11-01 |
United States Patent
Application |
20070253438 |
Kind Code |
A1 |
Curry; David S. ; et
al. |
November 1, 2007 |
Differentiated services using weighted quality of service (QoS)
Abstract
Differentiated services for network traffic using weighted
quality of service is provided. Network traffic is queued into
separate per flow queues, and traffic is scheduled from the per
flow queues into a group queue. Congestion management is performed
on traffic in the group queue. Traffic is marked with priority
values, and congestion management is performed based on the
priority values. For example, traffic can be marked as "in
contract" if it is within a contractual limit, and marked as "out
of contract" if it is not within the contractual limit. Marking can
also include classifying incoming traffic based on Differentiated
Service Code Point. Higher priority traffic can be scheduled from
the per flow queues in a strict priority over lower priority
traffic. The lower priority traffic can be scheduled in a round
robin manner.
Inventors: |
Curry; David S.; (San Jose,
CA) ; Colvin; Robert J.; (San Jose, CA) ;
Nubani; Samer I.; (Santa Clara, CA) ; Sunkad;
Ravindra; (Pleasanton, CA) ; Hsiao; Man-Tung T.;
(Cupertino, CA) ; Hallinan; Paul M.; (San Carlos,
CA) ; Mehta; Rishi; (San Jose, CA) ; Khanna;
Sanjay; (Fremont, CA) |
Correspondence
Address: |
FITZPATRICK CELLA HARPER & SCINTO
30 ROCKEFELLER PLAZA
NEW YORK
NY
10112
US
|
Assignee: |
Tellabs San Jose, Inc.
Naperville
IL
|
Family ID: |
38434805 |
Appl. No.: |
11/413409 |
Filed: |
April 28, 2006 |
Current U.S.
Class: |
370/412 ;
370/395.4 |
Current CPC
Class: |
H04L 47/621 20130101;
H04L 47/60 20130101; H04L 47/624 20130101; H04L 47/10 20130101;
H04L 47/2441 20130101; H04L 49/90 20130101; H04L 47/2408 20130101;
H04L 47/31 20130101 |
Class at
Publication: |
370/412 ;
370/395.4 |
International
Class: |
H04L 12/56 20060101
H04L012/56; H04L 12/28 20060101 H04L012/28 |
Claims
1. A method for offering differentiated service of network traffic,
the method comprising: queuing the traffic into a first plurality
of separate per flow queues; scheduling the traffic from the per
flow queues into a group queue; and performing congestion
management on traffic in the group queue.
2. The method of claim 1, further comprising: marking traffic with
priority values according to priority, wherein the congestion
management is performed based on the priority values.
3. The method of claim 2, wherein the marking includes determining
whether the traffic is within a contractual limit, and marking the
traffic as "in contract" if the traffic is within the contractual
limit, and marking the traffic as "out of contract" if the traffic
is not within the contractual limit.
4. The method of claim 2, wherein the marking includes classifying
incoming traffic based on Differentiated Service Code Point.
5. The method of claim 1, further comprising: performing congestion
management in a per flow queue.
6. The method of claim 1, wherein the traffic is scheduled from the
per flow queues by scheduling higher priority traffic in a strict
priority over lower priority traffic.
7. The method of claim 6, wherein the lower priority traffic
includes a first lower priority traffic and a second lower priority
traffic, and the traffic is scheduled from the per flow queues by
scheduling the first lower priority traffic and the second lower
priority traffic based on a round robin process.
8. The method of claim 1, further comprising: scheduling traffic
from the group queue into a second plurality of separate per flow
queues based on priority; scheduling traffic from the second
plurality of separate per flow queues into either of a high
priority group queue or a low priority group queue, wherein traffic
in higher priority per flow queues of the second plurality of
separate per flow queues is scheduled into the high priority group
queue, and traffic in lower priority per flow queues of the second
plurality of per flow queues is scheduled into the low priority
group queue.
9. The method of claim 8, wherein traffic in the higher priority
per flow queues of the second plurality of separate per flow queues
is scheduled into the high priority group queue based on a round
robin process, and traffic in the lower priority per flow queues of
the second plurality of separate per flow queues is scheduled into
the low priority group queue based on a round robin process.
10. The method of claim 8, further comprising: scheduling traffic
from the high priority group queue in a strict priority over
traffic in the low priority group queue.
11. The method of claim 1, wherein the traffic includes a plurality
of types of traffic including user control traffic, expedited
forwarding traffic, assured forwarding traffic, and best effort
traffic, and the traffic is queued into the per flow queues
according to traffic type.
12. A method for offering differentiated service of network
traffic, the method comprising: queuing the traffic into a first
plurality of separate per flow queues; scheduling the traffic from
the first plurality of per flow queues into either of a high
priority group queue or a low priority group queue, wherein traffic
in higher priority per flow queues of the first plurality of
separate per flow queues is scheduled into the high priority group
queue, and traffic in lower priority per flow queues of the first
plurality of separate per flow queues is scheduled into the low
priority group queue; and performing congestion management on
traffic in the high priority group queue and traffic in the low
priority group queue.
13. The method of claim 12, wherein traffic in the higher priority
per flow queues of the first plurality of separate per flow queues
is scheduled into the high priority group queue based on a round
robin process, and traffic in the lower priority per flow queues of
the first plurality of separate per flow queues is scheduled into
the low priority group queue based on a round robin process.
14. The method of claim 12, further comprising: scheduling traffic
from the high priority group queue and the low priority group queue
onto a second plurality of separate per flow queues based on
priority; scheduling traffic from the per flow queues of the second
plurality of separate per flow queues onto either of a second high
priority group queue or a second low priority group queue, wherein
traffic in higher priority per flow queues of the second plurality
of separate per flow queues is scheduled onto the second higher
priority group queue, and traffic in lower priority per flow queues
of the second plurality of separate per flow queues is scheduled
onto the second low priority group queue.
15. The method of claim 14, wherein traffic in the higher priority
per flow queues of the second plurality of separate per flow queues
is scheduled onto the second high priority group queue based on a
round robin process, and traffic in the lower priority per flow
queues of the second plurality of separate per flow queues is
scheduled onto the second low priority group queue based on a round
robin process.
16. The method of claim 14, further comprising: scheduling traffic
from the second high priority group queue in a strict priority over
traffic in the second low priority group queue.
17. An apparatus for offering differentiated services of network
traffic, the apparatus comprising: a per flow scheduler to queue
the traffic into a first plurality of separate per flow queues; a
higher priority scheduler to schedule traffic in higher priority
per flow queues of the first plurality of separate per flow queues
into a group queue; a lower priority scheduler to schedule traffic
from lower priority per flow queues of the first plurality of per
flow queues into the group queue; and a group queue congestion
manager to perform congestion management on traffic in the group
queue.
18. The apparatus of claim 17, further comprising: a second per
flow scheduler to schedule traffic from the group queue into a
second plurality of separate per flow queues based on priority; a
second higher priority scheduler to schedule traffic from higher
priority per flow queues of the second plurality of separate per
flow queue into a high priority group queue; and a second lower
priority scheduler to schedule traffic from lower priority per flow
queues of the second plurality of separate per flow queues into a
low priority group queue.
19. The apparatus of claim 18, wherein traffic in the higher
priority per flow queues of the second plurality of separate per
flow queues is scheduled into the high priority group queue based
on a round robin process, and traffic in the lower priority per
flow queues of the second plurality of separate per flow queues is
scheduled into the low priority group queue based on a round robin
process.
20. An apparatus for offering differentiated services of network
traffic, the apparatus comprising: a per flow scheduler to queue
the traffic into a first plurality of separate per flow queues; a
high priority scheduler to schedule traffic from higher priority
per flow queues of the first plurality of separate per flow queues
into a high priority group queue; a low priority scheduler to
schedule traffic from lower priority per flow queues of the first
plurality of separate per flow queues into a low priority group
queue; a high queue congestion manager to perform congestion
management on traffic in the high priority group queue; and a low
queue congestion manager to perform congestion management on
traffic in the low priority group queue.
21. Computer-executable program instructions stored on
computer-readable medium, the computer-executable program
instructions for offering differentiated services of network
traffic, the computer-executable instructions executable to perform
the method of: queuing the traffic in a first plurality of separate
per flow queues; scheduling the traffic from the per flow queues of
the first plurality of separate per flow queues into a group queue;
and performing congestion management on traffic in the group queue.
scheduling traffic from the group queue into a second plurality of
separate per flow queues based on priority; scheduling traffic from
the second plurality of separate per flow queues into either of a
high priority group queue or a low priority group queue, wherein
traffic in higher priority per flow queues of the second plurality
of separate per flow queues are scheduled into the high priority
group queue, and traffic in lower priority per flow queues of the
second plurality of per flow queues are scheduled into the low
priority group queue.
22. The computer-executable program instructions of claim 21,
wherein traffic in the higher priority per flow queues of the
second plurality of separate per flow queues is scheduled into the
high priority group queue based on a round robin process, and
traffic in the lower priority per flow queues of the second
plurality of separate per flow queues is scheduled into the low
priority group queue based on a round robin process.
23. Computer-executable program instructions stored on
computer-readable medium, the computer-executable program
instructions for offering differentiated services of network
traffic, the computer-executable instructions executable to perform
the method of: queuing the traffic into a first plurality of
separate per flow queues; scheduling the traffic from the first
plurality of per flow queues into either of a high priority group
queue or a low priority group queue, wherein traffic in higher
priority per flow queues of the first plurality of separate per
flow queues is scheduled into the high priority group queue, and
traffic in lower priority per flow queues of the first plurality of
separate per flow queues is scheduled into the low priority group
queue; and performing congestion management on traffic in the high
priority group queue and traffic in the low priority group queue.
Description
BACKGROUND OF THE INVENTION
[0001] 1. Field of the Invention
[0002] This invention pertains to devices, methods, and computer
programs providing differentiated services for network traffic.
Specifically, the invention relates to a system in which
differentiated services are provided using a combination of traffic
flow weighting, group queues, and congestion management.
[0003] 2. Description of Related Art
[0004] Network service providers offer differentiated services in
order to tailor customer bandwidth demands based on priority levels
of a customer's network traffic. In particular, higher priority
traffic is generally given preference over lower priority traffic,
thus increasing bandwidth and reducing delay for higher priority
traffic at the expense of the lower priority traffic. However, many
traditional differentiated services methods do not properly balance
high priority and low priority traffic. As a result, lower priority
traffic sometimes can be prematurely discarded in conditions of
network congestion.
SUMMARY OF THE INVENTION
[0005] To address the foregoing, the present invention provides a
method, apparatus, and computer program for providing
differentiated services for network traffic. In one embodiment, the
traffic is queued into a first plurality of separate per flow
queues, and the traffic is scheduled from the per flow queues into
a group queue. Congestion management is performed on traffic in the
group queue.
[0006] In at least one embodiment of the present invention, traffic
is marked with priority values according to priority, and
congestion management is performed based on the priority values.
For example, the marking can include determining whether the
traffic is within a contractual limit, and marking the traffic as
"in contract" if the traffic is within the contractual limit, and
marking the traffic as "out of contract" if the traffic is not
within the contractual limit. In another example, the marking can
include classifying incoming traffic based on Differentiated
Service Code Point.
[0007] According to one embodiment of the present invention,
congestion management is performed in a per flow queue.
[0008] In another embodiment of the present invention, the traffic
is scheduled from the per flow queues by scheduling higher priority
traffic in a strict priority over lower priority traffic. For
example, the lower priority traffic can include a first lower
priority traffic and a second lower priority traffic. In this case,
traffic is scheduled from the per flow queues by scheduling the
first lower priority traffic and the second lower priority traffic
based on a round robin process.
[0009] The traffic can include a plurality of types of traffic
including user control traffic, expedited forwarding traffic,
assured forwarding traffic, and best effort traffic, and the
traffic can be queued into the per flow queues according to traffic
type.
[0010] In another embodiment, traffic from the group queue is
scheduled into a second plurality of separate per flow queues based
on priority, and traffic from the second plurality of separate per
flow queues is scheduled into either of a high priority group queue
or a low priority group queue. In this case, traffic in higher
priority per flow queues of the second plurality of separate per
flow queues is scheduled into the high priority group queue, and
traffic in lower priority per flow queues of the second plurality
of per flow queues is scheduled into the low priority group
queue.
[0011] In another aspect of the present invention, traffic in the
higher priority per flow queues of the second plurality of separate
per flow queues is scheduled into the high priority group queue
based on a round robin process, and traffic in the lower priority
per flow queues of the second plurality of separate per flow queues
is scheduled into the low priority group queue based on a round
robin process.
[0012] In a further aspect, traffic can also be scheduled from the
high priority group queue in a strict priority over traffic in the
low priority group queue.
[0013] In another embodiment of the present invention, the network
traffic is queued into a first plurality of separate per flow
queues, and the traffic is scheduled from the first plurality of
per flow queues into either of a high priority group queue or a low
priority group queue. Traffic in higher priority per flow queues of
the first plurality of separate per flow queues is scheduled into
the high priority group queue, and traffic in lower priority per
flow queues of the first plurality of separate per flow queues is
scheduled into the low priority group queue. Congestion management
is performed on traffic in the high priority group queue and
traffic in the low priority group queue.
[0014] In another aspect of the present invention, traffic in the
higher priority per flow queues of the first plurality of separate
per flow queues is scheduled into the high priority group queue
based on a round robin process, and traffic in the lower priority
per flow queues of the first plurality of separate per flow queues
is scheduled into the low priority group queue based on a round
robin process.
[0015] In a further aspect, traffic is scheduled from the high
priority group queue and the low priority group queue onto a second
plurality of separate per flow queues based on priority, and
traffic is scheduled from the per flow queues of the second
plurality of separate per flow queues onto either of a second high
priority group queue or a second low priority group queue. In this
case, traffic in higher priority per flow queues of the second
plurality of separate per flow queues is scheduled onto the second
higher priority group queue, and traffic in lower priority per flow
queues of the second plurality of separate per flow queues is
scheduled onto the second low priority group queue.
[0016] In another aspect, traffic in the higher priority per flow
queues of the second plurality of separate per flow queues is
scheduled onto the second high priority group queue based on a
round robin process, and traffic in the lower priority per flow
queues of the second plurality of separate per flow queues is
scheduled onto the second low priority group queue based on a round
robin process. Traffic from the second high priority group queue
can be scheduled in a strict priority over traffic in the second
low priority group queue.
[0017] The invention can be embodied in, without limitation, a
method, apparatus, or computer-executable program instructions.
[0018] This brief summary has been provided so that the nature of
the invention may be understood quickly. A more complete
understanding of the invention can be obtained by reference to the
following detailed description in connection with the attached
drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
[0019] The present invention will be more readily understood from a
detailed description of the preferred embodiments taken in
conjunction with the following figures:
[0020] FIG. 1 is a block diagram of an exemplary environment in
which an embodiment of the present invention can be
implemented.
[0021] FIG. 2A is a block diagram of an exemplary data processing
system in which an embodiment of the present invention can be
implemented.
[0022] FIG. 2B is a perspective view of a configuration of an
exemplary network router in which an embodiment of the present
invention can be implemented.
[0023] FIG. 2C is a block diagram of an exemplary implementation of
the network router of FIG. 2B.
[0024] FIG. 3 is a traffic flow diagram of an ingress (originating)
node according to one embodiment of the present invention.
[0025] FIG. 4 is a traffic flow diagram of an ingress (originating)
node according to another embodiment of the present invention.
[0026] FIGS. 5A and 5B are process flowcharts of a method of
servicing traffic in an ingress (originating) node according to one
embodiment of the present invention.
[0027] FIG. 5C is a collaboration diagram for functional modules
for servicing traffic in an ingress (originating) node according to
one embodiment of the present invention.
[0028] FIG. 6 is a traffic flow diagram of a transit node according
to one embodiment of the present invention.
[0029] FIG. 7A is a process flowchart of a method of servicing
traffic in a transit node according to one embodiment of the
present invention.
[0030] FIG. 7B is a collaboration diagram for functional modules
for servicing traffic in a transit node according to one embodiment
of the present invention.
[0031] FIG. 8 is a block diagram of an exemplary network
interface.
[0032] FIG. 9 is a traffic flow diagram of an egress (terminating)
node according to one embodiment of the present invention.
[0033] FIG. 10 is a traffic flow diagram of an egress (terminating)
node according to another embodiment of the present invention.
[0034] FIGS. 11A and 11B are process flowcharts of a method of
servicing traffic in an egress (terminating) node according to one
embodiment of the present invention.
[0035] FIG. 11C is a collaboration diagram for functional modules
for servicing traffic in an egress (terminating) node according to
one embodiment of the present invention.
[0036] FIG. 12 is a diagram of a congestion management
configuration according to one embodiment of the present
invention.
[0037] FIG. 13 is a diagram of a congestion algorithm according to
one embodiment of the present invention.
[0038] FIG. 14 is a process flowchart of an exemplary congestion
management algorithm according to one embodiment of the
invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
[0039] Preferred embodiments of the present invention are described
below with reference to the accompanying drawings. The embodiments
include an apparatus, system, method, and computer program
providing differentiated services for network traffic using a
combination of traffic flow weighting, group queues, and congestion
management.
[0040] As one example of a use of the present invention, a network
service provider could offer differentiated services enabled by the
present invention to its customers. In particular, customers
contract with the network provider via a service level agreement
(SLA) to receive a particular type and level of service for the
customer's network traffic. In one exemplary environment of the
present invention, described below in reference to FIG. 1, the
network provider operates a network, or "cloud," of network traffic
routers. The customer sends network traffic through the provider's
network to various network destinations. The network provider
services the customer's traffic utilizing embodiments of the
present invention to achieve the terms of the SLA.
[0041] In particular, differentiated services are provided to the
customer. That is, the customer's network traffic is differentiated
based on levels of priority. Different levels of weighting, which
can be customizable by the customer, can then be applied to the
differentiated traffic. In addition, the use of group queues and
congestion management allow balanced service of the customer's
network traffic. In contrast with conventional methods of
differentiated services, which can sometimes discard lower priority
traffic prematurely, the present invention allows lower priority
traffic to compete more fairly with higher priority traffic. Other
advantages of the present invention will become apparent in the
description of the preferred embodiments below.
[0042] FIG. 1 depicts one example of an environment in which the
present invention can be implemented, which is a multiprotocol
label switching (MPLS) network 101. However, one skilled in the art
will understand that the present invention can be utilized in other
types of networks as well. FIG. 1 shows label edge routers (LERs),
including LER 100 and LER 120, and label switching routers (LSRs),
including LSRs 105, 110, and 115. Network traffic entering LER 110,
for example, is assigned a label switched path (LSP), which defines
a path (or partial path) for the traffic through the network. For
example, in the illustrated embodiment, an LSP is shown as
traversing LER 100 to LSR 105 to LSR 110 to LSR 115 to LER 120. For
this LSP, LER 100 is an ingress (originating) node for traffic,
while LSRs 105, 110, and 115 are transit nodes, and LER 120 is an
egress (terminating) node. In other words, when a customer's
network traffic enters LER 100, which is serving as an ingress
(originating) node, an LSP is defined to allow the traffic to reach
its destination. Preferred embodiments of the present invention,
described below, are implemented in the provider's LERs and LSRs to
allow the customer's network traffic to traverse the LSP.
[0043] The LERs and LSRs depicted in FIG. 1 can be implemented as
data processing systems, and the present invention can be
implemented as computer-executable program instructions stored on a
computer-readable medium of the data processing systems. In other
embodiments, software modules or circuitry can be used to implement
the present invention.
[0044] For example, the present invention can be implemented on a
general purpose computer. FIG. 2A is an architecture diagram for an
exemplary data processing system 1100, which could be used as an
LER and/or LSR for performing operations as an originating,
transit, or egress node in accordance with exemplary embodiments of
the present invention described in detail below.
[0045] Data processing system 1100 includes a processor 1110
coupled to a memory 1120 via system bus 1130. The processor is also
coupled to external Input/Output (1/0) devices (not shown) via the
system bus 1130 and an I/0 bus 1140. A storage device 1150 having a
computer-readable medium is coupled to the processor 1110 via a
storage device controller 1160 and the I/O bus 1140 and the system
bus 1130. The storage device 1150 is used by the processor 1110 and
controller 1160 to store and read/write data 1170 and program
instructions 1180 used to implement the procedures described below.
For example, those instructions 1180 can perform any of the methods
described below for operation as an originating node (in
conjunction with FIGS. 3, 4, 5A, 5B and 5C), a transit node (in
conjunction with FIGS. 6, 7A and 7B), and/or a terminating node (in
conjunction with FIGS. 9, 10, 11A, 11B and 11C).
[0046] The processor 1110 may be further coupled to a
communications device 1190 via a communications device controller
1200 coupled to the I/O bus 1140. The processor 1110 uses the
communications device 1190 to communicate with a network (not shown
in FIG. 2A) transmitting multiple flows of data as described
below.
[0047] In operation, the processor 1110 loads the program
instructions 1180 from the storage device 1150 into the memory
1120. The processor 1110 then executes the loaded program
instructions 1180 to offer differentiated services for network
traffic that arrives at data processing system 1100 and is enqueued
in memory 1120. Thus, processor 1110 operates under the control of
the instructions 1180 to perform the methods of this invention, as
described in more detail below.
[0048] The present invention can also be implemented, for example,
in a network router. FIG. 2B is a perspective view of one
configuration of a network router 1210 having sixteen universal
line cards (ULCs) 1215. Each ULC 1215 has four physical line
modules (PLMs) 1220 for connecting with physical network lines to
receive and transmit network traffic. The router 1210 also has
three switch and control cards (SCCs) 1225, which serve as switched
paths between each of the ULCs 1215. The SCCs 1225 allow traffic to
flow from a first ULC 1215 to a second ULC 1215, for example, when
ingress traffic received at a physical line of the first ULC 1215
must be transmitted on a physical line of the second ULC 1215. In
particular, each ULC 1215 is connected to the three SCCs 1225
(connections not shown). Network router 1210 could be used as an
LER and/or LSR for performing operations as originating, transit,
and egress nodes in accordance with exemplary embodiments of the
present invention described in detail below.
[0049] FIG. 2C is a block diagram showing details of a ULC 1215, in
which operations according to one embodiment of the present
invention are performed. Specifically, ULC 1215 preferably includes
several application specific integrated circuits (ASICs), such as
packet processor (PP) 1230, packet manager (PM) 1235, packet
scheduler (PS) 1240, fabric gateway (FG) 1245, PM 1250, PS 1255,
and PP 1260. ULC 1215 also includes a general purpose processor CPU
1265 (connections not shown). The ASICs and CPU 1265 of ULC 1215
are configured to perform the methods of the present invention, as
described in more detail below.
[0050] Generally, ingress traffic received through a PLM 1220
preferably is processed by an "ingress side" of ULC 1215, which
includes PP 1230, PM 1235, and PS 1240. Traffic from the "ingress
side" is sent to FG 1245, which transmits the traffic through one
of the SCCs 1225 to an "egress side" of one of the ULCs 1215 having
the correct PLM 1220 for transmitting the traffic to the next
network destination. The "egress side," which is not necessarily on
the same ULC 1215 as the "ingress side" (although it is not shown
as such in FIG. 2C, for convenience), includes PM 1250, PS 1255,
and PP 1260. The traffic is received by the corresponding FG 1245
at the "egress side," and the traffic is processed by PM 1250, PS
1255, and PP 1260 for transmitting onto the network via a PLM 1220.
CPU 1265 performs functions such as communicating with the SCCs and
programming the PPs, PMs, and PSs as needed. FG 1245 serves as a
gateway to the SCCs to transmit traffic to and receive traffic from
the SCCs.
[0051] Illustrative embodiments of the present invention will now
be described. Each embodiment is described as being implemented
within an MPLS-capable network environment, such as the network
shown in FIG. 1, although one skilled in the art will appreciate
that the invention can be applied in different network environments
as well, and can operate using many different protocols and traffic
types. The exemplary embodiments of the present invention described
below are implemented in an MPLS network carrying RFC2547 IP-VPN
traffic, and in an MPLS network carrying VPLSNVPWS traffic. These
two implementations will be described concurrently, where the
embodiments overlap. For convenience, the configuration and
behavior for the RFC2547 IP-VPN implementation will be described in
detail, and any differences in the VPLSNVPWS implementation will be
highlighted.
[0052] To illustrate the QoS behavior and implementation of the
present invention, the description below divides the network
environment into three portions, as illustrated in the following
traffic flow diagrams: (1) an ingress (originating) node,
corresponding to FIGS. 3 and 4; (2) a transit node, corresponding
to FIG. 6; and (3) an egress (terminating) node, corresponding to
FIGS. 9 and 10.
[0053] Referring first to FIGS. 3 and 4, FIG. 3 illustrates an
ingress node of the RFC2547 IP-VPN embodiment,, and FIG. 4 shows an
ingress node of the VPLSIVPWS embodiment. For example, the
embodiments of an ingress node according to the present invention
shown in FIGS. 3 and 4 can be implemented on LER 100 of FIG. 1,
which would initially receive a customer's network traffic for
subsequent transmission through network 101 via an LSP. For the
following description, the ingress node is described as having two
parts. First, a user access side (labeled "Ingress Side")
corresponds to the "ingress side" of ULC 1215, described above.
Second, an originating label switched path (LSP) side (i.e., a
network side, labeled "Egress Side") corresponds to the "egress
side" of ULC 1215, described above. The embodiments of ingress
nodes described below illustrate the present invention's use of
separate per flow queuing, aggregation into group queues, and
congestion management to provide differentiated services.
[0054] Various types of traffic are shown, including user control
traffic (cntl 201), expedited forwarding traffic (EF 203), best
effort traffic (BE 213), and several types of assured forwarding
(AF) traffic such as AF4 (205), AF3 (207), AF2 (209), AF1 (211),
AF4.1+AF3.1 (215), AF4.23+AF3.23 (217), AF2.1+AF1.1 (219), and
AF2.23+AF1.23 (221). Also shown are policers 223 (only one policer
is labeled), traffic shapers S.sub.1% 227, S.sub.max1 229, and
S.sub.max2 231, group queues GID.sub.low1 233, GID.sub.low2 235,
and GID.sub.high1 237, and deficit round robin (DRR) schedulers
DRR1 239, DRR2 241, and DRR3 243. The policers, shapers, and
schedulers can be implemented as software modules, which operate on
traffic stored in the queues, although in other embodiments they
may be circuitry or a combination of circuitry and software.
[0055] Referring now to FIG. 5A, along with FIGS. 2C, 3 and 4, the
operations of the Ingress Side of the ingress node will now be
described in detail.
[0056] Incoming traffic is classified (401) by PP 1230 based on
priority using Differentiated Service Code Point (DSCP) via access
control lists (ACLs). While in other embodiments, other methods of
classifying traffic can be used, ACLs are preferred because they
can offer flexibility in the classification procedure. For example,
internet protocol (IP) and media access control (MAC) ACLs can be
used. Using IP ACLs, a user, such as a customer, can classify the
traffic not only on DSCP but also on other fields in Layer3 and
Layer4 headers such as, for example, source IP address, destination
IP address, IP Protocol, source port, destination port, and others.
Similarly, the embodiment can receive traffic that has been
previously classified by a user via MAC ACLs, a user can classify
the traffic not only on 802.1p bits but also on other fields in the
Ethernet header such as, for example, source MAC address,
destination MAC address, ethertype, and others.
[0057] Next, the traffic is queued (402) into separate per flow
queues. In particular, PP 1230 determines the queue in which to
place the traffic, and PM 1235 stores the traffic in queues.
Referring specifically to FIG. 3, separate per flow queues are used
for the following traffic--cntl 201 (e.g. MP-eBGP in the RFC2547
case), EF 203, AF4 (205), AF3 (207), AF2 (209), AF1 (211) and BE
213. In contrast to the RFC2547 IP-VPN embodiment, in the VPLS/VPWS
embodiment shown in FIG. 4, AF4 (205) and AF3 (207) share a queue,
and AF2 (209) and AF1 (211) share a queue.
[0058] Returning to FIG. 5A, the traffic is policed/colored (403)
by PS 1240. In particular, policing determines that traffic is "in
contract" if the amount of traffic in the particular queue is less
than or equal to a contractual limit set forth in the SLA. If the
amount of traffic exceeds the contractual limit, policing
determines the traffic is "out of contract." As shown in FIGS. 3
and 4, the queue carrying EF 203 is policed by a policer 223, and
only "in contract" traffic is allowed to pass, while "out of
contract" traffic is discarded. The cntl 201 and EF 203 are marked
as "green" traffic for the purpose of congestion control, which is
described in more detail below. The AF queues are policed by
policers 223 to mark the traffic as "in contract" or "out of
contract." The "in contract" AF4 (205) and AF3 (207) traffic is
marked "green," and the "in contract" AF2 (209) and AF1 (211)
traffic is marked as "yellow" traffic for congestion control. The
"out of contract" AF traffic is marked as "red" traffic, e.g., AF4
Red 245, AF3 Red 247, AF2 Red 249, and AF1 Red 251 (FIG. 3). BE 213
is marked as "red," i.e., "out of contract."
[0059] Returning to FIG. 5A, after policing, the traffic is
scheduled (404) by PS 1240 onto a group queue GID.sub.low1 233,
which is preferably stored in PS 1240. In particular, as shown in
FIGS. 3 and 4, cntl 201 and EF 203 are shaped by shapers S.sub.1%
227 and S.sub.max1 229, respectively. Specifically, the shapers
limit the traffic flow to a predetermined amount. The queues
carrying EF 203 and cntl 201 are considered strict priority queues.
The strict priority queues (carrying cntl 201 and EF 203) take
precedence over weighted queues (carrying AF traffic and BE 213) in
terms of scheduling, with all the per flow queues aggregating into
GID.sub.low1 233. The queues carrying the AF and BE traffic are
considered weighted queues and the traffic in the weighted queues
is scheduled by DRR1 239 onto GID.sub.low1 233. The user can change
the weights of weighted queues (carrying AF and BE traffic), though
default weights can be set according to the user's needs.
[0060] The rate of traffic flow of GID.sub.low1 233 can be set by
S.sub.max2 231, which, for example, helps a service provider to
control the incoming traffic and enforce service level agreements
(SLAs), offering aggregate committed information rate and excess
information rate (CIR+EIR) to customers.
[0061] Congestion management is performed (405) by PS 1240 at the
Ingress Side (FIGS. 3 and 4). Specifically, a congestion algorithm
is utilized at GID.sub.low1 233 to discard traffic based on a
specified preference order. The congestion algorithm is described
in more detail below in reference to FIGS. 12, 13 and 14. At
GID.sub.low1 233, discard order is BE 213 and AF "out of contract"
traffic, followed by AF1 (211) and AF2 (209) "in contract,"
followed by AF3 (207) and AF4 (205) "in contract," EF 203 and cntl
201. For AF and BE traffic under congestion control,
higher-weighted queues receive preference over, i.e., will be
discarded less often than, "out of contract" AF and BE traffic and
"in contract" AF traffic (however, they will not result in "out of
contract" AF traffic getting preference over "in contract" traffic
from lower weighted AF classes). In particular, higher-weighted
queues preferably are marked with a higher-priority color than "out
of contract" traffic.
[0062] Referring to FIG. 5B, along with FIGS. 2C, 3 and 4, the
features of the Egress Side of the ingress node will now be
described in detail.
[0063] The traffic is marked (406) with an EXP number by PP 1260.
The EXP number preferably corresponds to a priority of the traffic,
as shown in the following table: TABLE-US-00001 TABLE 1 Traffic
type to EXP mapping Traffic EXP Value User Control 6 EF 5 (AF4 and
AF3) in contract 4 (AF2 and AF1) in contract 3 (AF4 and AF3) out of
contract 2 (AF2 and AF1) out of contract 1 BE 0
[0064] The traffic is separated (407) into per flow queues based on
priority. In particular, PP 1260 determines the queue in which to
store the traffic, and PM 1250 stores the traffic in the queues.
The traffic is then scheduled (408) by PS 1255 onto separate high
or low group queues, which are located in PS 1255, according to
priority. Specifically, cntl 201 and EF 203, which are marked. with
EXP values of EXP=6 and EXP=5, respectively, are placed on separate
per flow queues 255. These queues 255 aggregate into GID.sub.high1
237, which is a group queue of the highest priority. Cntl 201 and
EF 203 queues 255 preferably are scheduled in a round robin manner
among themselves, such as by DRR3 243, or are scheduled based on
some other scheduling criteria.
[0065] The AF and BE traffic are placed on weighted queues 257.
Specifically, "in contract" AF4 (205) and AF3 (207), which are
marked with EXP=4, are placed in a queue 215 for AF4.1+AF3.1
traffic. "In contract" AF2 (209) and AF1 (211), which are marked
with EXP=3, are placed in a queue 219 for AF2.1+AF1.1 traffic. AF4
Red 245 and AF3 Red 247, which are marked with EXP=2, are placed in
a queue 217 for AF4.23 +AF3.23 traffic. AF2 Red 249 and AF1 Red
251, which are marked with EXP=1, are placed in a queue 221 for
AF2.23+AF1.23 traffic. BE 213, which is marked with EXP=0, is
placed in a queue for BE traffic. Accordingly, in GID.sub.low2 235
in both the RFC2547 IP-V,PN and the VPLS/VPWS embodiments, the
traffic classes AF4 and AF3 share a queue, and traffic classes AF2
and AF1 share a queue. BE continues to be marked as "out of
contract" traffic and is on a separate BE queue, usually with a
minimal weight.
[0066] The weighted queues 257 are scheduled by DRR2 241 and
aggregate into GID.sub.low2 235, which is a group queue of the
lower priority. The weights can be set in a multiple of fractional
maximum transmission unit (MTU) bytes for the weighted queues
257.
[0067] The group queues, GID.sub.high1 237 and GID.sub.low2 235
correspond to a single interface, such as a physical network line,
in a preferred embodiment. As a consequence, the user control and
EF traffic from all LSPs goes through the same GID.sub.high1 237,
and weighted traffic (AF and BE) from all LSPs goes through the
same GID.sub.low2 235.
[0068] Congestion management is performed (409) by PS 1255 at the
Egress Side (congestion at the interface). Preferably, the
congestion algorithm described in more detail below is utilized at
GID.sub.low2 235 and GID.sub.high1 237 to discard traffic based on
a specified preference order, such as, for example, the "out of
contract" traffic (e.g., BE 213, AF1 Red 251, AF2 Red 249, AF3 Red
247, AF4 Red 245 in FIG. 3) in an LSP being discarded before the AF
"in contract" traffic, followed by EF 203 and cntl 201. Among the
AF "in contract" traffic, AF1 and AF2 "in contract" traffic is
discarded before AF3 and AF4 "in contract" traffic.
[0069] The group queues, GID.sub.high1 237 and GID.sub.low2 235,
are scheduled (410) in strict priority mode with respect to each
other. Preferably, GID.sub.high1 237 has higher priority than
GID.sub.low2 235. This ensures that user control and EF traffic get
precedence over the AF and BE traffic.
[0070] In order to support open bandwidth or auto open bandwidth
LSPs, shaping on GID.sub.high1 237 and GID.sub.low2 235 can be
turned off, that is, shaping is set to a high rate. In addition,
none of the individual queues (user control, EF, AF and BE traffic)
are policed.
[0071] In the case of multiple open bandwidth or auto open
bandwidth LSPs going over an interface, in one embodiment all LSPs
can be treated as equal; in other words, no prioritization or bias
among the LSPs. Moreover, as described earlier all of the open
bandwidth LSPs going through an interface share the same group
queues (GID.sub.high1 237 and GID.sub.low2 235) which can help
ensure that fairness among the LSPs is maintained.
[0072] Having described the sequence of operations within an
ingress (originating) node, specific functional modules
implementing the above-described operations from FIGS. 5A and 5B
will now be described. FIG. 5C is a collaboration diagram for
functional modules deployed in an ingress (originating) node, such
as LER 100, for offering differentiated services in accordance with
an exemplary embodiment of the present invention. The functional
modules can be implemented as software modules or objects. In other
embodiments, the functional modules may be implemented using
hardware modules or other types of circuitry, or a combination of
software and hardware modules. In particular, the functional
modules can be implemented via the PPs, PMs, and PSs described
above.
[0073] In operation, an ingress node classifier 411 classifies
incoming traffic preferably according to DSCP, and a per flow
scheduler 413 queues traffic into separate per flow queues. For
cntl and EF traffic, a policer 415 discards "out of contract" EF
traffic and marks cntl and "in contract" EF traffic as "green." A
shaper/scheduler 417 limits cntl and EF traffic to predefined
limits and schedules the traffic onto GID.sub.low1 233 in strict
priority over the AF and BE traffic. For the AF and BE traffic, a
policer 419 marks the AF traffic as "in contract" or "out of
contract." The "in contract" AF4 (205) and AF3 (207) traffic is
marked "green," and the "in contract" AF2 (209) and AF1 (211)
traffic is marked as "yellow." The "out of contract" AF traffic is
marked as "red" traffic, and BE 213 is marked as "red." A DRR
scheduler 421 schedules AF and BE traffic onto GID.sub.low1 233 in
a deficit round robin manner, or using other scheduling
criteria.
[0074] A group queue congestion manager 423 applies congestion
control to the traffic, and a shaper 425 limits traffic from
GID.sub.low1 233. A traffic marker 427 marks traffic with
corresponding EXP numbers, and a per flow scheduler 429 schedules
the marked traffic into separate per flow queues. A DRR scheduler
431 preferably schedules EXP=6 and EXP=5 traffic onto GID.sub.high1
237 in a deficit round robin manner (or using another suitable
scheduling technique), and a high queue congestion manager 433
applies congestion management to GID.sub.high1 237. For the EXP=0
to EXP=4 traffic, a DRR scheduler 435 schedules the traffic onto
GID.sub.low2 235, and a low queue congestion manager 437 applies
congestion management to GID.sub.low2 235. Finally, an egress
scheduler 439 schedules traffic from GID.sub.high1 237 and
GID.sub.low2 235, with traffic from GID.sub.high1 237 preferably
being scheduled in strict priority over traffic from GID.sub.low2
235.
[0075] Traffic scheduled from an ingress (originating) node in the
above-described manner is then sent to a transit node (or an egress
(terminating) node). For example, a customer's network traffic is
sent from an ingress node implemented in LER 100 to a transit node
implemented in LSR 105. Turning now to FIG. 6, a transit node
according to one embodiment of the present invention will now be
described. For example, the embodiment shown in FIG. 6 could be
implemented in LSR 105 of FIG. 1. The configuration and operation
of the transit node of the present embodiment is operable to
receive traffic from an ingress node of the RFC2547 IP-VPN
embodiment (FIG. 3) and an ingress node of the VPLS/VPWS embodiment
(FIG. 4). The transit node is shown as having two parts: an
"Ingress Side"; and an "Egress Side." The transit node includes
deficit round robin schedulers, DRR4 501, DRR5 503, DRR6 505, and
DRR7 507, and group queues GID.sub.high2 509, GID.sub.high3 511,
GID.sub.low3 513, and GID.sub.low4 515. The transit node preserves
the EXP values for LSP traffic flowing through it. In addition,
traffic prioritization in a transit node is same as the originating
node. The policers, shapers, and schedulers can be implemented as
software modules, which operate on traffic stored in the queues,
although in other embodiments they may be circuitry or a
combination of circuitry and software.
[0076] The embodiment of a transit node described below illustrate
the present invention's use of separate per flow queuing,
aggregation into group queues, and congestion management to provide
differentiated services.
[0077] Referring to FIG. 7A, along with FIGS. 2C and 6, the
operations performed by the transit node will now be described in
detail.
[0078] At ingress, traffic is queued (601) in separate per flow
queues. In particular, PP 1230 determines the queue in which to
store the traffic, and PM 1235 stores the traffic in queues. The
traffic is scheduled (602) by PS 1240 into a high priority or low
priority group queue, where congestion management, described in
more detail below, is performed (603). At egress the process is
repeated. Specifically, traffic is queued (604) by PP 1260 and PM
1250 in separate per flow queues, and scheduled (605) by PS 1255
into a high priority or low priority group queue, where congestion
management, described in more detail below, is performed (606).
Traffic is then scheduled (607) from the high priority and low
priority group queues by PS 1255.
[0079] For example, in a preferred embodiment of the invention,
user control and EF traffic are carried by the strict priority
queues 517 and 521, which are scheduled by DRR4 501 and DRR6 505,
respectively, and aggregated into the group queues GID.sub.high2
509 and GID.sub.high3 511. The user control and EF queues 517 and
521 are scheduled in round robin manner among themselves through
DRR4 501 and DRR6 505, although other scheduling criteria also may
be used.
[0080] AF and BE traffic are mapped to weighted queues 519 and 523
for scheduling by DRR5 503 and DRR7 507, respectively, and the
resulting traffic is aggregated into the group queue, GID.sub.low3
513 and GID.sub.low4 515. Instead of putting each AF traffic type
on a separate queue, AF4 and AF3 traffic are mapped on one queue
and AF2 and AF1 traffic are mapped on another queue. This setup
allows the number of per flow queues on a universal line card (ULC)
to be conserved, and thereby can help to scale the transit LSPs
over an interface to a reasonably large number.
[0081] Similar to the description above regarding the ingress node,
the group queues at each interface are preferably scheduled in
strict priority mode, with GID.sub.high2 509 having higher priority
than GID.sub.low3 513, and GID.sub.high3 511 having higher priority
than GID.sub.low4 515. The group queues, GID.sub.high3 511 and
GID.sub.low4 515 are per interface, that is, all the LSPs on a
given interface are transmitted through GID.sub.high3 511 and
GID.sub.low4 515.
[0082] Preferably, when a node is acting as a transit node, the
shapers at the group queues are kept disabled. Also, there is no
policing at the per flow queues (user control, EF, AF, and BE
queues). Accordingly, traffic shapers and policers are not shown in
FIG. 6.
[0083] In addition, under congestion at the interfaces, either the
Ingress Side or the Egress Side, a congestion algorithm described
in more detail below is utilized at the group queues of the
corresponding interface to discard traffic based on a specified
preference order. For example, at the Ingress Side, which
corresponds to GID.sub.high2 509 and GID.sub.low3 513, BE traffic
and "out of contract" AF traffic (EXP=0, EXP=1, and EXP=2 traffic)
preferably is discarded before "in contract" AF1 and AF2 traffic
(EXP=3 traffic), followed by "in contract" AF3 and AF4 traffic
(EXP=4), EF traffic (EXP=5), and user control traffic (EXP=6).
[0084] Because the GID.sub.high traffic preferably has higher
scheduling priority than GID.sub.low traffic, the EF and user
control traffic from all the transit and ingress nodes take
precedence over AF and BE traffic. AF and BE queues 519 and 523 are
scheduled in deficit round robin (DRR) mode, or using another
scheduling technique.
[0085] Having described the sequence of operations within a transit
node, specific functional modules implementing the operations of
the node will now be described. FIG. 7B is a collaboration diagram
for functional modules deployed in a transit node, such as LSR 105,
for offering differentiated services in accordance with an
exemplary embodiment of the present invention. The functional
modules can be implemented as software modules or objects. In other
embodiments, the functional modules may be implemented using
hardware modules or other types of circuitry, or a combination of
software and hardware modules. In particular, the functional
modules can be implemented via the PPs, PMs, and PSs described
above.
[0086] In operation, an ingress per flow scheduler 609 schedules
traffic onto high priority or low priority queues. The cntl and EF
traffic is scheduled by a DRR scheduler 611 into GID.sub.high2 509,
and a high queue congestion manager 613 performs congestion
management on GID.sub.high2 509. A DRR scheduler 615 schedules
traffic from GID.sub.high2 509 onto GID.sub.high3 511, and high
queue congestion manager 617 performs congestion management on
GID.sub.high3 511.
[0087] Similarly, the AF and BE traffic is scheduled by a DRR
scheduler 619 into GID.sub.low3 513, and a low queue congestion
manager 621 applies congestion management. A DRR scheduler 623 then
schedules traffic from GID.sub.low3 513 onto GID.sub.low4 515, and
low queue congestion manager 625 applies congestion management. An
egress scheduler 627 schedules traffic from GID.sub.high3 511 and
GID.sub.low3 515, with GID.sub.high3 511 preferably given strict
priority over GID.sub.low3 515.
[0088] Having described originating and transit nodes, it is noted
that transit and originating LSPs can use a common interface (e.g.,
a physical network line), for example, by aggregating all the per
flow queues from transit and originating LSPs behind the same group
queues. FIG. 8 shows an example of a single port/interface 700
preferably used by multiple per flow queues of various transit LSPs
703 and originating LSPs 705. Multiple queues of AF and BE traffic
are aggregated behind a GIDIW 707 group queue having a byte count
of, for example, 10 Gbytes. Multiple per flow queues of user
control and EF traffic are aggregated into a GID.sub.high 709 group
queue having a byte count of, for example, 10 Gbytes.
[0089] After having traversed an ingress node, such as LER 100, and
possibly one or more transit nodes, such as LSR 105, network
traffic reaches an egress (terminating) node, such as LER 120. For
example, LER 120 would be the final node of the LSP traversed by a
customer's network traffic, and this egress (terminating) node
schedules the customer's traffic for forwarding to its final
destination.
[0090] Referring now to FIGS. 9 and 10, an egress (terminating)
node will now be described in detail. FIG. 9 illustrates an egress
node of the RFC2547 IP-VPN embodiment, and FIG. 10 shows an egress
node of the VPLSNVPWS embodiment. Similar to the description above
regarding the ingress node, for the following description the
egress node is described as having two parts: a terminating LSP
side (labeled "Ingress Side"); and an egress access side (labeled
"Egress Side"). The policers, shapers, and schedulers can be
implemented as software modules, which operate on traffic stored in
the queues, although in other embodiments they can be circuitry or
a combination of circuitry and software.
[0091] The embodiments of egress nodes described below illustrate
the present invention's use of separate per flow queuing,
aggregation into group queues, and congestion management to provide
differentiated services.
[0092] Referring also to FIG. 11A in conjunction with FIGS. 2C, 9
and 10, operation of the Ingress Side of the egress node will now
be described in detail.
[0093] The traffic prioritization in a terminating LSP is handled
in the same manner as transit and originating LSPs. Incoming
traffic is queued (1001) in separate per flow queues. In
particular, PP 1230 determines the queue in which to schedule the
traffic, and PM 1235 stores the traffic in the queues. The traffic
is scheduled (1002) by PS 1240 onto high or low priority group
queues, where congestion management is performed (1003).
[0094] Preferably, the user control and EF traffic are placed
(1001) into separate queues 821, which are scheduled (1002) by DRR8
801 onto group queue, GID.sub.high4 803. AF and BE traffic are
mapped (1001) to weighted queues 823 for scheduling (1002) by DRR9
805 onto group queue, GID.sub.low5 807. In both the RFC2547 IP-VPN
and the VPLS/VPWS embodiments, the AF4 and AF3 traffic share a
queue, and the AF2 and AF1 traffic share a queue. The BE traffic is
on a separate queue, usually with a minimal weight.
[0095] The group queues, GID.sub.high4 803 corresponds to a single
physical interface (line), and GID.sub.low5 807 corresponds to a
single interface (line). That is, the user control and EF traffic
from all LSPs over an interface is transmitted through a same
GID.sub.high4 803 group queue, and the weighted traffic (AF and BE)
from all LSPs over an interface is transmitted through the same
GID.sub.low5 807 group queue.
[0096] GID.sub.high4 803 and GID.sub.low5 807 are scheduled in
strict priority mode with respect to each other. The GID.sub.high4
803 preferably has higher priority than GID.sub.low5 807. This
helps ensure that EF and user control traffic get precedence over
the AF and BE traffic.
[0097] The shapers for GID.sub.high4 803 and GID.sub.low5 807, as
well as policing on individual queues, can be turned off in order
to accommodate open bandwidth and auto open bandwidth LSPs.
Accordingly, traffic shapers and policers are not shown for these
GIDs.
[0098] Under congestion at the interface, congestion management is
performed (1003) by PS 1240. Preferably, BE traffic and "out of
contract" AF traffic (EXP=0, EXP=1, and EXP=2) are discarded before
"in contract" AF1 and AF2 traffic (EXP=3), followed by "in
contract" AF3 and AF4 traffic (EXP=4), EF traffic (EXP=5) and user
control traffic (EXP=6).
[0099] Referring to FIG. 11B, along with FIGS. 2C, 9 and 10,
operations of the Egress Side of the egress node will now be
described in detail.
[0100] For RFC2547 IP-VPN (FIG. 9), the outgoing IP traffic is
classified (1004) by PP 1260 based on DSCP into various diffserv
classes, enabling the user to apply QoS parameters (policing,
appropriate service class, etc . . . ) per diffserv class. The
outgoing IP traffic is queued (1005) into multiple per flow queues
825, that is, separate queues for user control, EF, AF4, AF3, AF2,
AF1 and BE traffic. In particular, PP 1260 determines the queue in
which to schedule traffic, and PM 1250 stores the traffic in the
queues. DSCP is preserved in the outgoing IP stream for RFC 2547
IP-VPN traffic. In contrast, for VPLS and multi-class VPWS (FIG.
10), the outgoing AF4 and AF3 traffic share a common queue, and the
outgoing AF2 and AF1 traffic share another common queue.
[0101] QoS parameters can be applied to the outgoing traffic
through policing (1006) the per flow queues 825 by PS 1255.
Specifically, the per flow queues carrying EF and AF traffic are
policed by policers 809. Preferably, the EXP=3 traffic is treated
as "green" upon entering policer 809; this helps EXP=3 traffic
compete with EXP=4 traffic for bandwidth. The traffic is then
scheduled (1007) by PS 1255 onto a group queue, which preferably is
stored in PS 1255. In particular, the user control traffic is
shaped by traffic shaper S.sub.1%2 811, the EF traffic is shaped by
traffic shaper S.sub.max3 813. The user control and EF traffic are
considered strict priority queues. The strict priority queues take
precedence over weighted queues (carrying AF traffic and BE 213) in
terms of scheduling, with all the per flow queues aggregating into
GID.sub.low6 817. The policed AF traffic and the BE traffic are
mapped into separate queues for AF1, AF2, AF3, AF4, and BE traffic,
and scheduled by DRR10 815 onto GID.sub.low6 817. The rate of
traffic flow of GID.sub.low6 817 can be set by S.sub.max4 819.
[0102] Congestion management is performed (1008) by PS 1255 in
GID.sub.low6 817. Under congestion at the interface, the "out of
contract" traffic (BE, AF1 Red, AF2 Red, AF3 Red, AF4 Red) is
discarded before the AF "in contract" traffic, EF and control
traffic. The queue thresholds preferably are set in such a way that
Red traffic is discarded first, followed by Yellow and finally
Green. Finally, the traffic is scheduled (1009) from GID.sub.low6
817.
[0103] Having described the sequence of operations within an egress
(terminating) node, specific functional modules implementing the
operations will now be described. FIG. 11C is a collaboration
diagram for functional modules deployed in an egress (terminating)
node, such as LER 120, for offering differentiated services in
accordance with an exemplary embodiment of the present invention.
The functional modules can be implemented as software modules or
objects. In other embodiments, the functional modules can be
implemented using hardware modules or other types of circuitry, or
a combination of software and hardware modules. In particular, the
fictional modules can be implemented via the PPs, PMs, and PSs
described above.
[0104] In operation, an ingress per flow scheduler 1011 schedules
incoming traffic onto high priority or low priority queues. The
cntl and EF traffic is scheduled by a DRR scheduler 1013 into
GID.sub.high4 803, and a high queue congestion manager 1015
performs congestion management on GID.sub.high4 803. A policer 1017
polices EF traffic and marks "in contract" EF traffic as "green,"
and "out of contract" EF traffic as "yellow." A shaper/scheduler
1019 limits cntl and EF traffic and preferably schedules the
traffic onto GID.sub.low6 817 in strict priority over AF and BE
traffic.
[0105] Similarly, the AF and BE traffic is scheduled by a DRR
scheduler 1021 into GID.sub.low5 807, and a low queue congestion
manager 1023 applies congestion management. A DSCP classifier 1025
classifies traffic preferably according to DSCP, and a policer 1027
marks "in contract" EXP=4 and EXP=3 traffic as "green," marks "out
of contract" EXP=4 and EXP=3 traffic as "yellow," and marks EXP=2
and EXP=1 traffic as "red." A DRR scheduler 1029 then schedules
traffic onto GID.sub.low6 817.
[0106] A group queue congestion manager 1031 applies congestion
management to GID.sub.low6 817, and an egress shaper/scheduler 1033
limits traffic from GID.sub.low6 817 and schedules traffic from
GID.sub.low6 817.
[0107] In one advantage of the above embodiments, if open bandwidth
transit and terminating LSPs are transmitted over the same
interface, the LSPs are fairly treated, since the group queues are
shared by all the open bandwidth LSPs traversing over the same
interface. This is similar to the case explained above in which
open bandwidth transit and originating LSPs traverse over the same
interface.
[0108] The embodiments described above utilize a congestion
algorithm to determine when and how to discard traffic. The
congestion algorithm will now be described in detail; however, one
skilled in the art will recognize that other suitable congestion
methods can be used.
[0109] The congestion algorithm, or random early discard (RED)
algorithm, uses the following factors to decide whether to discard
a packet: the color of the packet; the queue byte count size; and
congestion parameters. In the case of three colors of traffic (red,
yellow, and green), there are four congestion parameters, RedMin,
YelMin, GrnMin, and GrnMax. Referring to FIG. 12, the byte count of
the queue is divided into regions that are defined by the
congestion parameters. Specifically, a Pass value equals
2.sup.RedMin, a Red value equals Pass+2.sup.YelMin, a Yellow Region
equals Pass+Red+2.sup.GrnMin, a Green value equals
Pass+Red+Yellow+2.sup.GrnMax. A Pass Region corresponds to a byte
count range of zero (0) to Pass, a Red region corresponds to a byte
count range of Pass to Red, a Yellow Region corresponds to a byte
count range of Red to Yellow, a Green Region corresponds to a byte
count range of Yellow to Green, and a Fail Region corresponds to a
byte count range above Green.
[0110] The congestion parameters preferably work in powers of two
as shown in FIG. 12. This insures that the distance between two
levels is always a power of two, which keeps the number of bits
used to hold the parameters to a minimum while allowing values up
to 2.sup.31. For example, if the value in RedMin is 5, the Pass
value is 2.sup.5=32, and furthermore if YelMin is 6, the Red value
is 2.sup.5+2.sup.6=96. One exception exists, if the parameter is 0,
the value is zero and not 2.sup.0.
[0111] When a packet arrives at the queue for which congestion
management is performed, the byte count of the queue is compared to
the threshold corresponding to the packet color to determine if it
is to be passed (scheduled) or discarded.
[0112] FIG. 13 is a graphical representation of the RED algorithm.
If a packet arrives when the byte count of the queue is between 0
and Pass, the packet is passed regardless of the color of the
packet. If a packet arrives when the byte count is greater than
Green, the packet is discarded regardless of the color of the
packet. If a packet arrives when the byte count is between Pass and
Green, the decision to discard or pass the packet depends on the
color of the packet. For example, if a yellow packet arrives when
the byte count is between Red and 0, the packet is passed. If a
yellow packet arrives when the byte count is greater than Yellow,
the packet is discarded. Lastly, if a yellow packet arrives when
the byte count is between Red and Yellow, there is a linear
probability if the packet is kept or discarded, for example, if the
byte count is 75% of the way from Red to Yellow, there's a 75%
chance that the packet will be discarded.
[0113] FIG. 14 is a flowchart of an exemplary RED algorithm that
can be utilized in conjunction with the present invention. When a
packet arrives, for example at a group queue, the color of the
packet is determined (1401), and the byte count of the group queue
is determined (1402). The byte count of the queue is compared
(1403) to GrnMax. If the byte count is greater than GrnMax, the
packet is discarded (1404). However, if the byte count is not
greater than GrnMax, the byte count is compared (1405) to Pass. If
the byte count is less than Pass, the packet is enqueued (1406). On
the other hand, if the byte count is not less than Pass, a
determination is made (1407) whether the packet is "green." If the
packet is "green," the byte count is compared (1408) to GrnMin. If
the byte count is less than GrnMin, the packet is enqueued (1409).
On the other hand, if the byte count is not less than GrnMin, the
linear probability described above is applied (1410) to determine
if the packet is enqueued or discarded.
[0114] At 1407, if the packet's color is determined not to be
"green," a determination is made (1411) whether the color of the
packet is "yellow." If the packet is "yellow," the byte count is
compared (1412) to YelMin. If the byte count is less than YelMin,
the packet is enqueued (1413). On the other hand, if the byte count
is not less than YelMin, the byte count is compared (1414) to
GrnMin. If the byte count is greater than GrnMin, the packet is
discarded (1415). However, if the byte count is not greater than
GrnMin, the linear probability described above is applied (1416) to
determined if the packet is enqueued or discarded.
[0115] At 1411, if the packet's color is determined not to be
"yellow," the byte count is compared (1417) to RedMin. If the byte
count is not greater than RedMin, the linear probability described
above is applied (1418) to determine if the packet is enqueued or
discarded. On the other hand, if the byte count is greater than
RedMin, the packet is discarded (1419).
[0116] In other embodiments, the above algorithm can be employed in
conjunction with CIDs ("connection identifiers," which correspond
to per flow queues on a line card), GIDs and VOs (virtual output
queues) in a hierarchical manner. For example, each resource has
its own set of thresholds and byte counts. The byte counts are
summed across the resources. So, for instance, if there are 10 CIDs
to the same GID each with a byte count of 100, then the GID byte
count will be 100.times.10=1000 bytes. Similarly, if there are 3
GIDs to a VO (port+priority), then the VO byte count is the sum of
the byte counts of all 3 GIDs corresponding to that VO. When a
packet arrives at a resource, the total byte count for that
resource is compared to the threshold of that resource that
corresponds to the color of the packet. When a packet is accepted
(i.e. not discarded) the byte counts of the associated CID, GID,
and VO are incremented by the packet size at the same time. When
the packet is transmitted the byte counts of the CID, GID, and VO
are decremented by the packet size. This model is, for example,
like a hierarchical RED model.
[0117] The thresholds for ingress are used to enforce the traffic
contract and the final values are a combination of competing
factors. One factor is the intended delay of the traffic class. The
delay is longer for lower priority traffic classes to absorb
bursting, and shorter for higher priority traffic classes to
minimize delay. For example, for EF and the user control traffic
(each has it's own CID) which are shaped at ingress user/access
side interface, the delay is lower as compared to AF and BE traffic
classes.
[0118] Another factor is a minimum value for the "pass" region
(RedMin threshold) that allows a certain number of MTU's. This is
to prevent prematurely discarding packets due to any internal
delays or jitter within the hardware. Another factor is a fixed
maximum value per traffic class to prevent allocating too large a
threshold for a particular CID. An additional factor is a maximum
burst size (MBS) calculation where appropriate for the service
class and circuit type.
[0119] Once an overall "buffer size" (maximum byte count) has been
calculated and the RedMin adjustment determined, the thresholds are
divided up among the possible colors. If there are missing colors
then those thresholds are zero (not used). The user control traffic
class at the user/interface side for instance has only green
packets so the YelMin and GrnMin values are zero.
[0120] For egress, the goals of congestion control preferably are
(1) to isolate impact of one or few connections impacting
non-congested queues, (2) to guarantee minimum rates--discard all
red before green, (3) to minimize delay under congestion
(especially for higher priorities), (4) to enforce traffic
contracts, (5) to buffer reasonable bursts without discarding, and
(6) to allow more buffering for lower priorities. The CID, GID, and
VO thresholds combine to allow realization of these goals. Within a
traffic class the CID, GID and VO have their separate RedMin,
YelMin and GrnMin thresholds, though Green threshold is same for
all the queues (CID, GID, VO). Each traffic class (e.g., EF, AF,
BE) has it's own CID threshold, with lower thresholds for higher
priority classes (like EF). The individual CID thresholds and GID,
VO thresholds are adjusted such that BE and "out of contract" AF
traffic is discarded before "in contract" AF, EF and user control
traffic.
[0121] Although this invention has been described in certain
specific embodiments, many additional modifications and variations
would be apparent to those skilled in the art. It is therefore to
be understood that this invention may be practiced otherwise than
as specifically described. Thus, the present embodiments of the
invention should be considered in all respects as illustrative and
not restrictive, the scope of the invention to be determined by any
claims supportable by this application and the claims' equivalents
rather than the foregoing description.
APPENDIX A--ACRONYM LIST
[0122] ACL--access control list [0123] AF--assured forwarding
traffic [0124] ASIC--application specific integrated circuit [0125]
BE--best effort traffic [0126] CID--connection identifier [0127]
CIR+EIR--committed information rate and excess information rate
[0128] DRR--deficit round robin [0129] DSCP--Differentiated Service
Code Point [0130] EF--expedited forwarding traffic [0131]
EXP--experimental [0132] FG--fabric gateway [0133] GID--group
identifier (group queue) [0134] IP--internet protocol [0135]
LER--label edge router [0136] LSP--label switched path [0137]
LSR--label switching router [0138] MAC--media access control [0139]
MBS--maximum burst size [0140] MPLS--multiprotocol label switching
[0141] MTU--maximum transmission unit [0142] PLM--physical line
module [0143] PM--packet manager [0144] PP--packet processor [0145]
PS--packet scheduler [0146] QoS--quality of service [0147]
RED--random early discard [0148] SCC--switch and control card
[0149] SLA--service level agreement [0150] ULC--universal line card
[0151] VO--virtual output queue [0152] WRR--weighted round
robin
* * * * *