Differentiated services using weighted quality of service (QoS) Curry; David S. ; et al. [Tellabs San Jose, Inc.]

Differentiated services using weighted quality of service (QoS)

Curry; David S. ; et al.

Patent Application Summary

U.S. patent application number 11/413409 was filed with the patent office on 2007-11-01 for differentiated services using weighted quality of service (qos). This patent application is currently assigned to Tellabs San Jose, Inc.. Invention is credited to Robert J. Colvin, David S. Curry, Paul M. Hallinan, Man-Tung T. Hsiao, Sanjay Khanna, Rishi Mehta, Samer I. Nubani, Ravindra Sunkad.

Application Number	20070253438 11/413409
Document ID	/
Family ID	38434805
Filed Date	2007-11-01

United States Patent Application	20070253438
Kind Code	A1
Curry; David S. ; et al.	November 1, 2007

Differentiated services using weighted quality of service (QoS)

Abstract

Differentiated services for network traffic using weighted quality of service is provided. Network traffic is queued into separate per flow queues, and traffic is scheduled from the per flow queues into a group queue. Congestion management is performed on traffic in the group queue. Traffic is marked with priority values, and congestion management is performed based on the priority values. For example, traffic can be marked as "in contract" if it is within a contractual limit, and marked as "out of contract" if it is not within the contractual limit. Marking can also include classifying incoming traffic based on Differentiated Service Code Point. Higher priority traffic can be scheduled from the per flow queues in a strict priority over lower priority traffic. The lower priority traffic can be scheduled in a round robin manner.

Inventors:	Curry; David S.; (San Jose, CA) ; Colvin; Robert J.; (San Jose, CA) ; Nubani; Samer I.; (Santa Clara, CA) ; Sunkad; Ravindra; (Pleasanton, CA) ; Hsiao; Man-Tung T.; (Cupertino, CA) ; Hallinan; Paul M.; (San Carlos, CA) ; Mehta; Rishi; (San Jose, CA) ; Khanna; Sanjay; (Fremont, CA)
Correspondence Address:	FITZPATRICK CELLA HARPER & SCINTO 30 ROCKEFELLER PLAZA NEW YORK NY 10112 US
Assignee:	Tellabs San Jose, Inc. Naperville IL
Family ID:	38434805
Appl. No.:	11/413409
Filed:	April 28, 2006

Current U.S. Class:	370/412 ; 370/395.4
Current CPC Class:	H04L 47/621 20130101; H04L 47/60 20130101; H04L 47/624 20130101; H04L 47/10 20130101; H04L 47/2441 20130101; H04L 49/90 20130101; H04L 47/2408 20130101; H04L 47/31 20130101
Class at Publication:	370/412 ; 370/395.4
International Class:	H04L 12/56 20060101 H04L012/56; H04L 12/28 20060101 H04L012/28

Claims

1. A method for offering differentiated service of network traffic, the method comprising: queuing the traffic into a first plurality of separate per flow queues; scheduling the traffic from the per flow queues into a group queue; and performing congestion management on traffic in the group queue.

2. The method of claim 1, further comprising: marking traffic with priority values according to priority, wherein the congestion management is performed based on the priority values.

3. The method of claim 2, wherein the marking includes determining whether the traffic is within a contractual limit, and marking the traffic as "in contract" if the traffic is within the contractual limit, and marking the traffic as "out of contract" if the traffic is not within the contractual limit.

4. The method of claim 2, wherein the marking includes classifying incoming traffic based on Differentiated Service Code Point.

5. The method of claim 1, further comprising: performing congestion management in a per flow queue.

6. The method of claim 1, wherein the traffic is scheduled from the per flow queues by scheduling higher priority traffic in a strict priority over lower priority traffic.

7. The method of claim 6, wherein the lower priority traffic includes a first lower priority traffic and a second lower priority traffic, and the traffic is scheduled from the per flow queues by scheduling the first lower priority traffic and the second lower priority traffic based on a round robin process.

8. The method of claim 1, further comprising: scheduling traffic from the group queue into a second plurality of separate per flow queues based on priority; scheduling traffic from the second plurality of separate per flow queues into either of a high priority group queue or a low priority group queue, wherein traffic in higher priority per flow queues of the second plurality of separate per flow queues is scheduled into the high priority group queue, and traffic in lower priority per flow queues of the second plurality of per flow queues is scheduled into the low priority group queue.

9. The method of claim 8, wherein traffic in the higher priority per flow queues of the second plurality of separate per flow queues is scheduled into the high priority group queue based on a round robin process, and traffic in the lower priority per flow queues of the second plurality of separate per flow queues is scheduled into the low priority group queue based on a round robin process.

10. The method of claim 8, further comprising: scheduling traffic from the high priority group queue in a strict priority over traffic in the low priority group queue.

11. The method of claim 1, wherein the traffic includes a plurality of types of traffic including user control traffic, expedited forwarding traffic, assured forwarding traffic, and best effort traffic, and the traffic is queued into the per flow queues according to traffic type.

12. A method for offering differentiated service of network traffic, the method comprising: queuing the traffic into a first plurality of separate per flow queues; scheduling the traffic from the first plurality of per flow queues into either of a high priority group queue or a low priority group queue, wherein traffic in higher priority per flow queues of the first plurality of separate per flow queues is scheduled into the high priority group queue, and traffic in lower priority per flow queues of the first plurality of separate per flow queues is scheduled into the low priority group queue; and performing congestion management on traffic in the high priority group queue and traffic in the low priority group queue.

13. The method of claim 12, wherein traffic in the higher priority per flow queues of the first plurality of separate per flow queues is scheduled into the high priority group queue based on a round robin process, and traffic in the lower priority per flow queues of the first plurality of separate per flow queues is scheduled into the low priority group queue based on a round robin process.

14. The method of claim 12, further comprising: scheduling traffic from the high priority group queue and the low priority group queue onto a second plurality of separate per flow queues based on priority; scheduling traffic from the per flow queues of the second plurality of separate per flow queues onto either of a second high priority group queue or a second low priority group queue, wherein traffic in higher priority per flow queues of the second plurality of separate per flow queues is scheduled onto the second higher priority group queue, and traffic in lower priority per flow queues of the second plurality of separate per flow queues is scheduled onto the second low priority group queue.

15. The method of claim 14, wherein traffic in the higher priority per flow queues of the second plurality of separate per flow queues is scheduled onto the second high priority group queue based on a round robin process, and traffic in the lower priority per flow queues of the second plurality of separate per flow queues is scheduled onto the second low priority group queue based on a round robin process.

16. The method of claim 14, further comprising: scheduling traffic from the second high priority group queue in a strict priority over traffic in the second low priority group queue.

17. An apparatus for offering differentiated services of network traffic, the apparatus comprising: a per flow scheduler to queue the traffic into a first plurality of separate per flow queues; a higher priority scheduler to schedule traffic in higher priority per flow queues of the first plurality of separate per flow queues into a group queue; a lower priority scheduler to schedule traffic from lower priority per flow queues of the first plurality of per flow queues into the group queue; and a group queue congestion manager to perform congestion management on traffic in the group queue.

18. The apparatus of claim 17, further comprising: a second per flow scheduler to schedule traffic from the group queue into a second plurality of separate per flow queues based on priority; a second higher priority scheduler to schedule traffic from higher priority per flow queues of the second plurality of separate per flow queue into a high priority group queue; and a second lower priority scheduler to schedule traffic from lower priority per flow queues of the second plurality of separate per flow queues into a low priority group queue.

19. The apparatus of claim 18, wherein traffic in the higher priority per flow queues of the second plurality of separate per flow queues is scheduled into the high priority group queue based on a round robin process, and traffic in the lower priority per flow queues of the second plurality of separate per flow queues is scheduled into the low priority group queue based on a round robin process.

20. An apparatus for offering differentiated services of network traffic, the apparatus comprising: a per flow scheduler to queue the traffic into a first plurality of separate per flow queues; a high priority scheduler to schedule traffic from higher priority per flow queues of the first plurality of separate per flow queues into a high priority group queue; a low priority scheduler to schedule traffic from lower priority per flow queues of the first plurality of separate per flow queues into a low priority group queue; a high queue congestion manager to perform congestion management on traffic in the high priority group queue; and a low queue congestion manager to perform congestion management on traffic in the low priority group queue.

21. Computer-executable program instructions stored on computer-readable medium, the computer-executable program instructions for offering differentiated services of network traffic, the computer-executable instructions executable to perform the method of: queuing the traffic in a first plurality of separate per flow queues; scheduling the traffic from the per flow queues of the first plurality of separate per flow queues into a group queue; and performing congestion management on traffic in the group queue. scheduling traffic from the group queue into a second plurality of separate per flow queues based on priority; scheduling traffic from the second plurality of separate per flow queues into either of a high priority group queue or a low priority group queue, wherein traffic in higher priority per flow queues of the second plurality of separate per flow queues are scheduled into the high priority group queue, and traffic in lower priority per flow queues of the second plurality of per flow queues are scheduled into the low priority group queue.

22. The computer-executable program instructions of claim 21, wherein traffic in the higher priority per flow queues of the second plurality of separate per flow queues is scheduled into the high priority group queue based on a round robin process, and traffic in the lower priority per flow queues of the second plurality of separate per flow queues is scheduled into the low priority group queue based on a round robin process.

23. Computer-executable program instructions stored on computer-readable medium, the computer-executable program instructions for offering differentiated services of network traffic, the computer-executable instructions executable to perform the method of: queuing the traffic into a first plurality of separate per flow queues; scheduling the traffic from the first plurality of per flow queues into either of a high priority group queue or a low priority group queue, wherein traffic in higher priority per flow queues of the first plurality of separate per flow queues is scheduled into the high priority group queue, and traffic in lower priority per flow queues of the first plurality of separate per flow queues is scheduled into the low priority group queue; and performing congestion management on traffic in the high priority group queue and traffic in the low priority group queue.

Description

BACKGROUND OF THE INVENTION

[0001] 1. Field of the Invention

[0002] This invention pertains to devices, methods, and computer programs providing differentiated services for network traffic. Specifically, the invention relates to a system in which differentiated services are provided using a combination of traffic flow weighting, group queues, and congestion management.

[0003] 2. Description of Related Art

[0004] Network service providers offer differentiated services in order to tailor customer bandwidth demands based on priority levels of a customer's network traffic. In particular, higher priority traffic is generally given preference over lower priority traffic, thus increasing bandwidth and reducing delay for higher priority traffic at the expense of the lower priority traffic. However, many traditional differentiated services methods do not properly balance high priority and low priority traffic. As a result, lower priority traffic sometimes can be prematurely discarded in conditions of network congestion.

SUMMARY OF THE INVENTION

[0005] To address the foregoing, the present invention provides a method, apparatus, and computer program for providing differentiated services for network traffic. In one embodiment, the traffic is queued into a first plurality of separate per flow queues, and the traffic is scheduled from the per flow queues into a group queue. Congestion management is performed on traffic in the group queue.

[0006] In at least one embodiment of the present invention, traffic is marked with priority values according to priority, and congestion management is performed based on the priority values. For example, the marking can include determining whether the traffic is within a contractual limit, and marking the traffic as "in contract" if the traffic is within the contractual limit, and marking the traffic as "out of contract" if the traffic is not within the contractual limit. In another example, the marking can include classifying incoming traffic based on Differentiated Service Code Point.

[0007] According to one embodiment of the present invention, congestion management is performed in a per flow queue.

[0008] In another embodiment of the present invention, the traffic is scheduled from the per flow queues by scheduling higher priority traffic in a strict priority over lower priority traffic. For example, the lower priority traffic can include a first lower priority traffic and a second lower priority traffic. In this case, traffic is scheduled from the per flow queues by scheduling the first lower priority traffic and the second lower priority traffic based on a round robin process.

[0009] The traffic can include a plurality of types of traffic including user control traffic, expedited forwarding traffic, assured forwarding traffic, and best effort traffic, and the traffic can be queued into the per flow queues according to traffic type.

[0010] In another embodiment, traffic from the group queue is scheduled into a second plurality of separate per flow queues based on priority, and traffic from the second plurality of separate per flow queues is scheduled into either of a high priority group queue or a low priority group queue. In this case, traffic in higher priority per flow queues of the second plurality of separate per flow queues is scheduled into the high priority group queue, and traffic in lower priority per flow queues of the second plurality of per flow queues is scheduled into the low priority group queue.

[0011] In another aspect of the present invention, traffic in the higher priority per flow queues of the second plurality of separate per flow queues is scheduled into the high priority group queue based on a round robin process, and traffic in the lower priority per flow queues of the second plurality of separate per flow queues is scheduled into the low priority group queue based on a round robin process.

[0012] In a further aspect, traffic can also be scheduled from the high priority group queue in a strict priority over traffic in the low priority group queue.

[0013] In another embodiment of the present invention, the network traffic is queued into a first plurality of separate per flow queues, and the traffic is scheduled from the first plurality of per flow queues into either of a high priority group queue or a low priority group queue. Traffic in higher priority per flow queues of the first plurality of separate per flow queues is scheduled into the high priority group queue, and traffic in lower priority per flow queues of the first plurality of separate per flow queues is scheduled into the low priority group queue. Congestion management is performed on traffic in the high priority group queue and traffic in the low priority group queue.

[0014] In another aspect of the present invention, traffic in the higher priority per flow queues of the first plurality of separate per flow queues is scheduled into the high priority group queue based on a round robin process, and traffic in the lower priority per flow queues of the first plurality of separate per flow queues is scheduled into the low priority group queue based on a round robin process.

[0015] In a further aspect, traffic is scheduled from the high priority group queue and the low priority group queue onto a second plurality of separate per flow queues based on priority, and traffic is scheduled from the per flow queues of the second plurality of separate per flow queues onto either of a second high priority group queue or a second low priority group queue. In this case, traffic in higher priority per flow queues of the second plurality of separate per flow queues is scheduled onto the second higher priority group queue, and traffic in lower priority per flow queues of the second plurality of separate per flow queues is scheduled onto the second low priority group queue.

[0016] In another aspect, traffic in the higher priority per flow queues of the second plurality of separate per flow queues is scheduled onto the second high priority group queue based on a round robin process, and traffic in the lower priority per flow queues of the second plurality of separate per flow queues is scheduled onto the second low priority group queue based on a round robin process. Traffic from the second high priority group queue can be scheduled in a strict priority over traffic in the second low priority group queue.

[0017] The invention can be embodied in, without limitation, a method, apparatus, or computer-executable program instructions.

[0018] This brief summary has been provided so that the nature of the invention may be understood quickly. A more complete understanding of the invention can be obtained by reference to the following detailed description in connection with the attached drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

[0019] The present invention will be more readily understood from a detailed description of the preferred embodiments taken in conjunction with the following figures:

[0020] FIG. 1 is a block diagram of an exemplary environment in which an embodiment of the present invention can be implemented.

[0021] FIG. 2A is a block diagram of an exemplary data processing system in which an embodiment of the present invention can be implemented.

[0022] FIG. 2B is a perspective view of a configuration of an exemplary network router in which an embodiment of the present invention can be implemented.

[0023] FIG. 2C is a block diagram of an exemplary implementation of the network router of FIG. 2B.

[0024] FIG. 3 is a traffic flow diagram of an ingress (originating) node according to one embodiment of the present invention.

[0025] FIG. 4 is a traffic flow diagram of an ingress (originating) node according to another embodiment of the present invention.

[0026] FIGS. 5A and 5B are process flowcharts of a method of servicing traffic in an ingress (originating) node according to one embodiment of the present invention.

[0027] FIG. 5C is a collaboration diagram for functional modules for servicing traffic in an ingress (originating) node according to one embodiment of the present invention.

[0028] FIG. 6 is a traffic flow diagram of a transit node according to one embodiment of the present invention.

[0029] FIG. 7A is a process flowchart of a method of servicing traffic in a transit node according to one embodiment of the present invention.

[0030] FIG. 7B is a collaboration diagram for functional modules for servicing traffic in a transit node according to one embodiment of the present invention.

[0031] FIG. 8 is a block diagram of an exemplary network interface.

[0032] FIG. 9 is a traffic flow diagram of an egress (terminating) node according to one embodiment of the present invention.

[0033] FIG. 10 is a traffic flow diagram of an egress (terminating) node according to another embodiment of the present invention.

[0034] FIGS. 11A and 11B are process flowcharts of a method of servicing traffic in an egress (terminating) node according to one embodiment of the present invention.

[0035] FIG. 11C is a collaboration diagram for functional modules for servicing traffic in an egress (terminating) node according to one embodiment of the present invention.

[0036] FIG. 12 is a diagram of a congestion management configuration according to one embodiment of the present invention.

[0037] FIG. 13 is a diagram of a congestion algorithm according to one embodiment of the present invention.

[0038] FIG. 14 is a process flowchart of an exemplary congestion management algorithm according to one embodiment of the invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0039] Preferred embodiments of the present invention are described below with reference to the accompanying drawings. The embodiments include an apparatus, system, method, and computer program providing differentiated services for network traffic using a combination of traffic flow weighting, group queues, and congestion management.

[0040] As one example of a use of the present invention, a network service provider could offer differentiated services enabled by the present invention to its customers. In particular, customers contract with the network provider via a service level agreement (SLA) to receive a particular type and level of service for the customer's network traffic. In one exemplary environment of the present invention, described below in reference to FIG. 1, the network provider operates a network, or "cloud," of network traffic routers. The customer sends network traffic through the provider's network to various network destinations. The network provider services the customer's traffic utilizing embodiments of the present invention to achieve the terms of the SLA.

[0041] In particular, differentiated services are provided to the customer. That is, the customer's network traffic is differentiated based on levels of priority. Different levels of weighting, which can be customizable by the customer, can then be applied to the differentiated traffic. In addition, the use of group queues and congestion management allow balanced service of the customer's network traffic. In contrast with conventional methods of differentiated services, which can sometimes discard lower priority traffic prematurely, the present invention allows lower priority traffic to compete more fairly with higher priority traffic. Other advantages of the present invention will become apparent in the description of the preferred embodiments below.

[0042] FIG. 1 depicts one example of an environment in which the present invention can be implemented, which is a multiprotocol label switching (MPLS) network 101. However, one skilled in the art will understand that the present invention can be utilized in other types of networks as well. FIG. 1 shows label edge routers (LERs), including LER 100 and LER 120, and label switching routers (LSRs), including LSRs 105, 110, and 115. Network traffic entering LER 110, for example, is assigned a label switched path (LSP), which defines a path (or partial path) for the traffic through the network. For example, in the illustrated embodiment, an LSP is shown as traversing LER 100 to LSR 105 to LSR 110 to LSR 115 to LER 120. For this LSP, LER 100 is an ingress (originating) node for traffic, while LSRs 105, 110, and 115 are transit nodes, and LER 120 is an egress (terminating) node. In other words, when a customer's network traffic enters LER 100, which is serving as an ingress (originating) node, an LSP is defined to allow the traffic to reach its destination. Preferred embodiments of the present invention, described below, are implemented in the provider's LERs and LSRs to allow the customer's network traffic to traverse the LSP.

[0043] The LERs and LSRs depicted in FIG. 1 can be implemented as data processing systems, and the present invention can be implemented as computer-executable program instructions stored on a computer-readable medium of the data processing systems. In other embodiments, software modules or circuitry can be used to implement the present invention.

[0044] For example, the present invention can be implemented on a general purpose computer. FIG. 2A is an architecture diagram for an exemplary data processing system 1100, which could be used as an LER and/or LSR for performing operations as an originating, transit, or egress node in accordance with exemplary embodiments of the present invention described in detail below.

[0045] Data processing system 1100 includes a processor 1110 coupled to a memory 1120 via system bus 1130. The processor is also coupled to external Input/Output (1/0) devices (not shown) via the system bus 1130 and an I/0 bus 1140. A storage device 1150 having a computer-readable medium is coupled to the processor 1110 via a storage device controller 1160 and the I/O bus 1140 and the system bus 1130. The storage device 1150 is used by the processor 1110 and controller 1160 to store and read/write data 1170 and program instructions 1180 used to implement the procedures described below. For example, those instructions 1180 can perform any of the methods described below for operation as an originating node (in conjunction with FIGS. 3, 4, 5A, 5B and 5C), a transit node (in conjunction with FIGS. 6, 7A and 7B), and/or a terminating node (in conjunction with FIGS. 9, 10, 11A, 11B and 11C).

[0046] The processor 1110 may be further coupled to a communications device 1190 via a communications device controller 1200 coupled to the I/O bus 1140. The processor 1110 uses the communications device 1190 to communicate with a network (not shown in FIG. 2A) transmitting multiple flows of data as described below.

[0047] In operation, the processor 1110 loads the program instructions 1180 from the storage device 1150 into the memory 1120. The processor 1110 then executes the loaded program instructions 1180 to offer differentiated services for network traffic that arrives at data processing system 1100 and is enqueued in memory 1120. Thus, processor 1110 operates under the control of the instructions 1180 to perform the methods of this invention, as described in more detail below.

[0048] The present invention can also be implemented, for example, in a network router. FIG. 2B is a perspective view of one configuration of a network router 1210 having sixteen universal line cards (ULCs) 1215. Each ULC 1215 has four physical line modules (PLMs) 1220 for connecting with physical network lines to receive and transmit network traffic. The router 1210 also has three switch and control cards (SCCs) 1225, which serve as switched paths between each of the ULCs 1215. The SCCs 1225 allow traffic to flow from a first ULC 1215 to a second ULC 1215, for example, when ingress traffic received at a physical line of the first ULC 1215 must be transmitted on a physical line of the second ULC 1215. In particular, each ULC 1215 is connected to the three SCCs 1225 (connections not shown). Network router 1210 could be used as an LER and/or LSR for performing operations as originating, transit, and egress nodes in accordance with exemplary embodiments of the present invention described in detail below.

[0049] FIG. 2C is a block diagram showing details of a ULC 1215, in which operations according to one embodiment of the present invention are performed. Specifically, ULC 1215 preferably includes several application specific integrated circuits (ASICs), such as packet processor (PP) 1230, packet manager (PM) 1235, packet scheduler (PS) 1240, fabric gateway (FG) 1245, PM 1250, PS 1255, and PP 1260. ULC 1215 also includes a general purpose processor CPU 1265 (connections not shown). The ASICs and CPU 1265 of ULC 1215 are configured to perform the methods of the present invention, as described in more detail below.

[0050] Generally, ingress traffic received through a PLM 1220 preferably is processed by an "ingress side" of ULC 1215, which includes PP 1230, PM 1235, and PS 1240. Traffic from the "ingress side" is sent to FG 1245, which transmits the traffic through one of the SCCs 1225 to an "egress side" of one of the ULCs 1215 having the correct PLM 1220 for transmitting the traffic to the next network destination. The "egress side," which is not necessarily on the same ULC 1215 as the "ingress side" (although it is not shown as such in FIG. 2C, for convenience), includes PM 1250, PS 1255, and PP 1260. The traffic is received by the corresponding FG 1245 at the "egress side," and the traffic is processed by PM 1250, PS 1255, and PP 1260 for transmitting onto the network via a PLM 1220. CPU 1265 performs functions such as communicating with the SCCs and programming the PPs, PMs, and PSs as needed. FG 1245 serves as a gateway to the SCCs to transmit traffic to and receive traffic from the SCCs.

[0051] Illustrative embodiments of the present invention will now be described. Each embodiment is described as being implemented within an MPLS-capable network environment, such as the network shown in FIG. 1, although one skilled in the art will appreciate that the invention can be applied in different network environments as well, and can operate using many different protocols and traffic types. The exemplary embodiments of the present invention described below are implemented in an MPLS network carrying RFC2547 IP-VPN traffic, and in an MPLS network carrying VPLSNVPWS traffic. These two implementations will be described concurrently, where the embodiments overlap. For convenience, the configuration and behavior for the RFC2547 IP-VPN implementation will be described in detail, and any differences in the VPLSNVPWS implementation will be highlighted.

[0052] To illustrate the QoS behavior and implementation of the present invention, the description below divides the network environment into three portions, as illustrated in the following traffic flow diagrams: (1) an ingress (originating) node, corresponding to FIGS. 3 and 4; (2) a transit node, corresponding to FIG. 6; and (3) an egress (terminating) node, corresponding to FIGS. 9 and 10.

[0053] Referring first to FIGS. 3 and 4, FIG. 3 illustrates an ingress node of the RFC2547 IP-VPN embodiment,, and FIG. 4 shows an ingress node of the VPLSIVPWS embodiment. For example, the embodiments of an ingress node according to the present invention shown in FIGS. 3 and 4 can be implemented on LER 100 of FIG. 1, which would initially receive a customer's network traffic for subsequent transmission through network 101 via an LSP. For the following description, the ingress node is described as having two parts. First, a user access side (labeled "Ingress Side") corresponds to the "ingress side" of ULC 1215, described above. Second, an originating label switched path (LSP) side (i.e., a network side, labeled "Egress Side") corresponds to the "egress side" of ULC 1215, described above. The embodiments of ingress nodes described below illustrate the present invention's use of separate per flow queuing, aggregation into group queues, and congestion management to provide differentiated services.

[0054] Various types of traffic are shown, including user control traffic (cntl 201), expedited forwarding traffic (EF 203), best effort traffic (BE 213), and several types of assured forwarding (AF) traffic such as AF4 (205), AF3 (207), AF2 (209), AF1 (211), AF4.1+AF3.1 (215), AF4.23+AF3.23 (217), AF2.1+AF1.1 (219), and AF2.23+AF1.23 (221). Also shown are policers 223 (only one policer is labeled), traffic shapers S.sub.1% 227, S.sub.max1 229, and S.sub.max2 231, group queues GID.sub.low1 233, GID.sub.low2 235, and GID.sub.high1 237, and deficit round robin (DRR) schedulers DRR1 239, DRR2 241, and DRR3 243. The policers, shapers, and schedulers can be implemented as software modules, which operate on traffic stored in the queues, although in other embodiments they may be circuitry or a combination of circuitry and software.

[0055] Referring now to FIG. 5A, along with FIGS. 2C, 3 and 4, the operations of the Ingress Side of the ingress node will now be described in detail.

[0056] Incoming traffic is classified (401) by PP 1230 based on priority using Differentiated Service Code Point (DSCP) via access control lists (ACLs). While in other embodiments, other methods of classifying traffic can be used, ACLs are preferred because they can offer flexibility in the classification procedure. For example, internet protocol (IP) and media access control (MAC) ACLs can be used. Using IP ACLs, a user, such as a customer, can classify the traffic not only on DSCP but also on other fields in Layer3 and Layer4 headers such as, for example, source IP address, destination IP address, IP Protocol, source port, destination port, and others. Similarly, the embodiment can receive traffic that has been previously classified by a user via MAC ACLs, a user can classify the traffic not only on 802.1p bits but also on other fields in the Ethernet header such as, for example, source MAC address, destination MAC address, ethertype, and others.

[0057] Next, the traffic is queued (402) into separate per flow queues. In particular, PP 1230 determines the queue in which to place the traffic, and PM 1235 stores the traffic in queues. Referring specifically to FIG. 3, separate per flow queues are used for the following traffic--cntl 201 (e.g. MP-eBGP in the RFC2547 case), EF 203, AF4 (205), AF3 (207), AF2 (209), AF1 (211) and BE 213. In contrast to the RFC2547 IP-VPN embodiment, in the VPLS/VPWS embodiment shown in FIG. 4, AF4 (205) and AF3 (207) share a queue, and AF2 (209) and AF1 (211) share a queue.

[0058] Returning to FIG. 5A, the traffic is policed/colored (403) by PS 1240. In particular, policing determines that traffic is "in contract" if the amount of traffic in the particular queue is less than or equal to a contractual limit set forth in the SLA. If the amount of traffic exceeds the contractual limit, policing determines the traffic is "out of contract." As shown in FIGS. 3 and 4, the queue carrying EF 203 is policed by a policer 223, and only "in contract" traffic is allowed to pass, while "out of contract" traffic is discarded. The cntl 201 and EF 203 are marked as "green" traffic for the purpose of congestion control, which is described in more detail below. The AF queues are policed by policers 223 to mark the traffic as "in contract" or "out of contract." The "in contract" AF4 (205) and AF3 (207) traffic is marked "green," and the "in contract" AF2 (209) and AF1 (211) traffic is marked as "yellow" traffic for congestion control. The "out of contract" AF traffic is marked as "red" traffic, e.g., AF4 Red 245, AF3 Red 247, AF2 Red 249, and AF1 Red 251 (FIG. 3). BE 213 is marked as "red," i.e., "out of contract."

[0059] Returning to FIG. 5A, after policing, the traffic is scheduled (404) by PS 1240 onto a group queue GID.sub.low1 233, which is preferably stored in PS 1240. In particular, as shown in FIGS. 3 and 4, cntl 201 and EF 203 are shaped by shapers S.sub.1% 227 and S.sub.max1 229, respectively. Specifically, the shapers limit the traffic flow to a predetermined amount. The queues carrying EF 203 and cntl 201 are considered strict priority queues. The strict priority queues (carrying cntl 201 and EF 203) take precedence over weighted queues (carrying AF traffic and BE 213) in terms of scheduling, with all the per flow queues aggregating into GID.sub.low1 233. The queues carrying the AF and BE traffic are considered weighted queues and the traffic in the weighted queues is scheduled by DRR1 239 onto GID.sub.low1 233. The user can change the weights of weighted queues (carrying AF and BE traffic), though default weights can be set according to the user's needs.

[0060] The rate of traffic flow of GID.sub.low1 233 can be set by S.sub.max2 231, which, for example, helps a service provider to control the incoming traffic and enforce service level agreements (SLAs), offering aggregate committed information rate and excess information rate (CIR+EIR) to customers.

[0061] Congestion management is performed (405) by PS 1240 at the Ingress Side (FIGS. 3 and 4). Specifically, a congestion algorithm is utilized at GID.sub.low1 233 to discard traffic based on a specified preference order. The congestion algorithm is described in more detail below in reference to FIGS. 12, 13 and 14. At GID.sub.low1 233, discard order is BE 213 and AF "out of contract" traffic, followed by AF1 (211) and AF2 (209) "in contract," followed by AF3 (207) and AF4 (205) "in contract," EF 203 and cntl 201. For AF and BE traffic under congestion control, higher-weighted queues receive preference over, i.e., will be discarded less often than, "out of contract" AF and BE traffic and "in contract" AF traffic (however, they will not result in "out of contract" AF traffic getting preference over "in contract" traffic from lower weighted AF classes). In particular, higher-weighted queues preferably are marked with a higher-priority color than "out of contract" traffic.

[0062] Referring to FIG. 5B, along with FIGS. 2C, 3 and 4, the features of the Egress Side of the ingress node will now be described in detail.

[0063] The traffic is marked (406) with an EXP number by PP 1260. The EXP number preferably corresponds to a priority of the traffic, as shown in the following table: TABLE-US-00001 TABLE 1 Traffic type to EXP mapping Traffic EXP Value User Control 6 EF 5 (AF4 and AF3) in contract 4 (AF2 and AF1) in contract 3 (AF4 and AF3) out of contract 2 (AF2 and AF1) out of contract 1 BE 0

[0064] The traffic is separated (407) into per flow queues based on priority. In particular, PP 1260 determines the queue in which to store the traffic, and PM 1250 stores the traffic in the queues. The traffic is then scheduled (408) by PS 1255 onto separate high or low group queues, which are located in PS 1255, according to priority. Specifically, cntl 201 and EF 203, which are marked. with EXP values of EXP=6 and EXP=5, respectively, are placed on separate per flow queues 255. These queues 255 aggregate into GID.sub.high1 237, which is a group queue of the highest priority. Cntl 201 and EF 203 queues 255 preferably are scheduled in a round robin manner among themselves, such as by DRR3 243, or are scheduled based on some other scheduling criteria.

[0065] The AF and BE traffic are placed on weighted queues 257. Specifically, "in contract" AF4 (205) and AF3 (207), which are marked with EXP=4, are placed in a queue 215 for AF4.1+AF3.1 traffic. "In contract" AF2 (209) and AF1 (211), which are marked with EXP=3, are placed in a queue 219 for AF2.1+AF1.1 traffic. AF4 Red 245 and AF3 Red 247, which are marked with EXP=2, are placed in a queue 217 for AF4.23 +AF3.23 traffic. AF2 Red 249 and AF1 Red 251, which are marked with EXP=1, are placed in a queue 221 for AF2.23+AF1.23 traffic. BE 213, which is marked with EXP=0, is placed in a queue for BE traffic. Accordingly, in GID.sub.low2 235 in both the RFC2547 IP-V,PN and the VPLS/VPWS embodiments, the traffic classes AF4 and AF3 share a queue, and traffic classes AF2 and AF1 share a queue. BE continues to be marked as "out of contract" traffic and is on a separate BE queue, usually with a minimal weight.

[0066] The weighted queues 257 are scheduled by DRR2 241 and aggregate into GID.sub.low2 235, which is a group queue of the lower priority. The weights can be set in a multiple of fractional maximum transmission unit (MTU) bytes for the weighted queues 257.

[0067] The group queues, GID.sub.high1 237 and GID.sub.low2 235 correspond to a single interface, such as a physical network line, in a preferred embodiment. As a consequence, the user control and EF traffic from all LSPs goes through the same GID.sub.high1 237, and weighted traffic (AF and BE) from all LSPs goes through the same GID.sub.low2 235.

[0068] Congestion management is performed (409) by PS 1255 at the Egress Side (congestion at the interface). Preferably, the congestion algorithm described in more detail below is utilized at GID.sub.low2 235 and GID.sub.high1 237 to discard traffic based on a specified preference order, such as, for example, the "out of contract" traffic (e.g., BE 213, AF1 Red 251, AF2 Red 249, AF3 Red 247, AF4 Red 245 in FIG. 3) in an LSP being discarded before the AF "in contract" traffic, followed by EF 203 and cntl 201. Among the AF "in contract" traffic, AF1 and AF2 "in contract" traffic is discarded before AF3 and AF4 "in contract" traffic.

[0069] The group queues, GID.sub.high1 237 and GID.sub.low2 235, are scheduled (410) in strict priority mode with respect to each other. Preferably, GID.sub.high1 237 has higher priority than GID.sub.low2 235. This ensures that user control and EF traffic get precedence over the AF and BE traffic.

[0070] In order to support open bandwidth or auto open bandwidth LSPs, shaping on GID.sub.high1 237 and GID.sub.low2 235 can be turned off, that is, shaping is set to a high rate. In addition, none of the individual queues (user control, EF, AF and BE traffic) are policed.

[0071] In the case of multiple open bandwidth or auto open bandwidth LSPs going over an interface, in one embodiment all LSPs can be treated as equal; in other words, no prioritization or bias among the LSPs. Moreover, as described earlier all of the open bandwidth LSPs going through an interface share the same group queues (GID.sub.high1 237 and GID.sub.low2 235) which can help ensure that fairness among the LSPs is maintained.

[0072] Having described the sequence of operations within an ingress (originating) node, specific functional modules implementing the above-described operations from FIGS. 5A and 5B will now be described. FIG. 5C is a collaboration diagram for functional modules deployed in an ingress (originating) node, such as LER 100, for offering differentiated services in accordance with an exemplary embodiment of the present invention. The functional modules can be implemented as software modules or objects. In other embodiments, the functional modules may be implemented using hardware modules or other types of circuitry, or a combination of software and hardware modules. In particular, the functional modules can be implemented via the PPs, PMs, and PSs described above.

[0073] In operation, an ingress node classifier 411 classifies incoming traffic preferably according to DSCP, and a per flow scheduler 413 queues traffic into separate per flow queues. For cntl and EF traffic, a policer 415 discards "out of contract" EF traffic and marks cntl and "in contract" EF traffic as "green." A shaper/scheduler 417 limits cntl and EF traffic to predefined limits and schedules the traffic onto GID.sub.low1 233 in strict priority over the AF and BE traffic. For the AF and BE traffic, a policer 419 marks the AF traffic as "in contract" or "out of contract." The "in contract" AF4 (205) and AF3 (207) traffic is marked "green," and the "in contract" AF2 (209) and AF1 (211) traffic is marked as "yellow." The "out of contract" AF traffic is marked as "red" traffic, and BE 213 is marked as "red." A DRR scheduler 421 schedules AF and BE traffic onto GID.sub.low1 233 in a deficit round robin manner, or using other scheduling criteria.

[0074] A group queue congestion manager 423 applies congestion control to the traffic, and a shaper 425 limits traffic from GID.sub.low1 233. A traffic marker 427 marks traffic with corresponding EXP numbers, and a per flow scheduler 429 schedules the marked traffic into separate per flow queues. A DRR scheduler 431 preferably schedules EXP=6 and EXP=5 traffic onto GID.sub.high1 237 in a deficit round robin manner (or using another suitable scheduling technique), and a high queue congestion manager 433 applies congestion management to GID.sub.high1 237. For the EXP=0 to EXP=4 traffic, a DRR scheduler 435 schedules the traffic onto GID.sub.low2 235, and a low queue congestion manager 437 applies congestion management to GID.sub.low2 235. Finally, an egress scheduler 439 schedules traffic from GID.sub.high1 237 and GID.sub.low2 235, with traffic from GID.sub.high1 237 preferably being scheduled in strict priority over traffic from GID.sub.low2 235.

[0075] Traffic scheduled from an ingress (originating) node in the above-described manner is then sent to a transit node (or an egress (terminating) node). For example, a customer's network traffic is sent from an ingress node implemented in LER 100 to a transit node implemented in LSR 105. Turning now to FIG. 6, a transit node according to one embodiment of the present invention will now be described. For example, the embodiment shown in FIG. 6 could be implemented in LSR 105 of FIG. 1. The configuration and operation of the transit node of the present embodiment is operable to receive traffic from an ingress node of the RFC2547 IP-VPN embodiment (FIG. 3) and an ingress node of the VPLS/VPWS embodiment (FIG. 4). The transit node is shown as having two parts: an "Ingress Side"; and an "Egress Side." The transit node includes deficit round robin schedulers, DRR4 501, DRR5 503, DRR6 505, and DRR7 507, and group queues GID.sub.high2 509, GID.sub.high3 511, GID.sub.low3 513, and GID.sub.low4 515. The transit node preserves the EXP values for LSP traffic flowing through it. In addition, traffic prioritization in a transit node is same as the originating node. The policers, shapers, and schedulers can be implemented as software modules, which operate on traffic stored in the queues, although in other embodiments they may be circuitry or a combination of circuitry and software.

[0076] The embodiment of a transit node described below illustrate the present invention's use of separate per flow queuing, aggregation into group queues, and congestion management to provide differentiated services.

[0077] Referring to FIG. 7A, along with FIGS. 2C and 6, the operations performed by the transit node will now be described in detail.

[0078] At ingress, traffic is queued (601) in separate per flow queues. In particular, PP 1230 determines the queue in which to store the traffic, and PM 1235 stores the traffic in queues. The traffic is scheduled (602) by PS 1240 into a high priority or low priority group queue, where congestion management, described in more detail below, is performed (603). At egress the process is repeated. Specifically, traffic is queued (604) by PP 1260 and PM 1250 in separate per flow queues, and scheduled (605) by PS 1255 into a high priority or low priority group queue, where congestion management, described in more detail below, is performed (606). Traffic is then scheduled (607) from the high priority and low priority group queues by PS 1255.

[0079] For example, in a preferred embodiment of the invention, user control and EF traffic are carried by the strict priority queues 517 and 521, which are scheduled by DRR4 501 and DRR6 505, respectively, and aggregated into the group queues GID.sub.high2 509 and GID.sub.high3 511. The user control and EF queues 517 and 521 are scheduled in round robin manner among themselves through DRR4 501 and DRR6 505, although other scheduling criteria also may be used.

[0080] AF and BE traffic are mapped to weighted queues 519 and 523 for scheduling by DRR5 503 and DRR7 507, respectively, and the resulting traffic is aggregated into the group queue, GID.sub.low3 513 and GID.sub.low4 515. Instead of putting each AF traffic type on a separate queue, AF4 and AF3 traffic are mapped on one queue and AF2 and AF1 traffic are mapped on another queue. This setup allows the number of per flow queues on a universal line card (ULC) to be conserved, and thereby can help to scale the transit LSPs over an interface to a reasonably large number.

[0081] Similar to the description above regarding the ingress node, the group queues at each interface are preferably scheduled in strict priority mode, with GID.sub.high2 509 having higher priority than GID.sub.low3 513, and GID.sub.high3 511 having higher priority than GID.sub.low4 515. The group queues, GID.sub.high3 511 and GID.sub.low4 515 are per interface, that is, all the LSPs on a given interface are transmitted through GID.sub.high3 511 and GID.sub.low4 515.

[0082] Preferably, when a node is acting as a transit node, the shapers at the group queues are kept disabled. Also, there is no policing at the per flow queues (user control, EF, AF, and BE queues). Accordingly, traffic shapers and policers are not shown in FIG. 6.

[0083] In addition, under congestion at the interfaces, either the Ingress Side or the Egress Side, a congestion algorithm described in more detail below is utilized at the group queues of the corresponding interface to discard traffic based on a specified preference order. For example, at the Ingress Side, which corresponds to GID.sub.high2 509 and GID.sub.low3 513, BE traffic and "out of contract" AF traffic (EXP=0, EXP=1, and EXP=2 traffic) preferably is discarded before "in contract" AF1 and AF2 traffic (EXP=3 traffic), followed by "in contract" AF3 and AF4 traffic (EXP=4), EF traffic (EXP=5), and user control traffic (EXP=6).

[0084] Because the GID.sub.high traffic preferably has higher scheduling priority than GID.sub.low traffic, the EF and user control traffic from all the transit and ingress nodes take precedence over AF and BE traffic. AF and BE queues 519 and 523 are scheduled in deficit round robin (DRR) mode, or using another scheduling technique.

[0085] Having described the sequence of operations within a transit node, specific functional modules implementing the operations of the node will now be described. FIG. 7B is a collaboration diagram for functional modules deployed in a transit node, such as LSR 105, for offering differentiated services in accordance with an exemplary embodiment of the present invention. The functional modules can be implemented as software modules or objects. In other embodiments, the functional modules may be implemented using hardware modules or other types of circuitry, or a combination of software and hardware modules. In particular, the functional modules can be implemented via the PPs, PMs, and PSs described above.

[0086] In operation, an ingress per flow scheduler 609 schedules traffic onto high priority or low priority queues. The cntl and EF traffic is scheduled by a DRR scheduler 611 into GID.sub.high2 509, and a high queue congestion manager 613 performs congestion management on GID.sub.high2 509. A DRR scheduler 615 schedules traffic from GID.sub.high2 509 onto GID.sub.high3 511, and high queue congestion manager 617 performs congestion management on GID.sub.high3 511.

[0087] Similarly, the AF and BE traffic is scheduled by a DRR scheduler 619 into GID.sub.low3 513, and a low queue congestion manager 621 applies congestion management. A DRR scheduler 623 then schedules traffic from GID.sub.low3 513 onto GID.sub.low4 515, and low queue congestion manager 625 applies congestion management. An egress scheduler 627 schedules traffic from GID.sub.high3 511 and GID.sub.low3 515, with GID.sub.high3 511 preferably given strict priority over GID.sub.low3 515.

[0088] Having described originating and transit nodes, it is noted that transit and originating LSPs can use a common interface (e.g., a physical network line), for example, by aggregating all the per flow queues from transit and originating LSPs behind the same group queues. FIG. 8 shows an example of a single port/interface 700 preferably used by multiple per flow queues of various transit LSPs 703 and originating LSPs 705. Multiple queues of AF and BE traffic are aggregated behind a GIDIW 707 group queue having a byte count of, for example, 10 Gbytes. Multiple per flow queues of user control and EF traffic are aggregated into a GID.sub.high 709 group queue having a byte count of, for example, 10 Gbytes.

[0089] After having traversed an ingress node, such as LER 100, and possibly one or more transit nodes, such as LSR 105, network traffic reaches an egress (terminating) node, such as LER 120. For example, LER 120 would be the final node of the LSP traversed by a customer's network traffic, and this egress (terminating) node schedules the customer's traffic for forwarding to its final destination.

[0090] Referring now to FIGS. 9 and 10, an egress (terminating) node will now be described in detail. FIG. 9 illustrates an egress node of the RFC2547 IP-VPN embodiment, and FIG. 10 shows an egress node of the VPLSNVPWS embodiment. Similar to the description above regarding the ingress node, for the following description the egress node is described as having two parts: a terminating LSP side (labeled "Ingress Side"); and an egress access side (labeled "Egress Side"). The policers, shapers, and schedulers can be implemented as software modules, which operate on traffic stored in the queues, although in other embodiments they can be circuitry or a combination of circuitry and software.

[0091] The embodiments of egress nodes described below illustrate the present invention's use of separate per flow queuing, aggregation into group queues, and congestion management to provide differentiated services.

[0092] Referring also to FIG. 11A in conjunction with FIGS. 2C, 9 and 10, operation of the Ingress Side of the egress node will now be described in detail.

[0093] The traffic prioritization in a terminating LSP is handled in the same manner as transit and originating LSPs. Incoming traffic is queued (1001) in separate per flow queues. In particular, PP 1230 determines the queue in which to schedule the traffic, and PM 1235 stores the traffic in the queues. The traffic is scheduled (1002) by PS 1240 onto high or low priority group queues, where congestion management is performed (1003).

[0094] Preferably, the user control and EF traffic are placed (1001) into separate queues 821, which are scheduled (1002) by DRR8 801 onto group queue, GID.sub.high4 803. AF and BE traffic are mapped (1001) to weighted queues 823 for scheduling (1002) by DRR9 805 onto group queue, GID.sub.low5 807. In both the RFC2547 IP-VPN and the VPLS/VPWS embodiments, the AF4 and AF3 traffic share a queue, and the AF2 and AF1 traffic share a queue. The BE traffic is on a separate queue, usually with a minimal weight.

[0095] The group queues, GID.sub.high4 803 corresponds to a single physical interface (line), and GID.sub.low5 807 corresponds to a single interface (line). That is, the user control and EF traffic from all LSPs over an interface is transmitted through a same GID.sub.high4 803 group queue, and the weighted traffic (AF and BE) from all LSPs over an interface is transmitted through the same GID.sub.low5 807 group queue.

[0096] GID.sub.high4 803 and GID.sub.low5 807 are scheduled in strict priority mode with respect to each other. The GID.sub.high4 803 preferably has higher priority than GID.sub.low5 807. This helps ensure that EF and user control traffic get precedence over the AF and BE traffic.

[0097] The shapers for GID.sub.high4 803 and GID.sub.low5 807, as well as policing on individual queues, can be turned off in order to accommodate open bandwidth and auto open bandwidth LSPs. Accordingly, traffic shapers and policers are not shown for these GIDs.

[0098] Under congestion at the interface, congestion management is performed (1003) by PS 1240. Preferably, BE traffic and "out of contract" AF traffic (EXP=0, EXP=1, and EXP=2) are discarded before "in contract" AF1 and AF2 traffic (EXP=3), followed by "in contract" AF3 and AF4 traffic (EXP=4), EF traffic (EXP=5) and user control traffic (EXP=6).

[0099] Referring to FIG. 11B, along with FIGS. 2C, 9 and 10, operations of the Egress Side of the egress node will now be described in detail.

[0100] For RFC2547 IP-VPN (FIG. 9), the outgoing IP traffic is classified (1004) by PP 1260 based on DSCP into various diffserv classes, enabling the user to apply QoS parameters (policing, appropriate service class, etc . . . ) per diffserv class. The outgoing IP traffic is queued (1005) into multiple per flow queues 825, that is, separate queues for user control, EF, AF4, AF3, AF2, AF1 and BE traffic. In particular, PP 1260 determines the queue in which to schedule traffic, and PM 1250 stores the traffic in the queues. DSCP is preserved in the outgoing IP stream for RFC 2547 IP-VPN traffic. In contrast, for VPLS and multi-class VPWS (FIG. 10), the outgoing AF4 and AF3 traffic share a common queue, and the outgoing AF2 and AF1 traffic share another common queue.

[0101] QoS parameters can be applied to the outgoing traffic through policing (1006) the per flow queues 825 by PS 1255. Specifically, the per flow queues carrying EF and AF traffic are policed by policers 809. Preferably, the EXP=3 traffic is treated as "green" upon entering policer 809; this helps EXP=3 traffic compete with EXP=4 traffic for bandwidth. The traffic is then scheduled (1007) by PS 1255 onto a group queue, which preferably is stored in PS 1255. In particular, the user control traffic is shaped by traffic shaper S.sub.1%2 811, the EF traffic is shaped by traffic shaper S.sub.max3 813. The user control and EF traffic are considered strict priority queues. The strict priority queues take precedence over weighted queues (carrying AF traffic and BE 213) in terms of scheduling, with all the per flow queues aggregating into GID.sub.low6 817. The policed AF traffic and the BE traffic are mapped into separate queues for AF1, AF2, AF3, AF4, and BE traffic, and scheduled by DRR10 815 onto GID.sub.low6 817. The rate of traffic flow of GID.sub.low6 817 can be set by S.sub.max4 819.

[0102] Congestion management is performed (1008) by PS 1255 in GID.sub.low6 817. Under congestion at the interface, the "out of contract" traffic (BE, AF1 Red, AF2 Red, AF3 Red, AF4 Red) is discarded before the AF "in contract" traffic, EF and control traffic. The queue thresholds preferably are set in such a way that Red traffic is discarded first, followed by Yellow and finally Green. Finally, the traffic is scheduled (1009) from GID.sub.low6 817.

[0103] Having described the sequence of operations within an egress (terminating) node, specific functional modules implementing the operations will now be described. FIG. 11C is a collaboration diagram for functional modules deployed in an egress (terminating) node, such as LER 120, for offering differentiated services in accordance with an exemplary embodiment of the present invention. The functional modules can be implemented as software modules or objects. In other embodiments, the functional modules can be implemented using hardware modules or other types of circuitry, or a combination of software and hardware modules. In particular, the fictional modules can be implemented via the PPs, PMs, and PSs described above.

[0104] In operation, an ingress per flow scheduler 1011 schedules incoming traffic onto high priority or low priority queues. The cntl and EF traffic is scheduled by a DRR scheduler 1013 into GID.sub.high4 803, and a high queue congestion manager 1015 performs congestion management on GID.sub.high4 803. A policer 1017 polices EF traffic and marks "in contract" EF traffic as "green," and "out of contract" EF traffic as "yellow." A shaper/scheduler 1019 limits cntl and EF traffic and preferably schedules the traffic onto GID.sub.low6 817 in strict priority over AF and BE traffic.

[0105] Similarly, the AF and BE traffic is scheduled by a DRR scheduler 1021 into GID.sub.low5 807, and a low queue congestion manager 1023 applies congestion management. A DSCP classifier 1025 classifies traffic preferably according to DSCP, and a policer 1027 marks "in contract" EXP=4 and EXP=3 traffic as "green," marks "out of contract" EXP=4 and EXP=3 traffic as "yellow," and marks EXP=2 and EXP=1 traffic as "red." A DRR scheduler 1029 then schedules traffic onto GID.sub.low6 817.

[0106] A group queue congestion manager 1031 applies congestion management to GID.sub.low6 817, and an egress shaper/scheduler 1033 limits traffic from GID.sub.low6 817 and schedules traffic from GID.sub.low6 817.

[0107] In one advantage of the above embodiments, if open bandwidth transit and terminating LSPs are transmitted over the same interface, the LSPs are fairly treated, since the group queues are shared by all the open bandwidth LSPs traversing over the same interface. This is similar to the case explained above in which open bandwidth transit and originating LSPs traverse over the same interface.

[0108] The embodiments described above utilize a congestion algorithm to determine when and how to discard traffic. The congestion algorithm will now be described in detail; however, one skilled in the art will recognize that other suitable congestion methods can be used.

[0109] The congestion algorithm, or random early discard (RED) algorithm, uses the following factors to decide whether to discard a packet: the color of the packet; the queue byte count size; and congestion parameters. In the case of three colors of traffic (red, yellow, and green), there are four congestion parameters, RedMin, YelMin, GrnMin, and GrnMax. Referring to FIG. 12, the byte count of the queue is divided into regions that are defined by the congestion parameters. Specifically, a Pass value equals 2.sup.RedMin, a Red value equals Pass+2.sup.YelMin, a Yellow Region equals Pass+Red+2.sup.GrnMin, a Green value equals Pass+Red+Yellow+2.sup.GrnMax. A Pass Region corresponds to a byte count range of zero (0) to Pass, a Red region corresponds to a byte count range of Pass to Red, a Yellow Region corresponds to a byte count range of Red to Yellow, a Green Region corresponds to a byte count range of Yellow to Green, and a Fail Region corresponds to a byte count range above Green.

[0110] The congestion parameters preferably work in powers of two as shown in FIG. 12. This insures that the distance between two levels is always a power of two, which keeps the number of bits used to hold the parameters to a minimum while allowing values up to 2.sup.31. For example, if the value in RedMin is 5, the Pass value is 2.sup.5=32, and furthermore if YelMin is 6, the Red value is 2.sup.5+2.sup.6=96. One exception exists, if the parameter is 0, the value is zero and not 2.sup.0.

[0111] When a packet arrives at the queue for which congestion management is performed, the byte count of the queue is compared to the threshold corresponding to the packet color to determine if it is to be passed (scheduled) or discarded.

[0112] FIG. 13 is a graphical representation of the RED algorithm. If a packet arrives when the byte count of the queue is between 0 and Pass, the packet is passed regardless of the color of the packet. If a packet arrives when the byte count is greater than Green, the packet is discarded regardless of the color of the packet. If a packet arrives when the byte count is between Pass and Green, the decision to discard or pass the packet depends on the color of the packet. For example, if a yellow packet arrives when the byte count is between Red and 0, the packet is passed. If a yellow packet arrives when the byte count is greater than Yellow, the packet is discarded. Lastly, if a yellow packet arrives when the byte count is between Red and Yellow, there is a linear probability if the packet is kept or discarded, for example, if the byte count is 75% of the way from Red to Yellow, there's a 75% chance that the packet will be discarded.

[0113] FIG. 14 is a flowchart of an exemplary RED algorithm that can be utilized in conjunction with the present invention. When a packet arrives, for example at a group queue, the color of the packet is determined (1401), and the byte count of the group queue is determined (1402). The byte count of the queue is compared (1403) to GrnMax. If the byte count is greater than GrnMax, the packet is discarded (1404). However, if the byte count is not greater than GrnMax, the byte count is compared (1405) to Pass. If the byte count is less than Pass, the packet is enqueued (1406). On the other hand, if the byte count is not less than Pass, a determination is made (1407) whether the packet is "green." If the packet is "green," the byte count is compared (1408) to GrnMin. If the byte count is less than GrnMin, the packet is enqueued (1409). On the other hand, if the byte count is not less than GrnMin, the linear probability described above is applied (1410) to determine if the packet is enqueued or discarded.

[0114] At 1407, if the packet's color is determined not to be "green," a determination is made (1411) whether the color of the packet is "yellow." If the packet is "yellow," the byte count is compared (1412) to YelMin. If the byte count is less than YelMin, the packet is enqueued (1413). On the other hand, if the byte count is not less than YelMin, the byte count is compared (1414) to GrnMin. If the byte count is greater than GrnMin, the packet is discarded (1415). However, if the byte count is not greater than GrnMin, the linear probability described above is applied (1416) to determined if the packet is enqueued or discarded.

[0115] At 1411, if the packet's color is determined not to be "yellow," the byte count is compared (1417) to RedMin. If the byte count is not greater than RedMin, the linear probability described above is applied (1418) to determine if the packet is enqueued or discarded. On the other hand, if the byte count is greater than RedMin, the packet is discarded (1419).

[0116] In other embodiments, the above algorithm can be employed in conjunction with CIDs ("connection identifiers," which correspond to per flow queues on a line card), GIDs and VOs (virtual output queues) in a hierarchical manner. For example, each resource has its own set of thresholds and byte counts. The byte counts are summed across the resources. So, for instance, if there are 10 CIDs to the same GID each with a byte count of 100, then the GID byte count will be 100.times.10=1000 bytes. Similarly, if there are 3 GIDs to a VO (port+priority), then the VO byte count is the sum of the byte counts of all 3 GIDs corresponding to that VO. When a packet arrives at a resource, the total byte count for that resource is compared to the threshold of that resource that corresponds to the color of the packet. When a packet is accepted (i.e. not discarded) the byte counts of the associated CID, GID, and VO are incremented by the packet size at the same time. When the packet is transmitted the byte counts of the CID, GID, and VO are decremented by the packet size. This model is, for example, like a hierarchical RED model.

[0117] The thresholds for ingress are used to enforce the traffic contract and the final values are a combination of competing factors. One factor is the intended delay of the traffic class. The delay is longer for lower priority traffic classes to absorb bursting, and shorter for higher priority traffic classes to minimize delay. For example, for EF and the user control traffic (each has it's own CID) which are shaped at ingress user/access side interface, the delay is lower as compared to AF and BE traffic classes.

[0118] Another factor is a minimum value for the "pass" region (RedMin threshold) that allows a certain number of MTU's. This is to prevent prematurely discarding packets due to any internal delays or jitter within the hardware. Another factor is a fixed maximum value per traffic class to prevent allocating too large a threshold for a particular CID. An additional factor is a maximum burst size (MBS) calculation where appropriate for the service class and circuit type.

[0119] Once an overall "buffer size" (maximum byte count) has been calculated and the RedMin adjustment determined, the thresholds are divided up among the possible colors. If there are missing colors then those thresholds are zero (not used). The user control traffic class at the user/interface side for instance has only green packets so the YelMin and GrnMin values are zero.

[0120] For egress, the goals of congestion control preferably are (1) to isolate impact of one or few connections impacting non-congested queues, (2) to guarantee minimum rates--discard all red before green, (3) to minimize delay under congestion (especially for higher priorities), (4) to enforce traffic contracts, (5) to buffer reasonable bursts without discarding, and (6) to allow more buffering for lower priorities. The CID, GID, and VO thresholds combine to allow realization of these goals. Within a traffic class the CID, GID and VO have their separate RedMin, YelMin and GrnMin thresholds, though Green threshold is same for all the queues (CID, GID, VO). Each traffic class (e.g., EF, AF, BE) has it's own CID threshold, with lower thresholds for higher priority classes (like EF). The individual CID thresholds and GID, VO thresholds are adjusted such that BE and "out of contract" AF traffic is discarded before "in contract" AF, EF and user control traffic.

[0121] Although this invention has been described in certain specific embodiments, many additional modifications and variations would be apparent to those skilled in the art. It is therefore to be understood that this invention may be practiced otherwise than as specifically described. Thus, the present embodiments of the invention should be considered in all respects as illustrative and not restrictive, the scope of the invention to be determined by any claims supportable by this application and the claims' equivalents rather than the foregoing description.

APPENDIX A--ACRONYM LIST

[0122] ACL--access control list [0123] AF--assured forwarding traffic [0124] ASIC--application specific integrated circuit [0125] BE--best effort traffic [0126] CID--connection identifier [0127] CIR+EIR--committed information rate and excess information rate [0128] DRR--deficit round robin [0129] DSCP--Differentiated Service Code Point [0130] EF--expedited forwarding traffic [0131] EXP--experimental [0132] FG--fabric gateway [0133] GID--group identifier (group queue) [0134] IP--internet protocol [0135] LER--label edge router [0136] LSP--label switched path [0137] LSR--label switching router [0138] MAC--media access control [0139] MBS--maximum burst size [0140] MPLS--multiprotocol label switching [0141] MTU--maximum transmission unit [0142] PLM--physical line module [0143] PM--packet manager [0144] PP--packet processor [0145] PS--packet scheduler [0146] QoS--quality of service [0147] RED--random early discard [0148] SCC--switch and control card [0149] SLA--service level agreement [0150] ULC--universal line card [0151] VO--virtual output queue [0152] WRR--weighted round robin

* * * * *