U.S. patent number 7,580,350 [Application Number 10/835,532] was granted by the patent office on 2009-08-25 for system for deriving packet quality of service indicator.
This patent grant is currently assigned to Extreme Networks, Inc.. Invention is credited to David K. Parker.
United States Patent |
7,580,350 |
Parker |
August 25, 2009 |
**Please see images for:
( Certificate of Correction ) ** |
System for deriving packet quality of service indicator
Abstract
A system for deriving a quality of service indicator for a
packet is described. A register holds a control element. A first
resource derives a first candidate quality of service indicator for
the packet. A second resource derives a second candidate quality of
service indicator for the packet responsive at least in part to the
control element held in the register. Resolution logic derives the
quality of service indicator for the packet from the first and
second candidate quality of service indicators.
Inventors: |
Parker; David K. (Chelthenham,
GB) |
Assignee: |
Extreme Networks, Inc. (Santa
Clara, CA)
|
Family
ID: |
40417079 |
Appl.
No.: |
10/835,532 |
Filed: |
April 28, 2004 |
Related U.S. Patent Documents
|
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
Issue Date |
|
|
60558039 |
Mar 30, 2004 |
|
|
|
|
Current U.S.
Class: |
370/230;
370/395.21; 370/395.42; 370/395.43; 370/412; 709/223 |
Current CPC
Class: |
H04L
12/56 (20130101); H04L 45/745 (20130101) |
Current International
Class: |
H04L
12/56 (20060101) |
References Cited
[Referenced By]
U.S. Patent Documents
Foreign Patent Documents
Other References
Van Ess, D., A Circular FIFO, PSoC Style, Cypress Microsystems,
Application Note AN2036, Rev. B, pp. 1-5, Jun. 21, 2002. cited by
other .
Network Working Group, RFC 1071--Computing the Internet Checksum,
Sep. 1988, available at : http://www.faqs.org/rfcs/rfc1071.html.
cited by other .
Network Working Group, RFC 1141--Incremental Updating of the
Internet Checksum, Jan. 1990, available at:
http://www.faqs.org/rfcs/rfc1141.html. cited by other .
Network Working Group, RFC 1624--Computation of the Internet
Checksum via Incremental Update, May 1994, available at:
http://www.faqs.org/rfcs/rfc1624.htm1. cited by other .
Netlogic Microsystems, Product Brief NSE5000GLQ, Copyright 2004,
available at:
http://www.netlogicmicro.com/datasheets/nse5000g1q.html. cited by
other .
Zhang, Z., Recovery of Memory and Process in DSM Systems: HA Issue
#1, Hewlett-Packard Co. HPL-2001-76, Mar. 30, 2001. cited by
other.
|
Primary Examiner: Pham; Chi H.
Assistant Examiner: Hom; Shick
Attorney, Agent or Firm: Howrey LLP
Parent Case Text
This application claims the benefit of U.S. Provisional Application
Ser. No. 60/558,039, filed Mar. 30, 2004, which is hereby fully
incorporated herein by reference as though set forth in full.
Claims
What is claimed is:
1. A system for deriving a quality of service indicator for a
packet comprising: a register for holding a control element; a
first resource configured to derive a first candidate quality of
service indicator for the packet from packet processing state data
relating to the packet; a second resource configured to derive a
second candidate quality of service indicator for the packet from
packet processing state data relating to the packet responsive to
at least a portion of the control element; and resolution logic for
deriving the quality of service indicator for the packet from the
first and second candidate quality of service indicators for the
packet; wherein the resolution logic utilizes a configurable
priority resolution scheme to select one of the first and second
quality of service indicators for assignment to the packet.
2. The system of claim 1 further comprising a memory, wherein the
control element is selected from a plurality of control elements
held in the memory.
3. The system of claim 2 wherein the plurality of control elements
comprises a sequence of control elements.
4. The system of claim 3 wherein the sequence of control elements
comprises a sequence of software commands or instructions.
5. The system of claim 1 wherein the first resource comprises logic
for mapping packet processing state data relating to the packet
into the first candidate quality of service indicator.
6. The system of claim 1 wherein the second resource comprises
logic for mapping packet processing state data relating to the
packet into the second candidate quality of service indicator
responsive to at least the portion of the control element.
7. The system of claim 6 wherein the at least the portion of the
control element specifies a mapping mode selected from a plurality
of possible mapping modes.
8. The system of claim 1 wherein the second resource comprises
logic for searching for a corresponding quality of service
indicator using a key derived from packet processing state data
relating to the packet responsive to at least the portion of the
control element.
9. The system of claim 8 wherein the key is derived from a subset
of packet processing state data relating to the packet selected by
the values of one or more fields of the control element.
10. The system of claim 1 wherein the resolution logic comprises a
packet processor.
11. A system for deriving a quality of service indicator for a
packet comprising: a register for holding a control element; a
first resource configured to derive a first candidate quality of
service indicator for the packet from packet processing state data
relating to the packet; a second resource configured to derive a
second candidate quality of service indicator for the packet from
packet processing state data relating to the packet responsive to
at least a portion of the control element; and resolution logic for
deriving the quality of service indicator for the packet from the
first and second candidate quality of service indicators for the
packet; wherein the resolution logic includes first resolution
logic configured to derive a first priority for the first candidate
quality of service indicator, and second resolution logic is
configured to derive a second priority for the second candidate
quality of service indicator, and the resolution logic is
configured to derive the quality of service indicator for the
packet from the first and second candidate quality of service
indicators, and the first and second priorities.
12. The system of claim 11 wherein the packet has a current quality
of service indicator and priority, and the resolution logic is
configured to replace the current quality of service indicator for
the packet with the first candidate quality of service indicator if
the priority of the first candidate quality of service indicator
exceeds that of the current quality of service indicator and that
of the second candidate quality of service indicator, and replace
the current quality of service indicator for the packet with the
second candidate quality of service indicator if the priority of
the second candidate quality of service indicator exceeds that of
the current quality of service indicator for the packet and that of
the first candidate quality of service indicator.
13. The system of claim 1 wherein the quality of service indicator
for the packet is an egress quality of service indicator.
14. The system of claim 1 wherein the quality of service indicator
for the packet is an ingress quality of service indicator.
15. The system of claim 1 wherein the quality of service indicator
for the packet is a host quality of service indicator.
16. The system of claim 1 wherein the quality of service indicator
for the packet is a multi-dimensional quality of service indicator
comprising ingress, egress, and host quality of service indicator
components.
17. The system of claim 1 comprising three or more resources, each
configured to derive a candidate quality of service indicator for
the packet, wherein the resolution logic is configured to derive
the quality of service indicator for the packet from the candidate
quality of service indicators derived by the three or more
resources.
18. The system of claim 1 wherein the first resource is configured
to derive the first candidate quality of service indicator
responsive to the at least the portion of the control element.
19. The system of claim 1 wherein the first resource is configured
to derive the first candidate quality of service indicator
independent of the control element.
20. In a packet classification system, a method of deriving a
quality of service indicator for a packet comprising: holding a
control element in a register; deriving a first candidate quality
of service indicator for the packet from packet processing state
data relating to the packet; deriving a second candidate quality of
service indicator for the packet from packet processing state data
relating to the packet responsive to at least a portion of the
control element; and deriving the quality of service indicator for
the packet from the first and second candidate quality of service
indicators for the packet, comprising utilizing a configurable
priority resolution scheme to select one of the first and second
quality of service indicators for assignment to the packet.
21. The method of claim 20 further comprising selecting the control
element from a plurality of control elements held in a memory.
22. The method of claim 21 wherein the plurality of control
elements comprises a sequence of control elements.
23. The method of claim 22 wherein the sequence of control elements
comprises a sequence of software commands or instructions.
24. The method of claim 20 wherein the first deriving step
comprises mapping packet processing state data relating to the
packet into the first candidate quality of service indicator.
25. The method of claim 20 wherein the second deriving step
comprises mapping packet processing state data relating to the
packet into the second candidate quality of service indicator
responsive to at least the portion of the control element.
26. The method of claim 25 wherein the at least the portion of the
control element specifies a mapping mode selected from a plurality
of possible mapping modes.
27. The method of claim 20 wherein the second deriving step
comprises searching for a corresponding quality of service
indicator using a key derived from packet processing state data
relating to the packet responsive to at least the portion of the
control element.
28. The method of claim 27 wherein the key is derived from a subset
of packet processing state data relating to the packet selected by
the values of one or more fields of the control element.
29. In a packet classification system, a method of deriving a
quality of service indicator for a packet comprising: holding a
control element in a register; deriving a first candidate quality
of service indicator for a packet from packet processing state data
relating to the packet; deriving a second candidate quality of
service indicator for the packet from packet processing state data
relating to the packet responsive to at least a portion of the
control element; deriving a first priority for the first candidate
quality of service indicator, deriving a second priority for the
second candidate quality of service indicator, and deriving the
quality of service indicator for the packet from the first and
second candidate quality of service indicators, and the first and
second priorities.
30. The method of claim 29 wherein the packet has a current quality
of service indicator and priority, and the method further comprises
replacing the current quality of service indicator for the packet
with the first candidate quality of service indicator if the
priority of the first candidate quality of service indicator
exceeds that of the current quality of service indicator and that
of the second candidate quality of service indicator.
31. The method of claim 29 wherein the packet has a current quality
of service indicator and priority, and the method further comprises
replacing the current quality of service indicator for the packet
with the second candidate quality of service indicator if the
priority of the second candidate quality of service indicator
exceeds that of the current quality of service indicator and that
of the first candidate quality of service indicator.
32. The method of claim 20 wherein the quality of service indicator
for the packet is an egress quality of service indicator.
33. The method of claim 20 wherein the quality of service indicator
for the packet is an ingress quality of service indicator.
34. The method of claim 20 wherein the quality of service indicator
for the packet is a host quality of service indicator.
35. The method of claim 20 wherein the quality of service indicator
for the packet is a multi-dimensional quality of service indicator
comprising ingress, egress and host quality of service indicator
components.
36. The method of claim 20 further comprising deriving three or
more candidate quality of service indicators for the packet, and
deriving the quality of service indicator for the packet from the
three or more candidate quality of service indicators.
37. A system for deriving a quality of service indicator for a
packet, the packet having a state, comprising: means for holding a
control element; first means for deriving a first candidate quality
of service indicator for the packet from packet processing state
data relating to the packet; second means for deriving a second
candidate quality of service indicator for the packet from packet
processing state data relating to the packet responsive to at least
a portion of the control element; and resolution means for deriving
the quality of service indicator for the packet from the first and
second candidate quality of service indicators for the packet, by
utilizing a configurable priority resolution scheme to select one
of the first and second quality of service indicators for
assignment to the packet.
38. The system of claim 1, wherein the packet processing state data
comprises data representative of at least a portion of the packet,
data representative of at least a portion of the state of
processing of the packet, or both.
39. The system of claim 11, wherein the packet processing state
data comprises data representative of at least a portion of the
packet, data representative of at least a portion of the state of
processing of the packet, or both.
40. The method of claim 20, wherein the packet processing state
data comprises data representative of at least a portion of the
packet, data representative of at least a portion of the state of
processing of the packet, or both.
41. The method of claim 29, wherein the packet processing state
data comprises data representative of at least a portion of the
packet, data representative of at least a portion of the state of
processing of the packet, or both.
Description
FIELD OF THE INVENTION
This invention relates to the field of packet processing, and more
specifically, to deriving a quality of service indicator for a
packet.
RELATED ART
Current packet processing systems are under increasing pressure to
handle higher and higher data throughputs of, e.g., 10 GB/s or
more, and more complex and diverse data packet formats, e.g.,
embedded packet formats. However, these systems are subject to
various bottlenecks and constraints that limit the data throughput
that is achievable and the packet formats that can be handled.
Hence, there is a need for a packet processing system that
overcomes the problems of the prior art.
SUMMARY OF THE INVENTION
A system for deriving a quality of service indicator for a packet
is provided. In this system, a register holds a control element. A
first resource is configured to derive a first candidate quality of
service indicator for the packet from packet processing state data.
A second resource is configured to derive a second candidate
quality of service indicator for the packet from packet processing
state data responsive to at least a portion of the control element.
Resolution logic is configured to derive a quality of service
indicator for the packet from the first and second candidate
quality of service indicators for the packet.
A method of deriving a quality of service indicator for a packet is
also provided. This method comprises the step of holding a control
element. It also comprises deriving a first candidate quality of
service indicator for a packet from packet processing state data.
It also comprises deriving a second candidate quality of service
indicator for the packet from packet processing state data
responsive to at least a portion of the control element. The method
further comprises deriving a quality of service indicator for the
packet from the first and second candidate quality of service
indicators for the packet.
Related systems, methods, features and advantages of the invention
or combinations of the foregoing will be or will become apparent to
one with skill in the art upon examination of the following figures
and detailed description. It is intended that all such additional
systems, methods, features, advantages and combinations be included
within this description, be within the scope of the invention, and
be protected by the accompanying claims.
BRIEF DESCRIPTION OF THE DRAWINGS
The components in the figures are not necessarily to scale,
emphasis instead being placed upon illustrating the principles of
the invention. In the figures, like reference numerals designate
corresponding parts throughout the different views.
FIG. 1 is a block diagram of an embodiment of a packet processing
system that comprises a receive-side packet classification system
and a transmit-side packet modification system.
FIG. 2 illustrates an example of the format of a packet header as
produced by an embodiment of a packet classification system in a
packet processing system.
FIG. 3 is a block diagram of an embodiment of a receive-side packet
classification system.
FIGS. 4A-4B are a block diagram of an embodiment of a transmit-side
packet modification system.
FIG. 5 is a block diagram of an embodiment of a cascade of multiple
packet processing systems.
FIG. 6 is a flowchart of an embodiment of method of processing a
packet which comprises multiple parsing steps.
FIG. 7 is a flowchart of an embodiment of a method of performing
egress mirroring of a packet.
FIG. 8 is a flowchart of an embodiment of a method of performing
egress marking of a packet.
FIG. 9 is a flowchart of an embodiment of a method of resolving a
plurality of quality of service (QoS) indicators for a packet
utilizing a configurable priority resolution scheme.
FIG. 10 is a flowchart of an embodiment of a method of classifying
a packet in which sliced packet data is provided to a packet
classification engine over a wide data path.
FIG. 11 is a flowchart of an embodiment of a method of modifying a
packet in which sliced packet data is provided to a packet
modification engine over a wide data path.
FIG. 12 is a flowchart of an embodiment of a method of controlling
packet classification processing of a packet through first and
second stacks.
FIG. 13 is a flowchart of an embodiment of a method of maintaining
packet statistics which involves allocating a packet size
determiner to a packet from a pool of packet size determiners.
FIG. 14 is a flowchart of an embodiment of a method of classifying
a packet which involves buffering the packet in a buffer upon or
after ingress thereof, and associating packet classification data
with the packet as retrieved directly from the buffer to form a
classified packet on an egress data path.
FIG. 15 is a flowchart of an embodiment of a method of modifying a
packet which involves buffering the packet in a buffer upon or
after ingress thereof, and assembling a packet on an egress data
path from one or more modified portions of the packet, and one or
more unmodified portions as retrieved directly from the buffer.
FIG. 16 is a flowchart of an embodiment of a method of performing
classification processing of a packet in a cascaded combination of
multiple, replicated packet classification systems.
FIG. 17 is a flowchart of an embodiment of a method of preventing
re-ordering of packets in a packet processing system.
FIG. 18 is a block diagram of an embodiment of a pipelined packet
processing system.
FIG. 19 is a diagram illustrating operation of the pipeline in one
embodiment of the system of FIG. 18.
FIG. 20 illustrates one example of the categories of working state
information in the system of FIG. 18.
FIG. 21 illustrates one implementation of the pipeline of FIG. 19,
as configured to process the multiple categories of state
information illustrated in FIG. 20.
FIG. 22 illustrates an example of the control portion of state data
maintained in one embodiment of the processing pipeline for a
packet.
FIG. 23 illustrates an example of the AFH portion of state data
maintained in one embodiment of the processing pipeline for a
packet.
FIG. 24 illustrates an example of the statistics portion of state
data maintained in one embodiment of the processing pipeline for a
packet.
FIG. 25 illustrates an example of the consolidated state data
maintained in one embodiment of the processing pipeline for a
packet.
FIGS. 26A-26B illustrate an example of the format of the state data
of FIG. 25 at the nibble level of detail.
FIG. 27 illustrates an example of the format of the first 128 bytes
of packet data at the nibble level of detail.
FIGS. 28A-28C illustrate an implementation example the format of a
SCT entry.
FIG. 29 illustrates one embodiment of data path logic for deriving
a CAM key.
FIG. 30 illustrates one embodiment of SCT-supplied selection data
used in the data path logic of FIG. 29.
FIG. 31 illustrates several examples of CAM key formats.
FIGS. 32A-32B illustrates an implementation example of the format
of an ARAM entry.
FIG. 33 illustrates an embodiment of logic for updating context
select values.
FIG. 34 illustrates an embodiment of logic for updating packet
context pointers, current working VLAN, and current L3 Header using
the context select values of FIG. 33.
FIG. 35 illustrates an embodiment of logic for updating the index
of the next SCT entry.
FIG. 36 illustrates an embodiment of logic for updating
priority-based working state information.
FIG. 37 is a flowchart of one embodiment of a method of performing
pipelined processing of a packet.
FIG. 38 is a flowchart of one embodiment of a method of performing
a cycle of processing on the data in a filled slot of the
pipeline.
FIG. 39 is a block diagram of an embodiment of a system for
deriving a quality of service indicator for a packet from a
plurality of candidate quality of service indicators.
FIG. 40 is a block diagram of an implementation example of the
system of FIG. 39, wherein three candidate quality of service
indicators may be derived for a given processor slot, the first
using a VLAN state table, the second using a QoS mapping process,
and the third using a CAM-based searching process.
FIG. 41A illustrates an example format of the VLAN State Table
(VST), and FIG. 41B illustrates an example format of an entry of
the VST.
FIG. 42A illustrates an example format of the Vpri QoS Mapping
Table, and FIG. 42B illustrates an example format of an entry of
the Vpri QoS Mapping Table.
FIG. 43A illustrates an example format of the MPLS Exp QoS Mapping
Table, and FIG. 43B illustrates an example format of an entry of
the MPLS Exp QoS Mapping Table.
FIG. 44A illustrates an example format of the IP v4 ToS QoS Mapping
Table, and FIG. 44B illustrates an example format of an entry of
the IP v4 ToS QoS Mapping Table.
FIG. 45A illustrates an example format of the IP v6 ToS Mapping
Table, and
FIG. 45B illustrates an example format of an entry of the IP v6 ToS
Mapping Table.
FIG. 46A illustrates an example format of the Port State Table
(PST), and
FIG. 46B illustrates an example format of an entry of the PST.
FIG. 47A illustrates an example format of the QoS Priority Table,
and FIG. 47B illustrates an example format of an entry of the QoS
Priority Table.
FIG. 48 is a flowchart of an embodiment of a method of deriving a
quality of service indicator for a packet from a plurality of
candidate quality of service indicators.
RELATED APPLICATIONS
The following applications are commonly owned by the assignee
hereof, and are each incorporated by reference herein as though set
forth in full:
TABLE-US-00001 U.S. patent application No. Title Filing date
10/814,725, now PACKET Mar. 30, 2004 U.S. Pat. No. PROCESSING
7,292,591 SYSTEM ARCHITECTURE AND METHOD 10/814,552, now PACKET
Mar. 30, 2004 U.S. Pat. No. PROCESSING 7,385,984 SYSTEM
ARCHITECTURE AND METHOD 10/814,556 PACKET DATA Mar. 30, 2004
MODIFICATION PROCESSOR 10/814,728 SYSTEM AND Mar. 30, 2004 METHOD
FOR PACKET PROCESSOR STATUS MONITORING 10/814,545, now METHOD AND
Mar. 30, 2004 U.S. Pat. No. SYSTEM FOR 7,453,874 INCREMENTALLY
UPDATING A CHECKSUM IN A NETWORK DATA PACKET 10/814,729 SYSTEM AND
Mar. 30, 2004 METHOD FOR EGRESS PACKET MARKING 10/813,731, now
SYSTEM AND Mar. 30, 2004 U.S. Pat. No. METHOD FOR 7,304,996
ASSEMBLING A DATA PACKET 10/814,727, now PACKET DATA Mar. 30, 2004
U.S. Pat. No. MODIFICATION 7,463,628 PROCESSOR COMMAND INSTRUCTION
SET 10/814,774 DATA STRUCTURES Mar. 30, 2004 FOR SUPPORTING PACKET
DATA MODIFICATION OPERATIONS 10/835,272 PACKET PARSER Apr. 28, 2004
10/835,598 PIPELINED PACKET Apr. 28, 2004 PROCESSOR 10/834,566, now
SYSTEM FOR Apr. 28, 2004 U.S. Pat. No. DERIVING HASH 7,502,374
VALUES FOR PACKETS IN A PACKET PROCESSING SYSTEM 10/835,271 SYSTEM
FOR Apr. 28, 2004 SUPPORTING PACKET PROCESSING OPERATIONS
10/834,576 SYSTEM FOR Apr. 28, 2004 ACCESSING CONTENT- ADDRESSABLE
MEMORY IN PACKET PROCESSOR 10/834,573 SYSTEM FOR Apr. 28, 2004
STATISTICS GATHERING AND SAMPLING IN A PACKET PROCESSING SYSTEM
10/835,252, now EXCEPTION Apr. 28, 2004 U.S. Pat. No. HANDLING
SYSTEM 7,522,516 FOR PACKET PROCESSING SYSTEM
DETAILED DESCRIPTION
As utilized herein, terms such as "about" and "substantially" and
"near" are intended to allow some leeway in mathematical exactness
to account for tolerances that are acceptable in the trade.
Accordingly, any deviations upward or downward from the value
modified by the terms "about" or "substantially" or "near" in the
range of 1% to 20% or less should be considered to be explicitly
within the scope of the stated value.
As used herein, the terms "software" or "instructions" or commands"
include source code, assembly language code, binary code, firmware,
macro-instructions, micro-instructions, or the like, or any
combination of two or more of the foregoing.
The term "memory" refers to any processor-readable physical or
logical medium, including but not limited to RAM, ROM, EPROM, PROM,
EEPROM, disk, floppy disk, hard disk, CD-ROM, DVD, queue, FIFO or
the like, or any combination of two or more of the foregoing, on
which may be stored one or more instructions or commands executable
by a processor, data, or packets in whole or in part.
The terms "processor" or "CPU" or "engine" refer to any device
capable of executing one or more commands or instructions and
includes, without limitation, a general- or special-purpose
microprocessor, finite state machine, controller, computer, digital
signal processor (DSP), or the like.
The term "logic" refers to implementations in hardware, software,
or combinations of hardware and software.
The term "stack" may be implemented through a first-in-first-out
memory such as a FIFO.
The term "packet" means (1) a group of binary digits including data
and control elements which is switched and transmitted as a
composite whole, wherein the data and control elements and possibly
error control information are arranged in a specified format; (2) a
block of information that is transmitted within a single transfer
operation; (3) a collection of symbols that contains addressing
information and possibly error detection or correction information;
(4) a sequence of characters with a specific order and format, such
as destination followed by a payload; (5) a grouping of data of
some finite size that is transmitted as a unit; (6) a frame; (7)
the logical organization of control and data fields defined for any
of the layers or sub-layers of an applicable reference model,
including the OSI or TCP/IP reference models, e.g., MAC sub-layer;
or (8) a unit of transmission for any of the layers or sub-layers
of an applicable reference model, including the OSI or TCP/IP
reference models.
The term "layer two of the OSI reference model" includes the MAC
sub-layer.
The term "port" or "channel" refers to any point of ingress or
egress to or from a switch or other entity, including any port
channel or sub-channel, or any channel or sub-channel of a bus
coupled to the port.
The term "register" refers to any physical medium for holding a
data element, including, but not limited to, a buffer, FIFO, or the
like.
The term "packet processing state data" in relation to a packet
refers to data representative of at least a portion of the packet,
data representative of at least a portion of the state of
processing of the packet, or both.
Example Environment
An example environment for the subject invention will now be
described. Many others examples are possible, so nothing in this
example should be taken as limiting.
A. Overall Packet-Processing System
FIG. 1 illustrates an embodiment 100 of a packet processing system
comprising a packet classification system 102 and a packet
modification system 104. The packet classification system 102 has
an ingress portion 106 and an egress portion 108. Similarly, the
packet modification system 104 has an ingress portion 110 and an
egress portion 112. The ingress portion 106 of the packet
classification system 102 is coupled, through interface 118, to one
or more network-side devices 114, and the egress portion 108 of the
packet classification system 102 is coupled, through interface 120,
to one or more switch-side devices 116. The ingress portion 110 of
the packet modification system 104 is coupled, through interface
122, to the one or more switch-side devices 116, and the egress
portion 124 of the packet modification system 104 is coupled,
through interface 112, to the one or more network-side devices
114.
The packet classification system 102 comprises an ingress portion
106, a first packet parser 126 for parsing a packet and providing
first data representative thereof, and a packet classification
engine 128 for classifying the packet responsive to the first data.
The packet modification system 104 comprises a second packet parser
130 for parsing the classified packet (after a round trip through
the one or more switch-side devices 116) or a packet derived
there-from and providing second data representative thereof, a
packet modification engine 132 for modifying some or all of the
packet responsive to the second data, a third packet parser 134 for
parsing the modified packet and providing third data representative
thereof, and a packet post-processor 136 for post-processing the
modified packet responsive to the third data.
In one embodiment, the packet undergoing processing by the system
has a plurality of encapsulated layers, and each of the first,
second and third parsers 126, 130, 134 is configured to parse the
packet by providing context pointers pointing to the start of one
or more of the encapsulated layers. In a second embodiment, the
packet undergoing processing by the system comprises a first packet
forming the payload portion of a second packet, each of the first
and second packets having a plurality of encapsulated layers, and
each of the first, second and third parsers 126, 130, 134 is
configured to parse the packet by providing context pointers
pointing to the start of one or more of the encapsulated layers of
the first packet and one or more of the encapsulated layers of the
second packet.
In one implementation, the packet post-processor 136 is configured
to compute a checksum for a modified packet responsive to the third
data provided by parser 134. In one embodiment, the packet
post-processor 136 is configured to independently calculate a layer
three (IP) and layer four (TCP/UDP) checksum.
In one embodiment, packet post-processor 136 comprises Egress
Access Control List (ACL) logic 136a and Packet Marking logic 136b.
The Egress ACL logic 136a is configured to arrive at an ACL
decision with respect to a packet. In one implementation, four ACL
decisions can be independently performed: 1) default ACL action; 2)
CPU copy; 3) mirror copy; and 4) kill. The default ACL action may
be set to kill or allow. The CPU copy action forwards a copy of the
packet to a host 138 coupled to the system. The mirror copy action
implements an egress mirroring function (to be discussed in more
detail later), in which a copy of the packet is forwarded to mirror
FIFO 140 and then on to the egress portion 108 of the packet
classification system 102. The kill action either kills the packet
or marks it for killing by a downstream Medium Access Control (MAC)
processor.
The Packet Marking logic 136b is configured to implement a packet
egress marking function in which certain packet marking control
information for a packet generated by the packet classification
system 102 is used to selectively modify one or more quality of
service (QoS) fields in the packet.
In one embodiment, Content Addressable Memory (CAM) 142 is used by
the packet classification system 102 to perform packet searches to
arrive at a classification decision for a packet. In one
implementation, the CAM searches are ternary in that all entries of
the CAM have a data and mask field allowing don't care setting of
any bit position in the data field. In another implementation, the
CAM searches are binary, or combinations of binary and ternary.
The associated RAM (ARAM) 144 provides associated data for each
entry in the CAM 142. The ARAM 144 is accessed using the match
address returned by the CAM 142 as a result of a search operation.
The ARAM 144 entry data is used to supply intermediate
classification information for the packet that is used by the
classification engine 128 in making a final classification decision
for the packet.
The statistics RAM 146 is used to maintain various packet
statistics, including, for each CAM entry, the cumulative number
and size of packets that hit or matched that entry.
The modification RAM 148 provides data and control structures for
packet modification operations performed by the modification engine
132.
In one implementation, the interfaces 150; 152, 154, and 156 with
any of the RAMs or CAMs may be a QDR- or DDR-type interface as
described in U.S. patent application Ser. No. 10/655,742, filed
Sep. 4, 2003, which is hereby fully incorporated by reference
herein as though set forth in full.
FIG. 2 illustrates the format of classification data 200 for a
packet as produced by one embodiment of packet classification
system 102. The classification data 200 in this embodiment has
first and second portions, identified respectively with numerals
202 and 204. The first portion 202 is a 64 bit Address Filtering
Header (AFH) which is pre-pended to the packet. The second portion
204 is a 20 bit grouping of flags that are encoded as control bits
maintained by the system 100.
In one embodiment, the Port Tag Index (PTI) field is an identifier
of the port or list of ports within interface 124 over which the
packet will be sent by the packet modification engine. (The
assumption in this embodiment is that the interface 124 is a
multi-port interface).
The Egress Quality of Service (EQoS) field may be used to perform
an egress queue selection function in a device encountering the
packet. In one embodiment, this field also encodes one of the
following functions: nothing, pre-emptive kill, normal kill,
thermonuclear kill, egress mirror copy, pre-emptive intercept to
host, and normal intercept to host.
The Link Aggregation Index (LAI) field may be used to implement
physical link selection, ingress alias, echo kill alias, or equal
cost multi-path functions in a device encountering the packet.
The JUMBO flag, if asserted, directs a device encountering the
packet to perform a JUMBO-allowed check. In one embodiment, the
flag is used to implement the policy that the only valid JUMBO
packets are IP packets. Therefore, if the packet is a non-IP JUMBO
packet, the device either sends it to a host, fragments it, or
kills it.
The DON'T FRAG flag, if asserted, directs a device encountering the
packet not to fragment it in the course of implementing a
JUMBO-allowed check.
The IF TYPE flag indicates whether the ingress interface over which
the packet was received is an Ethernet or Packet Over Sonet (POS)
interface.
The ROUTE flag, if asserted, indicates that the packet is being
bridged not routed, and may be used by devices encountering the
packet to implement an echo kill suppress function.
The RANDOM EARLY DROP (RED) flag may be used to implement a random
early drop function in devices encountering the packet.
The CTL flag indicates the format of the AFH. FIG. 2 illustrates
the format of the header for packets exiting the packet
classification system 102 and destined for the one or more
switch-side devices 116. Another format applies for packets exiting
the one or more switch-side devices 116 and destined for the packet
modification system 104. The CTL flag indicates which of these two
formats is applicable.
The Transmit Modification Index (TXMI) field is used by the
modification engine 132 to retrieve control and data structures
from Modification RAM 148 for use in performing any necessary
modifications to the packet.
The CPU Quality of Service (CQoS) field may be used to perform an
ingress queue select function in a host coupled to the packet
processing system.
In one embodiment, the CPU Copy flag, if asserted, directs one or
more of the switch-side devices 116 to forward a copy of the packet
to a host coupled to the packet processing system. In another
embodiment, the CPU Copy flag, if asserted, directs a copy of a
packet to be forwarded to the host through a host bus or another
PBUS.
The Redirect flag, if asserted, directs one or more of the
switch-side devices 116 to forward a copy of the packet to the host
for redirect processing. In redirect processing, the host receives
the packet copy and redirects it to the sender, with an indication
that the sender should switch the packet, not route it.
The Statistical Sample (SSAMPLE) flag, if asserted, indicates to
one or more of the switch-side devices 116 that the packet is a
candidate for statistical sampling. If the packet is ultimately
selected for statistical sampling, a copy of the packet is directed
to the host, which performs a statistical analysis of the packet
for the purpose of accurately characterizing the network traffic of
which the packet is a part.
The LEARN flag, if asserted, directs one or more of the switch-side
devices 116 to forward a copy of the packet to the host so the host
can perform learn processing. In learn processing, the host
analyzes the packet to "learn" the sender's MAC address for future
packet switching of packets to that address.
The Egress Mirror (EMIRROR) flag, if asserted, implements egress
mirroring by directing one or more of the switch-side devices 116
to send a copy of the packet to mirror FIFO 140. From mirror FIFO
140, the packet passes through the egress portion 108 of the packet
classification system 102 en route to the one or more switch-side
devices 116.
The Ingress Quality of Service (IQoS) field may be used to perform
an ingress queue selection function in a device encountering the
packet.
The Egress Mark Select (EMRK SEL) field selects one of several
possible egress mark functions. The Egress Mask (EMRK MASK) field
selects one of several possible egress masks. Together, the EMRK
SEL and EMRK MASK fields forms an embodiment of packet egress
marking control information which may be used by packet marking
logic 136b to mark the packet, i.e., selectively modify one or more
QoS fields within the packet.
The Ingress Mirror (IMIRROR) flag, if asserted, directs one or more
of the switch-side devices 116 to forward a copy of the packet to
the designated ingress mirror port on the switch.
The Parity Error Kill (PERR KILL) flag, if asserted, directs the
interface 120 to kill the packet due to detection of an ARAM parity
error.
In one embodiment, the EMIRROR bit is normally in an unasserted
state. If the packet classification system 102, after analyzing the
packet, determines that egress mirroring of the packet is
appropriate, the packet classification system 102 changes the state
of the EMIRROR bit to place it in the asserted state.
The packet, along with a pre-pended AFH containing the EMIRROR bit,
is then forwarded to the one or more switch-side devices 116. After
processing the packet, the one or more devices transmit the packet,
with the EMIRROR bit preserved in a pre-pended packet header, back
to the packet modification system 104 over interface 122. In
response, the packet modification system 104 is configured to
detect the state of the EMIRROR bit to determine if egress
mirroring of the modified packet is activated, and if so, provide a
copy of the modified packet to the egress portion 108 of the packet
classification system 102 through the mirror FIFO 140.
In one embodiment, the EQoS, CQoS, IQoS, EMRK SEL and EMRK MASK
fields define a multi-dimensional quality of service indicator for
the packet. In this embodiment, the EMRK SEL and EMRK MASK fields
form packet egress marking control information that is utilized by
packet modification system 104 to selectively modify one or more
quality of service fields within the packet, or a packet derived
there-from.
The quality of service indicator for a packet may be derived from a
plurality of candidate quality of service indicators derived from
diverse sources. In one embodiment, a plurality of candidate
quality of service indicators are derived for a packet, each with
an assigned priority, and a configurable priority resolution scheme
is utilized to select one of the plurality of quality of service
indicators for assigning to the packet. In one embodiment, one or
more of the candidate quality of service indicators, and associated
priorities, are derived by mapping one or more fields of the packet
into one or more candidate quality of service indicators for the
packet and associated priorities. In a second embodiment, one or
more searches are conducted to obtain one or more candidate quality
of service indicators for the packet and associated priorities. In
a third embodiment, a combination of these two approaches is
utilized.
In one example, candidate quality of service indicators, and
associated priorities, are derived from three sources. The first is
a VLAN mapping scheme in which a VLAN from the packet is mapped
into a candidate quality of service indicator and associated
priority using a VLAN state table (VST). The VLAN from the packet
may represent a subnet or traffic type, and the associated priority
may vary based on the subnet or traffic type. The second is a
CAM-based search that yields an associated ARAM entry that in turn
yields a candidate quality of service indicator. A field of an
entry in a Sequence Control Table (SCT) RAM, which provides the
sequence of commands controlling the operation of one embodiment of
the packet classification engine 102, provides the associated
priority. The third is a QoS mapping scheme, which operates in one
of three modes, as determined by a field in a SCT RAM entry.
In the first mode, the 0.1p mapping mode, the VST provides the four
QSEGment bits. The QSEG and the 0.1p bits are mapped into a
candidate quality of service indicator, and the VLAN itself is
mapped into an associated priority using the VST. In the second
mode, the MPLS mapping mode, the EXP/QoS fields from the packet are
mapped into a candidate quality of service indicator, and a VLAN
from the packet is mapped into the associated priority using the
VST. In the third mode, the ToS mapping mode, the IPv4 ToS, IPv6
Traffic Class, or Ipv6 Flow Label based QoS fields are mapped into
a candidate quality of service indicator, and a VLAN from the
packet is mapped into an associated priority using the VST.
In this example, the candidate quality of service indicator with
the highest priority is assigned to the packet. Moreover, a
candidate from one of the sources can be established as the
default, which may be overridden by a candidate obtained from one
of the other sources, at least a candidate that has a higher
priority than the default selection. For example, the candidate
quality of service indicator resulting from the 0.1p mapping mode
can be established as the default selection, and this default
overridden only by a candidate quality of service indicator
resulting from an ARAM entry in turn resulting from a CAM-based
search.
FIG. 3 illustrates an embodiment 300 of a packet classification
system. In this embodiment, the packet classification system is
coupled to one or more network-side devices through a multi-port
packet bus (PBUS) 302, as described in U.S. patent application Ser.
Nos. 10/405,960 and 10/405,961, filed Apr. 1, 2003, which are both
hereby fully incorporated herein by reference. PBUS ingress logic
304 is configured to detect a start of packet (SOP) condition for
packets arriving at the packet classification system over the
PBUS.
Upon or after detection of the SOP condition, the packet, or a
portion thereof, is stored in slicer 306. Slicer 306 is configured
to slice some or all of a packet into portions and provide the
portions in parallel over first data path 308 having a first width
to classification engine 310. In one embodiment, the slicer 306 is
a FIFO which stores the first 128 bytes of a packet (or the
entirety of the packet if less than 128 bytes), and provides the
1024 bits thereof in parallel to the packet classification engine
310 over the first data path 308.
Upon or after detection of the SOP condition, parser 312 parses the
packet in the manner described previously, and stores the resultant
context pointers (and other flags resulting from the parsing
process) in parser result RAM 314. Concurrently with this parsing
process, the packet is stored in buffer 318, which in one
embodiment, is a FIFO buffer.
The packet classification engine 310 is configured to classify the
packet responsive to the packet portions received over the first
data path 308 and the parser results as stored in the parser result
RAM 314, and store data representative of the packet classification
in classification RAM 316. In one embodiment, the classification
data is the AF header illustrated in FIG. 2.
An associator 320 is configured to associate the data
representative of the packet classification with some or all of the
packet, and provide the associated packet over a second data path
322 having a second width less than the first width.
The packet classification system is coupled to one or more
switch-side devices over a multi-port PBUS 326, and PBUS egress
logic 324 is configured to transmit the associated packet over the
PBUS 326.
In one embodiment, slicer 306 comprises a plurality of memories
configured to store some or all of the packet, and provide the
portions thereof in parallel over the first data path 308 to the
classification engine 310. In one example, the slicer 306 is
configured as eight (8) memories configured to provide the first
1024 bits of the bits of the packet (or less if the packet is less
than 128 bytes) in parallel over the first data path 308 to
classification engine 310.
In one embodiment, the associator 320 comprises a multiplexor
configured to multiplex onto the second data path 322 the data
representative of the packet classification as stored in
classification RAM 316 and some or all of the packet as stored in
buffer 318. In one implementation, the multiplexor multiplexes the
first 8 byte portion 202 of the AF data illustrated in FIG. 2
(which may be referred to as the AF header) onto the second data
path followed by the packet as stored in buffer 318, thereby
effectively pre-pending the AF header to the packet. In this
implementation, control logic 328 controls the operation of the
multiplexor through one or more signals provided over control data
path 334.
More specifically, the multiplexor in this implementation is
configured to select one of three inputs and output the selected
input to the second data path 322 under the control of the control
logic 328. The first input is the classification data as stored in
classification RAM 316. The second input is the packet as stored in
buffer 318. The third input is the output of the mirror FIFO 140.
This third input is selected when the egress mirroring function,
discussed previously, is activated.
In one embodiment, the control logic 328 is also configured to
maintain first and second FIFO buffers, identified respectively
with numerals 330 and 332, the first FIFO buffer 330 for
identifying those packets which are awaiting classification by the
packet classification system, and the second FIFO buffer 332 for
identifying those packets which are undergoing classification by
the classification system.
In this embodiment, the control logic 328 is configured to place an
identifier of a packet on the first FIFO buffer 330 upon or after
receipt of the packet by the packet classification system, pop the
identifier off the first FIFO buffer 330 and place it on the second
FIFO buffer 332 upon or after initiation of classification
processing of the packet by the packet classification system, and
pop the identifier off the second FIFO buffer 332 upon or after
completion of classification processing of the packet by the packet
classification system.
The control logic 328 is configured to prevent the packet
classification system from outputting a packet onto PBUS 326 while
an identifier of the same is placed on either the first or second
FIFO buffers 330, 332, and allows the packet classification system
to output the packet onto PBUS 326 upon or after the identifier of
the packet has been popped off the second FIFO buffer 332. In one
implementation, the control logic 328 prevents the associator 320
from outputting data on the second data path 322 through one or
more signals provided over control data path 334. In one
implementation, the control logic 328 is a state machine.
In one embodiment, the control logic 328 forms the basis of a
packet statistics maintaining system within the packet
classification system. In this embodiment, the control logic 328 is
configured to maintain a pool of packet size determiners, and
allocate a packet size determiner to a packet from the pool upon or
after receipt thereof by the packet classification system.
In one implementation, the control logic 328 allocates a packet
size determiner to a packet upon or after the PBUS ingress logic
304 signals a SOP condition for the packet. The packet size
determiner is configured to determine the size of the packet, and
the control logic 328 is configured to return the packet size
determiner to the pool upon or after the same has determined the
size of the packet. In one implementation example, the packet size
determiners are counters.
Statistics RAM 330 in this embodiment maintains packet statistics,
and statistics update logic 336 is configured to update the packet
statistics responsive to the determined size of the packet. In one
implementation, the statistics update logic 336 includes a queue
for queuing statistics update requests issued by the control logic
328.
In one configuration, the packet statistics maintaining system is
configured to maintain packet statistics indicating the cumulative
size of packets which have met specified processing conditions or
hits, and the statistics update logic 336, upon or after a packet
size determiner has determined the size of a packet, is configured
to increment a cumulative size statistic for a particular
processing condition or hit by the determined size of the packet if
the packet satisfies that particular processing condition or hit.
In one example, the system maintains statistics indicating the
cumulative size and number of packets that have resulted in each of
a plurality of ternary CAM 142 hits.
FIGS. 4A-4B illustrate an embodiment 400 of a packet modification
system having PBUS ingress logic 404 that is coupled to one or more
switch-side devices through PBUS 402. In this embodiment, the
packets are received over the PBUS channels in bursts. The PBUS
ingress logic 404 is configured to monitor the PBUS channels in a
round robin fashion. When the PBUS ingress logic 404 detects a SOP
condition on one of the channels, the Transmit Modification Index
(TXMI) is extracted from the AF header of the packet, and it, along
with the length of the initial packet burst, and an end of packet
(EOP) marker if the packet length is less than or equal to the
burst length, is placed on Transmit In Control FIFO 406. The packet
or packet burst is stored in Transmit In Data FIFO 428, and a
pointer to the start of the packet or packet burst (SOP pointer) is
stored in Transmit Engine FIFO 408, along with an identifier of the
PBUS channel over which the packet or packet burst was received. In
one implementation, the packet bursts are 128 bytes in length.
Transmit In Data FIFO 428 stores the packet data such that portions
of the packet can be passed in parallel over a first data path 402
having a first width to a modification engine 422. In one
implementation, the Transmit In Data FIFO 428 comprises a plurality
of FIFOs, with the outputs of the FIFOs coupled in parallel to the
modification engine 422 and collectively forming the first data
path 402. Incoming packet or packet bursts are copied into each of
the plurality of FIFOs, thereby providing the modification engine
with sliced portions of the packets or packet bursts in
parallel.
The incoming packets or packet bursts are also input to the second
packet parser 424, which parses the packets or packet bursts in the
manner described previously. The context pointers and status bits
resulting from the parsing process are stored in parser result RAM
426.
The Transmit Command Sequencer 410 is configured to read a SOP
pointer and channel from the Transmit Engine FIFO 408, and utilize
this information to locate the packet or packet bursts in the
Transmit In Control FIFO 406. The Transmit Modification Index
(TXMI) within the AF header of this packet or packet burst is then
located and used to access a TXMI link in External Transmit SRAM
412, an SRAM located off-chip in relation to modification engine
422. The TXMI link may either be 1) an internal recipe link to a
recipe of modification commands stored in Internal Recipe RAM 414,
an on-chip RAM in relation to modification engine 422, and related
data structures stored in External Transmit SRAM 412, or 2) an
external recipe link to a recipe of modification commands stored in
External Transmit SRAM 412 and related data structures also stored
in External Transmit SRAM 412.
The sequencer 410 also assigns a sequence number to the packet to
prevent packet re-ordering. It then directs the Transmit RAM
arbiter 416 to read the recipe of modification commands stored in
the External Transmit SRAM 412 (assuming the TXMI link is an
external recipe link) or Internal Recipe RAM 414 (assuming the TXMI
link is an internal recipe link) and store the same in Recipe RAM
418, an on-chip RAM in relation to modification engine 422. It
further directs the arbiter 416 to read the data structures
associated with the specified internal or external recipe command
sequence, and store the same in Data RAM 420, another on-chip RAM
in relation to modification engine 422.
The sequencer 410 then awaits an available slot in the pipeline of
the modification engine 422. When such is available, the sequencer
410 passes to the engine 422 for placement in the slot a pointer to
the recipe as stored in Recipe RAM 418 and other related
information.
The sequencer 410 assigns a fragment buffer to the packet. The
fragment buffer is a buffer within a plurality of fragment buffers
which collectively may be referred to as TX work buffer 436. The
modification engine then executes the recipe for the packet or
packet burst, through one or more passes through the modification
engine pipeline. In one embodiment, the recipe comprises one or
more entries, and one or more passes through the pipeline are
performed to execute each entry of the recipe.
In the process of executing the recipe, the modification engine 422
stores the modified fragments of the packet in the fragment buffer
allocated to the packet in TX work buffer 436. At the same time,
the modification engine 422 stores, in ascending order in fragment
format RAM 438, pointers to the modified fragments of the packet as
stored in the fragment buffer and pointers to the unmodified
fragments of the packet as stored in Transmit In Data FIFO 428.
When all the recipe entries have been executed, the modification
engine 422 writes an entry to the fragment CAM 440, the entry
comprising the PBUS channel over which the packet was received, the
sequence number for the packet, the SOP pointer to the packet (as
stored in the Transmit In Data FIFO 428), a packet to be filled
flag, a packet offset in the Transmit In Data FIFO 428, and the
total length of the list of fragments as stored in the fragment
format RAM 438. This completes the processing of the packet by the
modification engine 422.
Fragment/burst processor 442 assembles the packets for ultimate
egress from the system. To prevent packet re-ordering, the
fragment/burst processor 442 processes, for each PBUS channel, the
packets in the order in which they were received by the
modification system 400. More specifically, the fragment/burst
processor 442 maintains an expected next sequence number for each
PBUS channel, and then performs, in round robin fashion, CAM
searches in fragment CAM 440 for an entry bearing the expected next
sequence number for the channel. If an entry is found with that
sequence number, the fragment/burst processor 442 processes it. If
such an entry is not found, the fragment/burst processor 442 takes
no action with respect to the channel at that time, and proceeds to
process the next channel.
When a fragment CAM entry with the expected next sequence number is
located, the fragment/burst processor 442 directs assembler 446 to
assemble the packet responsive to the fragment list for the packet
as stored in the fragment format RAM 438. In one embodiment, the
assembler 446 is a multiplexor, which is directed to multiplex
between outputting on second data path 444, responsive to the
fragment list, the modified packet fragments as stored in the TX
work buffer 436 and the unmodified packet fragments as stored in
the Transmit In Data FIFO 428 (as provided to the multiplexor 446
over data path 434). Through this process, the packet is assembled
in ascending order on second data path 444. In one embodiment, the
second data path 444 has a width less than the width of the first
data path 402. In one implementation, the fragment/burst processor
442 outputs the packets over data path 444 in the form of
bursts.
The assembled packet is parsed by the third packet parser 448 in
the manner described previously. The resultant context pointers and
status flags are then passed, along with the packet, for concurrent
processing by Transmit Processor Block 452 and Transmit ACL Logic
454.
The Transmit Processor Block 452 performs two main functions.
First, it performs egress mark processing by selectively modifying
one or more QoS fields in the packet responsive to the egress mark
control information from the packet stored by the modification
engine in Transmit Post Processor RAM 456. In one example, any of
the VLAN VPRI, MPLS EXP, and IPv4/IPv6 TOS fields may be modified
through this process utilizing the VPRI/EXP/IPToS RAMs 458 as
appropriate. The egress mark control information may be derived
from one or more egress mark commands specified by an AFH
pre-pended to the packet, or from one or more egress mark commands
within a recipe for the packet. Second, it performs OSI Layer
3/Layer 4 checksum calculation or modification.
The Transmit ACL logic 454 conducts a CAM search for the packet in
Egress ACL CAM 460 to determine if the packet should be killed, a
copy sent to the host, or mirrored to the egress mirror FIFO 140.
The packet then exits the packet modification system 400 through
the egress portion 462 of the system 400, and is output onto PBUS
464.
FIG. 5 illustrates a cascaded combination 500 of multiple,
replicated packet systems, each of which is either a packet
classification system or a packet modification system. In one
embodiment, the cascaded combination comprises a first one 502 of
the replicated packet systems having ingress and egress portions,
identified respectively with numerals 504 and 506, and a second one
508 of the replicated packet systems having ingress and egress
portions, identified respectively with numerals 510 and 512.
In this embodiment, the egress portion 506 of the first packet
system 502 is coupled to the ingress portion 510 of the second
packet system 508. Moreover, the first one 502 of the replicated
packet systems is configured to perform partial processing of a
packet, either classification or modification processing as the
case may be, and the second one 508 of the replicated packet
systems is configured to complete processing of the packet.
In one configuration, packet system 508 forms the last one of a
plurality of systems in the cascaded combination, and packet system
502 forms either the first or the next to last one of the systems
in the cascaded combination.
In one example, each of the replicated systems performs a limited
number of processing cycles, and the number of replicated systems
is chosen to increase the number of processing cycles to a desired
level beyond that achievable with a single system.
In a second example, a complete set of processing functions or
tasks is allocated amongst the replicated systems. In one
configuration, a first replicated system is allocated ACL and QoS
classification processing tasks, and a second replicated system is
allocated PTI/TXMI classification processing tasks.
FIG. 6 is a flowchart of one embodiment 600 of a method of
processing a packet. In this embodiment, the method comprises step
602, parsing a packet and providing first data representative
thereof, and step 604, classifying the packet responsive to the
first data.
In step 606, the packet is forwarded to and received from switching
fabric, which may perform additional processing of the packet. Step
608 comprises parsing the packet received from the switching fabric
(which may be the packet forwarded to the switching fabric, or a
packet derived there-from), and providing second data
representative thereof.
Step 610 comprises modifying the packet responsive to the second
data, and step 612 comprises parsing the modified packet and
providing third data representative thereof. Step 614 comprises
post-processing the modified packet responsive to the third
data.
In one embodiment, the packet undergoing processing has a plurality
of encapsulation layers, and each of the first, second and third
parsing steps 602, 608, 612 comprising providing context pointers
pointing to the start of one or more of the encapsulated layers of
the packet.
In a second embodiment, the packet undergoing processing comprises
a first packet forming the payload portion of a second packet, each
of the first and second packets having a plurality of encapsulation
layers, and each of the first, second and third parsing steps 602,
608, 612 comprises providing context pointers pointing to the start
of one or more of the encapsulated layers of the first packet and
one or more of the encapsulated layers of the second packet.
In one implementation, the post-processing step comprises computing
a checksum for the modified packet. In a second implementation, the
post-processing step comprises egress marking of the packet. In a
third implementation, the post-processing step comprises the
combination of the foregoing two implementations.
FIG. 7 is a flowchart of a second embodiment 700 of a method of
processing a packet. In this embodiment, step 702 comprises
analyzing a packet in a packet classification system and,
responsive thereto, selectively changing the state of a control bit
from a first state to a second state. Step 704 comprises forwarding
the packet to and from switching fabric. Step 706 comprises
modifying, in a packet modification system, the packet received
from the switching fabric (either the packet forwarded to the
switching fabric, or a packet derived there-from), detecting the
control bit to determine if egress mirroring of the modified packet
is activated, and if so, providing a copy of the modified packet to
the packet classification system.
In one implementation, the control bit is associated with the
packet received from the switching fabric. In one example, the
control bit is in a packet header pre-pended to the packet received
from the switching fabric.
FIG. 8 is a flowchart of a third embodiment 800 of a method of
processing a packet. Step 802 comprises providing a
multi-dimensional quality of service (QoS) indicator for a packet.
Step 804 comprises forwarding the packet to and from switching
fabric. Step 806 comprises egress marking of the packet received
from the switching fabric (either the packet forwarded to the
switching fabric, or a packet derived there-from), responsive to at
least a portion of the multi-dimensional QoS indicator.
In one implementation, step 806 comprises selectively modifying one
or more quality of service fields within the packet received from
the switching fabric responsive to at least a portion of the
multi-dimensional quality of service indicator.
In one configuration, the multi-dimensional quality of service
indicator comprises an ingress quality of service indicator, an
egress quality of service indicator, and packet marking control
information, and step 806 comprises selectively modifying one or
more quality of service fields within the packet received from the
switching fabric responsive to the packet marking control
information. In one example, the multi-dimensional quality of
service indicator further comprises a host quality of service
indicator.
In one embodiment, the method further comprises utilizing the
ingress quality of service indicator as an ingress queue select. In
a second embodiment, the method further comprises utilizing the
egress quality of service indicator as an egress queue select. In a
third embodiment, the method further comprises utilizing the host
quality of service indicator as an ingress queue select for a
host.
FIG. 9 is a flowchart of an embodiment 900 of assigning a quality
of service indicator to a packet. In this embodiment, step 902
comprises providing a plurality of quality of service indicators
for a packet, each with an assigned priority, and step 904
comprises utilizing a configurable priority resolution scheme to
select one of the plurality of quality of service indicators for
assigning to the packet.
In one implementation, step 902 comprises mapping one or more
fields of the packet into a quality of service indicator for the
packet and an associated priority. In a second implementation, step
902 comprises performing a search to obtain a quality of service
indicator for the packet and an associated priority. A third
implementation comprises a combination of the foregoing two
implementations.
FIG. 10 is a flowchart of an embodiment 1000 of a method of
classifying a packet. In this embodiment, step 1002 comprises
slicing some or all of a packet into portions and providing the
portions in parallel over a first data path having a first width to
a classification engine. Step 1004 comprises classifying, in the
packet classification engine, the packet responsive to the packet
portions received over the first data path and providing data
representative of the packet classification. Step 1006 comprises
associating the data representative of the packet classification
with the packet to form an associated packet, and providing the
associated packet over a second data path having a second width
less than the first width.
In one implementation, the step of providing the packet portions
over the first data path comprises providing each of the bits of
some or all of the packet in parallel over the first data path to
the classification engine.
In a second implementation, the associating step comprises
multiplexing the data representative of the packet classification
and some or all of the packet onto the second data path.
FIG. 11 is a flowchart of an embodiment 1100 of a method of
modifying a packet. Step 1102 comprises providing some or all of a
packet as packet portions and providing the portions in parallel
over a first data path having a first width to a modification
engine. Step 1104 comprises modifying, in the modification engine,
one or more of the packet portions. Step 1106 comprises assembling
a packet from the one or more modified and one or more unmodified
packet portions, and providing the assembled packet over a second
data path having a second width less than the first width.
FIG. 12 is a flowchart 1200 of an embodiment of a method of
classifying a packet. Step 1202 comprises placing an identifier of
a packet on a first FIFO buffer. Step 1204 comprises popping the
identifier off the first FIFO buffer and placing it on a second
FIFO buffer upon or after initiation of classification processing
of the packet. Step 1206 comprises avoiding outputting the packet
while an identifier of the same is placed on either the first or
second FIFO buffers. Step 1208 comprises outputting the packet upon
or after the identifier of the packet has been popped off the
second FIFO buffer.
FIG. 13 is a flowchart illustrating an embodiment 1300 of a method
of maintaining packet statistics. Step 1302 comprises allocating a
packet size determiner to a packet from a pool of packet size
determiners. Step 1304 comprises using the packet size determiner
to determine the size of the packet. Step 1306 comprises updating
one or more packet statistics responsive to the determined size of
the packet. Step 1308 comprises returning the packet size
determiner to the pool upon or after the same has determined the
size of the packet.
In one implementation, the packet size determiner is a counter that
counts the size of the packet. In a second implementation, the
method further comprises queuing one or more statistics update
requests.
In one implementation example, the one or more packet statistics
indicate the cumulative size of packets which have met specified
processing conditions or hits, and step 1306 comprises incrementing
a cumulative size statistic for a particular processing condition
or hit by the determined size of the packet if the packet meets
that particular processing condition or hit.
FIG. 14 illustrates an embodiment 1400 of a method of classifying a
packet. Step 1402 comprises buffering a packet in a buffer upon or
after ingress thereof. Step 1404 comprises classifying the packet
and providing data representative of the packet classification.
Step 1406 comprises associating the data representative of the
packet classification with some or all of the packet as directly
retrieved from the buffer to form a packet on an egress data
path.
In one implementation, step 1406 comprises multiplexing the data
representative of the packet classification onto a data path
followed by some or all of the packet as directly retrieved from
the buffer.
FIG. 15 illustrates an embodiment 1500 of a method of modifying a
packet. Step 1502 comprises buffering the packet in a buffer upon
ingress thereof. Step 1504 comprises modifying one or more portions
of the packet. Step 1506 comprises assembling the one or more
modified portions of the packet with one or more unmodified
portions of the packet as retrieved directly from the buffer to
form an assembled packet on an egress data path.
In one implementation, the method comprises providing a list
indicating which portions of the assembled packet are to comprise
modified portions of an ingress packet, and which portions are to
comprise unmodified portions of the ingress packet, and step 1506
comprises assembling the assembled packet responsive to the
list.
FIG. 16 illustrates an embodiment 1600 of a method of processing a
packet in a cascaded combination of multiple, replicated packet
processing systems. In one implementation, each of systems is
either a packet classification system or a packet modification
system, and the processing which is performed by each system is
either classification processing or modification processing as the
case may be. Step 1602 comprises performing partial processing of a
packet in a first of the replicated packet processing systems, and
step 1604 comprises completing processing of the packet in a second
of the replicated packet processing systems.
In one implementation, the second packet processing system is the
last of a plurality of replicated packet processing systems, and
the first packet processing system is either the first or next to
last packet processing system in the plurality of packet processing
systems, wherein partial processing of a packet is performed in the
first replicated packet processing system, and processing is
completed in the second replicated packet processing system.
FIG. 17 illustrates an embodiment 1700 of a method of preventing
re-ordering of packets in a packet processing system. Step 1702
comprises assigning a sequence number to a packet upon or after
ingress thereof to the system. Step 1704 comprises processing the
packet. Step 1706 comprises storing data representative of the
packet in a buffer. Step 1708 comprises checking the buffer for an
entry matching an expected next sequence number. Inquiry step 1710
comprises determining if a match is present. If so, steps 1712 and
1714 are performed. Step 1712 comprises outputting the
corresponding packet, and step 1714 comprises updating the expected
next sequence number to reflect the outputting of the packet. If
not, the method loops back to step 1708, thus deferring outputting
a packet if a match is not present.
In one implementation, steps 1708-1714 comprise maintaining an
expected next sequence number for each of a plurality of output
channels, checking the buffer for a match for each of the channels,
outputting the corresponding packet on a channel if a match for
that channel is present and updating the expected next sequence
number for that channel, and deferring outputting a packet on a
channel if a match for that channel is not present.
B. Pipelined Packet Processing System
An embodiment of a pipelined packet processing system 1800 is
illustrated in FIG. 18. The system 1800 comprises a packet
processor 1802 that maintains at least one pipeline having a
predetermined number of slots, such as illustrated in FIG. 19, for
placement of packet data. Three such slots are identified in FIG.
19 with numerals 1902a, 1902b, and 1902c. The packet processor 1802
is configured to load each of one or more empty ones of the slots
with available packet data, process each of one or more filled ones
of the slots in sequence during a cycle of processing, and process
each of the one or more filled ones of the slots for a
predetermined number of cycles of processing.
In one embodiment, the processor 1802 is configured to process the
data in a filled slot during a cycle by accessing one or more
resources responsive to state data corresponding to the packet data
stored in the slot, retrieving data from the one or more resources,
and selectively updating the state data responsive to the data
retrieved from the one or more resources.
Upon or after the data in the filled slot has undergone the
predetermined number of cycles of processing, the processor 1802 is
configured to unload the data, and derive packet classification or
forwarding information from the state data for the packet. In one
embodiment, the processor 1802 assigns the packet classification or
forwarding information to the packet such as by pre-pending it to
the packet.
In one application, the processor 1802 forms the packet
classification engine 128 illustrated in FIG. 1, or the
classification engine 310 illustrated in FIG. 3, and the packet
classification or forwarding information derived by the processor
1802 is the AFH, illustrated in FIG. 2, which is pre-pended to the
packet.
Turning back to FIGS. 18 and 19, in one implementation, the
processor 1802 is configured to fill the one or more of the
unfilled ones of the slots 1902a, 1902b, 1902c during a loading
mode of operation, and process one or more of the filled ones of
the slots during a subsequent processing mode of operation that
commences after the loading mode of operation has been
completed.
In one embodiment, the processor 1802 is configured to fill the one
or more of the unfilled slots with available packet data as
obtained from a queue 1903. In one example, the processor 1802 is
configured to bypass unfilled ones of the slots if and while the
queue is empty. Thus, in FIG. 19, filled ones of the slots are
identified with "P" while unfilled ones of the slots are identified
with "X." In one configuration, the packet data that is taken from
queue 1903 and stored in a slot is an identifier of packet data as
stored in FIFO buffer 1804. In one application, the queue 1903 is
the queue 330, illustrated in FIG. 3, which maintains identifiers
of packets that are awaiting classification, and the FIFO buffer
1804 is slicer 306.
In one embodiment, working state data is stored in the slots along
with the corresponding packet data. In FIG. 19, this working state
data is shown in phantom and identified with numerals 1904a, 1904b,
1904c.
In one implementation example, the predetermined number of slots
maintained by the processor 1802 is a programmable variable having
a default value of 20 slots, and the predetermined number of
processing cycles that each slot undergoes is also a programmable
variable having a default value of 5 cycles. In this implementation
example, identifiers of packets awaiting processing by processor
1802 are stored in the queue 1903. During a loading mode of
operation, each of the slots 1902a, 1902b, 1902c in the pipeline
are sequentially loaded with packet identifiers popped off the
queue 1903. The process of loading slots is identified in FIG. 19
with numeral 1906. During the loading mode of operation, if the
queue 1903 is empty when a slot is presented for loading, the slot
is bypassed and not loaded with packet data. This process continues
until all the slots have either been filled or bypassed. At that
point, the processor enters a processing mode of operation, during
which each of the filled slots undergoes the predetermined number
of cycles of processing.
Turning back to FIG. 18, the processor 1802 performs a cycle of
processing on a slot by retrieving an entry from sequence control
table (SCT) 1806. During the first cycle of processing of the data
in the slot, the address of the command in the SCT is obtained from
an entry in First Command CAM 1818. That entry is obtained from a
search of the First Command CAM 1818 using a key derived from the
results of parsing the packet as stored in Parser Result RAM 1820.
During subsequent cycles of processing of the data in the slot, the
address of the command is obtained from working state data stored
in the slot itself alongside the corresponding packet data. In one
implementation, this address is stored in the slot at the
conclusion of the previous cycle of processing. In one example,
this address is derived during the previous cycle of processing
from the SCT command that is being executed during that cycle of
processing. In one application, the Parser Result RAM 1820 is the
Parser Result RAM 314 identified in FIG. 3.
Turning back to FIG. 18, in one implementation example, the command
from SCT 1806 is processed by data path logic 1808 to form a key to
CAM 1810. In this implementation example, a matching entry in the
CAM 1810 is located. This matching entry identifies a corresponding
entry in associated RAM (ARAM) 1812. The ARAM and/or SCT entries
either provide working state data for the packet undergoing
processing, or provide data from which that working state data is
updated. In one application, CAM 1810 forms the CAM 142 illustrated
in FIG. 1, and ARAM 1812 forms the ARAM 144 illustrated in FIG.
1.
The steps of updating the working state information for a packet
are reflected in FIG. 19. In particular, once a slot is loaded with
packet data as identified with numeral 1906, in one implementation
example, the slot conceptually moves through the pipeline in a
counter-clockwise fashion. At the point identified with numeral
1908, an access is made to SCT 1806 for the command to be executed.
As discussed, during the first cycle of processing of a slot, the
address of this first command is obtained from First Command CAM
1818. During subsequent cycles of processing, the address of the
command is obtained from the working state for the packet stored in
the slot alongside the packet.
At the point identified with numeral 1910, the SCT command
resulting from this access is obtained. At the point identified
with numeral 1912, this command is processed by data path logic
1808 to result in a CAM key. At the point identified with numeral
1914, an access is made to CAM 1810 using this key. Because of the
latency of this CAM, the result of this access is not available
until the point identified with numeral 1916. At the point
identified with numeral 1918, the CAM entry resulting from this
access is used to access a corresponding entry in ARAM 1812. At the
point identified with numeral 1920, the result of this access is
available. At the point identified with numeral 1922, data
resulting from the ARAM access and/or the SCT command data is
resolved with the current working state data for the packet. For
priority-based items, an element of the ARAM/SCT data supersedes an
existing element of state data if it has a higher priority. For
non-priority based items, an element of the ARAM/SCT data may
supersede an existing element of state data without regard to
priority.
In one embodiment, as discussed, the working state data for a
packet is stored in the corresponding slot alongside the packet
data. In a second embodiment, an identifier of the working state
data as stored in a buffer is stored in the corresponding slot
along with the packet data.
In one embodiment, the working state data for a packet is control
data, such as, for example, pipeline management information, packet
process state data, or static packet information. In a second
embodiment, the working state data for a packet is packet
classification or forwarding information for the packet such as,
for example, priority-based packet classification/forwarding
information or non-priority-based packet classification/forwarding
information. In a third embodiment, the working state data for the
packet is statistical information relating to the packet. In a
fourth embodiment, the working state data for a packet is any
combination of the foregoing.
In one implementation example, as illustrated in FIG. 20, the
working state data maintained for a packet comprises control data
2002, AFH data 2004, and statistical information 2006. In this
implementation example, the working state data is stored in the
slot along with an identifier of the packet as stored in a buffer
(such as slicer 306 in FIG. 3). In one configuration, the control
data 2002 comprises: Pipeline management data, including:
host/packet indicator, indicating whether the slot is occupied by
packet data or data from a CPU host. cycle count, the number of
cycles of processing data in the slot has undergone to date.
first/done indicators, indicating respectively whether the current
cycle of processing is the first cycle for the data in the slot,
and whether the slot has completed all required cycles of
processing. Packet process state data, including: Page selector,
the page selector applicable to the current processing cycle. VLAN
selector, the VLAN selector applicable to the current processing
cycle. IP selector, the IP header selector applicable to the
current processing cycle. ARAM VLAN flag, indicating whether the
working VLAN for the packet is to be taken from the ARAM entry. SCT
index, identifying the address of the next SCT command to be
executed. Static packet information, including: Packet length.
Packet pointer, a pointer to the packet as stored in a buffer.
Interface type, e.g., EtherNet or POS. Ingress port number, an
identifier of the ingress port of the packet. Port State flag, a
flag indicating whether the Port State Table is being used for this
processor slot. Debug management information.
In one configuration, the AFH data 2004 comprises: Priority based
information, including: PTI. TXMI. IQoS, EQoS, CQoS. Egress Mark
data. Non-priority based information, including the following
"sticky" flags that, once set, remain set: Learn flag, a flag that,
if asserted, directs a switch-side device to forward a copy of the
packet to the host for learn processing. Redirect flag, a flag
that, if asserted, directs a switch-side device to forward a copy
of the packet to the host for redirect processing. Ingress Mirror
flag, a flag that, if asserted, directs a switch-side device to
forward a copy of the packet to a designated ingress mirror port on
the switch. Egress Mirror flag, a flag that, if asserted, directs a
switch-side device to forward a copy of the packet to a designated
mirror FIFO on the switch. Random Early Drop flag, a flag that, if
asserted, increases the priority of the packet for dropping. Jumbo
check flag, a flag that, if asserted, directs a device encountering
the packet to perform a Jumbo-allowed check.
In one configuration, the statistical data comprises: Matrix mode
statistics, whereby a multi-dimensional statistic for a packet is
accumulated over each of the processing cycles undertaken by the
packet.
In one embodiment, the pipeline of FIG. 19 comprises three separate
but related pipelines, identified with numerals 2102, 2104, 2106 in
FIG. 21, that are respectively used to update the control, AFH, and
statistical portions of the working state data.
FIG. 22 illustrates an implementation example of the control data
portion of the state data corresponding to a packet, FIG. 23 is an
implementation example of the AFH portion of the state data
corresponding to a packet, and FIG. 24 is the statistics data
portion of the state data corresponding to a packet. The functions
of the various bits and fields illustrated in FIG. 22 are as
follows: BUSY--a bit that, if asserted, indicates the pipeline slot
is processing a packet. CPU--a bit that, if asserted, indicates the
pipeline slot is processing a CPU or host access. FIRST--a bit
that, if asserted, indicates the current cycle is the first
processing cycle for the packet. DONE PEND--a bit that, if
asserted, indicates that the packet has undergone all required
cycles of processing and that an AFH assignment to the packet is
pending. PTR--a pointer or reference handle to the packet in a
receive FIFO. LEN--packet length up to 128 bytes total IF
TYPE--ingress interface type; 0=Ethernet, 1=POS. IF PST ACTIVE--an
indicator of whether the Port State Table is active during this
processor cycle. PORT--the ingress port of the packet being
processed. VLAN--the working VLAN for the current processing cycle.
C1--the C1 context pointer for the current processing cycle.
C2--the C2 context pointer for the current processing cycle.
C3--the C3 context pointer for the current processing cycle.
C4--the C4 context pointer for the current processing cycle.
C5--the C5 context pointer for the current processing cycle.
C6--the C6 context pointer for the current processing cycle. LKUP
COUNT--a count of the number of cycles of processing undertaken to
date for the packet. SCT--the SCT index for the current processing
cycle. PAGE SEL--the page selector for the current processing
cycle. VLAN SEL--the VLAN selector for the current processing
cycle. L3 SEL--the L3 Header selector for the current processing
cycle. VLAN ARAM--an indicator that the working VLAN for the
current processing cycle was derived from an ARAM entry. DEBUG
ACTIVE--a flag that, if asserted, indicates that a Debug Process is
active. DEBUG LAST SLOT--an indicator to the Debug Process that the
current slot is the last slot in the pipeline. DEBUG LAST LKUP--an
indicator to the Debug Process that the current processing cycle is
the last processing cycle in the pipeline. DEBUG VALID--Debug Valid
bits to control debug triggering.
The functions of the bits and fields illustrated in FIG. 23 are as
follows: PTI--see discussion of FIG. 2. TXMI--see discussion of
FIG. 2. EQoS--see discussion of FIG. 2. IQoS--see discussion of
FIG. 2. CQoS--see discussion of FIG. 2. CPU Copy--see discussion of
FIG. 2. In one implementation, set when a QoS source returns a
valid CPU QoS value. EMRK SEL--see discussion of FIG. 2. PERR
KILL--see discussion of FIG. 2. LAI--see discussion of FIG. 2. LAI
KEEP--an indicator whether the LAI was supplied by ARAM.
EMIRROR--see discussion of FIG. 2. In one implementation, this flag
is set if the ARAM EMirror flag is set or if an Egress QoS is
returned with a special Mirror Copy encode value. IMIRROR--see
discussion of FIG. 2. In one implementation, this flag is set if
either the ARAM IMirror or VPST Mirror flags are set. ROUTE--see
discussion of FIG. 2. In one implementation, this flag is set when
any SCT entry in the lookup sequence for the packet requests that
it be set. LEARN--see discussion of FIG. 2. In one implementation,
this flag may be set when an SCT-enabled comparison indicates that
the ingress port does not equal the least significant bits of the
PTI obtained from a matching CAM entry, or that the CAM search did
not result in a match (also subject to VPST.Learn enable control).
REDIRECT--see discussion of FIG. 2. In one implementation, this
flag is set when an SCT-enabled comparison determines that the
ingress and egress (ARAM-supplied) VLANs are equal. JUMBO--see
discussion of FIG. 2. In one implementation, this flag is set when
any SCT entry in the lookup sequence for the packet requests that
it be set. DON'T FRAG--see discussion of FIG. 2. In one
implementation, this flag is always set for IPv6 processing, and
set for IPv4 processing if the Don't Fragment bit in the IPv4
header is set. In one example, unlike the other flags in this
table, which are all persistent, i.e. once set, remain set, this
flag is pseudo-persistent, i.e., once set, normally remains set,
but may be overwritten in limited circumstances. For example, the
bit may be initially set based on the processing of an outer IP
header, but then is updated (through a SCT request) based on the
processing of an inner UDP header. RED--see discussion of FIG. 2.
In one implementation, this flag is set when a QoS source returns
this flag set. IF TYPE--see discussion of FIG. 2. PTI PRI--current
PTI priority. TXMI PRI--current TXMI priority. EQoS PRI--current
EQoS priority. IQoS PRI--current IQoS priority. CQoS PRI--current
CQoS priority. EMS/EMM PRI--current Egress Mark Select/Mask
priority. SSAMPLE BIN--Statistical Sample bin. SAMPLE
ARAM--indicator that Statistical Sample bin is supplied by
ARAM.
The functions of the bits and fields illustrated in FIG. 24 are
explained in co-pending U.S. patent application Ser. No.
10/834,573.
In one embodiment, the data of FIGS. 22, 23 and 24 is consolidated
with other data to form the process data illustrated in FIG. 25. In
particular, the control data of FIG. 22 forms the 116 bit CONTROL
SET referred to in FIG. 25; the AFH data of FIG. 23 forms the 112
bit AFH SET referred to in FIG. 25; and the statistics data of FIG.
24 forms the 56 bit STATS SET referred to in FIG. 25. This process
data, which includes a pointer to the corresponding packet, forms
the state data that is stored in a slot. The functions of the other
fields referred to in FIG. 25 are as follows: CID--an identifier of
the CAM key as used in the current processing cycle. RID--a Router
identifier as obtained from the PST or VST during the current
processing cycle. PORT--the ingress port of the packet being
processed. CONSTANT--the CONSTANT field from the SCT used in the
current processing cycle. RT0-RT3 RESULTS--the results,
respectively, of Reduction Tables 0-3 during the current processing
cycle. IP PROTOCOL--the IP protocol field of the IP Header
currently being processed. ARAM DATA--the ARAM entry data from the
previous processing cycle. This process data forms a 128 byte,
nibble addressable data structure that is represented in FIGS.
26A-26B. This process data is to be contrasted with a 128 bytes
nibble addressable data structure, representing the first 128 bytes
of packet data, which is also maintained. This data structure is
illustrated in FIG. 27.
In one embodiment, the first cycle of processing is preceded by the
following initialization steps of the CONTROL SET data: current SCT
index loaded with initial SCT index as obtained from the Fist
Command CAM 1808. current PAGE SEL set to 0 (representing Page 0).
current VLAN SEL set to 0 (representing the only or outer VLAN of
Page 0). current VLAN set to Page 0, VLAN0 (or in the case of a
routed POS service, the current VLAN is set to the VLAN supplied by
the First Command CAM 1818). current context pointer set (C1-C6)
loaded with Page 0 context pointers. current L3 SEL set to 0
(representing the only or outer L3 Header of Page 0). current IP
control set (consisting of Fragment Type, Don't_Fragment, Protocol,
Next Header, and Exception Control values) to Page 0 L3 0
(representing the only or outer Header of Page 0). LKUP COUNT reset
to 0 (if counting upwards) or predetermined number of cycles per
packet (if counting down).
All the data in the AFH SET is initialized to 0. The data in the
STATISTICS SET is initialized to values specified in the PSTNST
table.
In one embodiment, a cycle of processing comprises the following
steps: fetch SCT entry based on current SCT index value. form CAM
key (using data path logic 1808). execute CAM search. select active
Exception Handler, as described in U.S. patent application Ser. No.
10/835,252. execute QoS mapping operations, using PST, VST and QoS
Map tables as described herein. execute VPST access, as described
in U.S. patent application Ser. No. 10/835,271. if CAM hit, fetch
corresponding ARAM entry. selectively update process and statistics
data based on SCT and/or ARAM entry data (as well as QoS mapping
operations, VPST access, and exception handling operations). unload
operation if last cycle of processing for packet.
In one example, CAM 1810 is organized so that higher priority
entries precede lower priority entries. If there are multiple
matches or hits with the CAM key, the first such match or hit is
selected, consistent with the higher priority of this entry
compared to the other entries.
In one implementation example, the format of a SCT entry is as
illustrated in FIGS. 28A-28C. The following elements of the SCT
entry format of FIGS. 28A-28C are relevant to this discussion: NEXT
SCT HIT--the index of the next SCT command assuming a CAM hit
during this processing cycle. NEXT SCT MISS--the index of the next
SCT command assuming a CAM miss during this processing cycle. PTI
PRIORITY--the priority of the PTI during this processing cycle TXMI
PRIORITY--the priority of the TXMI during this processing cycle.
EQoS PRIORITY--the priority of the ARAM-supplied EQoS field during
this processing cycle. IQoS PRIORITY--the priority of the
ARAM-supplied IQoS field during this processing cycle.
CQoS PRIORITY--the priority of the ARAM-supplied CQoS field during
this processing cycle. LEARN OP--enable Learn processing operation
ROUTE OP--set the Unicast Route flag during the current processing
cycle. DON'T FRAG OP--enable Don't Frag processing operation during
the current processing cycle. JUMBO OP--enable a Jumbo processing
operation during the current processing cycle. CAM KEY SEL NIBBLE
0-17--Eighteen CAM Key Selection Fields, discussed below.
In one implementation, the CAM key used to search through CAM 1810
during a processing cycle is derived by the data path logic 1808 of
FIG. 18 from the process and packet data for that processing cycle,
as well as the current SCT entry. In FIG. 18, the packet and
process data is provided to the data path logic 1808 over one or
more signal lines 1814, and selection data, used to narrow the
combined 256 bytes of data represented by this process and packet
data down to the desired size of the CAM key, is provided to the
data path logic 1808 from the current SCT entry over one or more
signal lines 1816.
FIG. 29 illustrates one example 2900 of the data path logic 1808.
In this particular example, the data path logic produces a 72 bit
CAM key 2902 that comprises 18 4-bit nibbles. Each of the nibbles
is produced by a corresponding 4-bit wide multiplexor. Thus, in
FIG. 29, nibble 0 of CAM key 2902 is produced by multiplexor 2904a,
while nibble 17 of CAM key 2902 is produced by multiplexor 2904b.
Each of these multiplexors receives the same inputs in the same
order, 512 4-bit nibbles, 256 nibbles representing the process
data, and 256 nibbles representing the packet data. Each of these
multiplexors receives its own 12-bit selection field from the
current SCT entry. Thus, multiplexor 2904a receives the 12-bit
SELECTo field, referred to in FIG. 28 as CAM KEY SEL NIBBLE 0,
while multiplexor 2904b receives the 12-bit SELECT.sub.17 field,
referred to in FIG. 28 as CAM KEY SEL NIBBLE 17. There are a total
of 18 selection fields represented in FIG. 28, which may be
referred to respectively as CAM KEY SEL NIBBLE 0-17, each of which
is assigned its own multiplexor in the implementation of data path
logic illustrated in FIG. 29.
FIG. 30 illustrates the format of each of these 12-bit selection
fields. The functions performed by the bits and fields in this
format are as follows: NIBBLE SELECT--selects one of the two
nibbles in the selected byte. BYTE SELECT--selects one of 128 bytes
in the selected data structure (either process or packet data).
PROCESS PACKET DATA SELECT--selects either the process or packet
data structures. CONTEXT SELECT--must be 0 if the process data
structure is selected; otherwise, selects one of seven packet
contexts as follows: 0--Context 0--beginning of packet. 1--Context
1--MAC Header Start. 2--Context 2--Encapsulation/EtherType Start.
3--Context 3--MPLS Start. 4--Context 4--L3 Outer Start. 5--Context
5--L3 Inner Start. 6--Contect 6--L4 Start. 7--Reserved. In a second
example, a 144 bit CAM key is formed using the structure of FIG. 29
from two successive retrievals of SCT entries over two successive
half cycles. The selection fields from the two successive SCT
entries are successively input to the multiplexors of FIG. 29 with
the same process and packet data as inputs. Through this process,
two 72 data structures are formed that are concatenated to form the
144 bit CAM key. Other examples are possible, so nothing in this or
the previous example should be taken as limiting. FIG. 31
illustrates several possible examples of 72 bit keys.
Once formed, the CAM key is used to search through CAM 1810. If
there is a hit, the process yields an ARAM entry. In one
implementation, the format of an ARAM entry is as illustrated in
FIGS. 32A-32B.
The following elements of the ARAM entry format of FIGS. 32A-32B
are relevant to this discussion: PTI--see discussion of FIG. 2.
TXMI--see discussion of FIG. 2. EQoS--see discussion of FIG. 2.
IQoS--see discussion of FIG. 2. CQoS--see discussion of FIG. 2. PTI
VALID--indicates whether ARAM-supplied PTI field is valid. TXMI
VALID--indicates whether ARAM-supplied TXMI field is valid. EQoS
VALID--indicates whether ARAM-supplied EQoS field is valid. IQoS
VALID--indicates whether ARAM-supplied IQoS field is valid. CQoS
VALID--indicates whether ARAM-supplied CQoS field is valid. RED--if
asserted, sets the AFH RED flag. Next SCT--the next SCT address or
index (depending on state of NEXT SCT VALID flag) NEXT SCT VALID--a
flag that, if asserted, indicates the Next SCT field is valid. VLAN
ID--replaces the working VLAN for the packet if REPLACE VLAN flag
asserted (see below). CONT UPDATE--a 4 bit field that, if non-zero,
selects one of 15 context update registers for updating the packet
context for the current processing cycle. EMIRROR--when asserted,
selects egress mirroring. IMIRROR--when asserted, selects ingress
mirroring. REPLACE VLAN--when asserted, specifies that the VLAN
represented by the VLAN ID field becomes the next working VLAN for
the packet.
In one embodiment, the current SCT and/or ARAM entries yield data
that is used to selectively update the state data for the slot.
Other resources may be accessed as well for the purpose of
retrieving data for use in updating the current state data as
described in U.S. patent application Ser. No. 10/835,271; U.S.
patent application Ser. No. 10/834,576.
In one implementation example, the state data for a slot is the
process data illustrated in FIG. 25. In one implementation, this
process data is selectively updated at the conclusion of a
processing cycle in the following order: CONTROL SET, AFH SET, and
STATS SET.
The CONTROL SET data is updated in part based on the ARAM field
CONT UPDATE. As illustrated in FIG. 33, this field, if non-zero, is
used to select one of fifteen registers is register bank 3302. A
first predetermined bit 3304a in the selected register 3303 forms
the updated value of PAGE SEL. A second predetermined bit 3304b in
the selected register 3303 forms the updated value of VLAN SEL. A
third predetermined bit 3304c in the selected register 3303 forms
the updated value of L3 SEL. In one embodiment, one or more
selected bits in the selected register 3303, such as the bit
identified with numeral 3304d, may be used to selectively update
specific context pointers to handle, for example, the situation in
which the parser did not recognize the corresponding protocol and
thus inaccurately determined the context pointer. The selected bit
may be used to replace the selected context pointer with an updated
value in this embodiment.
The updated PAGE SEL, VLAN SEL, and L3 SEL values form part of the
updated state data for the current slot, but they are used to
update other portions of this state data, such as the context
pointers C1-C6, and the working VLAN. An embodiment of multiplexing
logic for updating this other state data, which may be part of
processor 1802 or data path logic 1808, is illustrated in FIG. 34.
Numeral 3402a identifies page 0 context information, while numeral
3404b identifies page 1 context information. The page 0 context
information comprises the C1-C6 context pointers, up to two VLANs,
VLAN0 and VLAN1, and up to two nested L3 IP Headers, IPHDR0 and
IPHDR1. Similarly, the page 1 context information comprises the
C1-C6 context pointers, up to two VLANs, VLAN0 and VLAN1, and up to
two nested L3 IP Headers, IPHDR0 and IPHDR1.
Multiplexor 3404 selects between these two groupings of information
based on the value of PAGE SEL. If two L3 IP headers are present in
the selected page, multiplexor 3410 selects between these two
headers based in the value of L3 SEL. Similarly, if two VLANs are
present in the selected page, multiplexor 3406 selects between
these two VLANs based on the value of VLAN SEL. And multiplexor
3408 selects between the VLAN selected by multiplexor 3406 and any
ARAM-supplied VLAN based on the value of REPLACE VLAN (from the
ARAM entry).
The output of multiplexor 3408 forms the updated working VLAN in
the CONTROL SET portion of the process data. Similarly, the
selected C1-C6 context pointers output by multiplexor 3404,
identified with numeral 3412, form the updated C1-C6 context
pointers in the CONTROL SET portion of the process data, except
that the C3 context pointer may be modified if there are nested L3
headers in the selected page and the inner header is selected by
multiplexor 3410 as the current L3 header. In that case, the C3
context pointer is updated to pointer to the inner L3 header.
The value of LKUP COUNT in the CONTROL SET portion of the process
data is incremented by one. In one embodiment, the SCT field in
this CONTROL SET, representing the index of the next SCT entry, is
updated using the logic illustrated in FIG. 35, which may be part
of the processor 1802 or the data path logic 1808. As illustrated,
multiplexor 3502 selects between the NEXT SCT HIT and NEXT SCT MISS
values provided by the current SCT entry based on HIT, an indicator
of whether there was a CAM hit or not. If a CAM hit occurred, NEXT
SCT HIT is selected. If a CAM miss occurred, NEXT SCT MISS is
selected.
Multiplexor 3504 selects between the selected SCT-supplied next SCT
index output by multiplexor 3502 and the ARAM-supplied next SCT
index (NEXT SCT) based on the logical ANDing of HIT and the
ARAM-supplied NEXT SCT VALID field. In other words, if there was a
CAM hit and the ARAM-supplied next SCT index is valid, the
ARAM-supplied next SCT index (NEXT SCT) is selected. Otherwise, the
selected SCT-supplied next SCT index (output by multiplexor 3504)
is selected. The selected value output by multiplexor 3504 forms
the SCT field in the CONTROL SET portion of the process data.
The updating of the AFH SET portion of the process data will now be
described. FIG. 36 illustrates an embodiment in which logic 3602
updates priority-based values within this AFH SET, such as PTI,
IQoS, EQoS, CQoS, EMS/EMM, TXMI, and LAI. This logic, which may
either be part of processor 1802 or data path logic 1808, is
configured to updates the current value of a priority-based element
3604, such as PTI or TXMI, if two conditions are met. First, if the
next potential value 3606 of this element is valid. Second, if the
priority 3608 of the next potential value exceeds the priority 3610
of the current value 3604. If these two conditions are met, the
next potential value 3606 replaces the current value 3604 in the
state data, and the priority 3608 of the next potential value
replaces the priority 3610 in the state data.
In one implementation, the specific manner of updating several
elements of the AFH SET proceeds as follows: PTI--the possible
sources of the next PTI field include an ARAM entry, if any,
corresponding to a CAM hit, and one or more of the Exception
Handlers. If there is a tie, the first value is used. The
ARAM-supplied PTI value has a priority determined by the current
SCT entry, and the priority of any Exception Handler value is
supplied by the Exception Handler. The next PTI is taken to be the
PTI value from any of these sources that has the highest priority
that exceeds the current priority. If there is no CAM hit, a
default PTI value is obtained from one or more of the Exception
Handlers. This default value only supplants the current PTI if its
priority exceeds that of the current PTI. IQoS--the possible
sources of the next IQoS field include any of 0.1p, MPLS, or ToS
QoS mapping (if enabled by the current SCT entry), the PST (or
VST), and the current ARAM entry (assuming a CAM hit). The SCT
supplies the priority associated with the ARAM-supplied IQoS. A
4-bit PST (or VST) resident field is used to select a QoS Priority
control structure from 16 possible structures. This structure
indicates the priority for the PST, VST, 0.1p, MPSL, and ToS IQoS
values. The next IQoS value is taken to be the IQoS value from any
of these sources that has the highest priority that exceeds the
current priority. If there is a tie, the first value is used. In
the case of MPLS parallel label processing, as described in U.S.
patent application Ser. No. 10/835,271, parallel IQoS mappings are
performed for each of the MPLS labels, and an ARAM supplied field
(the MPLS field) is used to select the next IQoS value from these
parallel operations. EQoS--EQoS updating is performed the same way
as IQoS, but using an independent set of resources. In one mode of
operation, the least significant bits of the EQoS value encodes the
following egress side decisions: None. Pre-emptive Kill. Normal
Kill. Thermonuclear Kill. Egress Mirror Copy. Pre-emptive Intercept
(to CPU or host). Normal Intercept (to CPU). CQoS--CQoS updating is
performed the same way as IQoS, but using an independent set of
resources. The assertion of a CQoS valid flag for any resource that
wins the priority context causes a copy of the packet to be sent to
the CPU regardless of the setting of any CPU_Copy or CPU_Alert
flags. EMS/EMM--EMS/EMM updating is performed the same way as IQoS,
but using an independent set of resources. TXMI--assuming a CAM
hit, the SCT-supplied priority of the ARAM-supplied TXMI value is
compared with the current priority, and if it exceeds the current
priority, the ARAM-supplied TXMI value becomes the next TXMI value.
LAI--the next-LAI may be supplied by two possible methods. First,
if the ARAM-supplied LAI VALID field is asserted, the next LAI
value is taken to be the value of the ARAM-supplied LAI field.
Second, the next LAI value may be accumulated over one or more of
the processing cycles using a hash-based lookup scheme as described
in U.S. patent application Ser. No. 10/834,566.
The process of updating values in the STATS SET portion of the
process data, and the process of updating the statistics data
structures as maintained in the Statistics RAM 146 at the end of a
processing cycle is described in U.S. patent application Ser. No.
10/834,573.
FIG. 37 illustrates one embodiment 3700 of a method of performing
pipelined processing of one or more packets in a pipeline having a
predetermined number of slots for placement of packet data. The
method comprises step 3702, loading each of one or more empty, ones
of the slots of the pipeline with available packet data. The method
further comprises step 3704, processing the data in each of one or
more filled ones of the slots in sequence during a cycle of
processing, and also processing the data in each of one or more
filled ones of the slots for a predetermined number of cycles of
processing, occurs. The method further comprises step 3706,
unloading the data in each of one or more filled ones of the slots
upon or after the data in the filled slot has undergone the
predetermined number of cycles of processing, and deriving
classification or forwarding information for the packet from
related state information for the packet.
In one embodiment, the predetermined number of slots in the
pipeline is fixed. In another embodiment, it is a programmed
variable. In one implementation, the step of loading the pipeline
comprises filling one or more unfilled ones of the slots with
packet data as obtained from a queue. In one example, the step
further comprises bypassing one or more unfilled ones of the slots
if and while the queue is empty.
In one implementation example, the packet data loaded into a slot
is an identifier of the packet as stored in a buffer. In another
implementation example, the state data relating to a packet is
stored in a slot along with the packet data corresponding to the
packet.
In one configuration, the related state data for a packet is
control data, such as pipeline management data, or packet process
state data. In one example, the control data is static packet
information. In another example, the related state data is packet
classification/forwarding information, such as priority-based
packet classification/forwarding information or non-priority-based
packet classification/forwarding information. The related state
data may also comprises one or more "sticky" flags relating to the
packet, or statistical information relating to the packet,
including statistical information relating to each of a plurality
of processing cycles performed on the corresponding packet
data.
FIG. 38 illustrates an embodiment 3800 of a method of processing
the data in a filled slot of the pipeline during a processing
cycle. As illustrated, in this embodiment, the method comprises
step 3802, accessing one or more resources responsive to current
working state data corresponding to the slot. The method also
comprises step 3804, retrieving data from one or more of the
resources. In one implementation, this step comprises retrieving an
SCT entry using an SCT index as obtained from the working state
data, deriving a CAM key from this entry, using this CAM key to
perform a CAM search. If the search results in a hit, a
corresponding ARAM entry is retrieved. The data in the SCT and/or
ARAM entries form the data retrieved in step 3804. In one
implementation, data from other resources besides the SCT and ARAM
are retrieved in this step, including but not limited to QoS
mapping tables, PST, VST or VPST tables, Exception Handlers,
etc.
The method further comprises step 3806, selectively updating the
working state data responsive to the data retrieved in step
3804.
Preferred Embodiments of the Invention
FIG. 39 illustrates an embodiment 3900 of a system for deriving a
quality of service indicator for a packet. In this embodiment, a
register 3906 holds a control element. A first resource 3908a is
configured to derive a first candidate quality of service indicator
for the packet from packet processing state data. In the example
illustrated, this packet processing state data comprises data 3902
representative of at least a portion of the packet, data 3904
representative of at least a portion of the state of processing of
the packet, or a combination of the data 3902 and data 3904.
A second resource 3908b is configured to derive a second candidate
quality of service indicator for the packet from packet processing
state data. Again, in the example shown, the packet processing
state data comprises data 3902 representative of at least a portion
of the packet, data 3904 representative of the state of processing
of the packet, or a combination of the data 3902 and data 3904. The
second resource 3908b is also configured to derive the second
candidate quality of service indicator responsive to at least a
portion of the control element held in the register 3906. (The
first resource 3908a may or may not be configured to derive the
first candidate quality of service indicator responsive to at least
a portion of the control element held in the register 3906).
Resolution logic 3910 is configured to derive a quality of service
indicator 3914 for the packet from the first and second candidate
quality of service indicators 3912a, 3912b for the packet as
derived by the first and second resources 3908a, 3908b.
In one implementation, the system further comprises a memory 3916,
and the control element held in the register 3906 is selected from
a plurality of control elements 3918a, 3918b, 3918c held in the
memory 3916. In one implementation example, the plurality of
control elements is a sequence of control elements, such as but not
limited to a sequence of software commands or instructions that
define a processing or program sequence for the packet.
In one embodiment, the first resource 3908a comprises logic for
mapping the packet processing state data into the first candidate
quality of service indicator through a lookup table or the
like.
In one embodiment, the second resource 3908b comprises logic for
mapping packet processing state data into the second candidate
quality of service indicator responsive to at least a portion of
the control element. In one implementation, the logic comprises a
plurality of lookup tables, and one of these lookup tables is
selected responsive to a predetermined field of the control element
held in the register 3906 that specifies a mapping mode selected
from a plurality of possible mapping modes.
In a second embodiment, the second resource 3908b comprises logic
for searching for a corresponding quality of service indicator
using a key derived from packet processing state data responsive to
at least a portion of the control element held in the register
3906. In one implementation, one or more fields of this control
element selects a subset of the packet processing state data, and
this subset forms a key that is used to search a CAM for a
corresponding entry, i.e., an entry having a tag portion that
matches the value of the key. If such an entry is found, the
content portion of the corresponding entry either forms the key, or
forms the address of the key as held in another memory, e.g.,
ARAM.
In one embodiment, the resolution logic 3910 comprises a packet
processor. In one implementation, the first resource 3908a is
configured to derive a first priority for the first candidate
quality of service indicator, the second resource 3908b is
configured to derive a second priority for the second candidate
quality of service indicator, and the packet processor is
configured to derive a quality of service indicator for the packet
from the first and second candidate quality of service indicators,
and the first and second priorities.
In one example, the packet processor maintains a current quality of
service indicator and priority for the packet, and is configured to
replace the current quality of service indicator for the packet
with the first candidate quality of service indicator if the
priority of the first candidate quality of service indicator
exceeds that of the current quality of service indicator and that
of the second candidate quality of service indicator. In this
example, the packet processor is also configured to replace the
current quality of service indicator for the packet with the second
candidate quality of service indicator if the priority of the
second candidate quality of service indicator exceeds that of the
current quality of service indicator for the packet and that of the
first candidate quality of service indicator.
In one example, the quality of service indicator for the packet is
an egress quality of service indicator. In a second example, the
quality of service indicator for the packet is an ingress quality
of service indicator. In a third example, the quality of service
indicator for the packet is a host quality of service indicator. In
a fourth example, the quality of service indicator for the packet
is a multi-dimensional quality of service indicator comprising
ingress, egress, and host quality of service indicator
components.
In one embodiment, the system comprises three or more resources,
3908a, 3908b, 3908c, each configured to derive a candidate quality
of service indicator for the packet, wherein the resolution logic
3910 is configured to derive the quality of service indicator for
the packet from the candidate quality of service indicators derived
by the three or more resources. The details of this third resource
3908c are not important to this embodiment; one of skill in the art
would appreciate that this third resource may be configured to
provide a candidate quality of service indicator through mapping,
searching, or a combination of mapping or search. Furthermore, one
of skill in the art would appreciate that this third resource may
provide a candidate quality of service indicator that is or is not
responsive to control information (such as from the control element
held in the register 3906). Other examples are possible, so nothing
in the foregoing should be taken as limiting.
FIG. 40 illustrates an implementation example 4000 of a system for
deriving a quality of service indicator for a packet. In this
implementation, the quality of service indicator is
multi-dimensional, and has four components. The first component,
IQoS, is an ingress quality of service indicator, the second, EQoS,
the third, CQoS, a host quality of service indicator, and the
fourth, EMRK SEL and EMRK MASK, together form an indicator of a
maskable, quality of service related, egress marking operation to
be performed on the packet.
In this example, the system provides data 4002 representative of a
packet and data 4004 representative of the working state of
processing of the packet. In this example, the data 4002 is the
parsed data representative of the packet as stored in Parser Result
RAM 1820 in FIG. 18, and the state data 4004 is the data
illustrated in FIGS. 22-25. The control element held in the
register in this example is the SCT entry (illustrated in FIG. 28)
for the current processing slot.
VLAN state table 4008 (VST) comprises a first resource that maps
the current VLAN identifier for the packet (the VLAN field
illustrated in FIG. 22) into a first candidate quality of service
indicator 4010. FIG. 41A illustrates the format of one example of
the VST, and FIG. 41B illustrates the format of a VST entry in this
example. In FIG. 41B, the EQOS, IQOS, and CQOS fields are,
respectively, the egress, ingress and host components of the
candidate quality of service indicator, and the EQOS VALID, IQOS
VALID, and CQOS VALID fields are flags indicating respectively
whether the EQOS, IQOS and CQOS fields are valid. The EMRK SEL and
EMRK MASK fields together form the packet marking component of the
candidate quality of service indicator.
The QOS SEG field, identified with numeral 4012 in FIG. 40,
specifies one of sixteen possible QoS segment values for the
packet. This QoS segment value forms an input to QoS priority table
4014. An example of the format of this table and an example of the
format of an entry in this table are discussed below. Suffice it to
say here that in this example this table provides, with one
exception, a priority value 4016 for each of the candidate quality
of service indicators produced by the various resources in the
system. The one exception is that the priority for the CAM-based,
ARAM-supplied candidate quality of service indicators is provided
by the current SCT entry (FIG. 28). More specifically, the EQOS
PRIORITY, IQOS PRIORITY, and Coos PRIORITY fields provide the
priority, respectively, for the CAM-based, ARAM-supplied EQoS,
IQoS, and CQoS indicators.
In FIG. 40, QoS mapping tables 4018 form a second resource for
deriving a candidate quality of service indicator. In this example,
four QoS mapping tables are provided, the Vpri QoS mapping table,
the MPLS Exp QoS mapping table, the IP v4 ToS QoS mapping table,
and the IP v6 ToS QoS mapping table. Assuming the QoS mapping mode
is activated (determined by the setting of the SCT QOS MAP OP
control bit illustrated in FIG. 28), one of these tables is
selected using the QOS MAP field (FIG. 28) of the SCT entry for the
current processing slot (the QOS MAP field is illustrated in FIG.
40 with numeral 4020). The value of this field is provided as an
input to logic implementing the QoS mapping tables over one or more
signal lines 4022. A value of 0 selects the Vpri QoS table, a value
of 1, the MPLS Exp table, a value of 2, the IP v4 ToS table, and a
value of 3, the IP v6 ToS table.
The format of one example of the Vpri QoS mapping table is
illustrated in FIG. 42A. An address to this table is formed from
the QoS segment value 4012 and the most significant bits of the
Vpri (0.1 p) field of the packet. Each entry in the table contains
two QoS entries. The least significant bit of the Vpri (0.1 p)
field selects one of the two as the active entry.
The format of one example of a Vpri QoS table entry is illustrated
in FIG. 42B. As in FIG. 41B, the EQOS, IQOS, and CQOS fields are,
respectively, the egress, ingress and host components of the
candidate quality of service indicator, and the EQOS VALID, IQOS
VALID, and CQOS VALID fields are flags indicating respectively
whether the EQOS, IQOS and CQOS fields are valid. The EMRK SEL and
EMRK MASK fields together form the packet marking component of the
candidate quality of service indicator.
The format of one example of the MPLS Exp QoS mapping table is
illustrated in FIG. 43A. An address to this table is formed from
the QoS segment value 4012 and the most significant bits of the
MPLS Exp (CoS) field of the packet (extracted from the packet by
appropriate settings of the QOS MAP SEL fields illustrated in FIG.
28). Each entry in the table contains two QoS entries. The least
significant bit of the MPLS Exp (CoS) field selects one of the two
as the active entry.
The format of one example of a MPLS Exp QoS table entry is
illustrated in FIG. 43B. As in FIG. 42B, the EQOS, IQOS, and CQOS
fields are, respectively, the egress, ingress and host components
of the candidate quality of service indicator, and the EQOS VALID,
IQOS VALID, and CQOS VALID fields are flags indicating respectively
whether the EQOS, IQOS and CQOS fields are valid. The EMRK SEL and
EMRK MASK fields together form the packet marking component of the
candidate quality of service indicator.
The format of one example of the IP v4 ToS QoS mapping table is
illustrated in FIG. 44A. An address to this table is formed from
the QoS segment value 4012 and the most significant bits of the IP
v4 ToS field of the packet (extracted from the packet through
suitable settings of the QOS MAP SEL fields illustrated in FIG.
28). Each entry in the table contains two QoS entries. The least
significant bit of the IP v4 ToS field selects one of the two as
the active entry.
The format of one example of an IP v4 QoS table entry is
illustrated in FIG. 44B. As in FIG. 43B, the EQOS, IQOS, and CQOS
fields are, respectively, the egress, ingress and host components
of the candidate quality of service indicator, and the EQOS VALID,
IQOS VALID, and CQOS VALID fields are flags indicating respectively
whether the EQOS, IQOS and CQOS fields are valid. The EMRK SEL and
EMRK MASK fields together form the packet marking component of the
candidate quality of service indicator.
The format of one example of the IP v6 ToS QoS mapping table is
illustrated in FIG. 45A. An address to this table is formed from
the QoS segment value 4012 and the most significant bits of the
IPv6 Traffic Class, or Ipv6 Flow Label based QoS fields of the
packet (extracted from the packet through suitable settings of the
QOS MAP SEL fields illustrated in FIG. 28). Each entry in the table
contains two QoS entries. The least significant bit of the IP v6
field selects one of the two as the active entry.
The format of one example of an IP v6 QoS table entry is
illustrated in FIG. 45B. As in FIG. 44B, the EQOS, IQOS, and CQOS
fields are, respectively, the egress, ingress and host components
of the candidate quality of service indicator, and the EQOS VALID,
IQOS VALID, and CQOS VALID fields are flags indicating respectively
whether the EQOS, IQOS and CQOS fields are valid. The EMRK SEL and
EMRK MASK fields together form the packet marking component of the
candidate quality of service indicator.
In one implementation, only one of these four tables is selected at
a time. The candidate quality of service indicator as produced by
the selected table is output on the one or more signal lines
4022.
In FIG. 40, the search logic 4024 comprises a third resource that
is configured to provide a candidate quality of service indicator
for the packet. In one example, the search logic 4024 comprises the
combination of the data path logic 1808, CAM 1810, and ARAM 1812
illustrated in FIG. 18. The CAM KEY fields 4028 from the current
SCT entry (FIG. 28) form inputs to the data path logic 1808 over
one or more signal lines 4030. Responsive thereto, the data path
logic 1808 selects a subset of the combined packet and process
data. This subset forms a key, which is input to the CAM 1810. A
search is conducted for a CAM entry having a tag portion that
matches the key.
If a hit occurs, the content portion of the entry forms the address
to the ARAM 1812. An example of the addressed entry of the ARAM has
the format illustrated in FIGS. 32A-32B. In FIGS. 32A-32B, as in
FIG. 44B, the EQOS, IQOS, and CQOS fields are, respectively, the
egress, ingress and host components of the candidate quality of
service indicator, and the EQOS VALID, IQOS VALID, and CQOS VALID
fields are flags indicating respectively whether the EQOS, IQOS and
CQOS fields are valid. The EMRK SEL and EMRK MASK fields together
form the packet marking component of the candidate quality of
service indicator. If there is a miss, this third resource does not
supply a candidate quality of service indicator for the current
processing slot.
An optional port state table (PST) (not shown in FIG. 40) comprises
a fourth possible resource that maps the current ingress port
identifier for the packet (the PORT field illustrated in FIG. 22)
into a fourth possible candidate quality of service indicator. FIG.
46A illustrates an example of the format of the PST in this
example, and FIG. 46B illustrates an example of the format of a PST
entry in this example. In FIG. 46B, the EQOS, IQOS, and CQOS fields
are, respectively, the egress, ingress and host components of the
candidate quality of service indicator, and the EQOS VALID, IQOS
VALID, and CQOS VALID fields are flags indicating respectively
whether the EQOS, IQOS and CQOS fields are valid. The EMRK SEL and
EMRK MASK fields together form the packet marking component of the
candidate quality of service indicator.
A configuration table (not shown in FIG. 40) indicates whether the
VST or PST will be active for a given ingress port. In the example
illustrated in FIG. 40, one or the other but not both of the VST
and the PST are active for a given port. The IF PST ACTIVE flag in
FIG. 22, if asserted, indicates that the PST is active for the
current processing sequence.
In one embodiment, with one exception, each of the candidate
quality of service indicators produced by the various resources is
assigned a priority by the QoS priority table 4014. The one
exception is that the priority for the CAM-based, ARAM-supplied
candidate quality of service indicators is provided by the current
SCT entry (FIG. 28). More specifically, the EQOS PRIORITY, IQOS
PRIORITY, and Coos PRIORITY fields provide the priority,
respectively, for the CAM-based, ARAM-supplied EQOS, IQOS, and CQOS
indicators. In one example, the format of the QoS priority table is
as illustrated in FIG. 47A. The entries of this table that are
relevant to this discussion are the QoS Priority entries 4702. An
address into this table is formed by zero extending the QOS SEG
value by 8 bits, and concatenating an MSB of 1'b1 into bit [8] of
the address. An example of the format of an entry to this table is
illustrated in FIG. 47B. As illustrated, an entry in this table
separately assigns a priority to each of the EQoS, IQoS, CQoS, EMRK
SEL/MASK components as produced by all but one of the resources
that have been discussed. The exception is the ARAM-supplied QoS
components, which are supplied with priority values by the current
SCT entry. In FIG. 28, the priorities of the ARAM-supplied EQOS,
IQOS, CQOS, and EMRK QoS components are respectively provided by
the EQOS PRIORITY, IQOS PRIORITY, CQOS PRIORITY, and EMRK PRIORITY
fields.
In FIG. 40, the resolution logic 4032 comprises the packet
processor 1802 of FIG. 18. The packet processor resolves the
candidate quality of service indicators as received from each of
the various resources with the current QoS components (represented
by the EQOS, IQOS, COS, EMRK SEL, EMRK MASK fields of FIG. 23).
If a candidate QoS component received from a resource is not
indicated as valid, it is ignored. Otherwise, the component is
considered in the resolution process. During this process, for each
QoS component that is indicated as being valid, the packet
processor 1802 compares the priority of that candidate with that of
all the other valid candidates and that of the current QoS
component, and replaces the current component with any candidate
component that has the highest priority of all the other valid
candidate components and that exceeds the priority of the current
component. If more than one valid candidate component is provided
that has a priority that exceeds that of the current component, the
first encountered candidate component is selected as the next
component. In one example, the candidate components are evaluated
in the following order: PST, VST, Vpri QoS Mapping, MPLS Exp QoS
Mapping, IP v4 ToS Mapping, IP v6 ToS Mapping, and ARAM-supplied.
The resulting QoS component and priority values, identified in FIG.
40 with numeral 4034, become the current QoS component and priority
values during the next processing slot that the packet
undergoes.
FIG. 48 is a flowchart illustrating one embodiment 4800 of a method
of deriving a quality of service indicator for a packet, the packet
having a state. In this method, step 4802 comprises holding a
control element. Step 4804 comprises deriving a first candidate
quality of service indicator for a packet from data representative
of at least a portion of the packet, data representative of at
least a portion of the state of the packet, or both. Step 4806
comprises deriving a second candidate quality of service indicator
for the packet from data representative of at least a portion of
the packet, data representative of at least a portion of the state
of the packet, or both, responsive to at least a portion of the
control element. Step 4808 comprises deriving a quality of service
indicator for the packet from the first and second candidate
quality of service indicators for the packet.
In one implementation, the method further comprises selecting the
control element from a plurality of control elements held in a
memory. In one example, the plurality of control elements comprises
a sequence of control elements. In one configuration, the sequence
of control elements comprises a sequence of software commands or
instructions that form a program or processing sequence for the
packet.
In one embodiment, the first deriving step 4804 comprises mapping
packet processing state data relating to a packet into the first
candidate quality of service indicator. In another embodiment, the
second deriving step 4806 comprises mapping packet processing state
data relating to the packet into the second candidate quality of
service indicator responsive to at least a portion of the control
element. In one implementation, the at least a portion of the
control element specifies a mapping mode selected from a plurality
of possible mapping modes.
In a second embodiment, the second deriving step 4806 comprises
searching for a corresponding quality of service indicator using a
key derived from packet processing state data relating to the
packet responsive to at least a portion of the control element. In
one implementation, the key is derived from data representative of
packet processing state data relating to the packet selected by the
values of one or more fields of the control element.
In one embodiment, the method further comprises deriving a first
priority for the first candidate quality of service indicator,
deriving a second priority for the second candidate quality of
service indicator, and deriving a quality of service indicator for
the packet from the first and second candidate quality of service
indicators, and the first and second priorities.
In one implementation, the packet has a current quality of service
indicator and priority, and the method further comprises replacing
the current quality of service indicator for the packet with the
first candidate quality of service indicator if the priority of the
first candidate quality of service indicator exceeds that of the
current quality of service indicator and that of the second
candidate quality of service indicator. In this implementation, the
method further comprises replacing the current quality of service
indicator for the packet with the second candidate quality of
service indicator if the priority of the second candidate quality
of service indicator exceeds that of the current quality of service
indicator and that of the first candidate quality of service
indicator.
In one example, the quality of service indicator for the packet is
an egress quality of service indicator. In a second example, the
quality of service indicator for the packet is an ingress quality
of service indicator. In a third example, the quality of service
indicator for the packet is a host quality of service indicator. In
a fourth example, the quality of service indicator for the packet
is a multi-dimensional quality of service indicator comprising
ingress, egress, and host quality of service indicator
components.
In another embodiment, the method comprises deriving three or more
candidate quality of service indicators for the packet, and
deriving the quality of service indicator for the packet from the
three or more candidate quality of service indicators.
In one implementation example, any of the foregoing systems and
methods may be implemented or embodied as one or more application
specific integrated circuits (ASICs).
While various embodiments of the invention have been described, it
will be apparent to those of ordinary skill in the art that many
more embodiments and implementations are possible that are within
the scope of this invention.
* * * * *
References