Hash collision avoidance in network routing Patent Grant Sites February 23, 2 [Google Inc.]

Hash collision avoidance in network routing

Sites February 23, 2

Patent Grant 9270592

U.S. patent number 9,270,592 [Application Number 14/163,562] was granted by the patent office on 2016-02-23 for hash collision avoidance in network routing. This patent grant is currently assigned to Google Inc.. The grantee listed for this patent is Google Inc.. Invention is credited to Richard Lee Sites.

United States Patent	9,270,592
Sites	February 23, 2016

Hash collision avoidance in network routing

Abstract

Network device and method for routing a packet and setting up a new flow. The device includes a packet classifier, a field-selection table, a hash module, and a routing table. A packet is routed by finding an entry in the field-selection table using the packet classifier, selecting bits from the packet based on the entry in the field-selection table, and hashing the selected bits along with an identifier from the packet classifier or the field-selection table, using the hash module. The hash result is used to locate instructions in the routing table. When setting up a new flow, the hash module result may point to an existing entry in the routing table. In such instances, a new entry is added to the packet classifier, such that the hash module will produce a different result that points to an available entry in the routing table.

Inventors:

Sites; Richard Lee (Mountain View, CA)

Applicant:

Name	City	State	Country	Type
Google Inc.	Mountain View	CA	US

Assignee:

Google Inc. (Mountain View, CA)

Family ID:

55314791

Appl. No.:

14/163,562

Filed:

January 24, 2014

Current U.S. Class:	1/1
Current CPC Class:	H04L 45/745 (20130101); H04L 45/7453 (20130101); H04L 45/7457 (20130101)
Current International Class:	H04L 12/743 (20130101); H04L 12/741 (20130101)

References Cited [Referenced By]

U.S. Patent Documents


7002965	February 2006	Cheriton
7219211	May 2007	Greene et al.
7290084	October 2007	Miller et al.
7653670	January 2010	Hasan et al.
2008/0133494	June 2008	Won-Kyoung et al.
2011/0273987	November 2011	Schlansker et al.
2013/0086017	April 2013	Chao et al.

Foreign Patent Documents


1 551 141	Jul 2005	EP
2 512 073	Oct 2012	EP

Other References

Nie, et al., IP Address Lookup Using a Dynamic Hash Function, pp. 1642-1647, Canadian Conference on Electrical and Computer Engineering, IEEE, May 2005. cited by applicant .
Pagiamtzis, et al., Content Addressable Memory (CAM) Circuits and Architectures: A Tutorial and Survey, pp. 712-727, IEEE Journal of Solid-State Circuits, vol. 41. No. 3, Mar. 2006. cited by applicant.

Primary Examiner: Yao; Kwang B
Assistant Examiner: Castaneyra; Ricardo
Attorney, Agent or Firm: Gordon; Edward A. Foley & Lardner LLP

Claims

What is claimed is:

1. A network device, comprising: a packet classifier; a field-selection table; a hash module; a routing table comprising entries each associated with a respective hash value; and a routing module configured to route a packet by: determining an entry in the packet classifier using the packet, retrieving a first identifier associated with the determined packet classifier entry, choosing a first field-selection table entry using the first identifier, wherein the first field-selection table entry specifies a first set of bits, generating a first hash module input by identifying values of the first set of bits of the packet specified by the chosen first field-selection table entry, causing the hash module to compute a first hash result based on the first hash module input and based on the first identifier, matching the first hash result to a first entry in the routing table, and obtaining processing data for the packet from the first routing table entry associated, by the matching, with the first hash result; and a maintenance module configured to resolve a collision between the first hash result associated with the first entry in the routing table and a second hash result for a new data communication flow by: adding, responsive to detecting the collision, a new entry to the packet classifier corresponding to the new data communication flow, wherein the new entry includes a new identifier that is different from the first identifier; adding, to the field-selection table, a new field-selection table entry corresponding to the new identifier, wherein the new field-selection table entry specifies a second set of bits; generating a second hash module input by identifying values of the second set of bits of a packet of the new data communication flow; causing the hash module to compute a third hash result based on the second hash module input and the new identifier; and adding, in association with the third hash result, an entry to the routing table comprising processing data associated with the new data communication flow.

2. The network device of claim 1, wherein the new entry added to the packet classifier matches at least one field of a packet associated with the new data communication flow.

3. The network device of claim 1, wherein the packet classifier is a ternary content addressable memory (TCAM).

4. The network device of claim 3, wherein the first identifier associated with the first packet classifier entry comprises an index in the TCAM.

5. The network device of claim 1, wherein the field-selection table comprises a table of byte-masks.

6. The network device of claim 1, wherein causing the hash module to compute the first hash result based on the first hash module input and based on the first identifier comprises causing the hash module to use the first identifier as a seed value.

7. The network device of claim 1, wherein the routing module is configured to match the first hash result to the first routing table entry by: deriving a routing table index from a first subset of a set of hash result bits, wherein the hash result bits are a binary representation of the hash result; locating an entry in the routing table from the routing table index; identifying a match data item stored in the entry; and verifying that the match data item is equal to a second subset of the set of hash result bits, the second subset comprising at least one bit not in the first subset.

8. The network device of claim 1, wherein the first hash module input consists of the results of an application of a mask associated with the chosen first field-selection table entry to a single contiguous block of bits from the first packet, wherein the single contiguous block of bits comprises at least a portion of a header of the first packet.

9. The network device of claim 1, the maintenance module configured to detect the collision between the first hash result and the second hash result for the new data communication flow.

10. The network device of claim 1, the routing module configured to retrieve the first identifier associated with the first packet classifier entry by retrieving an index value of the first packet classifier entry.

11. The network device of claim 1, wherein the second set of bits specified in the new field-selection table entry is different from the first set of bits specified by the first field-selection table entry.

12. A method, comprising: receiving a first packet associated with a first data communication flow; determining a first entry in a packet classifier using the received first packet; retrieving a first identifier associated with the determined first packet classifier entry; choosing a first field-selection table entry using the retrieved first identifier, wherein the first field-selection table entry specifies a first set of bits; generating a first hash module input by identifying values of the first set of bits of the received first packet specified by the chosen first field-selection table entry; causing a hash module to compute a first hash result based on the first hash module input and the retrieved first identifier; matching the first hash result to a first entry in a routing table; obtaining processing data for the first packet from the matching first routing table entry associated, by the matching, with the first hash result; detecting a collision between a second hash result for a second data communication flow and the first hash result associated with the first entry in the routing table; adding, responsive to detecting the collision, a new entry to the packet classifier corresponding to the second data communication flow, wherein the new entry includes a new identifier that is different from the first identifier; adding, to the field-selection table, a new field-selection entry corresponding to the new identifier, wherein the new field-selection entry specifies a second set of bits; generating a second hash module input by identifying values of the second set of bits of a packet of the new data communication flow; causing the hash module to compute a third hash result based on the second hash module input and the new identifier; and adding an entry to the routing table comprising processing data associated with the new data communication flow, the added entry associated with the third hash result.

13. The method of claim 12, wherein adding the new entry to the packet classifier comprises adding an entry that matches at least one field of a packet associated with the new data communication flow.

14. The method of claim 12, wherein retrieving the first identifier associated with the first packet classifier entry comprises retrieving an index value of the first packet classifier entry.

15. The method of claim 12, wherein the first hash module input consists of the results of an application of a mask associated with the chosen first field-selection table entry to a single contiguous block of bits from the received packet.

16. The method of claim 12, wherein matching the first hash result to the first routing table entry in the routing table comprises: deriving a routing table index from a first subset of a set of hash result bits, wherein the hash result bits are a binary representation of the hash result; locating an entry in the routing table from the routing table index; identifying a match data item stored in the entry; and verifying that the match data item is equal to a second subset of the set of hash result bits, the second subset comprising at least one bit not in the first subset.

17. The method of claim 12, wherein the second set of bits specified in the new field-selection table entry is different from the first set of bits specified by the first field-selection table entry.

18. A non-transitory computer-readable medium storing computer-readable instructions that, when executed by one or more computing devices, cause at least one of the one or more computing devices to: receive a first packet associated with a first data communication flow; determine a first entry in a packet classifier using the received first packet; retrieve a first identifier associated with the determined first packet classifier entry; choose a first field-selection table entry using the retrieved first identifier, wherein the first field-selection table entry specifies a first set of bits; generate a first hash module input by identifying values of the first set of bits of the received first packet specified by the chosen first field-selection table entry; cause a hash module to compute a first hash result based on the first hash module input and the retrieved first identifier; match the first hash result to a first entry in a routing table; obtain processing data for the first packet from the first routing table entry associated, by the matching, with the first hash result; detect a collision between a second hash result for a second data communication flow and the first hash result associated with the first entry in the routing table; add, responsive to detecting the collision, a new entry to the packet classifier corresponding to the second data communication flow, wherein the new entry includes a new identifier that is different from the first identifier; add, to the field-selection table, a new field-selection entry corresponding to the new identifier, wherein the new field-selection entry specifies a second set of bits; generate a second hash module input by identifying values of the second set of bits of a packet of the new data communication flow; cause the hash module to compute a third hash result based on the second hash module input and the new identifier; and add an entry to the routing table comprising processing data associated with the new data communication flow, the added entry associated with the third hash result.

19. The non-transitory computer-readable medium in claim 18, further comprising additional instructions that, when executed by one or more computing devices, cause at least one of the one or more computing devices to add the new entry to the packet classifier by adding an entry that matches at least one field of a packet associated with the new data communication flow.

20. The non-transitory computer-readable medium in claim 18, further comprising additional instructions that, when executed by one or more computing devices, cause at least one of the one or more computing devices to: retrieve the first identifier associated with the first packet classifier entry by retrieving an index value of the first packet classifier entry.

21. The non-transitory computer-readable medium in claim 18, wherein the first hash module input consists of results of an application of a mask associated with the chosen first field-selection table entry to a single contiguous block of bits from the received first packet.

22. The non-transitory computer-readable medium in claim 18, further comprising additional instructions that, when executed by one or more computing devices, cause at least one of the one or more computing devices to match the first hash result to the first routing table entry in the routing table by: deriving a routing table index from a first subset of a set of hash result bits, wherein the hash result bits are a binary representation of the hash result; locating an entry in the routing table from the routing table index; identifying a match data item stored in the entry; and verifying that the match data item is equal to a second subset of the set of hash result bits, the second subset comprising at least one bit not in the first subset.

23. The non-transitory computer-readable medium in claim 18, wherein the second set of bits specified in the new field-selection table entry is different from the first set of bits specified by the first field-selection table entry.

Description

BACKGROUND

Network devices, e.g., switches, routers, and filters, play an important role in data communications. Countless amounts of data are transferred as data packets transmitted over different networks across the world. Each data packet must be channeled from its source to its destination, and network devices play an important role in directing the traffic. In order to limit latency, it is important that network device can route traffic efficiently and accurately.

There are many different components within network devices. One component found in some network routers is a routing table. A routing table stores handling instructions, e.g., a next-hop destination or an egress port identifier, for data packets that are to be routed through the device. Each entry in the table corresponds to a route. These routing tables are sometimes stored using volatile integrated circuit memory, e.g., SRAMs. Generally, routing tables have limited capacities. In order to stay within the limited capacity, it is important that network devices store a minimal amount of data for each route in the routing table.

SUMMARY

In one aspect, the disclosure relates to a network device. The network device includes a packet classifier, a field-selection table, a hash module, a routing table, and a routing module configured to route a packet. The routing module is configured to determine an entry in the packet classifier using a received packet, retrieve an identifier associated with the determined packet classifier entry, choose a field-selection table entry using the retrieved identifier, generate a hash module input by identifying a set of bits of the packet based on the chosen field-selection table entry, cause the hash module to compute a hash result based on the generated hash module input and based on the retrieved identifier, match the hash result to an entry in the routing table, and obtain processing data for the data packet from the matching routing table entry.

In another aspect, the disclosure relates to a method. The method includes receiving a packet from a source, determining an entry in a packet classifier using the received packet, retrieving an identifier associated with the determined packet classifier entry, and choosing a field-selection table entry using the retrieved identifier. The method further includes generating a hash module input by identifying a set of bits of the packet based on the chosen field-selection table entry, causing a hash module to compute a hash result based on the generated hash module input and the retrieved identifier, and matching the hash result to an entry in a routing table. The method includes obtaining processing data for the packet from the matching routing table entry.

In another aspect, the disclosure relates to a non-transitory computer-readable medium storing computer-readable instructions that, when executed by one or more computing devices, cause at least one of the one or more computing devices to perform operations that include receiving a packet from a source, determining an entry in a packet classifier using the received packet, retrieving an identifier associated with the determined packet classifier entry, and choosing a field-selection table entry using the retrieved identifier. The operations further include generating a hash module input by identifying a set of bits of the packet based on the chosen field-selection table entry, causing a hash module to compute a hash result based on the generated hash module input and the retrieved identifier, and matching the hash result to an entry in a routing table. The operations further include obtaining processing data for the packet from the matching routing table entry.

BRIEF DESCRIPTION OF THE DRAWINGS

These diagrams and flowcharts are not intended to limit the scope of the present teachings in any way. The devices and methods may be better understood from the following illustrative description with reference to the following figures in which:

FIG. 1 is a diagram of an example network device;

FIG. 2 is a flowchart of an example method for routing a packet using the network device shown in FIG. 1;

FIG. 3A is the layout for a typical TCP/IPv4 packet header, including the Ethernet frame;

FIG. 3B is the layout for a typical TCP/IPv6 packet header, including the Ethernet frame;

FIG. 4 shows an example packet classifier and an example field-selection table, as used by the network device shown in FIG. 1;

FIG. 5 shows an example of the hash module used in the network device shown in FIG. 1;

FIG. 6 shows an example routing table; and

FIG. 7 is a flowchart of a method for initiating a new flow using the network device shown in FIG. 1.

Like reference numbers and designations in the various drawings indicate like elements.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

The various concepts introduced above and discussed in greater detail below may be implemented in any of numerous ways, as the described concepts are not limited to any particular manner of implementation. Examples of specific implementations and applications are provided primarily for illustrative purposes.

FIG. 1 shows a diagram of an example network device 100. The illustrated network device 100 includes a routing module 101, a maintenance module 102, a packet classifier 120, a field-selection table 130, a hash module 140, and a routing table 150. The network device 100 includes network interfaces 115, through which the network device 100 can receive data packets from one or more of sources 110.sub.a-110.sub.n (generally "source 110") and can forward the data packets to any number of destinations 190.sub.a-190.sub.m (generally "destination 190"). The sources 110 and destinations 190 can each be a server, a computer, a processor, a mobile device such as a smart phone or tablet, a set-top device, or any other suitable network device. In general, a set of data packets forming a communication between a source 110 and a destination 190 constitutes a flow. In a two-way exchange between two end points, the end points act as both sources and destinations. Thus, a source 110 in one flow may be a destination 190 for another flow. In some implementations, the network device 100 is an end-host device. For example, the network device 100 can be a server that receives requests from one or more of sources 110.sub.a-110.sub.n, and responds to each request, e.g., by originating data packets to transmit, or causing another server to transmit, data packets to the requesting source 110. In some implementations, the network device 100 uses the routing module 101 to direct packets to another network device, to one of multiple computing processors or modules in the network device 100, to a particular core of a multi-core processor, to a particular hypervisor, virtual computer, or operating system, or to a specific application or service instance executing on an end-host server.

In some implementations, one or more of the various components of the network device 100 are implemented as hardware in an integrated circuit, such as an application specific integrated circuit (ASIC) or field programmable gate array (FPGA). In some implementations, one or more of the various components of the network device 100 are implemented as computer executable instructions that are executed by one or more special purpose or general purpose processors. In some implementations, one or more of the various components of the network device 100 are implemented as a mix of special purpose circuits and computer executable instructions that are executed by a processor. For example, in some implementations, the packet classifier 120 is implemented as a ternary content-addressable memory (TCAM) circuit, the hash module 140 is implemented as a special purpose hashing circuit, the field-selection table 130 and the routing table 150 are implemented using random access memory (RAM), and the routing module 101 and the maintenance module 102 are implemented as computer executable instructions that are executed by a general purpose computing processor. In some implementations, one or more of the components or modules are remote from the network device 100. For example, in some implementations, the maintenance module 102 is implemented in a separate controller, e.g., in a controller for a software-defined network (SDN). In some implementations, the field-selection table 130 and/or the routing table 150 are implemented using volatile memory, such as DRAM, SRAM, FLASH, or other integrated circuit memory. In some implementations, a computing processor is multi-core. In some implementations, the network device 100 is implemented with multiple computing processors.

In operation, the network device 100 receives a packet and determines how to handle the received packet, e.g., by identifying and forwarding the packet to a next-hop network device. The process is generally governed by the routing module 101. Each packet begins with a sequence of header bits that identify a communication protocol for the packet and any additional packet information in accordance with the communication protocol, e.g., addressing information for the packet. Two example formats are illustrated in FIGS. 3A and 3B, described below. The routing module 101 processes the initial bits of the packet to identify an entry in the routing table 150 with handling instructions for the packet. For example, the handling instruction may specify a network interface 115 to use for forwarding the packet. In some implementations, the network device 100 includes multiple computing processors and the handling instructions specify which processor to use for processing the packet. In some implementations, the network device 100 includes multiple service instances for handling network traffic, and the handling instruction may identify a particular service instance to handle the received packet.

The routing module 101 identifies the entry in the routing table 150 by processing information in the header of the packet. In general, each communication protocol defines an assignment of the header bits into meaningful fields, where each field has a number of bits and the values of those bits is the value of the respective header field. The meaning of a header field value is understood within the context of the communication protocol used. The number of bits in a field is typically specified by either the protocol or by the contents of another field in the header. For example, an address in IPv4 is represented by thirty-two bits, starting at the ninety seventh bit of the IPv4 header (as shown, for example, in the IPv4 header 340 in FIG. 3A), whereas an address in IPv6 is represented by one hundred twenty eight bits, starting at the sixty fifth bit of the IPv6 header (as shown, for example, in the IPv6 header 350 in FIG. 3B). Furthermore, a typical data packet uses multiple protocols in a layered manner. For example, the transmission control protocol (TCP) defines a communication verification protocol that relies on a separate addressing protocol such as the Internet Protocol (IP), which, in turn, relies on a framing protocol such as Ethernet. Thus, a single TCP/IP packet has at least three layers of header information. For example, as shown in FIG. 3A, a typical TCP/IPv4 packet, after a frame delimiter, begins with a fourteen-byte Ethernet frame header 320, followed by a twenty-byte IPv4 header 340, followed by a TCP header 360, which is also typically twenty bytes. Each protocol header contributes information used by the routing module 101. For example, the IPv4 header includes address information and the TCP header includes port information. Therefore, the routing module may be configured to distinguish between protocols and extract information from the different protocol headers of a packet using the correct bits, or fields of bits, of the packet.

Accordingly, still referring to FIG. 1, the routing module 101 uses a packet classifier 120 to classify the packet. In some implementations, the packet classifier 120 determines which communication protocols are used by the packet. In some implementations, the packet classifier 120 compares the initial bits of a packet to one or more patterns, each associated with an entry in the field selection table 130. If a pattern matches to the initial bits of a packet, then the associated entry in the field selection table 130 indicates which fields (sets of bits) in the packet to use. In some implementations, the pattern matching is performed using ternary-logic content-addressable memory (TCAM). In a TCAM, the pattern for each bit can match to a 1, to a 0, or to either. That is, the pattern can designate some of the bits as "don't care" bits that will satisfy a comparison with the pattern regardless of the value of those particular bits. The "don't care" bits are typically indicated by an "x". In some implementations, the patterns are ordered in such that if multiple patterns can may match to the same packet, the first pattern in the ordering is used. The last pattern can then be a generic pattern that will match all packet headers and is associated with a default rule. In some implementations, the packet classifier 120 returns an identifier for an entry in the field selection table 130. In some implementations, the packet classifier 120 returns two values: a classification result and an identifier for an entry in the field selection table 130. In some implementations, the classification result is unique to the pattern matched.

The field selection table 130 is used to identify which data fields in the header of a packet to process based on the determined classification. In some implementations, the field selection table 130 specifies, for each communication protocol or combination of communication protocols, which bits of the header(s) to use. The selected bits are used as part of an input to the hash module 140. For example, an entry in the field selection table 130 may be a bit mask that, when applied to the packet header with a logical AND operation, zeros-out or clears the header bits that are not selected (effectively leaving only the selected field values). In some implementations, the routing module 101 extracts the results into a data structure, an n-tuple, holding the selected field data. In such implementations, the n-tuple is passed to the hash module 140. In some implementations, the bit mask identified by an entry in the field selection table 130 is applied to the header, or to a single contiguous block of bits from the header, and the result is passed to the hash module 140. That is, the entire packet prefix, or a single contiguous block of bits from the packet, is used with bits for non-selected fields simply set to a constant, e.g., zero. In some implementations, the bit mask is stored in a compressed manner, where each bit of the mask represents multiple bits, e.g., one bit in a byte-mask represents eight bits of a bit-mask. In some implementations, an entry in the field selection table 130 identifies specific bits, or sets of bits, to use from the header. In some implementations, an entry in the field selection table 130 identifies specific sets of bits using a compressed encoding wherein each bit of the encoding corresponds to a range of bits in the header. For example, if a bit is "on" (i.e., set to 1), then the corresponding range (e.g., the n.sup.th octet or byte, or the bits from bit x to bit y of the header) is extracted. In some implementations, the resulting value(s) is copied into a memory register.

The network device 100 includes a hash module 140, used to hash an input value (or values). For example, the hash module 140 may be used to hash bit values selected from a packet's header data, as indicated by the entry in the field selection table 130. In some implementations, values in addition to the bits selected from a packet header are included in the input values to the hash module 140. The hash module 140 processes the input values and produces a hash value. In some implementations, the hash value is of a fixed bit-length. In some implementations, an identifier associated with a result from the packet classifier 120 is included in the input values. For example, in some implementations, the classification result is included with the bit values from the packet's header data. In some implementations, the hash module 140 produces a hash result value from a sequential stream of input values. For example, the hash module may accept any number of input values. In some such implementations, each input value is limited to a predetermined number of bits (e.g., 32 or 64). In some such implementations, the hash module result is updated as each input value is received, such that the result is impacted by every bit received as input. In some implementations, the hash module is placed in an initial state prior to generating a result. In some implementations, the first input to the hash module is a seed value. In some implementations, the hash module implements a hash algorithm in special hardware, e.g., in an ASIC or FPGA.

The routing module 101 uses the resulting hash value, or a portion of the resulting hash value, to select an appropriate entry in the routing table 150. In some implementations, the hash module 140 produces a 2N-bit (e.g., 32 bit) hash value and the routing module 101 only uses the lower order (or, alternatively, the higher order) N bits (e.g., 16 bits) of the hash value. In some implementations, the routing table 150 is stored in a manner facilitating fast look-ups using the hash value. For example, in some implementations, the routing table 150 may be keyed to, or indexed by, the hash module 140 output values. In some implementations, the routing table 150 may be implemented as an array, two-way associative array, four-way associative array, n-way associative array, or successive-row-lookup table.

The maintenance module 102 maintains, and modifies as necessary, the contents of the packet classifier 120, field selection table 130, and routing table 150. In some implementations, the maintenance module is implemented as hardware in an integrated circuit, such as an ASIC or FPGA. In some implementations, the maintenance module 102 is implemented as computer executable instructions that are executed by a special purpose or general purpose processor. In some implementations, the maintenance module 102 is on a separate network controller, such as an SDN controller, remotely maintaining the components of the network device 100. In some implementations, the maintenance module 102 updates the routing table 150 for each new packet flow. In some implementations, the maintenance module 102 updates the routing table 150 for a new packet flow if the new flow meets certain requirements. For example, in some implementations, the maintenance module 102 updates the routing table 150 to ensure a consistent route when a flow indicates a need for a certain quality of service (QoS) or in-order delivery. In some implementations, the maintenance module 102 updates the packet classifier 120. For example, in some implementations, the maintenance module 102 updates the packet classifier 120 with an additional classifier pattern used to differentiate between two distinct packet flows that result in the same hash result value from the hash module 140. The new pattern will have a new, internally unique, classification result, and may be associated with an existing entry in the field selection table 130 or may be associated with a new entry in the field selection table 130.

FIG. 2 is a flowchart of an example method 200 for routing a packet using the network device shown in FIG. 1. In brief overview, the method 200 includes receiving a packet (step 210), retrieving an identifier associated with a packet classifier entry (step 220), and choosing a field selection table entry (step 230). The method further includes generating a first hash module input by identifying a set of bits of the received packet (step 240), hashing the first hash module input together with the identifier (step 250), matching the hash result to an entry in the routing table (step 260), and obtaining a routing instruction from the routing table entry (step 270). The network device then processes the packet according to the obtained instruction.

As indicated above, the method 200 begins with receiving a packet (step 210). The packet can be received from any of the sources 110 connected to the network device 100. The packet can be received via a wired network connection or wirelessly. In general, each packet begins with a sequence of header bits that identify a communication protocol for the packet and any additional packet information in accordance with the communication protocol, e.g., addressing information for the packet. A data packet includes, after the header bits, additional data bits referred to as the payload. The payload may encapsulate another packet, e.g., a packet in a format of another protocol. The payload for an encapsulated packet begins with another header. Thus, as shown in FIGS. 3A and 3B, an Ethernet packet may encapsulate an IP (IPv4 or IPv6) packet, which may encapsulate a TCP packet (or a UDP packet, or an ICMP packet, or any other protocol packet). Each encapsulation is a layer, and the header for each layer specifies information that may be useful in determining how to handle the packet. For example, the IP layer includes a source address, a destination address, and a protocol indicator for the next-layer protocol of an encapsulated packet (e.g., 1 for ICMP, 6 for TCP, 17 for UDP, etc.). Similarly, the TCP layer includes identifiers for a source port and a destination port, and also includes control flags indicating a flow state, e.g., a synchronization (SYN) flag used to initiate a flow and a final packet (FIN) flag used to terminate an existing flow.

Continuing with FIG. 2, after receiving a packet (step 210), the routing module 101 retrieves an identifier for the packet using the packet classifier 120 (step 220). A packet classifier 120 matches a received packet with a packet classifier entry. In some implementations, the packet classifier 120 will parse the packet header to determine specific information associated with the packet, such as protocol, source IP address and destination IP address. In some implementations, the packet classifier 120 will compare the packet header to one or more classification patterns. For example, in some implementations, the packet classifier 120 is a TCAM. In general, a packet may satisfy conditions for multiple possible classifications. In some implementations, the packet classifier 120 prioritizes or orders the classification patterns such that the packet is classified according to the highest priority (highest order or "first") pattern it satisfies. For example, the packet classifier 120 may have a low order generic pattern for all IPv4 packets, a higher order pattern for all TCP/IPv4 packets, and a higher order pattern for TCP/IPv4 packets addressed to a particular address or range of addresses (e.g., a sub-net). If a TCP/IPv4 packet arrives addressed to an address in that range, it would satisfy all three patterns, but the prioritization determines that it should be classified using the highest order pattern. In some implementations, the packet classifier 120 has a maximum number of entries. In some implementations, a smaller number of entries may result in a reduced level of electrical power consumption. For example, where the packet classifier 120 is implemented as a TCAM, a TCAM with at most 128 or 256 entries will use significantly less power than a TCAM with thousands of entries.

The routing module 101 uses the retrieved identifier to identify an entry in the field-selection table (step 230). Each entry in the field-selection table 130 indicates how to parse or process the header information for a packet. In some implementations, an entry indicates which bits (or sets of bits) of the packet are to be used as input values to a hash module 140. In some implementations, multiple packet classifier entries may correspond to a same field-selection table entry. In some implementations, the packet classifier 120 is implemented as a TCAM and each entry in the TCAM corresponds to (or indexes) an entry in the field-selection table 130. In some such implementations, the entries in the field-selection table 130 are data structures including an identifier (e.g., an identifier for an entry in the packet classifier 120) and a field selection indicator (e.g., a bit selection pattern, as described above). In some implementations, the identifier is an arbitrary number selected as a "Seed" value that, when passed to the hash module 140 as an input value, causes the hash module 140 to generate a particular hash result value (or causes the hash module 140 to generate a hash result value other than a particular hash result value). In brief, as discussed in more detail below, in some implementations, the hash result value corresponds to an entry in the routing table 150 and the Seed value may be selected so that hash result value corresponds to a particular entry in the routing table 150. Thus, in some implementations, at steps 220 and 230, the routing module 101 uses the packet classifier 120 to identify an entry in the field-selection table that specifies an identifier or Seed value and a bit-selection instruction.

The routing module 101 then selects bits from the header(s) of the received packet (step 240) based on the bit-selection instruction from the entry in the field selection table. One or more of the fields of the packet header may be selected. For example, the entry may indicate selection of bits representing the packet's source IP address, destination IP address, next level protocol, destination port, and source port. As another example, the entry may indicate selection of bits representing the packet's destination IP address sub-net (e.g., the first 24 bits of an IPv4 address), next level protocol, destination port, and the TCP synchronization control flag (SYN). In some implementations, the routing module 101 extracts the designated bits from the packet header and passes the designated bits to a hash module 140 as input. In some implementations, the routing module 101 identifies a single contiguous block of bits from the packet, applies a bit mask to the block, the bit mask identified in the entry in the field selection table, and passes the result to a hash module 140 as input. For example, the single contiguous block of bits may be the first forty-four bytes (three hundred fifty two bits) after the Ethernet header, which is sufficient to include the twenty bytes of an IPv4 header or the forty bytes of an IPv6 header, and the first few bytes of an encapsulated header. In some implementations, the routing module 101 also passes the identifier (from step 220) or Seed value (from step 230) to the hash module 140 as input.

The routing module 101 uses the hash module 140 to determine a hash value for the input values (step 250). In general, the hash module 140 accepts the bits selected from the packet header, and any additional input bits (e.g., an identifier or Seed value), as input. The hash module 140 implements a hash function which generates a hash value based on the input values. In some implementations, the hash module 140 calculates the hash result using a hash function such as MD5, Jenkins, or MurMur. In some implementations, the hash module 140 uses a table of random numbers for generating hash values. In some implementations, the hash module 140 uses a linear-feedback shift register (LFSR). Typically, a hash function uses every input value such that a change in any one input bit will result in a different output value. The output value for a hash function is typically represented with fewer bits than the input value. For example, the input values may be 128 bits that include two 32-bit IPv4 addresses, port information, protocol information, and a seed value, and the input values may be reduced to, for example, a 32-bit hash result value. This is a form of lossy compression, which means that, for such functions, there must be at least one output value that can be reached from at least two different input values. When this happens, there is a collision between the different input values that resulted in the same output hash value. As introduced above, in some implementations, the Seed value may be adjusted to avoid collision events. Further discussion of collisions, and methods of addressing collision events, is presented below.

Continuing to refer to FIG. 2, in the method 200, the routing module 101 takes the output of the hash module 140 and matches it to an entry in the routing table (step 260). In some implementations, the routing table 150 is a hash table keyed to the results of the hash module 140. In some implementations, the hash result value is an index into the routing table 150. In some implementations, the routing module 101 uses the hash result, or a portion of the hash result, to calculate an index into the routing table 150. In some implementations, an index into the routing table 150 is a memory address allowing for direct access to memory storing an entry in the routing table 150. In some implementations, the routing table 150 has a fixed number of entries such that each possible hash result value can be translated to a specific entry in the routing table 150. For example, there could by 2.sup.16 entries and the lower-order 16 bits of the hash result value identifies a respective entry. Each entry is either empty or is populated with routing instructions that corresponds to a packet with header information that results in a corresponding hash value. In some implementations, the entry identified is a generic entry for packets flowing to a subnet.

In some implementations, matching the hash result to an entry in the routing table includes verifying the match. For example, each routing table entry may include match data that can be used to confirm the entry is correct for a particular packet. Match data is described in more detail below. In some implementations, matching the hash result to an entry in a the routing table includes deriving a routing table index from a first subset of a set of hash result bits (the binary representation of the hash result), locating an entry in the routing table from the routing table index, identifying a match data item stored in the entry, and verifying that the match data item is equal to a second subset of the set of hash result bits. In some implementations, the second subset of bits includes at least one bit not in the first subset of bits. In some implementations, the first subset of bits does not intersect with the second subset of bits. In some implementations, the first subset of bits is the x lower order bits of an n-bit hash result, and the second subset of bits is the upper n-x bits of the n-bit hash result. In some implementations, another characteristic of the packet is used to verify the match.

In some implementations, the entry identified is a specific entry created for a particular flow of data packets. For example, in some implementations, a new entry is added to the routing table 150 when a new flow is detected, and the new entry indicates specific instructions for the new flow. A new flow may be detected, for example, when a TCP/IP packet arrives with the SYN flag set, indicating the beginning of a TCP handshake. In some implementations, if a new flow is detected and the resulting hash value points to (or indicates) an entry in the routing table 150 that is already in use, this indicates a collision. In some implementations, when a collision is detected, a new entry is added to the packet classifier 120 such that the identifier for the entry in the packet classifier 120 is changed, thereby altering the input to the hash module 140 and generating a new hash result value. In some implementations, a collision may be detected in other ways, as described below.

The routing module 101 obtains the routing instruction from the entry in the routing table (step 270) and processes the packet using the routing instruction. In some implementations, the routing table entry identifies a network interface 115 through which the network device 100 forwards the packet. In some implementations, the routing table entry identifies a next-hop address. In some implementations, the routing table entry includes an instruction to allow or drop the packet. In some implementations, the routing table entry includes an instruction to process the packet before forwarding, e.g., to fragment the packet or to update a time-to-live field or a hop limit field. The network device processes the packet using the routing instructions. Thus, for example, the network device can forward the packet to the proper destination 190.

FIG. 3A shows the format 314 for the headers of a typical TCP/IPv4 packet transmitted via Ethernet. In broad overview, the illustrated format includes an Ethernet frame 320, an Internet Protocol (IP) version 4 header 340, a transmission control protocol (TCP) header 360, and the beginning of the encapsulated data 380, i.e., the payload.

A TCP/IPv4 packet, as shown in FIG. 3A, begins with a new packet preamble and delimiter, most of which is not shown. After the delimiter, an Ethernet frame header 320 includes a media access control (MAC) address for the packet's immediate destination (i.e., the network device receiving the packet) and a MAC address for the packet's immediate source (i.e., the network device transmitting the packet). A MAC address is 48 bits, or six 8-bit octets. The Ethernet frame header 320 also includes a 16-bit "Ethertype" indicator, which may indicate the size of the frame or the protocol for the Ethernet payload (i.e., the next level protocol). The Ethernet frame header 320 is followed by the Ethernet payload, which begins with a header for the encapsulated packet.

FIG. 3A shows the format 314 for the headers of a typical TCP/IPv4 packet, thus the Ethernet frame header 320 is followed by an IPv4 header 340. The first four bits indicate the Internet Protocol version (i.e., 4). The next sets of bits indicate the header length (IHL), flags to differentiate service requirements (DSCP, used, e.g., to express quality of service (QoS) requirements), explicit congestion notification (ECN), a length for the IP packet, a packet identification shared across packet fragments, IP flags, and a fragment offset. After the packet fragmentation bits, the IPv4 header 340 indicates a time to live (TTL) for the packet, which may be measured in time (e.g., seconds) or hops (number of network devices that can forward the packet). After the TTL, the IPv4 header 340 indicates the protocol for the next level encapsulated packet. For example, a 1 indicates the Internet control message protocol (ICMP), a 6 indicates TCP, and 17 indicates the user datagram protocol (UDP). The IPv4 header 340 further includes a header checksum, which must be recalculated every time the header changes, e.g., whenever the TTL is updated. The IPv4 header 340 next specifies a 32-bit source address and a 32-bit destination address. Additional header fields may be used, but may be omitted and are not shown in FIG. 3A.

After the IPv4 header 340, FIG. 3A shows a TCP header 360. The typical TCP header begins with a 16-bit source port identifier and a 16-bit destination port identifier. A TCP port is a virtual port, typically used to indicate the type of data in the payload so that the receiver can pass the packet to the correct application. The TCP header 360 then specifies sequencing information including a sequence number for the packet, an acknowledgement number, and a data offset. The TCP header 360 includes control flags, e.g., SYN, FIN, and ACK, and additional control information such as the window size, a checksum, and other options. The data encapsulated 380 begins after the TCP header 360.

FIG. 3B shows the format 316 for the headers of a typical TCP/IPv6 packet transmitted via Ethernet. In broad overview, the illustrated format includes an Ethernet frame 320, an Internet Protocol (IP) version 6 header 350, a transmission control protocol (TCP) header 370, and the beginning of the encapsulated data 390, i.e., the payload. The Ethernet frame 320 in the illustrated packet is identical to the Ethernet frame 320 in FIG. 3A.

FIG. 3B shows the format 316 for the headers of a typical TCP/IPv6 packet, thus the Ethernet frame header 320 is followed by an IPv6 header 350. The first four bits indicate the Internet Protocol version (i.e., 6). The next sets of bits indicate a traffic class, a flow label, and the payload length. After the payload length, the IPv6 header 350 indicates a Next Header, which is the same as the protocol identifier used in IPv4. That is, it may be a 1 for ICMP, a 6 for TCP, a 17 for UDP, or any other number indicating an associated protocol for the next header in the packet. The IPv6 header 350 then indicates a hop limit for the packet, equivalent to the TTL of IPv4 when used to specify the number of network devices that can forward the packet. After the hop limit, the IPv6 header 350, specifies a 128-bit source address and a 128-bit destination address. Additional header fields may be used, but may be omitted and are not shown in FIG. 3B. There is no checksum for an IPv6 header, eliminating one of the bottlenecks in IPv4 packet processing.

After the IPv6 header 350, FIG. 3B shows a TCP header 370. The TCP header 370 is identical to the TCP header 360 shown in FIG. 3A, but the offsets from the Ethernet frame 320 are increased because the size of an IPv6 header 350 is larger than the size of an IPv4 header 340. The data encapsulated 390 begins after the TCP header 370.

FIG. 4 illustrates an example of a packet classifier 420 and an example of a field selection table 430. The illustrated packet classifier 420 is shown with patterns expressed in hexadecimal, such that each four bit section of the header is represented by a value in the range 0-9, A-F, or by an X for "don't care." The patterns illustrated begin with the IP header, although in some implementations the patterns begin with the Ethernet frame, e.g., so that the source MAC address can be used in the patterns. The illustrated packet classifier 420 begins with a lowest priority filter 421 matching any IPv4 packet with an IP header of twenty octets and a filter 422 matching any IPv6 packet. The packet classifier 420 also includes a filter 423 matching any TCP/IPv4 packet and a filter 424 matching any TCP/IPv6 packet. The packet classifier 420 includes a higher priority filter 425 matching a TCP/IPv4 packet from address 1.2.3.4 to address 5.6.7.8, with a destination port of 80 (shown in hexadecimal as 0050). The packet classifier 420 also includes a higher priority filter 426 matching a TCP/IPv6 packet from address 0102:0304:0506:0708:0910:1112:1314:1516 to address 1718:1920:2122:2324:2526:2728:2930:3132.

Each row of the example field selection table 430 corresponds to a row of the example of a packet classifier 420. Thus a packet that satisfies the highest priority filter 426 is associated with a corresponding entry 436 that indicates where the address bits are in the header. An IPv4 packet that does not satisfy any of the higher priority filters will match the lowest priority filter 421, which corresponds to an entry 431 that indicates where IPv4 address bits are located in the header. As indicated above, while the example field selection table 430 includes one entry for each entry in the packet classifier 420, in some other implementations, more than one entry in the packet classifier 420 may correspond to a common entry in the field selection table 430. For example, in some implementations, the filters 421, 423, and 425, which each match to an IPv4 packet, each correspond to a single entry in the field selection table 430 that identifies, for example, bits common to any IPv4 header.

FIG. 5 shows an example structure of the inputs and outputs of a hash module used in the network device shown in FIG. 1. The example structure of a hash module 140 takes, as input, bits from the packet 520 and a seed value 510, and generates a hash result value 530. In some implementations, the bits from the packet 520 are the values for various protocol fields indicated by an entry in the field selection table 130. In some implementations the bits from the packet 520 are a block of bits from the header with some of the bits therein set to zero. Although shown in FIG. 5 as multiple fields 520, the bits from the packet 520 may only be one field, may be a single block of data representing multiple fields, may be multiple fields, and/or may include bit values that have been set to a constant. In some implementations, the seed value 510 is an identifier specified by the packet classifier 120. In some implementations, the seed value 510 is an identifier or Seed value specified by an entry in the field selection table 130. Because the hash module processes all of the input bits, different values for the seed value 510 will cause the hash module 140 to produce different values for the hash result 530, without any change to the input values 520 selected from the packet header. In some implementations, the hash result 530 is a fixed number of bits. In some implementations, the lower order bits of the hash result 530 are used to locate an entry in the routing table 150. For example, in some implementations, the routing table 150 has space for 2.sup.16 entries, and the lower order sixteen bits of the hash result 530 uniquely identify one of those 2.sup.16 spaces in the routing table 150. In some implementations, the routing table 150 is dynamically sized and an intermediary index maps hash result values to indices into the dynamically sized routing table 150. In some such implementations, the intermediary index has the same number of entries as possible hash result values 530, e.g., 2.sup.16 entries for a 16 bit result. In some such implementations, the intermediary index has a smaller number of entries, e.g., m, and the entry for a hash result values is found at the hash result value modulo m. These implementations have an increased likelihood of hash collisions.

In some implementations, the hash result 530 is an n-bit value (e.g., a 64-bit value or a 32-bit value) and the lower order n-x bits (e.g., 16 bits) are used to determine an index into the routing table 150. In some implementations, the routing module 101 confirms that the entry at the determined index is the correct entry (and not an entry reached via a hash collision) by matching one or more stored values with confirmation values (referred to as "match data"). In some such implementations, the full n-bit hash result value 530 may be used as match data. For example, in some implementations, the routing table 150 includes, in each entry, a stored copy of the n-bit value that matches to the entry even though only n-x bits are used to locate the entry. In some such implementations, only the additional bits of the hash value not included in the n-x bits used to locate the entry are stored as match data. The routing module 101 can confirm that the entry is correct by matching the stored n-bit value (or stored portion thereof) to the n-bit hash result value 530. As an example, in some implementations, the hash result 530 is 64 bits and only the higher order 48 bits are stored as match data. The lower order 16 bits are used to locate an entry in the routing table 150 and the remaining bits are used to confirm a match. In some such implementations, no other match data is stored. In some implementations, the match data includes bits from one or more of the input fields 520. For example, in some implementations, the match data includes an address field from the packet header. In some implementations, the match data is an input value 520 that is common to all packets for which the entry should match. In some implementations, the match data is stored in a compressed form. In general, a routing table storing less match data will use less memory and require less circuitry and electrical power for verifications with match data. In some implementations, no match data is stored and no verification is performed.

FIG. 6 shows an example structure of a routing table 150 with stored match data. The example structure shows the routing table 150 implemented as an associative array containing two tables, an index table 651 and a data table 652. The routing table 150 is accessed using key data 610, which includes a hash value 614, e.g., bits from a hash result 530. When a routing module 101 accesses the routing table 150 to identify routing instructions for a packet, the routing module 101 passes in the hash value 614, which is then converted to an index using the index table 651. The index is used to locate a corresponding entry in the data table 652. In some implementations, the key data 610 also includes match data 612.

The index table 651 maps hash values to indices. The index table 651 is ordered by hash values such that an entry for an input hash value 614 can be quickly located in the index table 651. For example, in some implementations, the index table 651 is structured such that each entry is at a memory location addressed by a start address plus the respective input hash value multiplied by a constant (e.g., the size of an entry in the index table 651).

The data table 652 is ordered by the indices from the index table 651. As shown in FIG. 6, in some implementations, each populated entry in the data table 652 includes confirmation match data and routing instructions. In some implementations, the data table 652 does not include the confirmation match data. In some implementations, the confirmation match data includes multiple values, e.g., a value representative of a source or destination address, a value representative of a source or destination sub-net, a hash value, or any other data that may be used as confirmation match data. In some implementations, the confirmation match data is the hash result value 530 produced by the hash module 140 (see FIG. 5). In some such implementations, the hash value 614 used as key data 610 for the routing table 150 is a portion of the hash result value 530 stored as confirmation match data. For example, the input hash value 614 may be only the lower (or higher) order bits of the hash result value 530. In some implementations, the key data 610 is a subset of the bits representing the hash result value 530, and the match data is a different subset of the bits representing the hash result value 530. In some implementations, the confirmation match data is the same data that was passed into the hash module 140. In some implementations, the confirmation match data is a subset of the data that was passed into the hash module 140. In some such implementations, the confirmation match data may include less information than the data that was passed into the hash module 140. In some implementations, the confirmation match data is compressed. In some implementations, the index table 651 and the data table 652 are n-way associative. For example, the index table 651 and the data table 652 may be two-way associative, such that the data table 652 includes, for each entry, the hash value that corresponds to that entry (from the index table 651).

In some implementations, the match data 612 is also passed in as an input value 610. In such implementations, if the match data stored in the entry matches the input match data 612, then the routing instruction stored at the entry is returned 620. If there is no entry, the packet may belong to a new flow and the maintenance module 102 updates the tables accordingly. If there is an entry, but the match data in the data table 652 does not match to the input match data 612, then there is a collision. That is, the values selected from the packet hashed into a hash value that is already in use by packets in another flow. This packet, too, is for a new flow and the maintenance module 102 updates the tables accordingly. In some implementations, the match data 612 is only passed in as an input value 610 for a new flow. For example, in some implementations, the maintenance module 102 may receive an instruction to establish a new flow and uses the match data 612 to verify that the newly established flow does not have a collision with an existing flow.

In some implementations, when there is a new flow with a hash collision with an existing entry in the routing table 150, the maintenance module 102 updates the packet classifier 120 to add a new entry for the new flow, such that the new flow is associated with a new identifier. The new entry in the packet classifier 120 may correspond to an already existing entry in the field selection table 130, or the new entry in the packet classifier 120 may correspond to a new entry in the field selection table 130. In either case, the updates result in a configuration for the new flow such that the data extracted from packets in the flow, along with the new identifier, cause the hash module 140 to generate a distinct hash result 530 (as shown in FIG. 5). The new hash result can be used, e.g., as an input value 610, to identify an entry in the routing table 150 for the new flow. In some implementations, the maintenance module 102 configures the entry with routing instructions for the flow. In some implementations, the maintenance module 102 verifies that the entry is not already in use, and, if it is, modifies the identifier associated with the flow so that the hash module 140 will generate a different result.

FIG. 7 is a flowchart of an example method for initiating a new flow using the network device shown in FIG. 1. In brief overview, the method 700 includes receiving an indication of a new flow (step 701), retrieving an identifier associated with packet classifier entry (step 702), and choosing a field selection table entry (step 703). The method further includes generating a hash module input by processing a packet of the new flow (step 704), hashing the generated input along with the identifier (step 705), and detecting a collision (step 706). If a collision is detected, the method 700 includes adding a new entry to the data packet classifier (step 707), assigning a new identifier for the new entry in the data packet classifier (step 708), and hashing the generated hash module input and the new identifier to produce a new hash result (step 709). The method 700 then verifies that the new hash result does not have a collision with an existing routing table entry (step 710). If there is a collision at step 710, the method 700 repeats step 708, assigning a different new identifier for the new entry. Once there are no collisions (at step 706 or step 710), the method 700 includes adding an entry to the routing table (step 720).

As indicated above, the method 700 begins with the maintenance module 102 receiving an indication to initiate a new data communication flow (step 701). In some implementations, the indication is an instruction received by the maintenance module 102. For example, in some implementations, an application or service instance on an host server may initiate a flow by sending an instruction to the maintenance module 102. In some such implementations, the application or service instance designates specific packet handling instructions for the new flow. For example, the host server can request that acknowledgment packets for a new data stream be directed to a particular stream management instance. In some implementations, the maintenance module 102 detects a new flow as a previously unseen flow. In some implementations, the packet classifier 120 classifies a packet as one initiating a new flow. For example, the packet classifier 120 may detect the beginning of a TCP handshake, as indicated by a SYN flag set in the TCP packet header. In some implementations where the data packet classifier 120 is implemented as a TCAM, there is a TCAM pattern to match one or more flow initiation packets. For example, in some implementations, the TCAM has a high priority pattern for identifying a TCP packet with the SYN flag set, which is interpreted by the network device to indicate a new flow. In some implementations where the data packet classifier 120 is implemented as a TCAM, receiving a packet with a packet header that matches to the default row of the TCAM may initiate a new flow. In some implementations, an indication of a new flow includes an indication of a desired routing instruction for the new flow. In some implementations, the maintenance module 102 determines a desired routing instruction for a new flow. For example, the maintenance module 102 may access additional network topology information to identify a next-hop device for packets in the new flow.

The method 700 includes the maintenance module 102 retrieving an identifier associate with a packet classifier entry (step 702). This step is analogous to step 220 in FIG. 2. The method 700 includes the maintenance module 102 choosing a field-selection table entry (step 703), this step is similarly analogous to step 230 in FIG. 2. The method 700 includes the maintenance module 102 selecting bits from a packet of the flow (step 704). This step is analogous to step 240 in FIG. 2. The method 700 also includes hashing the selected bits and the identifier (step 705). This step is analogous to step 250 in FIG. 2. As with step 250 in FIG. 2, step 705 in FIG. 7 includes using the identifier as a seed value to a hashing function.

In the method 700, the maintenance module 102 then detects whether a hash collision has occurred (step 706). A collision is detected, for example, when the routing table 150 does not have an empty location at the index specified by the hash result 530 for the new flow. In some implementations, a collision is detected if there exists an entry with different match data as compared to match data for the new flow. In some implementations, a collision is detected if there exists an entry with a different routing instruction as compared to the desired routing instruction for the new flow, and no collision is detected if the routing instruction in the routing table 150 is the desired routing instruction for the new flow. That is, if a new flow is established and packets for that new flow would cause the routing module 101 to access an entry in the routing table that is populated with handling instructions prior to establishing the flow, then there is a collision. However, if those handling instructions are the same as the desired handling instructions, in some implementations, the collision is ignored because the new entry would have the same handling instructions as the existing entry. In some implementations, a collision is detected unless the returned location in the routing table 150 is empty, which indicates that the routing instruction for the new data communication flow may be stored at the returned location in the routing table 150.

If a collision is detected at step 706, the method 700 includes establishing a new identifier for the flow, e.g., by adding a new entry to the data packet classifier (step 707), assigning a new identifier for the new entry in the data packet classifier (step 708), and hashing the generated hash module input and the new identifier to produce a new hash result (step 709). When adding a new entry to the data packet classifier (step 707), the new entry in the data packet classifier 120 is selected to match packets of the new flow, but not packets of flows already being processed. In some implementations, the next entry is selected to match packets only of the new flow. In some implementations, the maintenance module 102 identifies the pattern previously used to classify a packet for the flow and adds additional criteria to the pattern, specific to the packet classified. For example, if the packet matched a classification pattern that specified the packet's destination sub-net, but not the full destination address, the maintenance module 102 adds a new classification pattern that specifies the packet's full destination address.

The new classification pattern corresponds to a new identifier assigned for the new entry in the data packet classifier (step 708). In some implementations, the new identifier is an output value from the packet classifier. In some implementations, the new identifier is stored in the field selection table 130, e.g., as a Seed value. In some implementations, the new identifier is selected at random. In some implementations, the new identifier is selected in a deterministic manner, e.g., by incrementing a counter.

If a collision was detected at step 706, the method 700 includes, after establishing a new identifier for the flow in step 708, hashing the generated hash input from step 704 and the new identifier (step 709). Because the inputs are now different, the hashing at step 709 results in a new a new hash value. The result of the hashing at step 709 is used to identify an entry location in the routing table. The new hash value will almost always correspond to a different entry in the routing table 150. If there is a collision (step 710), the method 700 returns to step 708 and assigns a new identifier for the entry in the data packet classifier (added in step 707). In some implementations, steps 708, 709, and 710 are repeated until there is no collision. In some implementations, after a predetermined number of iterations without avoiding a collision, an error condition is triggered. In some implementations, after a predetermined number of iterations without avoiding a collision, additional steps are taken to modify the hash output, e.g., by associating the new entry in the packet classifier with a different field-selection table entry.

When there are no collisions detected at step 706 and/or step 710, the method 700 includes adding, by the maintenance module 102, a new entry to the routing table (step 720), the entry containing the desired routing instruction and match data for the flow. In some implementations, the new entry also contains the hash result from step 709. In some implementations, the match data is a portion of the hash result from step 709. For example, in some implementations, the hash result is a 64-bit value and the match data is the higher order 48 bits of the hash result. In some such implementations, the lower order 16 bits of the hash result are used to locate an entry in the routing table and the higher order 48 bits are used to verify a match. Packets received after completion of the method 700 are routed using the method explained in relation to FIG. 2, which locates and retrieves the routing instructions and associated data stored in step 720.

In some implementations, the maintenance module 102 removes entries from the routing table 150 for flows that have ended, terminated, or become stale. For example, the packet classifier 120 may detect a TCP teardown, as indicated by a FIN flag set in the TCP packet header. In some implementations where the data packet classifier 120 is implemented as a TCAM, there is a TCAM pattern to match one or more flow termination packets. In some implementations, the maintenance module 102 periodically removes entries from the routing table 150 that have not been used for a threshold length of time or that were established more than some predetermined period of time prior.

Implementations of the subject matter and the operations described in this specification can be implemented in digital electronic circuitry, or in computer software embodied on a tangible medium, firmware, or hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them. Implementations of the subject matter described in this specification can be implemented as one or more computer programs embodied on a tangible medium, i.e., one or more modules of computer program instructions, encoded on one or more computer storage media for execution by, or to control the operation of, a data processing apparatus. A computer storage medium can be, or be included in, a computer-readable storage device, a computer-readable storage substrate, a random or serial access memory array or device, or a combination of one or more of them. The computer storage medium can also be, or be included in, one or more separate components or media (e.g., multiple CDs, disks, or other storage devices). The computer storage medium may be tangible and non-transitory.

The operations described in this specification can be implemented as operations performed by a data processing apparatus on data stored on one or more computer-readable storage devices or received from other sources.

A computer program (also known as a program, software, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, declarative or procedural languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, object, or other unit suitable for use in a computing environment. A computer program may, but need not, correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document), in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code). A computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network. Examples of communication networks include a local area network ("LAN") and a wide area network ("WAN"), an inter-network (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks).

The processes and logic flows described in this specification can be performed by one or more programmable processors executing one or more computer programs to perform actions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit).

While this specification contains many specific implementation details, these should not be construed as limitations on the scope of any inventions or of what may be claimed, but rather as descriptions of features specific to particular implementations. Certain features that are described in this specification in the context of separate implementations can also be implemented in combination in a single implementation. Conversely, various features that are described in the context of a single implementation can also be implemented in multiple implementations separately or in any suitable sub-combination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a sub-combination or variation of a sub-combination.

Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In certain circumstances, multitasking and parallel processing may be advantageous. Moreover, the separation of various system components in the implementations described above should not be understood as requiring such separation in all implementations, and it should be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.

References to "or" may be construed as inclusive so that any terms described using "or" may indicate any of a single, more than one, and all of the described terms. The labels "first," "second," "third," and so forth are not necessarily meant to indicate an ordering and are generally used merely to distinguish between like or similar items or elements.

Thus, particular implementations of the subject matter have been described. Other implementations are within the scope of the following claims. In some cases, the actions recited in the claims can be performed in a different order and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In certain implementations, multitasking or parallel processing may be utilized.

* * * * *