Distributing Information Across Equal-cost Paths In A Network Kalkunte; Mohan ; et al. [Broadcom Corporation]

Distributing Information Across Equal-cost Paths In A Network

Kalkunte; Mohan ; et al.

Patent Application Summary

U.S. patent application number 12/555801 was filed with the patent office on 2009-12-31 for distributing information across equal-cost paths in a network. This patent application is currently assigned to Broadcom Corporation. Invention is credited to Mohan Kalkunte, Karagada Ramarao Kishore, Srinivas Sampath.

Application Number	20090323535 12/555801
Document ID	/
Family ID	33545356
Filed Date	2009-12-31

United States Patent Application	20090323535
Kind Code	A1
Kalkunte; Mohan ; et al.	December 31, 2009

DISTRIBUTING INFORMATION ACROSS EQUAL-COST PATHS IN A NETWORK

Abstract

A method of distributing data across a network having a plurality of equal-cost paths. Also, a device for distributing data over a network according to the method. The data, which is typically contained in data packets, may be distributed based on at least one attribute of each of the packets. The data may also be distributed according to a weighted distribution function that allows for unequal amounts of traffic to be distributed to each of the equal-cost paths.

Inventors:	Kalkunte; Mohan; (Sunnyvale, CA) ; Sampath; Srinivas; (Sunnyvale, CA) ; Kishore; Karagada Ramarao; (Saratoga, CA)
Correspondence Address:	BRAKE HUGHES BELLERMANN LLP;c/o CPA Global P.O. Box 52050 Minneapolis MN 55402 US
Assignee:	Broadcom Corporation Irvine CA
Family ID:	33545356
Appl. No.:	12/555801
Filed:	September 8, 2009

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
10825656	Apr 16, 2004	7606161
12555801
60483026	Jun 27, 2003
60529617	Dec 16, 2003

Current U.S. Class:	370/238
Current CPC Class:	H04L 45/12 20130101; H04L 45/24 20130101; H04L 45/00 20130101
Class at Publication:	370/238
International Class:	H04L 12/56 20060101 H04L012/56

Claims

1. A distribution device, the distribution device comprising: a first distribution unit, including a device logic, wherein the first distribution unit is configured to use the device logic to distribute a packet of data entering the device through an ingress port among a set of ports to one of at least a first output port and a second output port among the set of ports, the device logic configured to determine whether at least two equal-cost paths exist between the distribution device and a destination device, wherein a first equal-cost path is associated with the first output port and a second equal-cost path is associated with the second output port; determine, based on a packet attribute of the packet, a first weight associated with the first output port and the first equal-cost path; determine, based on the packet attribute of the packet, a second weight associated with the second output port and the second equal-cost path, wherein the first weight is larger than the second weight; and distribute the packet based on the first weight and the second weight, including distributing the packet with a higher likelihood of output to the first output port and the first equal-cost path based on a relation of the first weight to the second weight.

2. The distribution device of claim 1, wherein the device logic is configured to distribute a set of packets including the packet, including distributing a first subset of the set of packets through the first output port and a second subset of the set of packets through the second output port, wherein the first subset is lager than the second subset in proportion to which the first weight is larger than the second weight.

3. The distribution device of claim 1, wherein each packet of the set of packets includes the packet attribute.

4. The distribution device of claim 1, wherein the first weight and the second weight correspond, respectively, to a first number of entries stored in at least one memory and to a second number of entries stored in the at least one memory.

5. The distribution device of claim 1, wherein the first equal cost path and the second equal cost path are associated with first network traffic and second network traffic, respectively, that are unrelated to the packet, and wherein the first network traffic is less than the second network traffic.

6. The distribution device of claim 5, wherein the first network traffic is less than the second network traffic in approximate proportion to an extent to which the first weight is larger than the second weight.

7. The distribution device of claim 1, wherein the device logic comprises: a first lookup unit including an acknowledgment unit configured to acknowledge whether multiple equal cost-paths exist, a first referencing unit configured to reference a second lookup unit when multiple equal-cost paths do exist, and a second referencing unit configured to reference a third lookup unit otherwise, the second lookup unit including a second distribution unit configured to distribute the packet across the set of ports and a third referencing unit for referencing the third lookup unit, and the third lookup unit including a selection unit configured to select between the first port and the second port.

8. The distribution device of claim 7 wherein the second lookup unit includes multiple entries, each referencing a common set of instructions in the third lookup unit.

9. The distribution device of claim 8 wherein, in the second lookup unit, a first number of entries associated with the first packet determines the first weight, and a second number of entries associated with the second packet determines the second weight.

10. A method of distributing data across a network, comprising: providing a distribution device configured to distribute a set of packets across a set of equal-cost paths in the network; and distributing each packet in the set of packets across the set of equal-cost paths according to a weighted distribution in which a weight is assigned to each packet in proportion to relative loads of the set of equal-cost paths and a packet with a relatively higher weight is more likely to be distributed to an equal-cost path of the set of equal-cost paths having a relatively smaller network traffic load.

11. The method of claim 10 wherein the weight of each packet corresponds to a number of entries in a memory.

12. The method of claim 10 wherein the distributing further comprises using a packet attribute from each packet to determine the weighted distribution.

13. The method of claim 12 wherein the distributing comprises performing a hashing function on the packet attribute.

14. The method of claim 13 wherein the packet attribute includes one or more of a source address, a next-hop address, or a destination address.

15. The method of claim 10, wherein the distributing comprises obtaining a match between a longest prefix in a first packet and a portion of a first set of instructions in a first compilation of sets of instructions.

16. The method of claim 15, wherein the distributing comprises using a pointer portion from the first set of instructions to select a second set of instructions from a second compilation of sets of instructions, wherein the first set of instructions includes a first value that specifies how much weight is to be given to each equal-cost path in the set of equal-cost paths.

17. The method of claim 10, further comprising updating a compilation of sets of instructions used to determine the weighted distribution, wherein the compilation is updated based on a best-fit algorithm.

18. A device for distributing packets across a network, the device comprising: a set of interface means; distribution means for distributing a set of packets entering the device through a first interface means of the set of interface means such that packets in the set of packets are distributed across all interface means in the set of interface means operably-connected to equal-cost paths according to a weighted distribution so that at least one of said packets is given greater weight to be distributed across at least one of said equal-cost paths than at least one other of said equal-cost paths

19. The device of claim 18 wherein said packet weight corresponds to a number of entries stored in a memory.

20. The device of claim 18 wherein the distribution means is configured to distribute the packets based on attributes of the packets.

Description

CROSS REFERENCE TO RELATED APPLICATIONS

[0001] This application is a continuation of, and claims priority under 35 U.S.C. .sctn.120 to, U.S. patent application Ser. No. 10/825,656, filed on Apr. 16, 2004 and titled Distributing Information Across Equal-Cost Paths in a Network, now U.S. Pat. No. ______, which itself claims priority from U.S. Provisional Patent Application Ser. No. 601483,026, entitled "ECMP IN XGS" and filed on Jun. 27, 2003 and U.S. Provisional Patent Application Ser. No. 60/592,617, entitled "Distributing Information Across Equal-Cost Paths in a Ntwork" and filed on Dec. 16, 2003. The contents of all of the above-referenced applications are hereby incorporated in their entirety by reference.

BACKGROUND OF THE INVENTION

[0002] 1. Field of the Invention

[0003] Certain embodiments of the present invention are directed to methods of distributing datagrams across a network. Certain other embodiments of the present invention are directed to devices for implementing such methods.

[0004] 2. Description of the Related Art

[0005] Telecommunications systems typically distribute information, usually in the form of datagrams such as, but not limited to, data cells and packets, over networks that include network devices such as, but not limited to, hosts, servers, modules, nodes, and distribution devices such as, but not limited to, switches and routers. A small portion of a representative network is illustrated in FIG. 1.

[0006] In FIG. 1, a switch or router 100, a first host 110, a second host 120, and a server 130 are all illustrated as being connected/linked to each other and to thereby form a portion of a network. The router 100 includes a plurality of ports. Among the plurality of ports, a first port 140 functions as an ingress port for a first datagram, illustrated as a first packet P1 in FIG. 1. The first packet P1 has been forwarded to the router 100 from a source that is not illustrated in FIG. 1.

[0007] A second port 150, also among the plurality of ports, is one possible egress for the data that came into the router 100 as part of the first packet P1. In a Layer-3 switching environment, upon egress from the router 100, the source address (SA) of the packet is changed to the router MAC address of the router 100. The destination address (DA) of the packet is changed to the next-hop address (NHA) or, in the example illustrated in FIG. 1, the address of the first host 110. The data egresses in the form of a second packet P2 from the router 100 to the first host 110.

[0008] A third port 160 is another potential egress port for the data that entered the router 100 through the first port 140. If the data egresses through the third port 160, it does so as a third datagram, illustrated as a third packet P3 in FIG. 1. The third packet P3 has a different SA than the first packet P1 and a different DA, or NHA, than the second packet P2, since the third packet's P3 next hop is to the second host 120 while the second packet's P2 next hop is to the first host 110. It should also be noted that other packet attributes, such as time-to-live (TTL) and Header Checksum typically change. It should also be noted that, in FIG. 1, the SA of the second packet P2 and third packet P3 will be the same, but different than the SA of the first packet P1.

[0009] Although many other factors frequently come into play, according to a simplified model, calculating the "cost" of a path in a network involves counting the number of "hops" that a datagram or packet has to make between a source and a destination. For example, in FIG. 1, an IP packet traveling from the router 100 to the server 130 may hop to either the first host 110 or second host 120 before hopping to the server 130. Since the number of hops are equivalent for the IP packet, regardless of whether it travels to the first host 110 or second host 120, the IP packet at the router 100 is said to have two equal-cost paths available to it.

[0010] According to this simplified model, since the router 100 relies exclusively upon the number of hops between it and the datagram or packet destination to determine the cost of a path, the router 100 can make no cost-based distinction between the path through the first host 110 and the path through the second host 120. Hence, in order to determine whether to forward the data in the first packet P1 as the second packet P2 or the third packet P3, the router 100 often makes use of an equal-cost multi-path (ECMP) algorithm, which is well know in the related art.

[0011] Unfortunately, although ECMP algorithms according to the related art are often useful for distributing traffic evenly over a set of equal-cost paths, ECMP algorithms according to the related art fail to account for general network traffic that also commonly flows through the various network devices along the equal-cost paths. Hence, in the partial network illustrated in FIG. 1, if the first host 110 has more general network traffic flowing across it than the second host 120, when the router 100 begins distributing packet traffic via the ECMP algorithm discussed above, then the first host 110 could become overly burdened with total traffic relative to the second host 120. Under such circumstances, the network devices are generally not utilized optimally and the network is not operating at maximum efficiency.

[0012] In addition to the general inability of ECMP algorithms according to the related art to account for general network traffic, these algorithms also typically require a considerable amount of distribution device resources. These resources are generally in the form of time allocated to performing the algorithms and in the form of hardware that is used and re-used while performing the algorithms.

[0013] At least in view of the above, there is a need for methods that are capable of distributing datagram, data, and/or packet traffic across equal-cost paths in a network in a manner that reduces the possibility that certain network devices along the equal-cost paths will be overly burdened. There is also a need for devices capable of implementing such methods.

[0014] In addition, at least in view of the above, there is also a need for methods that reduce the amount of time spent by the distribution device in performing lookup algorithms and/or that reduce the amount of hardware that is used and re-used while performing such algorithms. Further, there is a need for devices capable of implementing such methods.

SUMMARY OF THE INVENTION

[0015] In order to address and/or overcome at least some of the above-discussed shortcomings of the related art, new devices and methods are provided. Some of these methods and devices are summarized below.

[0016] According to certain embodiments of the present invention, a first method of distributing data across a network is provided. According to this first method, a step of providing a distribution device configured to distribute packets of data across a set of equal-cost paths in a network is typically provided. According to this first method, distribution of the packets across the paths is usually based on at least one attribute of each of the packets.

[0017] According to certain other embodiments of the present invention, a second method of distributing data across the network is provided. According to this second method, a distribution device is normally provided, and this distribution device is generally configured to distribute a set of packets of data across a set of equal-cost paths in the network. According to this second method, each packet in the set of packets is typically distributed across the set of equal-cost paths according to a weighted distribution.

[0018] According to yet other embodiments of the present invention, a first data packet distribution device is provided. This distribution device typically includes a set of ports and a first distribution unit. The first distribution unit often includes a device logic. Usually, the first distribution unit is configured to use the device logic to distribute a packet of data entering the device through a first port among the set of ports to a second port among the set of ports. Normally, the device logic includes a first lookup unit that itself generally includes an acknowledgement unit for acknowledging whether multiple equal-cost paths exist, a first referencing unit for referencing a second lookup unit when multiple equal-cost paths do exist, and a second referencing unit for referencing a third lookup unit otherwise. Typically, the device logic also includes the second lookup unit that itself includes a second distribution unit for distributing the packet across the set of ports and a third referencing unit for referencing the third lookup unit. Also, the device logic commonly includes the third lookup unit, that itself usually includes a selection unit for selecting the second port.

[0019] In addition, certain embodiments of the present invention include a second device for distributing Internet Protocol (IP) packets across a network. The device generally includes a set of interface means for interfacing the device with the network. The device also routinely includes distribution means for distributing a set of IP packets entering the device through a first interface means in the set of interface means such that packets in the set of IP packets are distributed across all of the interface means in the set of interface means that are operably connected to equal-cost paths according to a weighted distribution.

BRIEF DESCRIPTION OF THE DRAWINGS

[0020] For proper understanding of certain embodiments of the invention, reference should be made to the accompanying drawings, wherein:

[0021] FIG. 1 illustrates two equal-cost paths for a first packet between a router and a server;

[0022] FIG. 2 illustrates a packet, an LPM Table, an ECMP Table, and an L3 Table that may be used according to certain embodiments of the present invention;

[0023] FIG. 3 illustrates a distribution device according to certain embodiments of the present invention, along with a pair of equal-cost paths between the distribution device and a server;

[0024] FIG. 4 illustrates the steps of a first representative algorithm that may be used, according to certain embodiments of the present invention, for distributing data across a network;

[0025] FIG. 5 illustrates the steps of a second representative algorithm that may be used, according to certain embodiments of the present invention, for distributing data across a network;

[0026] FIG. 6 illustrates the steps of a third representative algorithm that may be used, according to certain embodiments of the present invention, for distributing data across a network;

[0027] FIG. 7 illustrates the steps of a fourth representative algorithm that may be used, according to certain embodiments of the present invention, for distributing data across a network;

[0028] FIG. 8 illustrates the steps of a fifth representative algorithm that may be used, according to certain embodiments of the present invention, for distributing data across a network;

[0029] FIG. 9 illustrates the steps of a sixth representative algorithm that may be used, according to certain embodiments of the present invention, for distributing data across a network; and

[0030] FIG. 10 illustrates the steps of a seventh representative algorithm that may be used, according to certain embodiments of the present invention, for distributing data across a network.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0031] As telecommunications networks grow and increase in speed, more and more information and/or data is distributed over the networks, often in the form of datagrams. Hence, it becomes more and more desirable to enhance network efficiency at every level. Below are described several representative methods and devices for enhancing network efficiency by more efficiently distributing datagrams such as, but not limited to, Internet Protocol (IP) packets, over a network.

[0032] According to certain embodiments of the present invention, distribution and/or routing methods for distributing datagrams, often in the form of IP packets, across a network are provided. These methods typically help alleviate traffic congestion across network devices and/or entire portions of the network. The network devices may include, for example, nodes, modules, routers, distribution devices, and links,

[0033] In order to illustrate how some of these methods operate, the representative packet 200 and the representative set of tables 210, 220, 230 illustrated in FIG. 2 will frequently be referenced. One skilled in the art of the present invention will recognize that the tables shown in FIG. 2 may be thought of as collections or compilations of sets of instructions and that each entry in each of the tables may be thought of as a set of instructions. One skilled in the art will also recognize that the packet 200 is only a representative datagram and that datagrams having attributes other than the source IP (SIP) address and destination IP (DIP) address illustrated in FIG. 2 may also be used according to certain embodiments of the present invention. One skilled in the art will further recognize that no particular restrictions are made on the type or form of the DATA that is included in each packet 200.

[0034] As mentioned above, according to certain embodiments of the present invention, methods of distributing data across a network are provided. According to some of these methods, such as the first representative method 400 illustrated in FIG. 4, a distribution device, such as, for example, the switch 300 illustrated in FIG. 3, is provided in a network according to a first step 410. Generally, multiple equal-cost paths exist between the distribution device and a destination to which multiple datagrams or packets are to be forwarded over time. Hence, also according to this first step 410, the distribution device is typically configured to distribute the datagrams across the set of equal-cost paths. Preferably, the distribution is made in a manner that can avoid overburdening certain network devices, especially when other network devices that are less encumbered by network traffic are available. In FIG. 3, a representative destination is illustrated as the server 305, and two representative network devices are illustrated as a first network device 315 and a second network device 325.

[0035] According to certain embodiments of the present invention, as shown in the second step 420 of the first representative method 400, the datagrams are distributed, over time, across each of the available equal-cost paths. In this second step the distribution is made as a function of at least one attribute of each of the datagrams.

[0036] According to a second representative method 500, the steps of which are illustrated in FIG. 5, a distribution device configured to distribute packets of data across a set of equal-cost paths in the network is provided according to a first step 510. Typically, when choosing over which of the set of equal-cost paths a given packet is to be distributed, the choice is made based, at least partially, on a packet attribute. For example, according to the second step 520 illustrated in FIG. 5, the attribute may be the SIP. The attribute may also be the DIP, as shown in the third step 530, and so on.

[0037] Another representative embodiment of the present invention that illustrates how the packet attribute is used to choose from among a set of equal-cost paths is discussed below, and is explained with reference to the exemplary packet 200 and tables 210, 220, 230 illustrated in FIG. 2. According to this embodiment, the packet 200 may be thought of as a more detailed view or version of a first datagram D1 that enters a distribution device, such as the switch 300 illustrated in FIG. 3, through an ingress port, such as the first port 330 or another interface means.

[0038] The distribution device typically includes, or is operably connected to, a first distribution unit 310 that itself typically includes a device logic 320 and/or memory. Usually, the first distribution unit 310 is configured to make use of the device logic 320 and/or memory when distributing DATA contained in the first datagram D1 that enters the distribution device. The first datagram D1 typically enters through the first port 330 and is distributed to an egress port, such as a second port 340 or a third port 345, chosen from among a plurality of possible egress ports. If the second port 340 illustrated in FIG. 3 is chosen, the second datagram D2 illustrated egresses from the switch 300 and, if the third port 345 is chosen, the third datagram D3 egresses from the switch 300.

[0039] According to certain embodiments of the present invention, the device logic 320 includes a first lookup unit 350, which often stores the LPM Table 210 illustrated in FIG. 2. The first lookup unit 350 generally includes an acknowledging unit 360 for acknowledging whether multiple equal-cost paths exist between the distribution device and a location to which a datagram or packet of data is to be forwarded. When an LPM Table 210 is stored in the first lookup unit 350, the acknowledgement unit 360 typically makes use of instructions in the LPM Table 210 to acknowledge whether multiple equal-cost paths are present. In the LPM Table 210 illustrated in FIG. 2, these instructions are typically contained in the column labeled ECMP and commonly take the form of binary ECMP values.

[0040] In operation, according to certain embodiments of the present invention, the distribution device or switch 300 performs a Longest Prefix match between the packet 200 and a portion of an entry in the LPM Table 210. Typically, only a single entry in the LPM Table 210 includes a portion that matches the Longest Prefix of the packet 200, regardless of whether or not multiple equal-cost paths exist in the network for the packet 200. This LPM Table 210 entry is referred to below as the "matched entry".

[0041] Usually, a distribution device that includes and makes use of a first lookup unit 350 in which an LPM Table 210 is stored relies on the ECMP value included in the matched LPM Table entry to specify whether equal-cost paths exist. Normally, the ECMP value is a binary value that either specifies that multiple equal-cost paths exist or that they do not.

[0042] As illustrated in FIG. 2, the LPM Table 210 according to certain embodiments of the present invention also includes an ECMP Table Base Pointer (ECMP_TBL_BASE_PTR), which may be contained or stored within and/or used by a first referencing unit 370, such as the one illustrated as part of the first lookup unit 350 in FIG. 3. Generally, the ECMP Table Base Pointer may be thought of as an instruction, or a part of an instruction, for referencing a second compilation of sets of instructions, which are often contained or stored in the second lookup unit 380 illustrated in FIG. 3. Typically, the ECMP Table Base Pointer is only used when the presence of equal-cost paths has been specified by the ECMP value. In FIG. 2, the second compilation of sets of instructions is represented as the ECMP Table 220, which will be discussed shortly.

[0043] The ECMP Table 220 is usually only referenced when multiple equal-cost paths do exist. When the ECMP value in the LPM Table 210 indicates that no equal-cost paths are present, the ECMP Table 220 or, more generally, the second compilation of sets of instructions used by the second lookup unit 380, is normally not referenced. In such cases, instructions for referencing a third compilation of sets of instructions are used.

[0044] These instructions for referencing the third compilation of sets of instructions directly from the first compilation of sets of instructions are commonly included in the first compilation of sets of instructions and may, for example, take the form of an L3 Table Index contained in the matched entry of the LPM Table 210. The L3 Table Index may be stored in and/or used by a second referencing unit 365 in the first lookup unit 350.

[0045] Typically, the L3 Table Index is used to pick an entry from an L3 Table. The above-discussed ECMP Table may include both the L3 Table Index and the L3 Interface Index. The L3 Interface Index may be used to find the L3 Interface attribute that may eventually be used during datagram transmission but that is not typically relevant to selection from among equal cost paths. This operation is generally simply a direct index from the ECMP Table into the L3 Table at the L3 Table Index address.

[0046] As stated above, according to certain embodiments of the present invention, the second compilation of sets of instructions takes the form of the ECMP Table 220 illustrated in FIG. 2 and is commonly stored and/or referenced by the second lookup unit 380. Often, as will be discussed below, the number of sets of instructions in the second compilation is determinative of how a packet is distributed across the set of ports of the distribution device, which act as a set of interface means for interfacing the distribution device with a network, according to a weighted distribution. Usually, the number of sets of instructions is specified by a second distribution unit 375 in the second lookup unit 380. According to certain embodiments, the second distribution unit 375 relies on the COUNT values from the LPM Table 210 to determine how many entries in the ECMP Table 220 correspond to each equal-cost path in the network.

[0047] Sets of instructions for referencing the third compilation of sets of instructions are also commonly stored and/or used by a third referencing unit 385 in the second lookup unit 380. These sets of instructions commonly take the form of L3 Table Indices that point to a specific entry in the L3 Table 330. From the discussion above, it should be clear to one of skill in the art that, when multiple equal-cost paths are available, L3 Table Indices from the ECMP Table 220 may be used and that L3 Table Indices from the LPM Table 210 may be used in the absence of multiple equal-cost paths.

[0048] In the example illustrated in FIG. 2, one or more identical entries in the ECMP Table 220, each referencing a common entry in the L3 Table 230, may be included. The ECMP Table entries then typically use L3 Table Index values to reference the L3 Table 230.

[0049] One advantage of including multiple, identical entries in the ECMP Table 220 is that these multiple entries allow for the distribution of packet traffic across a set of equal-cost paths according to a weighted distribution over time. How such a distribution is achieved is discussed below.

[0050] The distribution of each packet in a set of packets across a set of equal-cost paths according to a weighted distribution is stipulated in the second step 620 of the third representative method 600 illustrated in FIG. 6. The second step 620 generally follows the first step 610 of this third method, wherein a distribution device configured to distribute a set of packets of data across a set of equal-cost paths in a network is provided.

[0051] Returning to FIG. 3, it should be noted that the third compilation of sets of instructions discussed above is commonly stored and/or referenced by the third lookup unit 355 and often takes the form of the L3 Table 230 discussed above. More generally, the third compilation typically includes a selection unit 390 that stores and/or references instructions for selecting the egress port of the distribution device through which the datagram or packet is to exit the distribution device. As discussed above, the third lookup unit 355 may be referenced directly from the first lookup unit 350 when no equal-cost paths exist and is generally referenced from the second lookup unit 380 when multiple equal-cost paths do exist.

[0052] Returning now to the description of methods that make use of the devices such as the device illustrated in FIG. 3 and that distribute packets based on packet attributes, we note that the packet attributes are usually taken into account when the LPM Table 210 references the ECMP Table 220. In other words, the packet attributes commonly come into play when a set of instructions in the first compilation references the second compilation.

[0053] More specifically, according to one representative example, before the ECMP Table Base Pointer is used to reference the ECMP Table 220, the ECMP Table Base Pointer first undergoes a mathematical transformation involving a packet attribute. The mathematical transformation of the ECMP Table Base Pointer may, for example, begin with the selection of a packet attribute, which is often chosen to be the SIP, since different packets passing through the distribution device are more likely to have unique sources and destinations. Then, according to certain embodiments and as shown in the fourth step 540 of the representative method illustrated in FIG. 5, a hashing function is performed on the packet attribute. Among the advantages of performing a hashing function on the packet attribute at this point is that the attribute, which may be relatively lengthy, is reduced to a smaller, yet likely very unique, hash value. Since the smaller hash value can be represented with fewer data bits, the hash value is therefore easier to process than the attribute itself.

[0054] Pursuant to the hashing of the packet attribute, the hash value may be mathematically manipulated further, usually by adding the hash value to the ECMP Table Base Pointer to generate a modified pointer into the ECMP Table 220. Because the hash value is, by definition, highly likely to be unique, each modified pointer can reference a different ECMP Table entry or, when multiple identical ECMP Table entries are present, a different set of such identical ECMP Table entries. Since each ECMP Table entry references a single L3 Table entry, each of the packets will be distributed, based on the hashed attribute, over all of the equal-cost paths available, as specified in the fifth step 550 of the method illustrated in FIG. 5.

[0055] It should be noted that, according to certain embodiments of the present invention, a user-programmable option may be included to determine a hash selection from the ECMP Table. When such an option is included, a hash of the Layer 3 source address (SIP) and Layer 3 destination address (DIP) are typically used. For example, the following function may be used to select and entry:

[0056] Entry_select=[hash_function(SIP,DIP) % sizeof(ECMP_entries)]

[0057] The above function typically concatentates SIP and DIP, both of which may be of 32 bits, to form, for example, a 64-bit quantity. Then, the 64-bit quantity generally has a hash function performed upon it. The results of the hash function may then be used to obtain a modulo result on the divisor, which may be, for example, the number of equal-cost (ECMP) paths available. Since the result will typically have a value between 0 and the number of ECMP paths available, this value may represent the ECMP entry chosen.

[0058] When different numbers of identical ECMP Table entries point to a first L3 Table entry and a second L3 Table entry, the above-mentioned weighted distribution of packets among the equal-cost paths becomes possible. For example, if nine identical ECMP Table entries, each pointing to a first L3 Table entry that instructs that the packet be distributed to a first path via a first egress port are present in the ECMP Table 220, and only three identical ECMP Table entries that point to a second L3 Table entry that instructs that the packet be distributed to a second path via a second egress port are present, then the packet is three times as likely to be distributed to the first path than to the second path. Under such circumstances, over time, packet traffic is generally three times higher on the first path than on the second path and the traffic is said to be distributed across the equal-cost paths according to a weighted distribution.

[0059] Such a weighted distribution of traffic is especially beneficial when network modules are unevenly loaded with traffic from the rest of the network. For example, in FIG. 3, if the first network device 315 is loaded with more general network traffic than the second network device 325, then the switch 300 may distribute more traffic to the second network device 325 by including more ECMP Table entries in the ECMP Table 220 that point to or reference an L3 Table entry that routes traffic to the second network device 325 than ECMP Table entries that point to an L3 Table entry that routes traffic to the first network device 315.

[0060] Another advantage of the above-described method has to do with the fact that, instead of using multiple LPM Table entries to establish the presence of equal-cost paths, an ECMP value is used. Hence, the LPM Table 210 according to certain embodiments of the present invention is typically of a small size. Therefore, the amount of time spent performing lookup algorithms is reduced. Further, because of the small LPM Table, there is generally a reduction in the amount of hardware, such as memory, that is used and/or re-used.

[0061] FIG. 7 illustrates the steps of a representative algorithm that is used, according to certain embodiments of the present invention, for distributing data across a network according to a weighted distribution. According to the first step 710 of the algorithm, a match is made between a Longest Prefix of a first IP packet and a first entry in an LPM Table. Then, according to the second step 720, COUNT values, generally found in each of the entries of the LPM Table, are used to specify how many ECMP Table entries are allocated for each equal-cost path available to the packet.

[0062] Next, according to the third step 730, a hashing function is performed on a packet attribute of the first packet, thereby obtaining a hashed result. According to the fourth step 740, the hashed result is divided by the COUNT value of the first entry of the LPM Table to obtain a remainder value. Then, according to the fifth step 750, the remainder value is used as an offset that is added to the ECMP Table Base Pointer found in the first entry of the LPM Table to generate a modified pointer.

[0063] Following the above-listed steps, sixth step 760 specifies that an ECMP Table entry be selected based on the modified pointer. According to the seventh step 770, a pointer in the selected ECMP Table entry to reference an entry in an L3 Table. According to certain embodiments, the L3 Table Index is used to reference an entry in the L3 Table. Finally, the eighth step 780 specifies that the packet be forwarded to the distribution device egress port specified in the selected L3 Table entry.

[0064] If the above-discussed algorithm, or variations thereof, is performed on all packets entering a distribution device pursuant to a determination that the packets have multiple equal-cost paths available to them, then a weighted distribution of traffic over the paths can be obtained. According to certain embodiments of the present invention, as the presence or absence of equal-cost paths varies over time, the various tables used to perform the weighted distribution may be updated. Generally, this may be done according to a best-fit algorithm, as specified in step 630 of FIG. 6, wherein a compilation of sets of instructions used to perform the weighted distribution is updated based on the best-fit algorithm. However, other methods may also be used, including manual alteration by a user.

[0065] In order to illustrate a fifth representative method 800 according to certain embodiments of the present invention, FIG. 8 is provided. In FIG. 8, the first step 810 specifies that a distribution device configured to distribute a set of packets of data across a set of equal-cost paths in the network be provided. The second step 820 then specifies that each packet in the set of packets be distributed across the set of equal-cost paths according to a weighted distribution. The second step 820 also specifies using a packet attribute from each packet to perform the weighted distribution.

[0066] In order to illustrate a sixth representative method 900 according to certain embodiments of the present invention, FIG. 9 is provided. In FIG. 9, the first step 910 specifies providing a distribution device configured to distribute a set of packets of data across a set of equal-cost paths in the network. The second step 920 then specifies distributing each packet in the set of packets across the set of equal-cost paths according to a weighted distribution, performing a hashing function on the packet attribute, and using the hashed packet attribute from each packet to perform the weighted distribution.

[0067] In order to illustrate a seventh representative method 1000 according to certain embodiments of the present invention, FIG. 10 is provided. In FIG. 10, the first step 1010 specifies providing a distribution device configured to distribute a set of packets of data across a set of equal-cost paths in the network. The second step 1020 then specifies obtaining a match between a longest prefix in a first packet and a portion of a first set of instructions in a first compilation of sets of instructions. Then, the third step 1030 specifies using a pointer portion from the first set of instructions to select a second set of instructions from a second compilation of sets of instructions, wherein the first set of instructions includes a first value that specifies how much weight is to be given to each equal-cost path in the set of equal-cost paths.

[0068] Following these steps, the fourth step 1040 specifies performing a hashing function on an attribute of the first packet to obtain a hashed result. Then, fifth step 1050 specifies dividing the hashed result by the first value, thereby obtaining a remainder value, and using the remainder value to obtain an offset. Next, sixth step 1060 specifies adding the offset to the first pointer to select the second set of instructions. According to this step, the second set of instructions typically includes a pointer to a third set of instructions in a third compilation of sets of instructions. Following the sixth step 1060, the seventh step 1070 specifies forwarding the first packet to a port designated in the third set of instructions. Finally, the eighth step 1080 specifies distributing each packet in the set of packets across the set of equal-cost paths.

[0069] One having ordinary skill in the art will readily understand that the invention, as discussed above, may be practiced with steps in a different order, and/or with hardware elements in configurations which are different than those which are disclosed. Therefore, although the invention has been described based upon these preferred embodiments, it would be apparent to those of skill in the art that certain modifications, variations, and alternative constructions would be apparent, while remaining within the spirit and scope of the invention. In order to determine the metes and bounds of the invention, therefore, reference should be made to the appended claims.

* * * * *