U.S. patent application number 12/346734 was filed with the patent office on 2010-07-01 for filter for network intrusion and virus detection.
Invention is credited to Christopher F. Clark, Wajdi Feghali, VINODH GOPAL, Gilbert Wolrich.
Application Number | 20100169401 12/346734 |
Document ID | / |
Family ID | 42286195 |
Filed Date | 2010-07-01 |
United States Patent
Application |
20100169401 |
Kind Code |
A1 |
GOPAL; VINODH ; et
al. |
July 1, 2010 |
FILTER FOR NETWORK INTRUSION AND VIRUS DETECTION
Abstract
Methods and apparatus to perform string matching for network
packet inspection are disclosed. In some embodiments there is a set
of string matching slice circuits, each slice circuit of the set
being configured to perform string matching steps in parallel with
other slice circuits. Each slice circuit may include an input
window storing some number of bytes of data from an input data
steam. The input window of data may be padded if necessary, and
then multiplied by a polynomial modulo an irreducible Galois-field
polynomial to generate a hash index. A storage location of a memory
corresponding to the hash index may be accessed to generate a
slice-hit signal of a set of H slice-hit signals. The slice-hit
signal may be provided to an AND-OR logic array where the set of H
slice-hit signals is logically combined into a match result.
Inventors: |
GOPAL; VINODH; (Westboro,
MA) ; Clark; Christopher F.; (Hopkinton, MA) ;
Wolrich; Gilbert; (US) ; Feghali; Wajdi;
(Boston, MA) |
Correspondence
Address: |
LARRY MENNEMEIER;c/o Intellevate, LLC
P.O.Box 52050
Minneapolis
MN
55402
US
|
Family ID: |
42286195 |
Appl. No.: |
12/346734 |
Filed: |
December 30, 2008 |
Current U.S.
Class: |
708/316 ;
708/300 |
Current CPC
Class: |
H04L 63/1416 20130101;
G06F 2207/025 20130101; G06F 16/9014 20190101; G06F 7/02 20130101;
G06F 16/90344 20190101; G06F 21/567 20130101; H04L 63/0245
20130101; H04L 63/145 20130101 |
Class at
Publication: |
708/316 ;
708/300 |
International
Class: |
G06F 17/10 20060101
G06F017/10 |
Claims
1. A method to perform string matching for network packet
inspection, the method comprising: configuring a set of H slice
circuits, each i.sup.th slice circuit of the set of H slice
circuits being configured to perform the steps of: storing an
i.sup.th input window of W.sub.i bytes of data from an input data
steam; padding the W.sub.i bytes of data if necessary, and
multiplying the W.sub.i bytes of data by a polynomial modulo an
irreducible Galois-field polynomial to generate an i.sup.th hash
index; and accessing a storage location of a memory corresponding
to the i.sup.th hash index to generate an i.sup.th slice-hit signal
of a set of H slice-hit signals; and providing the i.sup.th
slice-hit signal to an AND-OR logic array as one of the set of H
slice-hit signals.
2. The method of claim 1 wherein configuring each i.sup.th slice
circuit of the set of H slice circuits to perform the step of
providing the i.sup.th slice-hit signal to the AND-OR logic array
comprises: storing the i.sup.th slice-hit signal in the storage
location of the memory corresponding to the i.sup.th hash
index.
3. The method of claim 2 wherein each i.sup.th input window of
W.sub.i bytes of data from the input data steam comprises a
complete data pattern.
4. The method of claim 2 wherein providing the i.sup.th slice-hit
signal to the AND-OR logic array comprises: reading out the
i.sup.th slice-hit signal, from the storage location of the memory
corresponding to the i.sup.th hash index, to the AND-OR logic array
as the i.sup.th one of the set of H slice-hit signals.
5. The method of claim 2 wherein providing the i.sup.th slice-hit
signal to the AND-OR logic array comprises: mutiplexing the
i.sup.th slice-hit signal from the storage location of the memory
corresponding to the i.sup.th hash index, to the AND-OR logic array
as the i.sup.th one of the set of H slice-hit signals.
6. The method of claim 1 further comprising: configuring the AND-OR
logic array to receive the set of H slice-hit signals and to
combine the set of H slice-hit signals into a match result.
7. The method of claim 6 wherein the AND-OR logic array is
configured to receive the set of H slice-hit signals and to
logically AND the set of H slice-hit signals into a match
result.
8. The method of claim 6 wherein the AND-OR logic array is
configured to receive the set of H slice-hit signals and to
logically OR the set of H slice-hit signals into a match
result.
9. The method of claim 6 wherein the AND-OR logic array is
configured to receive the set of H slice-hit signals and to
logically AND subsets of the set of H slice-hit signals into
temporary results, and to logically OR the temporary results into a
match result.
10. An apparatus comprising: an AND-OR logic array configurable to
receive a set of H slice-hit signals and to combine the set of H
slice-hit signals into a match result; and a set of H slice
circuits, each i.sup.th slice circuit of the set comprising: an
input window configurable to store W.sub.i bytes of data from an
input data steam; a Ghash unit coupled with the input window and
configurable to receive the W.sub.i bytes of data, pad the W.sub.i
bytes of data if necessary, and multiply the W.sub.i bytes of data
by a polynomial modulo an irreducible Galois-field polynomial to
generate an index; and a memory coupled with the Ghash unit and
configurable to access a storage location responsive to the index
to generate a slice-hit signal and to provide the slice-hit signal
to said AND-OR logic array as one of the set of H slice-hit
signals.
11. The apparatus of claim 10 wherein providing the slice-hit
signal to the AND-OR logic array comprises: reading out the
slice-hit signal, from the storage location of the memory
corresponding to the index of the i.sup.th slice circuit, to the
AND-OR logic array as the i.sup.th one of the set of H slice-hit
signals.
12. The apparatus of claim 10 wherein providing the slice-hit
signal to the AND-OR logic array comprises: multiplexing the
slice-hit signal, from the storage location of the memory
corresponding to the index of the i.sup.th slice circuit, to the
AND-OR logic array as the i.sup.th one of the set of H slice-hit
signals.
13. The apparatus of claim 10 wherein the AND-OR logic array is
configurable to receive the set of H slice-hit signals and to
logically AND the set of H slice-hit signals into a match
result.
14. The apparatus of claim 10 wherein the AND-OR logic array is
configurable to receive the set of H slice-hit signals and to
logically OR the set of H slice-hit signals into a match
result.
15. The apparatus of claim 10 wherein the AND-OR logic array is
configurable to receive the set of H slice-hit signals and to
logically AND subsets of the set of H slice-hit signals into
temporary results, and to logically OR the temporary results into a
match result.
16. The apparatus of claim 10 wherein the same irreducible
Galois-field polynomial is used in each i.sup.th slice circuit of
the set of H slice circuits.
17. The apparatus of claim 16 wherein each the W.sub.i bytes of
data are multiplied by a different distinct polynomial in each
i.sup.th slice circuit of the set of H slice circuits.
18. A packet processing system to perform string matching for
network packet inspection, the system comprising: a system
processor; an AND-OR logic array configurable to receive a set of H
slice-hit signals and to combine the set of H slice-hit signals
into a match result; and a set of H slice circuits, each i.sup.th
slice circuit of the set comprising: an input window configurable
to store W.sub.i bytes of data from an input data steam; a Ghash
unit coupled with the input window and configurable to receive the
W.sub.i bytes of data, pad the W.sub.i bytes of data if necessary,
and multiply the W.sub.i bytes of data by a polynomial modulo an
irreducible Galois-field polynomial to generate an index; and a
memory coupled with the Ghash unit and configurable to access a
storage location responsive to the index to generate a slice-hit
signal and to provide the slice-hit signal to said AND-OR logic
array as one of the set of H slice-hit signals; and a machine
readable medium to store executable instructions, such that when
said executable instructions are executed by the system processor,
the system processor is caused to: set a pointer to a first
character of the input data steam to establish a starting point for
the input window of each i.sup.th slice circuit, and increment the
pointer until the match result is positive or until an end-of-file
is reached in the input data steam.
19. The system of claim 18 wherein the same irreducible
Galois-field polynomial is used in each i.sup.th slice circuit of
the set of H slice circuits.
20. The system of claim 19 wherein each the W.sub.i bytes of data
are multiplied by a different distinct polynomial in each i.sup.th
slice circuit of the set of H slice circuits.
21. The system of claim 18 wherein the AND-OR logic array is
configurable to receive the set of H slice-hit signals and to
logically AND the set of H slice-hit signals into a match
result.
22. The system of claim 18 wherein the AND-OR logic array is
configurable to receive the set of H slice-hit signals and to
logically OR the set of H slice-hit signals into a match
result.
23. The system of claim 18 wherein the AND-OR logic array is
configurable to receive the set of H slice-hit signals and to
logically AND subsets of the set of H slice-hit signals into
temporary results, and to logically OR the temporary results into a
match result.
24. The system of claim 18 wherein providing the slice-hit signal
to the AND-OR logic array comprises: reading out the slice-hit
signal, from the storage location of the memory corresponding to
the index of the i.sup.th slice circuit, to the AND-OR logic array
as the i.sup.th one of the set of H slice-hit signals.
25. The system of claim 18 wherein providing the slice-hit signal
to the AND-OR logic array comprises: multiplexing the slice-hit
signal, from the storage location of the memory corresponding to
the index of the i.sup.th slice circuit, to the AND-OR logic array
as the i.sup.th one of the set of H slice-hit signals.
Description
FIELD OF THE DISCLOSURE
[0001] This disclosure relates generally to the field of network
processing. In particular, the disclosure relates to a novel filter
architecture to accelerate string matching in packet inspection for
network applications such as intrusion detection/prevention and
virus detection.
BACKGROUND OF THE DISCLOSURE
[0002] In modem networks, applications such as intrusion
detection/prevention and virus detection are important for
protecting the networks and/or network users from attacks. In such
applications network packets are often inspected to identify
problematic packets by finding matches to a known set of data
patterns. Matching every byte of an incoming data stream against a
large database of patterns (e.g. up to hundreds of thousands) is
very compute-intensive. Programs have used techniques such as
finite-state machines and filters to find matches to known
sets.
[0003] A Bloom filter, conceived by Burton H. Bloom in 1970, is a
probabilistic structure for determining whether an element is a
member of a set. Hashing is performed on the element. Multiple
different hash functions are used to generate multiple different
hash indices into an array of bits. To add or insert an element
into the set, these hash functions are used to index multiple bit
locations in the array for the element and these bit locations are
then set to one. To query the filter for an arbitrary element the
hash functions are used to index multiple bit locations in the
array for the element and these bit locations are then checked to
see if they are all set to one. If they are not all set to one, the
arbitrary element in question is not a member of the set.
[0004] Whenever a filter generates a positive outcome for an
element, which is not actually a member of the set, the outcome is
called a false positive. The Bloom filter will not generate a false
negative. It is a goal of any particular filter design, that the
probability of false positives is "small." For Bloom filters, after
inserting n elements into a set represented by an array of m bits
using k different hash functions, the probability of a false
positive is (1-(1-1/m).sup.kn).sup.k.
[0005] Designing a filter for a specific problem may be tedious,
and at high data rates it is difficult or impossible for
state-of-the art processors to implement the design at rates even
close to line-rate. To achieve rates close to one or more gigabits
per second, specialized field-programmable gate array solutions or
custom circuits have been proposed.
[0006] To date, more generalized reconfigurable architectures to
accelerate string matching in packet inspection for network
applications such as intrusion detection/prevention and virus
detection have not been fully explored.
BRIEF DESCRIPTION OF THE DRAWINGS
[0007] The present invention is illustrated by way of example and
not limitation in the figures of the accompanying drawings.
[0008] FIG. 1 illustrates one embodiment of a filter apparatus to
accelerate string matching in packet inspection for network
applications such as intrusion detection/prevention and virus
detection.
[0009] FIG. 2 illustrates a flow diagram for one embodiment of a
process to initialize a filter apparatus for string matching in
packet inspection.
[0010] FIG. 3 illustrates a flow diagram for one embodiment of a
process to utilize a filter apparatus for string matching in packet
inspection.
[0011] FIG. 4 illustrates one embodiment of a system employing a
filter apparatus to accelerate string matching in packet inspection
for network applications such as intrusion detection/prevention and
virus detection.
DETAILED DESCRIPTION
[0012] Methods and apparatus to perform string matching for network
packet inspection are disclosed below. In some embodiments, a
filter apparatus may be configured as a set of string matching
slice circuits, each slice circuit of the set being configured to
perform string matching steps in parallel with other slice
circuits. Each slice circuit may include an input window storing
some number of bytes of data from an input data steam. The input
window of data may be padded if necessary, and may be multiplied by
a distinct Galois-field polynomial modulo an irreducible
Galois-field polynomial to generate a hash index. A storage
location of a memory slice corresponding to the hash index may be
accessed to generate a slice-hit signal of a plurality of slice-hit
signals. The slice-hit signal may be provided to an AND-OR logic
array where the plurality of slice-hit signals is logically
combined into a match result.
[0013] Embodiments of such methods and apparatus represent
reconfigurable architectures to accelerate string matching in
packet inspection for network applications such as intrusion
detection/prevention and virus detection.
[0014] In the following description, numerous specific details are
set forth. However, it is understood that embodiments of the
invention may be practiced without these specific details. In other
instances, well-known circuits, structures and techniques have not
been shown in detail in order not to obscure the understanding of
this description. These and other embodiments of the present
invention may be realized in accordance with the following
teachings and it should be evident that various modifications and
changes may be made in the following teachings without departing
from the broader spirit and scope of the invention. The
specification and drawings are, accordingly, to be regarded in an
illustrative rather than a restrictive sense and the invention
measured only in terms of the claims and their equivalents.
[0015] FIG. 1 illustrates one embodiment of a filter apparatus 101
to accelerate string matching in packet inspection for network
applications such as intrusion detection/prevention and virus
detection. Filter apparatus 101 as shown includes an input data
stream 120, which may be in a system memory or may comprise an
optional data stream buffer of filter apparatus 101 for storing
packed data for inspection and/or a pattern database to initialize
filter apparatus 101. Filter apparatus 101 also includes a set of H
(e.g. 1-8) slice circuits 110-150, each i.sup.th slice circuit of
the set is configurable for providing an i.sup.th slice-hit signal
to a configurable AND-OR logic array 140 as one of a set of H
slice-hit signals. Slice circuits 110-150, respectively include
input windows 111-151 each configurable to store W.sub.i (e.g. 2-8)
bytes of data from input data steam 120, and Ghash units 112-152
coupled with input windows 111-151 and configurable to receive the
W.sub.i bytes of data, to pad the W.sub.i bytes of data if
necessary, and to multiply their respective W.sub.i bytes of data
by a polynomial modulo an irreducible Galois-field polynomial to
generate an index.
[0016] It will be appreciated that some embodiments of filter
apparatus 101 may use the same irreducible Galois-field polynomial
in each of the Ghash units 112-152 with H distinct polynomial
multipliers selected at random (each having a good mixture of 1's
and 0's) to generate H distinct hash indices, thus simplifying the
task of generating distinct hash indices for each Ghash unit. It
will also be appreciated that in embodiments of filter apparatus
101 where, unlike the Bloom filter, input windows 111-151 are
independently configurable to store W.sub.i bytes of data from
input data steam 120, the filter apparatus 101 may be used to solve
multiple problems of different sizes (e.g. a 2-byte match, a 3-byte
match, a 6-byte match, and an 8-byte match, etc.) at the same time
in parallel.
[0017] Slice circuits 110-150, respectively, also include memories
113-153 coupled with the Ghash units 112-152 and configurable to
access respective storage locations responsive to their respective
indices (e.g. at the addresses specified by some field of bits from
respective indices) to each generate an i.sup.th slice-hit signal
and to provide the an i.sup.th slice-hit signal to AND-OR logic
array 140 as one of the set of H slice-hit signals 115-155. Some
embodiments of memories 113-153 are configurable from a larger
memory 130 to serve as individual memories 113-153 for slice
circuits 110-150 respectively. Some alternative embodiments of
memories 113-153 may be N-entry (e.g. 1K entries) read/write
random-access memories (RAMs) of fixed width (e.g. 64-bits wide)
and are configurable to be combined into larger memories (e.g.
memory 130) as necessary (e.g. when a very large set of patterns is
required). Slice circuits 110-150 may also include multiplexers
114-154, respectively, configurable to access respective bit
storage locations responsive to portions of their respective
indices to generate the i.sup.th slice-hit signal and to provide
the i.sup.th slice-hit signal to AND-OR logic array 140 as one of
the set of H slice-hit signals 115-155.
[0018] AND-OR logic array 140 is configurable to receive a set of H
slice-hit signals 115-155 and to combine the set of H slice-hit
signals 115-155 into a match result 145, a copy of which may be
stored as a match result 185. Some embodiments of AND-OR logic
array 140 may be configurable to perform a simple AND (e.g. as in a
Bloom filter) or a simple OR (e.g. as in solving multiple problems
of different sizes in parallel) of the set of H slice-hit signals
115-155 to get a match result 145. Alternative embodiments of
AND-OR logic array 140 may be configurable to perform a complex
AND-OR of the set of H slice-hit signals 115-155 (e.g.
temp.sub.k=(AND slice-hit signal.sub.i for all i in a set S.sub.k)
and then the final match result=(OR temp.sub.k for all k) ) to get
a match result 145. The complex AND-OR of the set of H slice-hit
signals 115-155 may be used, for example, in embodiments of filter
apparatus 101 to provide multiple Bloom filters in parallel.
[0019] It will be appreciated that when a final match result is
positive, a verification process may be used to check against false
positives. Such verification process may be relatively slower than
using filter apparatus 101 and so the configuration of filter
apparatus 101 should be carefully made to avoid frequent false
positives.
[0020] FIG. 2 illustrates a flow diagram for one embodiment of a
process 201 to initialize a filter apparatus for string matching in
packet inspection. Process 201 and other processes herein disclosed
are performed by processing blocks that may comprise dedicated
hardware or software or firmware operation codes executable by
general purpose machines or by special purpose machines or by a
combination of both.
[0021] In processing block 211 a set of H slice circuits are
configured. In processing block 212, i is set to zero (0). In
processing block 213, i is incremented. In processing block 214, i
is checked to see if it has exceeded H. It will be appreciated that
even though initialization of the H slice circuits is shown as an
iterative process 201, in at least some preferred embodiment of
process 201, the set of H slice circuits are configured to
concurrently perform initialization according to processing blocks
215-220 of process 201 for use in string matching during network
packet inspections. Therefore, for each of the H slice circuits
processing blocks 215-220 are executed as follows, before
proceeding to processing block 222.
[0022] In processing block 215 W.sub.i bytes of data is stored from
an input data steam in an i.sup.th input window. In processing
block 216 the W.sub.i bytes of data are padded if necessary. Then
in processing block 217 the W.sub.i bytes of data are multiplied by
a Galois-field polynomial modulo an irreducible Galois-field
polynomial to generate an i.sup.th hash index. In processing block
218 a storage location of a memory corresponding to the i.sup.th
hash index is accessed, and in processing block 220 an i.sup.th
slice-hit signal is stored (i.e. set) in the storage location of
the memory corresponding to the i.sup.th hash index. When all of
the H slice circuits have completed processing blocks 215-220 of
process 201, processing proceeding to processing block 222 where a
pointer in the input data stream is moved (e.g. to a new string in
the database). Then from processing block 224, if the data stream
is empty processing terminates. Otherwise processing repeats in
processing block 212.
[0023] It will be appreciated that the process 201 may be iterated
for hundreds to hundreds of thousands of times in order to
initialize a filter apparatus for string matching patterns in
packet inspection. Thus when the set of H slice circuits are
configured to concurrently perform initialization substantial
performance improvements may be realized. It will also be
appreciated that the process 201 of initializing a filter apparatus
(by setting slice-hit signals) may be performed in a manner
substantially similar to a process of utilizing a filter apparatus
for string matching (by reading the slice-hit signals) in packet
inspection. In some embodiments of processing block 222 a pointer
into the input data stream may moved for each i.sup.th slice, in
such a way as to provide each i.sup.th slice with a new compete
pattern, whereas in utilizing a filter apparatus for string
matching a pointer into the input data stream may be simply
incremented.
[0024] FIG. 3 illustrates a flow diagram for one embodiment of a
process 301 to utilize a filter apparatus for string matching in
packet inspection. In processing block 311 a set of H slice
circuits are configured. In processing block 312, i is set to zero
(0). In processing block 313, i is incremented. In processing block
314, i is checked to see if it has exceeded H. Again, it will be
appreciated that even though utilization of the H slice circuits is
shown as an iterative process 301, in at least some preferred
embodiment of process 301, the set of H slice circuits are
configured to concurrently perform string matching according to
processing blocks 315-321 of process 301 for use during network
packet inspections. Therefore, for each of the H slice circuits
processing blocks 315-321 are executed as follows, before
proceeding to processing block 323.
[0025] In processing block 315 W.sub.i bytes of data is stored from
an input data steam in an i.sup.th input window. In processing
block 316 the W.sub.i bytes of data are padded if necessary. Then
in processing block 317 the W.sub.i bytes of data are multiplied by
a Galois-field polynomial modulo an irreducible Galois-field
polynomial to generate an i.sup.th hash index. In processing block
319 a storage location of a memory corresponding to the i.sup.th
hash index is accessed to generate an i.sup.th slice-hit signal of
a set of H slice-hit signals. In processing block 321 the i.sup.th
slice-hit signal is provided to an AND-OR logic array as one of the
set of H slice-hit signals. When all of the H slice circuits have
completed processing blocks 315-321 of process 301, processing
proceeding to processing block 323 where the AND-OR logic array is
configured to receive the set of H slice-hit signals and to combine
the set of H slice-hit signals into a match result. Then from
processing block 323 processing terminates.
[0026] It will be appreciated that iterations of process 301 may be
configured in accordance with embodiments of filter apparatus 101
to substantially accelerate string matching in packet
inspection.
[0027] FIG. 4 illustrates one embodiment of a system 401 employing
a filter 480 to accelerate string matching in packet inspection for
network applications such as intrusion detection/prevention and
virus detection.
[0028] System 401 includes an input data stream 420, which may be
in system memory 470 as shown, or may comprise an optional data
stream buffer of filter 480 for storing packed data for inspection
and/or a pattern database to initialize filter 480.
[0029] Filter 480 includes a set of H slice circuits 410-450, each
i.sup.th slice circuit of the set is configurable for providing an
i.sup.th slice-hit signal to a configurable AND-OR logic array 440
as one of a set of H slice-hit signals. Slice circuits 410-450,
respectively include input windows 411-451 each configurable to
store W.sub.i bytes of data from input data steam 420, and Ghash
units 412-452 coupled with input windows 411-451 and configurable
to receive the W.sub.i bytes of data, to pad the W.sub.i bytes of
data if necessary, and to multiply their respective WI bytes of
data by a polynomial modulo an irreducible Galois-field polynomial
to generate an index.
[0030] Slice circuits 410-450, respectively, also include memories
413-453 coupled with the Ghash units 412-452 and configurable to
access respective storage locations responsive to their respective
indices to each generate an i.sup.th slice-hit signal and to
provide the an i.sup.th slice-hit signal to AND-OR logic array 440
as one of the set of H slice-hit signals 415-455. Memories 413-453
may be N-entry read/write RAMs of any fixed width and configurable
to be combined into larger memories (e.g. memory 430) as necessary.
Alternatively some embodiments of memories 413-453 may be
configurable from a larger memory 430. Slice circuits 410-450 may
also include multiplexers 414-454, respectively, configurable to
access respective bit storage locations responsive to portions of
their respective indices to generate the i.sup.th slice-hit signal
and to provide the i.sup.th slice-hit signal to AND-OR logic array
440 as one of the set of H slice-hit signals 415-455. AND-OR logic
array 440 may receive the set of H slice-hit signals 415-455 and
combine the set of H slice-hit signals 415-455 into a match result
445.
[0031] System 401 also includes system processor 460 to executed a
program 471 in system memory 470 to accelerate string matching in
packet inspection for network applications using filter 480, and to
move or increment a pointer 461 into input data stream 420 until a
match result 445 is positive (in the case of string matching for
packet inspections) or until an end-of-file is reached in the input
data steam 420. In some embodiments of system 401, processor 460
may check a copy of match result 445 stored in system memory 470 as
a match result 485 when string matching for packet inspections to
determine if match result 445 was positive.
[0032] The above description is intended to illustrate preferred
embodiments of the present invention. From the discussion above it
should also be apparent that especially in such an area of
technology, where growth is fast and further advancements are not
easily foreseen, the invention can may be modified in arrangement
and detail by those skilled in the art without departing from the
principles of the present invention within the scope of the
accompanying claims and their equivalents.
* * * * *