U.S. patent application number 10/361067 was filed with the patent office on 2003-09-11 for systems and methods for automated whitelisting in monitored communications.
Invention is credited to Judge, Paul, Rajan, Guru.
Application Number | 20030172291 10/361067 |
Document ID | / |
Family ID | 29554084 |
Filed Date | 2003-09-11 |
United States Patent
Application |
20030172291 |
Kind Code |
A1 |
Judge, Paul ; et
al. |
September 11, 2003 |
Systems and methods for automated whitelisting in monitored
communications
Abstract
The present invention is directed to systems and methods for
detecting and preventing the delivery of unsolicited
communications. A communication transmitted over a communications
network is received and analyzed by a system processor. The system
processor can extract attributes from the communication and compare
extracted attributes to information stored in a system data store.
In processing the communication, the system processor may assign a
confidence level, a trust level, or other indicia of content. The
results of that processing, analysis, and comparison can be used to
direct the further handling of the communication. The system
processor can dispose of communications by quarantining, deleting,
or forwarding.
Inventors: |
Judge, Paul; (Alpharetta,
GA) ; Rajan, Guru; (Duluth, GA) |
Correspondence
Address: |
NEEDLE & ROSENBERG, P.C.
SUITE 1000
999 PEACHTREE STREET
ATLANTA
GA
30309-3915
US
|
Family ID: |
29554084 |
Appl. No.: |
10/361067 |
Filed: |
February 7, 2003 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
10361067 |
Feb 7, 2003 |
|
|
|
10093553 |
Mar 8, 2002 |
|
|
|
10361067 |
Feb 7, 2003 |
|
|
|
10094211 |
Mar 8, 2002 |
|
|
|
10361067 |
Feb 7, 2003 |
|
|
|
10094266 |
Mar 8, 2002 |
|
|
|
Current U.S.
Class: |
726/1 |
Current CPC
Class: |
H04L 63/1416 20130101;
H04L 63/0245 20130101; H04L 51/212 20220501; G06F 21/554 20130101;
H04L 63/14 20130101; G06F 21/577 20130101; H04L 63/1441 20130101;
H04L 63/0236 20130101; H04L 63/0428 20130101; H04L 63/145 20130101;
H04L 63/0263 20130101; H04L 63/1408 20130101; H04L 63/1433
20130101; H04L 63/1425 20130101 |
Class at
Publication: |
713/200 |
International
Class: |
G06F 011/30 |
Claims
What is claimed is:
1. A system for detecting an unsolicited communication transmitted
over a communications network, the system comprising: a) an
interface coupling the system with the communications network; b) a
system data store capable of storing data associated with
communications transmitted over the communications network and one
or more whitelists; c) a system processor in communication with the
interface and the system data store, wherein the system processor
comprises one or more processing elements and wherein the system
processor: 1) receive a communication via the interface; 2)
compares the communication to at least one whitelist; and 3)
modifies at least one whitelist based on the communication.
2. The system of claim 1, wherein the system processor is
programmed or adapted to modify the at least one whitelist by
updating the at least one whitelist with data derived from the
received communication.
3. The system of claim 1, wherein the system processor is further
programmed or adapted to modify the at least one whitelist by
updating the at least one whitelist with data derived from inbound
or outbound communication traffic patterns.
4. The system of claim 3, wherein the system processor modifies the
at least one whitelist by updating the at least one whitelist with
data derived from inbound and outbound communication traffic
patterns.
5. The system of claim 1, wherein the system processor is
programmed or adapted to modify the at least one whitelist by
adding an entry to the at least one whitelist corresponding to a
destination address associated with the received communication.
6. The system of claim 1, wherein the system processor is further
programmed or adapted to assign a confidence level to received
communications.
7. The system of claim 6, wherein the system processor is further
programmed or adapted to forward a communication with an indication
of its confidence level.
8. The system of claim 1, wherein the communication is transmitted
or received over the Internet.
9. The system of claim 8, wherein the communication is an e-mail
communication.
10. The system of claim 1, wherein the communication comprises an
e-mail communication, an HTTP communication, an FTP communication,
a WAIS communication, a telnet communication or a Gopher
communication.
11. The system of claim 1, wherein the system processor is further
programmed pr adapted to provide an interface for modifying the at
least one whitelist.
12. The system of claim 11, wherein the system processor is further
programmed or adapted to receive information from the provided
interface and apply changes to at least one whitelist based on
information received from the interface.
13. The system of claim 11, wherein the interface provides for
manual editing of the at least one whitelist.
14. The system of claim 1, wherein the system processor is further
programmed or adapted to determine deliverability of a received
communication by applying one or more tests.
15. The system of claim 14, wherein received communications
determined to be undeliverable are quarantined, discarded, or
forwarded.
16. The system of claim 14, wherein the system processor is further
programmed or adapted to forward the received communication for
delivery if it was determined to be deliverable.
17. The system of claim 14, wherein the system processor applies
each of the one or more tests in a parallel fashion.
18. The system of claim 14, wherein the system processor applies
each of the one or more tests in a sequential fashion.
19. The system of claim 14, wherein the system data store stores
configuration information and wherein the system processor applies
each of the one or more tests based upon configuration information
stored in the system data store.
20. The system of claim 14, wherein the system processor determines
deliverability by calculating an level of trust.
21. The system of claim 20, wherein the system processor determines
deliverability by comparing the level of trust to a threshold
level.
22. The system of claim 14, wherein the system processor determines
whether to deliver a received communication further based upon
configuration information stored in the system data store.
23. The system of claim 22, wherein the configuration information
comprises communication types, confidence information, time period
information, or combinations thereof.
24. The system of claim 14, wherein the system processor is further
programmed or adapted to select the one or more tests to determine
deliverability.
25. The system of claim 24, wherein the system processor selects
the one or more tests based upon communication type, configuration
information, or combinations thereof.
26. The system of claim 24, wherein the system processor is further
programmed or adapted to compare a received communication to the at
least one whitelist and wherein the system processor selects the
one or more tests based upon the comparison.
27. The system of claim 14, wherein the system processor is further
programmed or adapted to compare a received communication to the at
least one whitelist and to selectively bypass the determination of
deliverability based upon the comparison.
28. A method for detecting an unsolicited communication transmitted
over a communications network, the method comprising the steps of:
a) providing an interface for manually modifying at least one
whitelist; b) receiving an outbound communication of a type
selected from the group consisting of an e-mail communication, an
HTTP communication, an FTP communication, a WAIS communication, a
telnet communication and a Gopher communication; c) storing the
received outbound communication; and d) modifying the at least one
whitelist by adding or modifying an entry on the at least one white
list based upon a destination of the received outbound
communication.
29. The method of claim 28, and further comprising the steps of: a)
receiving an inbound communication; b) comparing the received
inbound communication to the at least one whitelist; c) selecting a
plurality of trust level tests based on a type associated with the
received inbound communication, configuration information, the
whilelist comparison or combinations thereof; d) determining
deliverability of the received inbound communication by applying
the selected plurality of trust level tests; e) assigning a
confidence level to the communication based upon the determined
deliverability; and f) quarantining, discarding, or forwarding the
received communication based on the assigned confidence level.
30. A system for detecting an unsolicited communication transmitted
over a communications network, the system comprising: a) means for
receiving an electronic communication; b) direction determination
means for determining if the received electronic communication is
inbound or outbound; c) whitelist modification means for updating
at least one whitelist based upon a received communication
determined to be outbound by the direction determination means by
adding an entry or updating an existing entry in the at least one
whitelist based upon the received communication; and d)
communication disposition means for disposing of a received
communication determined to be inbound by the direction
determination means by: 1) comparing the received inbound
communication to the at least one whitelist; 2) selecting a
plurality of trust level tests based on a type associated with the
received inbound communication, configuration information, the
whilelist comparison or combinations thereof; 3) assigning a
confidence level to the communication based upon the determined
deliverability; and 4) quarantining, discarding, or forwarding the
received communication based on the assigned confidence level.
31. Computer readable media storing instruction that upon execution
by a system processor cause the system processor to automatically
generate a whitelist based upon received outbound communication by
performing the steps comprising of: a) providing an interface for
manually modifying at least one whitelist; b) receiving an outbound
communication of a type selected from the group consisting of an
e-mail communication, an HTTP communication, an FTP communication,
a WAIS communication, a telnet communication and a Gopher
communication; c) storing the received outbound communication; and
d) modifying the at least one whitelist by adding or modifying an
entry on the at least one white list based upon a destination of
the received outbound communication.
Description
CROSS-REFERENCE TO RELATED PATENT APPLICATIONS
[0001] This application is a continuation-in-part of commonly
assigned U.S. patent application Ser. Nos. 10/093,553; 10/094,211;
and 10/094,266 all filed on Mar. 8, 2002, which are hereby
incorporated herein in their entirety.
BACKGROUND
[0002] The present invention is directed to methods and systems for
automated and/or authenticated whitelisting for accurate
communications filtering. More specifically, without limitation,
the present invention relates to computer-based systems and methods
for automated whitelist generation based on outbound traffic
associated with electronic communications transmitted over a
communications network.
[0003] The Internet is a global network of connected computer
networks. Over the last several years, the Internet has grown in
significant measure. A large number of computers on the Internet
provide information in various forms. Anyone with a computer
connected to the Internet can potentially tap into this vast pool
of information.
[0004] The information available via the Internet encompasses
information available via a variety of types of application layer
information servers such as SMTP (simple mail transfer protocol),
POP3 (Post Office Protocol), GOPHER (RFC 1436), WAIS, HTTP
(Hypertext Transfer Protocol, RFC 2616) and FTP (file transfer
protocol, RFC 1123).
[0005] One of the most wide spread method of providing information
over the Internet is via the World Wide Web (the Web). The Web
consists of a subset of the computers connected to the Internet;
the computers in this subset run Hypertext Transfer Protocol (HTTP)
servers (Web servers). Several extensions and modifications to HTTP
have been proposed including, for example, an extension framework
(RFC 2774) and authentication (RFC 2617). Information on the
Internet can be accessed through the use of a Uniform Resource
Identifier (URI, RFC 2396). A URI uniquely specifies the location
of a particular piece of information on the Internet. A URI will
typically be composed of several components. The first component
typically designates the protocol by which the address piece of
information is accessed (e.g., HTTP, GOPHER, etc.). This first
component is separated from the remainder of the URI by a colon
(`:`). The remainder of the URI will depend upon the protocol
component. Typically, the remainder designates a computer on the
Internet by name, or by IP number, as well as a more specific
designation of the location of the resource on the designated
computer. For instance, a typical URI for an HTTP resource might
be:
[0006] http://www.server.com/dir1/dir2/resource.htm where http is
the protocol, www.server.com is the designated computer and
/dir1/dir2/resource.htm designates the location of the resource on
the designated computer. The term URI includes Uniform Resource
Names (URN's) including URN's as defined according to RFC 2141.
[0007] Web servers host information in the form of Web pages;
collectively the server and the information hosted are referred to
as a Web site. A significant number of Web pages are encoded using
the Hypertext Markup Language (HTML) although other encodings using
eXtensible Markup Language (XML) or XHTML. The published
specifications for these languages are incorporated by reference
herein; such specifications are available from the World Wide Web
Consortium and its Web site (http://www.w3c.org). Web pages in
these formatting languages may include links to other Web pages on
the same Web site or another. As will be known to those skilled in
the art, Web pages may be generated dynamically by a server by
integrating a variety of elements into a formatted page prior to
transmission to a Web client. Web servers, and information servers
of other types, await requests for the information from Internet
clients.
[0008] Client software has evolved that allows users of computers
connected to the Internet to access this information. Advanced
clients such as Netscape's Navigator and Microsoft's Internet
Explorer allow users to access software provided via a variety of
information servers in a unified client environment. Typically,
such client software is referred to as browser software.
[0009] Electronic mail (e-mail) is another wide spread application
using the Internet. A variety of protocols are often used for
e-mail transmission, delivery and processing including SMTP and
POP3 as discussed above. These protocols refer, respectively, to
standards for communicating e-mail messages between servers and for
server-client communication related to e-mail messages. These
protocols are defined respectively in particular RFC's (Request for
Comments) promulgated by the IETF (Internet Engineering Task
Force). The SMTP protocol is defined in RFC 821, and the POP3
protocol is defined in RFC 1939.
[0010] Since the inception of these standards, various needs have
evolved in the field of e-mail leading to the development of
further standards including enhancements or additional protocols.
For instance, various enhancements have evolved to the SMTP
standards leading to the evolution of extended SMTP. Examples of
extensions may be seen in (1) RFC 1869 that defines a framework for
extending the SMTP service by defining a means whereby a server
SMTP can inform a client SMTP as to the service extensions it
supports and in (2) RFC 1891 that defines an extension to the SMTP
service, which allows an SMTP client to specify (a) that delivery
status notifications (DSNs) should be generated under certain
conditions, (b) whether such notifications should return the
contents of the message, and (c) additional information, to be
returned with a DSN, that allows the sender to identify both the
recipient(s) for which the DSN was issued, and the transaction in
which the original message was sent.
[0011] In addition, the IMAP protocol has evolved as an alternative
to POP3 that supports more advanced interactions between e-mail
servers and clients. This protocol is described in RFC 2060.
[0012] The various standards discussed above by reference to
particular RFC's are hereby incorporated by reference herein for
all purposes. These RFC's are available to the public through the
IETF and can be retrieved from its Web site
(http://www.ietf.org/rfc.html). The specified protocols are not
intended to be limited to the specific RFC's quoted herein above
but are intended to include extensions and revisions thereto. Such
extensions and/or revisions may or may not be encompassed by
current and/or future RFC's.
[0013] A host of e-mail server and client products have been
developed in order to foster e-mail communication over the
Internet. E-mail server software includes such products as
sendmail-based servers, Microsoft Exchange, Lotus Notes Server, and
Novell Group Wise; sendmail-based servers refer to a number of
variations of servers originally based upon the sendmail program
developed for the UNIX operating systems. A large number of e-mail
clients have also been developed that allow a user to retrieve and
view e-mail messages from a server; example products include
Microsoft Outlook, Microsoft Outlook Express, Netscape Messenger,
and Eudora. In addition, some e-mail servers, or e-mail servers in
conjunction with a Web server, allow a Web browser to act as an
e-mail client using the HTTP standard.
[0014] As the Internet has become more widely used, it has also
created new risks for corporations. Breaches of computer security
by hackers and intruders and the potential for compromising
sensitive corporate information are a very real and serious threat.
Organizations have deployed some or all of the following security
technologies to protect their networks from Internet attacks:
[0015] Firewalls have been deployed at the perimeter of corporate
networks. Firewalls act as gatekeepers and allow only authorized
users to access a company network. Firewalls play an important role
in controlling traffic into networks and are an important first
step to provide Internet security.
[0016] Intrusion detection systems (IDS) are being deployed
throughout corporate networks. While the firewall acts as a
gatekeeper, IDS act like a video camera. IDS monitor network
traffic for suspicious patterns of activity, and issue alerts when
that activity is detected. IDS proactively monitor your network 24
hours a day in order to identify intruders within a corporate or
other local network.
[0017] Firewall and IDS technologies have helped corporations to
protect their networks and defend their corporate information
assets. However, as use of these devices has become widespread,
hackers have adapted and are now shifting their point-of-attack
from the network to Internet applications. The most vulnerable
applications are those that require a direct, "always-open"
connection with the Internet such as web and e-mail. As a result,
intruders are launching sophisticated attacks that target security
holes within these applications.
[0018] Many corporations have installed a network firewall, as one
measure in controlling the flow of traffic in and out of corporate
computer networks, but when it comes to Internet application
communications such as e-mail messages and Web requests and
responses, corporations often allow employees to send and receive
from or to anyone or anywhere inside or outside the company. This
is done by opening a port, or hole in their firewall (typically,
port 25 for e-mail and port 80 for Web), to allow the flow of
traffic. Firewalls do not scrutinize traffic flowing through this
port. This is similar to deploying a security guard at a company's
entrance but allowing anyone who looks like a serviceman to enter
the building. An intruder can pretend to be a serviceman, bypass
the perimeter security, and compromise the serviced Internet
application.
[0019] FIG. 1 depicts a typical prior art server access
architecture. With in a corporation's local network 190, a variety
of computer systems may reside. These systems typically include
application servers 120 such as Web servers and e-mail servers,
user workstations running local clients 130 such as e-mail readers
and Web browsers, and data storage devices 110 such as databases
and network connected disks. These systems communicate with each
other via a local communication network such as Ethernet 150.
Firewall system 140 resides between the local communication network
and Internet 160. Connected to the Internet 160 are a host of
external servers 170 and external clients 180.
[0020] Local clients 130 can access application servers 120 and
shared data storage 110 via the local communication network.
External clients 180 can access external application servers 170
via the Internet 160. In instances where a local server 120 or a
local client 130 requires access to an external server 170 or where
an external client 180 or an external server 170 requires access to
a local server 120, electronic communications in the appropriate
protocol for a given application server flow through "always open"
ports of firewall system 140.
[0021] The security risks do not stop there. After taking over the
mail server, it is relatively easy for the intruder to use it as a
launch pad to compromise other business servers and steal critical
business information. This information may include financial data,
sales projections, customer pipelines, contract negotiations, legal
matters, and operational documents. This kind of hacker attack on
servers can cause immeasurable and irreparable losses to a
business.
[0022] In the 1980's, viruses were spread mainly by floppy
diskettes. In today's interconnected world, applications such as
e-mail serve as a transport for easily and widely spreading
viruses. Viruses such as "I Love You" use the technique exploited
by distributed Denial of Service (DDOS) attackers to mass
propagate. Once the "I Love You" virus is received, the recipient's
Microsoft Outlook sends emails carrying viruses to everyone in the
Outlook address book. The "I Love You" virus infected millions of
computers within a short time of its release. Trojan horses, such
as Code Red use this same technique to propagate themselves.
Viruses and Trojan horses can cause significant lost productivity
due to down time and the loss of crucial data.
[0023] The Nimda worm simultaneously attacked both email and web
applications. It propagated itself by creating and sending
infectious email messages, infecting computers over the network and
striking vulnerable Microsoft IIS Web servers, deployed on Exchange
mail servers to provide web mail.
[0024] Most e-mail and Web requests and responses are sent in plain
text today, making it just as exposed as a postcard. This includes
the e-mail message, its header, and its attachments, or in a Web
context, a user name and password and/or cookie information in an
HTTP request. In addition, when you dial into an Internet Service
Provider (ISP) to send or receive e-mail messages, the user ID and
password are also sent in plain text, which can be snooped, copied,
or altered. This can be done without leaving a trace, making it
impossible to know whether a message has been compromised.
[0025] As the Internet has become more widely used, it has also
created new troubles for users. In particular, the amount of "spam"
received by individual users has increased dramatically in the
recent past. Spam, as used in this specification, refers to any
communication receipt of which is either unsolicited or not desired
by its recipient.
[0026] The following are additional security risks caused by
Internet applications:
[0027] E-mail spamming consumes corporate resources and impacts
productivity. Furthermore, spammers use a corporation's own mail
servers for unauthorized email relay, making it appear as if the
message is coming from that corporation.
[0028] E-mail and Web abuse, such as sending and receiving
inappropriate messages and Web pages, are creating liabilities for
corporations. Corporations are increasingly facing litigation for
sexual harassment or slander due to e-mail their employees have
sent or received.
[0029] Regulatory requirements such as the Health Insurance
Portability and Accountability Act (HIPAA) and the
Gramm-Leach-Bliley Act (regulating financial institutions) create
liabilities for companies where confidential patient or client
information may be exposed in e-mail and/or Web servers or
communications including e-mails, Web pages and HTTP requests.
[0030] Using the "always open" port, a hacker can easily reach an
appropriate Internet application server, exploit its
vulnerabilities, and take over the server. This provides hackers
easy access to information available to the server, often including
sensitive and confidential information. The systems and methods
according to the present invention provide enhanced security for
communications involved with such Internet applications requiring
an "always-open" connection.
[0031] Anti-spam systems in use today include fail-open systems in
which all incoming messages are filtered for spam. In these
systems, a message is considered not to be spam until some form of
examination proves otherwise. A message is determined to be spam
based on an identification technique. Operators of such systems
continue to invest significant resources in efforts to reduce the
number of legitimate messages that are misclassified as spam. The
penalties for any misclassification are significant and therefore
most systems are designed to be predisposed not to classify
messages as spam.
[0032] One such approach requires a user to explicitly list users
from whom email is desirable. Such a list is one type of
"whitelist". There are currently two approaches for creating such a
whitelist. In a desktop environment, an end-user can import an
address book as the whitelist. This approach can become a burden
when operated at a more central location such as the gateway of an
organization. Therefore, some organizations only add a few entries
to the whitelist as necessary. In that case, however, the full
effect of whitelisting is not achieved. The present invention
improves upon these systems by including a system that allows a
more effective solution for whitelisting while requiring reduced
manual effort by end-users or administrators. The present invention
also allows a whitelist system to be strengthened by authenticating
sender information. Some exemplary known whitelist and/or spam
detection systems are described in U.S. Pat. No. 6,052,709, U.S.
Pat. No. 6,161,130 and U.S. patent application Ser. No. 10/154,137
(publication 2002/0199095 A1), the disclosures of which are
incorporated herein by this reference.
[0033] Many systems in use today employ a fail-closed system in
which a sender must prove its legitimacy. A common example of this
type of system uses a challenge and response. Such a system blocks
all messages from unknown senders and itself sends a confirmation
message to the sender. The sender must respond to verify that it is
a legitimate sender. If the sender responds, the sender is added to
the whitelist. However, spammers can create tools to respond to the
confirmation messages. Some confirmation messages are more advanced
in an effort to require that a human send the response. The present
invention is an improvement upon these systems. The present
invention can reference information provided by users to determine
who should be whitelisted rather than rely on the sender's
confirmation. The systems and methods according to the present
invention provide enhanced accuracy in the automated processing of
electronic communications.
SUMMARY
[0034] The present invention is directed to methods and systems for
automated and/or authenticated whitelisting for accurate
communications filtering. One preferred embodiment according to the
present invention includes a system data store (SDS), a system
processor and one or more interfaces to one or more communications
networks over which electronic communications are transmitted and
received. The SDS stores data needed to provide the desired system
functionality and may include, for example, received
communications, data associated with such communications,
information related to known security risks, information related to
corporate policy with respect to communications for one or more
applications (e.g., corporate e-mail policy, Web access guidelines,
message interrogation parameters, and whitelists) and predetermined
responses to the identification of particular security risks,
situations or anomalies.
[0035] The SDS may include multiple physical and/or logical data
stores for storing the various types of information. Data storage
and retrieval functionality may be provided by either the system
processor or data storage processors associated with the data
store. The system processor is in communication with the SDS via
any suitable communication channel(s); the system processor is in
communication with the one or more interfaces via the same, or
differing, communication channel(s). The system processor may
include one or more processing elements that provide electronic
communication reception, transmission, interrogation, analysis
and/or other functionality.
[0036] Accordingly, one preferred method of automated whitelisting
includes a variety of steps that may, in certain embodiments, be
executed by the environment summarized above and more fully
described below or be stored as computer executable instructions in
and/or on any suitable combination of computer-readable media. In
some embodiments, an electronic communication directed to or
originating from an application server is received. The source of
the electronic communication may be any appropriate internal or
external client or any appropriate internal or external application
server. One or more tests are applied to the received electronic
communication to evaluate the received electronic communication for
a particular security risk. A risk profile associated with the
received electronic communication is stored based upon this
testing. The stored risk profile is compared against data
accumulated from previously received electronic communications to
determine whether the received electronic communication is
anomalous. If the received communication is determined to be
anomalous, an anomaly indicator signal is output. The output
anomaly indicator signal may, in some embodiments, notify an
application server administrator of the detected anomaly by an
appropriate notification mechanism (e.g., pager, e-mail, etc.) or
trigger some corrective measure such as shutting down the
application server totally, or partially (e.g., deny access to all
communications from a particular source).
[0037] In some embodiments, an electronic communication directed to
or originating from an email server is received. One or more tests
can be applied to the received electronic communication to compare
the sender's address in the received electronic communication to
addresses contained in one or more whitelists.
[0038] Some embodiments may also support a particular approach to
testing the received electronic communication, which may also be
applicable for use in network level security and intrusion
detection. In such embodiments, each received communication is
interrogated by a plurality of interrogation engines where each
such interrogation engine is of a particular type designed to test
the communication for a particular security risk. Each received
communication is interrogated by a series of interrogation engines
of differing types. The ordering and selection of interrogation
engine types for use with received communications may, in some
embodiments, be configurable, whereas in others the ordering and
selection may be fixed.
[0039] Associated with each interrogation engine is a queue of
indices for communications to be evaluated by the particular
interrogation engine. When a communication is received, it is
stored and assigned an index. The index for the receive
communication is placed in a queue associated with an interrogation
of a particular type as determined by the interrogation engine
ordering. Upon completion of the assessment of the received
communication by the interrogation engine associated with the
assigned queue, the index is assigned to a new queue associated
with an interrogation engine of the next type as determined by the
interrogation engine ordering. The assignment process continues
until the received communication has been assessed by an
interrogation engine of each type as determined by the
interrogation engine selection. If the communication successfully
passes an interrogation engine of each type, the communication is
forwarded to its appropriate destination. In some embodiments, if
the communication fails any particular engine, a warning indicator
signal may be output; in some such embodiments, the communication
may then be forwarded with or without an indication of its failure
to its appropriate destination, to an application administrator
and/or both.
[0040] In some embodiments using this queuing approach, the
assignment of an index for a received communication to a queue for
an interrogation engine of a particular type may involve an
evaluation of the current load across all queues for the particular
interrogation engine type. If a threshold load exists, a new
instance of an interrogation engine of the particular type may be
spawned with an associated index queue. The index for the received
communication may then be assigned to the queue associated with the
interrogation engine instance. In some embodiments, the load across
the queues associated with the particular type may be redistributed
across the queues including the one associated with the new
interrogation engine instance prior to the assignment of the index
associated with the newly received communication to the queue. Some
embodiments may also periodically, or at particular times such as a
determination that a particular queue is empty, evaluate the load
across queues for a type of interrogation engine and if an
inactivity threshold is met, shutdown excess interrogation
instances of that type and disassociating or deallocating indices
queues associated with shutdown instances.
[0041] Alternatively, a fixed number of interrogation engines of
each particular type may be configured in which case dynamic
instance creation may or may not occur. In fixed instance
embodiments not supporting dynamic instance creation, assignment to
a particular queue may result from any appropriate allocation
approach including load evaluation or serial cycling through queues
associated with each interrogation engine instance of the
particular type desired.
[0042] In some embodiments, anomaly detection may occur through a
process outlined as follows. In such a process, data associated
with a received communication is collected. The data may be
accumulated from a variety of source such as from the communication
itself and from the manner of its transmission and receipt. The
data may be collected in any appropriate manner such as the
multiple queue interrogation approach summarized above and
discussed in greater detail below. Alternatively, the data
collection may result from a parallel testing process where a
variety of test is individually applied to the received
communication in parallel. In other embodiments, a single combined
analysis such as via neural network may be applied to
simultaneously collect data associated with the received
communication across multiple dimensions.
[0043] The collected data is then analyzed to determine whether the
received communication represents an anomaly. The analysis will
typically be based upon the collected data associated with the
received communication in conjunction with established
communication patterns over a given time period represented by
aggregated data associated with previously received communications.
The analysis may further be based upon defined and/or configurable
anomaly rules. In some embodiments, analysis may be combined with
the data collection; for instance, a neural network could both
collect the data associated with the received communication and
analyze it.
[0044] The adaptive communication interrogation can use established
communication patterns over a given time period represented by
aggregated data associated with previously received communications.
The analysis can further be based upon defined and/or configurable
spam rules. In some embodiments, analysis can be combined with the
data collection; for instance, a neural network could both collect
the data associated with the received communication and analyze
it.
[0045] Finally, if an anomaly is detected with respect to the
received communication, an indicator signal is generated. The
generated signal may provide a warning to an application
administrator or trigger some other appropriate action. In some
embodiments, the indicator signal generated may provide a
generalized indication of an anomaly; in other embodiments, the
indicator may provide additional data as to a specific anomaly, or
anomalies, detected. In the latter embodiments, any warning and/or
actions resulting from the signal may be dependent upon the
additional data.
[0046] Data collected from received communications can be analyzed
to determine whether the received communication is on one or more
whitelists. The analysis is typically based upon the collected data
associated with the received communication in conjunction with
reference to one or more whitelists. If no match to a whitelist is
found, the communication can be subject to a certain level of
interrogation. If a match to the whitelist is found, the
communication can either bypass any message interrogation or it can
be subject to a different level of interrogation. In one preferred
embodiment, if a match to a whitelist is found, the message can be
subject to either adaptive message interrogation or no message
interrogation. If no match to a whitelist is found, the message can
be subject to normal message interrogation. Additionally, a
whitelist can be created and/or updated based on outbound
communication. In one preferred embodiment, some or all of the
destination addresses of outbound communications are added to a
whitelist. If a destination address already appears on a whitelist,
a confidence value associated with the destination can be modified
based upon the destination address' presence. For instance, a usage
count may be maintained; such a usage count can reflect absolute
usage of the address or usage of the address over a given period of
time.
[0047] Additional advantages of the invention will be set forth in
part in the description which follows, and in part will be obvious
from the description, or may be learned by practice of the
invention. The advantages of the invention will be realized and
attained by means of the elements and combinations particularly
pointed out in the appended claims. It is to be understood that
both the foregoing general description and the following detailed
description are exemplary and explanatory only and are not
restrictive of the invention, as claimed.
BRIEF DESCRIPTION OF THE DRAWINGS
[0048] The accompanying drawings, which are incorporated in and
constitute a part of this specification, illustrate embodiments of
the invention and together with the description, serve to explain
the principles of the invention.
[0049] FIG. 1 depicts a typical prior art access environment.
[0050] FIG. 2 depicts a hardware diagram for an environment using
one preferred embodiment according to the present invention.
[0051] FIG. 3 is a logical block diagram of the components in a
typical embodiment of the present invention.
[0052] FIG. 4 is a flow chart of an exemplary anomaly detection
process according to the present invention.
[0053] FIG. 5 is a sample anomaly detection configuration interface
screen.
[0054] FIG. 6 is a bock diagram depicting the architecture of an
exemplary embodiment of a security enhancement system according to
the present invention.
[0055] FIG. 7 is a block diagram depicting the architecture of an
exemplary embodiment of a risk assessment approach according to the
present invention using multiple queues to manage the application
of a plurality of risk assessments to a received communication.
[0056] FIGS. 8A-8B are a flow chart depicting the process of
accessing risk associated with a received communication using the
architecture depicted in FIG. 7.
[0057] FIG. 9 is a flow chart of an exemplary communication
assessment process according to the present invention.
[0058] FIG. 10 is a flow chart of an exemplary whitelist management
process according to the present invention.
[0059] FIG. 11 is a flow chart of an exemplary interrogation
process according to the present invention.
DETAILED DESCRIPTION
[0060] Exemplary embodiments of the present invention are now
described in detail. Referring to the drawings, like numbers
indicate like parts throughout the views. As used in the
description herein and throughout the claims that follow, the
meaning of "a," "an," and "the" includes plural reference unless
the context clearly dictates otherwise. Also, as used in the
description herein and throughout the claims that follow, the
meaning of "in" includes "in" and "on" unless the context clearly
dictates otherwise. Finally, as used in the description herein and
throughout the claims that follow, the meanings of "and" and "or"
include both the conjunctive and disjunctive and may be used
interchangeably unless the context clearly dictates otherwise.
[0061] Ranges may be expressed herein as from "about" one
particular value, and/or to "about" another particular value. When
such a range is expressed, another embodiment includes from the one
particular value and/or to the other particular value. Similarly,
when values are expressed as approximations, by use of the
antecedent "about," it will be understood that the particular value
forms another embodiment. It will be further understood that the
endpoints of each of the ranges are significant both in relation to
the other endpoint, and independently of the other endpoint.
[0062] Architecture of a Typical Access Environment
[0063] FIG. 2 depicts a typical environment according to the
present invention. As compared with FIG. 1, the access environment
using systems and methods according to the present invention may
include a hardware device 210 connected to the local communication
network such as Ethernet 180 and logically interposed between the
firewall system 140 and the local servers 120 and clients 130. All
application related electronic communications attempting to enter
or leave the local communications network through the firewall
system 140 are routed to the hardware device 210 for application
level security assessment and/or anomaly detection. Hardware device
210 need not be physically separate from existing hardware elements
managing the local communications network. For instance, the
methods and systems according to the present invention could be
incorporated into a standard firewall system 140 or router (not
shown) with equal facility. In environment not utilizing a firewall
system, the hardware device 210 may still provide application level
security assessment and/or anomaly detection.
[0064] For convenience and exemplary purposes only, the foregoing
discussion makes reference to hardware device 210; however, those
skilled in the art will understand that the hardware and/or
software used to implement the systems and methods according to the
present invention may reside in other appropriate network
management hardware and software elements. Moreover, hardware
device 210 is depicted as a single element. In various embodiments,
a multiplicity of actual hardware devices may be used. Multiple
devices that provide security enhancement for application servers
of a particular type such as e-mail or Web may be used where
communications of the particular type are allocated among the
multiple devices by an appropriate allocation strategy such as (1)
serial assignment that assigns a communication to each device
sequentially or (2) via the use of a hardware and/or software load
balancer that assigns a communication to the device based upon
current device burden. A single device may provide enhanced
security across multiple application server types, or each device
may only provide enhanced security for a single application server
type.
[0065] In one embodiment, hardware device 210 may be a rack-mounted
Intel-based server at either 1U or 2U sizes. The hardware device
210 can be configured with redundant components such as power
supplies, processors and disk arrays for high availability and
scalability. The hardware device 210 may include SSL/TLS
accelerators for enhanced performance of encrypted messages.
[0066] The hardware device 210 will include a system processor
potentially including multiple processing elements where each
processing element may be supported via Intel-compatible processor
platforms preferably using at least one PENTIUM III or CELERON
(Intel Corp., Santa Clara, Calif.) class processor; alternative
processors such as UltraSPARC (Sun Microsystems, Palo Alto, Calif.)
could be used in other embodiments. In some embodiments, security
enhancement functionality, as further described below, may be
distributed across multiple processing elements. The term
processing element may refer to (1) a process running on a
particular piece, or across particular pieces, of hardware, (2) a
particular piece of hardware, or either (1) or (2) as the context
allows.
[0067] The hardware device 210 would have an SDS that could include
a variety of primary and secondary storage elements. In one
preferred embodiment, the SDS would include RAM as part of the
primary storage; the amount of RAM might range from 128 MB to 4 GB
although these amounts could vary and represent overlapping use
such as where security enhancement according to the present
invention is integrated into a firewall system. The primary storage
may in some embodiments include other forms of memory such as cache
memory, registers, non-volatile memory (e.g., FLASH, ROM, EPROM,
etc.), etc.
[0068] The SDS may also include secondary storage including single,
multiple and/or varied servers and storage elements. For example,
the SDS may use internal storage devices connected to the system
processor. In embodiments where a single processing element
supports all of the security enhancement functionality, a local
hard disk drive may serve as the secondary storage of the SDS, and
a disk operating system executing on such a single processing
element may act as a data server receiving and servicing data
requests.
[0069] It will be understood by those skilled in the art that the
different information used in the security enhancement processes
and systems according to the present invention may be logically or
physically segregated within a single device serving as secondary
storage for the SDS; multiple related data stores accessible
through a unified management system, which together serve as the
SDS; or multiple independent data stores individually accessible
through disparate management systems, which may in some embodiments
be collectively viewed as the SDS. The various storage elements
that comprise the physical architecture of the SDS may be centrally
located, or distributed across a variety of diverse locations.
[0070] The architecture of the secondary storage of the system data
store may vary significantly in different embodiments. In several
embodiments, database(s) are used to store and manipulate the data;
in some such embodiments, one or more relational database
management systems, such as DB2 (IBM, White Plains, N.Y.), SQL
Server (Microsoft, Redmond, Wash.), ACCESS (Microsoft, Redmond,
Wash.), ORACLE 8i (Oracle Corp., Redwood Shores, Calif.), Ingres
(Computer Associates, Islandia, N.Y.), MySQL (MySQL AB, Sweden) or
Adaptive Server Enterprise (Sybase Inc., Emeryville, Calif.), may
be used in connection with a variety of storage devices/file
servers that may include one or more standard magnetic and/or
optical disk drives using any appropriate interface including,
without limitation, IDE and SCSI. In some embodiments, a tape
library such as Exabyte X80 (Exabyte Corporation, Boulder, Colo.),
a storage attached network (SAN) solution such as available from
(EMC, Inc., Hopkinton, Mass.), a network attached storage (NAS)
solution such as a NetApp Filer 740 (Network Appliances, Sunnyvale,
Calif.), or combinations thereof may be used. In other embodiments,
the data store may use database systems with other architectures
such as object-oriented, spatial, object-relational or hierarchical
or may use other storage implementations such as hash tables or
flat files or combinations of such architectures. Such alternative
approaches may use data servers other than database management
systems such as a hash table look-up server, procedure and/or
process and/or a flat file retrieval server, procedure and/or
process. Further, the SDS may use a combination of any of such
approaches in organizing its secondary storage architecture.
[0071] The hardware device 210 would have an appropriate operating
system such as WINDOWS/NT, WINDOWS 2000 or WINDOWS/XP Server
(Microsoft, Redmond, Wash.), Solaris (Sun Microsystems, Palo Alto,
Calif.), or LINUX (or other UNIX variant). In one preferred
embodiment, the hardware device 210 includes a pre-loaded,
pre-configured, and hardened UNIX operating system based upon
FreeBSD (FreeBSD, Inc., http://www.freebsd.org). In this
embodiment, the UNIX kernel has been vastly reduced, eliminating
non-essential user accounts, unneeded network services, and any
functionality that is not required for security enhancement
processing. The operating system code has been significantly
modified to eliminate security vulnerabilities.
[0072] Depending upon the hardware/operating system platform,
appropriate server software may be included to support the desired
access for the purpose of configuration, monitoring and/or
reporting. Web server functionality may be provided via an Internet
Information Server (Microsoft, Redmond, Wash.), an Apache HTTP
Server (Apache Software Foundation, Forest Hill, Md.), an iplanet
Web Server (iPlanet E-Commerce Solutions--A Sun--Netscape Alliance,
Mountain View, Calif.) or other suitable Web server platform. The
e-mail services may be supported via an Exchange Server (Microsoft,
Redmond, Wash.), sendmail or other suitable e-mail server. Some
embodiments may include one or more automated voice response (AVR)
systems that are in addition to, or instead of, the aforementioned
access servers. Such an AVR system could support a purely
voice/telephone driven interface to the environment with hard copy
output delivered electronically to suitable hard copy output device
(e.g., printer, facsimile, etc.), and forward as necessary through
regular mail, courier, inter-office mail, facsimile or other
suitable forwarding approach. In one preferred embodiment, an
Apache server variant provides an interface for remotely
configuring the hardware device 210. Configuration, monitoring,
and/or reporting can be provided using some form of remote access
device or software. In one preferred embodiment, SNMP is used to
configure and/or monitor the device. In one preferred embodiment,
any suitable remote client device is used to send and retrieve
information and commands to/from the hardware device 210. Such a
remote client device can be provided in the form of a Java client
or a Windows-based client running on any suitable platform such as
a conventional workstation or a handheld wireless device or a
proprietary client running on an appropriate platform also
including a conventional workstation or handheld wireless
device.
[0073] Application Layer Electronic Communication Security
Enhancement
[0074] FIG. 3 depicts a block diagram of the logical components of
a security enhancement system according to the present invention.
The overall analysis, reporting and monitoring functionality is
represented by block 310, and anomaly detection is represented by
block 370.
[0075] Blocks 320-360 represent different assessments that may be
applied to electronic communications. These blocks are
representative of assessments that may be performed and do not
constitute an exhaustive representation of all possible assessments
for all possible application server types. The terms "test" and
"testing" may be used interchangeably with the terms "assess",
"assessment" or "assessing" as appropriate in the description
herein and in the claims that follow.
[0076] Application specific firewall 320 provides functionality to
protect against application-specific attacks. For instance in the
context of e-mail, this assessment could protect against attacks
directed towards Extended SMTP, buffer overflow, and denial of
service.
[0077] Application specific IDS 330 provides real-time monitoring
of activities specific to the application server. This may also
retrieve information from multiple layers including the application
layer, network layer and operating system layer. This compliments a
network intrusion detection system by adding an additional layer of
application specific IDS monitoring.
[0078] Application specific anti-virus protection and anti-spam
protection 340 provides support for screening application specific
communications for associated viruses and/or spam.
[0079] Policy management 350 allows definition of corporate
policies with respect to the particular application in regard to
how and what application specific communications are sent, copied
or blocked. Executable attachments or communication components,
often sources of viruses and/or worms, and/or questionable content
can be stripped or quarantined before they get to the application
server or client. Mail messages from competitors can be blocked or
copied. Large messages can be relegated to off-peak hours to avoid
network congestion.
[0080] Application encryption 360 provides sending and receiving
application communications securely, potentially leveraging
hardware acceleration for performance.
[0081] The application security system processes incoming
communications and appears to network intruders as the actual
application servers. This prevents the actual enterprise
application server from a direct or indirect attack.
[0082] Electronic communications attempting to enter or leave a
local communications network can be routed through present
invention for assessment. The results of that assessment can
determine if that message will be delivered to its intended
recipient.
[0083] An incoming or outgoing communication, and attachments
thereto, are received by a security system according to the present
invention. The communication in one preferred embodiment is an
e-mail message. In other embodiments, the communication may be an
HTTP request or response, a GOPHER request or response, an FTP
command or response, telnet or WAIS interactions, or other suitable
Internet application communication.
[0084] The automated whitelist generation of the present invention
allows the system to automatically create and/or maintain one or
more whitelists based on the outbound email traffic. In some
embodiments, the system can monitor outbound, and/or inbound, email
traffic and thereby determine the legitimate email addresses to add
to the whitelist. The software can use a set of metrics to decide
which outbound addresses are actually legitimate addresses.
[0085] A data collection process occurs that applies one or more
assessment strategies to the received communication. The multiple
queue interrogation approach summarized above and described in
detail below provides the data collection functionality in one
preferred embodiment. Alternatively, the assessments may be
performed on each received message in parallel. A separate
processing element of the system processor would be responsible for
applying each assessment to the received message. In other
embodiments, multiple risk assessments may be performed on the
received communication simultaneously using an approach such as a
neural network. The application of each assessment, or the
assessments in the aggregate, generates one or more risk profiles
associated with the received communication. The risk profile or log
file generated based upon the assessment of the received
communication is stored in the SDS. The collected data may be used
to perform threat analysis or forensics. This processing may take
place after the communication is already received and
forwarded.
[0086] In one preferred embodiment, particular assessments may be
configurably enabled or disabled by an application administrator.
An appropriate configuration interface system may be provided as
discussed above in order to facilitate configuration by the
application administrator.
[0087] An anomaly detection process analyzes the stored risk
profile associated with the received communication in order to
determine whether it is anomalous in light of data associated with
previously received communications. In one preferred embodiment,
the anomaly detection process summarized above and described in
detail below supports this detection functionality. Anomaly
detection in some embodiments may be performed simultaneously with
assessment. For instance, an embodiment using a neural network to
perform simultaneous assessment of a received communication for
multiple risks may further analyze the received communication for
anomalies; in such an embodiment, the data associated with the
previously received communications may be encoded as weighting
factors in the neural network.
[0088] In some embodiments, the thresholds for various types of
anomalies may be dynamically determined based upon the data
associated with previously received communications. Alternatively,
an interface may be provided to an application administrator to
allow configuration of particular thresholds with respect to
individual anomaly types. In some embodiments, thresholds by
default may be dynamically derived unless specifically configured
by an application administrator.
[0089] Anomalies are typically detected based upon a specific time
period. Such a time period could be a particular fixed period
(e.g., prior month, prior day, prior year, since security device's
last reboot, etc.) and apply to all anomaly types. Alternatively,
the time period for all anomaly types, or each anomaly type
individually, may be configurable by an application administrator
through an appropriate interface. Some embodiments may support a
fixed period default for all anomaly types, or each anomaly type
individually, which may be overridden by application administrator
configuration.
[0090] In one preferred embodiment, the stored risk profile
associated with the received communication is aggregated with data
associated with previously received communications of the same
type. This newly aggregate data set is then used in analysis of
subsequently received communications of that type.
[0091] If an anomaly is detected, an anomaly indicator signal is
output. The outputted signal may include data identifying the
anomaly detected and the communication in which the anomaly was
detected. Various types of anomalies are discussed below with
respect to e-mail application security. These types of anomalies
may be detected using the specific detection approach discussed
below or any of the aforementioned alternative anomaly detection
approaches.
[0092] The outputted signal may trigger a further response in some
embodiments; alternatively, the outputted signal may be the
response. In one preferred embodiment, the outputted signal may be
a notification to one or more designated recipient via one or more
respective, specified delivery platform. For instance, the
notification could be in the form of an e-mail message, a page, a
facsimile, an SNMP (Simple Network Management Protocol) alert, an
SMS (Short Message System) message, a WAP (Wireless Application
Protocol) alert, OPSEC (Operations Security) warning a voice phone
call or other suitable message. Alternatively, such a notification
could be triggered by the outputted signal.
[0093] Using SNMP allows interfacing with network level security
using a manager and agent; an example would be monitoring traffic
flow through a particular router. OPSEC is a formalized process and
method for protecting critical information. WAP is an open, global
specification that empowers mobile users with wireless devices to
easily access and interact with information and services instantly.
An example would be formatting a WAP page to a wireless device that
supports WAP when an anomaly is detected. WAP pages are stripped
down versions of HTML and are optimized for wireless networks and
devices with small displays. SMS is a wireless technology that
utilizes SMTP and SNMP for transports to deliver short text
messages to wireless devices such as a Nokia 8260 phone. SMS
messages could be sent out to these devices to alert a user of an
intrusion detection of anomaly alert.
[0094] Instead of or in addition to a notification, one or more
corrective measures could be triggered by the outputted signal.
Such corrective measures could include refusing acceptance of
further communications from the source of the received
communication, quarantining the communication, stripping the
communication so that it can be safely handled by the application
server, and/or throttling excessive numbers of incoming connections
per second to levels manageable by internal application
servers.
[0095] In one preferred embodiment, an interface may be provided
that allows an application administrator to selectively configure a
desired response and associated this configured response with a
particular anomaly type such that when an anomaly of that type is
detected the configured response occurs.
[0096] Finally, if an anomaly is detected with respect to a
received communication, the communication may or may not be
forwarded to the intended destination. Whether communications
determined to be anomalous are forwarded or not may, in certain
embodiments, be configurable with respect to all anomaly types.
Alternatively, forwarding of anomalous communications could be
configurable with respect to individual anomaly types. In some such
embodiments, a default forwarding setting could be available with
respect to any individual anomaly types not specifically
configured.
[0097] Whitelisting
[0098] In one embodiment, the system can be configured so that
communications matched to a whitelist entry may be subject to
either no interrogation or less rigorous interrogation. Once a
whitelist has at least one entry, the incoming message
interrogation system can utilize it in connection with the
interrogation of a message.
[0099] FIG. 10 depicts operations that can be performed on a
whitelist to add an entry. Once an outgoing address passes any
exclusion conditions 1005 described above, it can be added to a
whitelist. The whitelist can be stored on the SDS. The system first
checks to see if the address is already present on the list 1010.
If present, the list can be updated with any new information 1015.
Before new information is updated, the system can check for
sufficient space in the SDS 1025. If sufficient space is not
available, additional space is allocated from the SDS 1030. If an
address is not found in a whitelist, an initial record can be added
for that address. Before a new address is added to a whitelist
1040, the system can check for sufficient space in the SDS 1020. If
sufficient space is not available, additional space is allocated
from the SDS 1035. In many embodiments, explicit space allocation
need not occur rather implicit space allocation occurs as a result
of an information update 1015 or an add entry 1040.
[0100] The initial record for an outbound address can include the
email address, the internal email address, the message sent time,
usage count, last time used and/or any other characteristics one
skilled in the art would find relevant or useful. In the case of an
email address that is already present on a whitelist, the system
can use a separate record for each instance of that email address
being used as an outbound address or the system can maintain a
single record for each outbound address with a summary of
information in that entry, including information describing
instances of use. The system can store records in a number of other
ways using different data structures. The records may include other
representations of data in addition to the email address, including
by not limited to a hash of the email address.
[0101] In a preferred embodiment, the system can store records in a
MySQL database. As a non-limiting example, the following command
can be used to build a database comprising the external and
internal email addresses, date of last update, and an occurrence
counter.
1 create table ct_whitelist (out_emailaddress varchar(255) not
null, External email address in_emailaddress varchar(255) not null,
Internal email address lastupdatetime datetime, Last update of this
address curr_count integer, Address occurrence ); counter
[0102] Maintaining the Whitelist
[0103] In some embodiments, the system can allow unlimited storage.
In other embodiments, the storage available for the list can be
limited. In still other embodiments, the system can allow for
management of the size of the list. A number of caching techniques
can be used, including but not limited to first in first out and
least recently used. Other techniques can include an accounting of
the number of internal users that reported the outbound address.
List cleanup can occur in real-time or periodically. Additionally,
one skilled in the art will recognize that a wide variety of list
management techniques and procedures can be used to manage a
whitelist in connection with the present invention.
[0104] Whitelist Usage
[0105] An example of a system using a whitelist according to the
present invention is shown in FIG. 9. One or more relevant
parameters of inbound communication 905 are compared against one or
more whitelists 910. In some embodiments, the whitelist is checked
at each incoming email message. In a preferred embodiment, the
comparison includes origination email addresses. If the check
against a whitelist 910 reveals no match, then the message is
subject to normal message interrogation 915. Normal message
interrogation can employ analysis criteria that are the most
sensitive to spam or other threats as discussed hereinabove. If a
message passes normal interrogation 915, i.e. it is determined not
to be spam or a threat (or to have a lower likelihood of being spam
or a threat), it can be presented to its intended recipient for
delivery 920. If the check against a whitelist 910 reveals a match,
the system can be configured to process the message in a variety of
ways. In one embodiment, the system can be programmed or arranged
to bypass 925 any message interrogation and deliver the message to
its intended recipient 920. In an alternative embodiment, the
system can be programmed or arranged to process the message using
adaptive message interrogation 930. If adaptive message
interrogation 930 determines a message is not spam, it can forward
the message for delivery 920.
[0106] In some embodiments, both options 925, 930 are selectively
available. The decision whether to pass whitelisted communications
through adaptive message interrogation 930 or to bypass any message
interrogation 925 can be made per deployment or can be based on the
details of the whitelist entry. For instance, messages from more
frequently used outbound address can bypass 925 interrogation
completely whereas messages from less frequently used outbound
addresses can be subjected to adaptive message interrogation
930.
[0107] If the message goes through normal or adaptive interrogation
with the whitelist information, the interrogation module can
utilize the whitelist information to effect the type and/or level
of interrogation. In some preferred embodiments, the adaptive
message interrogation can use multiple levels of trust, as further
described below and in FIG. 11. In other embodiments, the adaptive
message interrogation can set a confidence indicator indicative of
the confidence the interrogator has in its characterization.
[0108] Messages that are not delivered to the intended recipient
can be either quarantined or deleted. In an alternative embodiment,
messages determined to be spam can be indicated as spam or a threat
and forwarded to the intended recipient.
[0109] Additionally, each outbound email address can be assigned a
confidence value. According to the confidence value associated with
a given incoming email address, incoming messages can be subjected
to variable levels of interrogation. In one preferred embodiment,
incoming messages associated with lower confidence values are
subjected to more aggressive spam interrogation and incoming
messages associated with higher confidence values are subjected to
less aggressive spam interrogation. In other embodiments, the
message can be given positive credits to offset any negative spam
detection points based on the confidence value.
[0110] One preferred embodiment of the system allows some or all
external email recipients to be whitelisted 935. Some embodiments
can have a metric that describes the number of outgoing messages to
a particular email address. When the metric reaches a certain
threshold, the email address can be whitelisted. Other embodiments
can include the ability to track addresses over time. In those
embodiments, if the metric exceeds a certain value for a particular
outbound email address during a particular time, then that entry
can be whitelisted.
[0111] The parameters described above may be configurable by an
application administrator through an appropriate interface. Some
embodiments may support fixed parameters which may be overridden by
application administrator configuration.
[0112] In some embodiments, the threshold for characterization as
spam or a threat may be dynamically determined based upon the data
associated with previously received communications. Alternatively,
an interface may be provided to an application administrator to
allow configuration of particular thresholds with respect to
individual addresses. In some embodiments, thresholds by default
may be dynamically derived unless specifically configured by an
application administrator.
[0113] When spam or a threat is detected, instead of, or in
addition to, a notification, one or more response measures could be
triggered. Such responsive measures could include refusing
acceptance of further communications from the source of the
received communication, quarantining the communication, stripping
the communication so that it can be forwarded to its intended
recipient, and/or throttling excessive numbers of incoming
communications from certain sources.
[0114] Authenticated Whitelist
[0115] One issue with whitelists is that attackers or spammers can
pretend to send messages from whitelisted addresses and therefore
bypass filtering and anti-spam tools. It is relatively easy for an
attacker to forge the sender information on messages. To overcome
this limitation of whitelists, the system of the present invention
allows the authentication of the sender information. There are
several methods for integrating sender authentication with a
whitelist system. In one embodiment, only authenticated senders can
be whitelisted. Such a procedure can reduce the likelihood of
forged senders being whitelisted. However, in many environments,
the percentage of messages that are authenticated is low, thereby
reducing the effectiveness of whitelisting. Some embodiments of the
present invention can allow both authenticated and unauthenticated
senders to be whitelisted. In these embodiments, a higher trust
value is given to messages from authenticated senders. SMIME and
PGP offer mechanism for providing authentication.
[0116] One such embodiment is depicted in FIG. 11. As a
non-limiting example, when a message 1105 is received from a sender
on a whitelist 1115 an associated level of trust is retrieved or
calculated 1135. In some embodiments, the trust level value is a
single value associated with the whitelist entry that simply
requires retrieval. In other embodiments, the trust level value can
be calculated as a weighted sum of various characteristics of the
entry; in some such embodiments, the weights can be statically
defined, defaulted subject to override by a user or other computer
system or dynamically configurable. That associated level of trust
can be compared to a threshold level 1140. Any communications that
have a trust level that meets or exceeds the trust level threshold
can bypass message interrogation 1120 while communications that do
not have a trust sufficient trust level will be processed with at
least some interrogation 1125. Messages that bypass interrogation
1120 as well as messages that pass interrogation 1125 can be
delivered to the intended recipient 1145. In such an embodiment,
messages not associated with a whitelist entry are subjected to
interrogation and further processing 1150.
[0117] Some embodiments of the present invention can allow the
trust level threshold 1130 to be configured by an administrator,
other user of the system or other computer systems.
[0118] Exclusions from Whitelist
[0119] The spam/threat detection according to present invention
examines every outbound message and maintains a list of known
outbound email addresses. The resulting list can then be used as
the list of trusted senders. However, it may not be advisable in
all cases to add every outbound message recipient to the list of
trusted senders for incoming mail. For example, while a user may
send a message to a newsgroup, that does not indicate that messages
from this newsgroup should necessarily bypass mail filtering. To
further illustrate, a user may send an unsubscribe message to a
newsletter or in response to a spam message. Thus, there can be
situations in which unconditional whitelist addition is not
advisable. The system of the present invention allows certain
exclusion conditions to be entered and applied.
[0120] These exclusion conditions can include rule sets,
heuristics, artificial intelligence, decision trees, or any
combination thereof. The conditions can be set by and administrator
or other user of the system.
[0121] Multiple Queue Approach to Interrogation of Electronic
Communications
[0122] With reference to FIG. 7, a multiple queue approach is
provided for applying a plurality of risk assessments to a received
communication.
[0123] Messages are first placed in an unprocessed message store
730, a portion of the SDS, for advanced processing and
administration. Messages come in from an external source 740 and
are placed in this store 730. This store 730 maintains physical
control over the message until the end of the process or if a
message does not pass interrogation criteria and is, therefore,
quarantined.
[0124] An index to the message in the store 730 is used to pass
through each of the queues 771B, 781B-784B, 791B in the queuing
layer 720 and to the interrogation engines 771A, 781A-784A, 791A
instead of the actual message itself to provide scalability and
performance enhancements as the index is significantly smaller than
the message itself.
[0125] Both the queues and the interrogation engines use the index
to point back to the actual message in the unprocessed message
store 730 to perform actions on the message. Any suitable index
allocation approach may be used to assign an index to a received
message, or communication. For instances, indices may be assigned
by incrementing the index assigned to the previously received
communication beginning with some fixed index such as 0 for the
first received communication; the index could be reset to the fixed
starting point after a sufficiently large index has been assigned.
In some embodiments, an index may be assigned based upon
characteristics of the received communication such as type of
communication, time of arrival, etc.
[0126] This approach provides independent processing of messages by
utilizing a multi-threaded, multi-process methodology, thereby
providing a scalable mechanism to process high volumes of messages
by utilizing a multi-threaded, multi-process approach.
[0127] By processing messages independently, the queuing layer 720
decides the most efficient means of processing by either placing an
index to the message on an existing queue or creating a new queue
and placing the index to the message on that queue. In the event
that a new queue is created, a new instance of the particular
interrogation engine type will be created that will be acting on
the new queue.
[0128] Queues can be added or dropped dynamically for scalability
and administration. The application administrator can, in one
preferred embodiment, configure the original number of queues to be
used by the system at start-up. The administrator also has the
capability of dynamically dropping or adding specific queues or
types of queues for performance and administration purposes. Each
queue is tied to a particular interrogation engine where multiple
queues and multiple processes can exist.
[0129] Proprietary application-specific engines can act on each
queue for performing content filtering, rules-based policy
enforcement, and misuse prevention, etc. A loosely coupled system
allows for proprietary application-specific applications to be
added enhancing functionality.
[0130] This design provides the adaptive method for message
interrogation. Application-specific engines act on the message via
the index to the message in the unprocessed message store for
completing content interrogation.
[0131] Administration of the queues provides for retrieving message
details via an appropriate interface such as a Web, e-mail and/or
telephone based interface system as discussed above in order to
facilitate access and management by the application administrator.
Administration of the queues allows the administrator to select
message queue order (other than the system default) to customize
the behavior of the system to best meet the needs of the
administrator's particular network and system configuration.
[0132] FIGS. 8A-8B are flow charts depicting use of the multiple
queue approach to assess risk associated with a received
communication. At step 802 a determination is made if the start-up
of the process is being initiated; if so, steps 805 and 807 are
performed to read appropriate configuration files from the SDS to
determine the type, number and ordering of interrogation engines
and the appropriate queues and instances are created. If not, the
process waits at step 810 for receipt of a communication.
[0133] Upon receipt at step 812, the communication is stored in a
portion of the SDS referred to as the unprocessed message store.
The communication is assigned at step 815 an index used to uniquely
identify it in the unprocessed message store, and this index is
placed in the first queue based upon the ordering constraints.
[0134] The processing that occurs at step 810 awaiting receipt of
communication continues independently of the further steps in this
process, and will consequently spawn a new traversal of the
remainder of the flow chart with each received communication. In
some embodiments, multiple instances of step 810 may be
simultaneously awaiting receipt of communications.
[0135] In some embodiments, the receipt of a communication may
trigger a load evaluation to determine if additional interrogation
engines and associated queues should be initiated. In other
embodiments, a separate process may perform this load analysis on a
periodic basis and/or at the direction of an application
administrator.
[0136] The index moves through the queue 820 until it is ready to
be interrogated by the interrogation engine associated with the
queue as determined in step 825. This incremental movement is
depicted as looping between steps 820 and 825 until ready for
interrogation. If the communication is not ready for evaluation at
step 825, the communication continues moves to move through the
queue at step 820. If the communication is ready, the index is
provided to the appropriate interrogation engine at step 830 in
FIG. 8B.
[0137] The interrogation engine processes the communication based
upon its index in step 830. Upon completion of interrogation in
step 835, the interrogation creates a new risk profile associated
with the received communication based upon the interrogation.
[0138] If additional interrogations are to occur (step 840), the
index for the communication is place in a queue for an instance of
the next interrogation type in step 845. Processing continues with
step 820 as the index moves through this next queue.
[0139] If no more interrogations are required (step 840), a further
check is made to determine if the communication passed
interrogation by all appropriate engines at step 850. If the
communication passed all interrogations, then it is forwarded to
its destination in step 855 and processing with respect to this
communication ends at step 870.
[0140] If the communication failed one or more interrogation as
determined at step 850, failure processing occurs at step 860. Upon
completion of appropriate failure processing, processing with
respect to this communication ends at step 870.
[0141] Failure processing may involve a variety of notification
and/or corrective measures. Such notifications and/or corrective
measures may include those as discussed above and in further detail
below with respect to anomaly detection.
[0142] Anomaly Detection Process
[0143] The Anomaly Detection process according to an exemplary
embodiment of the present invention uses three components as
depicted in FIG. 6:
[0144] 1. Collection Engine
[0145] This is where the actual collection of data occurs. The
collection engine receives a communication directed to or
originating from an application server. One or more tests are
applied to the received communication. These one or more tests may
correspond to the various risk assessments discussed above.
[0146] The collection engine in one preferred embodiment as
depicted in FIG. 6 uses the multiple queue approach discussed
above; however, this particular collection engine architecture is
intended as exemplary rather than restrictive with respect to
collection engines usable within the context of this anomaly
detection process.
[0147] As depicted in FIG. 6, the collection engine includes one or
more interrogation engines of one or more interrogation engine
types in an interrogation layer 610. Associated with each
interrogation engine type in a queuing layer 620 is at least one
indices queue containing the indices of received communication
awaiting interrogation by an interrogation engine of the associated
type. Collectively, the queuing layer 620 and the interrogation
layer 610 form the collection engine. A received communication is
received, stored in the SDS and assigned an index. The index is
queued in the queuing layer for processing through the collection
engine.
[0148] 2. Analysis Engine
[0149] The data collected by the previous component is analyzed for
unusual activity by the anomaly detection engine 640. The analysis
is based on data accumulated from analysis of previously received
communications over a period of time. A set of predefined
heuristics may be used to detect anomalies using dynamically
derived or predetermined thresholds. A variety of anomaly types may
be defined generally for all types of Internet application
communications while others may be defined for only particular
application types such as e-mail or Web. The data associated with
previously received communications and appropriate configuration
data 630 are stored in the SDS.
[0150] The set of anomaly types that the analysis engine will
detect may be selected from a larger set of known anomaly types.
The set of interest may be set at compile time or configurable at
run time, or during execution in certain embodiments. In
embodiments using the set approach all anomaly types and
configuration information are set within the analysis engine. In
some such embodiments, different sets of anomalies may be of
interest depending upon the type of communication received. In
configurable at run time embodiments, anomaly types are read from a
configuration file or interactively configured at run time of the
analysis engine. As with the set approach, certain anomaly types
may be of interest with respect to only selected types of
communication. Finally, in some embodiments (including some set or
configurable ones), an interface such as described above may be
provided allowing reconfiguration of the anomaly types of interest
and parameters associated therewith while the analysis engine is
executing.
[0151] The thresholds for various types of anomalies may be
dynamically determined based upon the data associated with
previously received communications. Alternatively, an interface may
be provided to an application administrator to allow configuration
of particular thresholds with respect to individual anomaly types.
In some embodiments, thresholds by default may be dynamically
derived unless specifically configured by an application
administrator.
[0152] Anomalies are typically detected based upon a specific time
period. Such a time period could be a particular fixed period
(e.g., prior month, prior day, prior year, since security device's
last reboot, etc.) and apply to all anomaly types. Alternatively,
the time period for all anomaly types, or each anomaly type
individually, may be configurable by an application administrator
through an appropriate interface such as those discussed above.
Some embodiments may support a fixed period default for all anomaly
types, or each anomaly type individually, which may be overridden
by application administrator configuration.
[0153] In one preferred embodiment, as depicted in FIG. 6,
information from the risk profiles 642, 644, 646 generated by the
collection engine is compared with the acquired thresholds for
anomaly types of interest. Based upon these comparisons, a
determination is made as to whether the received communication is
anomalous, and if so, in what way (anomaly type) the communication
is anomalous.
[0154] In one preferred embodiment, the stored risk profile
associated with the received communication is aggregated with data
associated with previously received communications of the same
type. This newly aggregate data set is then used in analysis of
subsequently received communications of that type.
[0155] If an anomaly is detected, an anomaly indicator signal is
output. The outputted signal may include data identifying the
anomaly type detected and the communication in which the anomaly
was detected such as alert data 650. Various types of anomalies are
discussed below with respect to e-mail application security. These
types of anomalies may be detected using the specific detection
approach discussed below or any of the aforementioned alternative
anomaly detection approaches.
[0156] 3. Action Engine
[0157] Based on the analysis, this component takes a decision of
what sort of action needs to be triggered. Generally the action
involves alerting the administrator of the ongoing unusual
activity. An alert engine 660 performs this task by providing any
appropriate notifications and/or initiating any appropriate
corrective actions.
[0158] The outputted signal may trigger a further response in some
embodiments; alternatively, the outputted signal may be the
response. In one preferred embodiment, the outputted signal may be
a notification to one or more designated recipient via one or more
respective, specified delivery platform. For instance, the
notification could be in the form of an e-mail message, a page, a
facsimile, an SNMP alert, an SMS message, a WAP alert, OPSEC
warning a voice phone call or other suitable message.
Alternatively, such a notification could be triggered by the
outputted signal.
[0159] Instead of or in addition to a notification, one or more
corrective measures could be triggered by the outputted signal.
Such corrective measures could include refusing acceptance of
further communications from the source of the received
communication, quarantining the communication, stripping the
communication so that it can be safely handled by the application
server, and/or throttling excessive numbers of incoming connections
per second to levels manageable by internal application
servers.
[0160] In one preferred embodiment, an interface may be provided
that allows an application administrator to selectively configure a
desired response and associate this configured response with a
particular anomaly type such that when an anomaly of that type is
detected the configured response occurs.
[0161] FIG. 4 depicts a flow chart in a typical anomaly detection
process according to one preferred embodiment of the present
invention. The process starts in step 410 by initializing various
constraints of the process including the types of anomalies,
thresholds for these types and time periods for which prior data is
to be considered. This information may be configured interactively
at initiation. In addition to, or instead of, the interactive
configuration, previously stored configuration information may be
loaded from the SDS.
[0162] The process continues at step 420 where anomaly definitional
information is read (e.g., Incoming messages that have the same
attachment within a 15 minute interval.). A determination is then
made as to whether a new thread is needed; this determination is
based upon the read the anomaly details (step not shown). In step
430, if a new thread is required, the thread is spun for processing
in step 450. In step 440, the process sleeps for a specified period
of time before returning to step 420 to read information regarding
an anomaly.
[0163] Once processing of the new thread commences in step 450,
information needed to evaluate the anomaly is retrieved from
appropriate locations in the SDS, manipulated if needed, and
analyzed in step 460. A determination in step 470 occurs to detect
an anomaly. In one preferred embodiment, this step uses
predetermined threshold values to make the determination; such
predetermined threshold values could be provided interactively or
via a configuration file. If an anomaly is not detected, the
process stops.
[0164] If an anomaly is detected, an anomaly indicator signal is
output at step 480 which may result in a notification. The possible
results of anomaly detection are discussed in more detail above
with respect to the Action Engine.
[0165] The types of anomalies may vary depending upon the type and
nature of the particular application server. The following
discussion provides exemplary definitions of anomalies where e-mail
is the application context in question. Anomalies similar, or
identical, to these can be defined with respect to other
application server types.
[0166] There are many potential anomaly types of interest in an
e-mail system. The analysis is based on the collected data and
dynamic rules for normality based on the historic audited data. In
some embodiments, an application administrator can be provided with
an interface for configuring predefined rules with respect to
different anomaly types. FIG. 5 provides a sample screen for such
an interface. The interface functionality may be provided via a Web
server running on the security enhancement device or other suitable
interface platform as discussed above.
[0167] In one preferred embodiment, the threshold value for the
analysis for each anomaly is derived from an anomaly action table.
The action for each anomaly is also taken from this table. The
analysis identifies that some thing unusual has occurred and hands
over to the action module. Enumerated below with respect to e-mail
are anomalies of various types.
[0168] 1. Messages from same IP Address--The point of collection
for this anomaly is SMTPI/SMTPIS service. SMTPI/SMTPIS has
information about the IP address from which the messages originate.
The IP address is stored in the SDS. The criterion for this anomaly
is that the number of message for the given period from the same IP
address should be greater than the threshold. Based on the level of
threshold, suitable alert is generated.
[0169] 2. Messages from same Address (MAIL FROM)--The point of
collection for this anomaly is SMTPI/SMTPIS service. SMTPIeSMTPIS
has information about the address (MAIL FROM) from which the
messages originate. The determined address is stored in the SDS.
The criterion for this anomaly is that the number of message for
the given period with the same MAIL FROM address should be greater
than the threshold. Based on the level of threshold, suitable alert
is generated.
[0170] 3. Messages having same Size--The point of collection for
this anomaly is SMTPI/SMTPIS service. SMTPI/SMTPIS has information
about the size of the messages. The size of the message is stored
in the SDS. This size denotes the size of the message body and does
not include the size of the headers. The criterion for this anomaly
is that the number of message for the given period with a same size
should be greater than the threshold. Based on the level of
threshold, suitable alert is generated.
[0171] 4. Messages having same Subject--The point of collection for
this anomaly is SMTPI/SMTPIS service. SMTPI/SMTPIS has information
about the subject line of the message. The subject line information
for the message is stored in the SDS. The criterion for this
anomaly is that the number of message for the given period with the
same subject line should be greater than the threshold. Based on
the level of threshold, suitable alert is generated.
[0172] 5. Messages having same Attachment--The point of collection
for this anomaly is the MIME Ripper Queue. The MIME Ripper Queue
parses the actual message into the constituent MIME parts and
stores the information in the SDS. A part of this information is
the attachment file name. The criterion for this anomaly is that
the number of message for the given period with same attachment
name should be greater than the threshold. Based on the level of
threshold, suitable alert is generated.
[0173] 6. Messages having same Attachment Extension--The point of
collection for this anomaly is the MIME Ripper Queue. The MIME
Ripper Queue parses the actual message into the constituent MIME
parts and stores the information in the SDS. A part of this
information is the attachment file extension. The criterion for
this anomaly is that the number of message for the given period
with same extension should be greater than the threshold. Based on
the level of threshold, suitable alert is generated.
[0174] 7. Messages having Viruses--This anomaly will be detected
only if any of the anti-virus queues are enabled. The point of
collection for this anomaly is the anti-virus Queue. The anti-virus
Queue scans for any viruses on each individual MIME parts of the
message. The scan details are stored in the SDS. A part of this
information is the virus name. The criterion for this anomaly is
that the number of message for the given period detected with
viruses should be greater than the threshold. Based on the level of
threshold, suitable alert is generated.
[0175] 8. Messages having same Virus--This anomaly will be detected
only if any of the anti-virus queues are enabled. The point of
collection for this anomaly is the anti-virus Queue. The anti-virus
Queue scans for any viruses on each individual MIME parts of the
message. The scan details are entered into the SDS. A part of this
information is the virus name. The criterion for this anomaly is
that the number of message for the given period detected with same
virus should be greater than the threshold. Based on the level of
threshold, suitable alert is generated.
[0176] The table below depicts the fields in an anomaly table in
one preferred embodiment using a relational database model for
storing this information in the SDS.
2 Sl No. Field Name Data Type Remarks 1. anm_type int Primary key.
Unique identifier for all anomalies. The list is given in next
section. 2. anm_name varchar Name of the Anomaly (Tag for the UI to
display) 3. can_display tinyint Anomaly is displayable or not in
UI. 0--Do not display 1--Display 4. is_enabled tinyint Specifies if
the anomaly is enabled or not 0--Disabled 1--Enabled 5. anm_period
int Time in minutes. This time specifies the period for the anomaly
check.
[0177] The table below depicts the fields in an anomaly action
table in one preferred embodiment using a relational database model
for storing this information in the SDS.
3 Sl No. Field Name Data Type Remarks 1. anm_type int Foreign key
from anomaly table. 2. anm_thresh int This value specifies the
threshold for a particular action to be taken. 3. alert_type int
This is foreign key from alert type table. This value specifies the
type of alert to be sent to the alert manager when this anomaly is
detected.
[0178] Throughout this application, various publications may have
been referenced. The disclosures of these publications in their
entireties are hereby incorporated by reference into this
application in order to more fully describe the state of the art to
which this invention pertains.
[0179] The embodiments described above are given as illustrative
examples only. It will be readily appreciated by those skilled in
the art that many deviations may be made from the specific
embodiments disclosed in this specification without departing from
the invention. Accordingly, the scope of the invention is to be
determined by the claims below rather than being limited to the
specifically described embodiments above.
* * * * *
References