U.S. patent number 10,218,717 [Application Number 15/042,135] was granted by the patent office on 2019-02-26 for system and method for detecting a malicious activity in a computing environment.
This patent grant is currently assigned to AWAKE SECURITY, INC.. The grantee listed for this patent is Awake Networks, Inc.. Invention is credited to Keith Amidon, Michael Callahan, Manasa Chalasani, Debabrata Dash, Gary Golomb.
View All Diagrams
United States Patent |
10,218,717 |
Amidon , et al. |
February 26, 2019 |
System and method for detecting a malicious activity in a computing
environment
Abstract
System and method for detecting a likely threat from a malicious
attack is disclosed. Communication between a user computer and a
destination computer is monitored by a security appliance.
Selective information from the communication is extracted.
Selective information is associated to one or more attributes of a
security entity. A knowledge graph is generated for a plurality of
security entities based on the associated selective
information.
Inventors: |
Amidon; Keith (Los Altos,
CA), Callahan; Michael (Palo Alto, CA), Chalasani;
Manasa (Mountain View, CA), Dash; Debabrata (San Jose,
CA), Golomb; Gary (Los Gatos, CA) |
Applicant: |
Name |
City |
State |
Country |
Type |
Awake Networks, Inc. |
Mountain View |
CA |
US |
|
|
Assignee: |
AWAKE SECURITY, INC. (Mountain
View, CA)
|
Family
ID: |
65410965 |
Appl.
No.: |
15/042,135 |
Filed: |
February 11, 2016 |
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
H04L
63/20 (20130101); H04L 63/1425 (20130101); H04L
63/1416 (20130101) |
Current International
Class: |
G06F
12/14 (20060101); H04L 29/06 (20060101) |
Field of
Search: |
;726/22-26 ;713/188 |
References Cited
[Referenced By]
U.S. Patent Documents
Primary Examiner: Tabor; Amare F
Attorney, Agent or Firm: Minisandram Law Firm Minisandram;
Raghunath S.
Claims
What is claimed is:
1. A method for detecting a likely threat from a malicious attack,
comprising: monitoring a communication between a user computer and
at least one destination computer by a security appliance;
extracting selective information from the communication by the
security appliance; associating selective information to one or
more attributes of a security entity; and generating a knowledge
graph for a plurality of security entities based on the associated
selective information, the knowledge graph indicative of a time
based association between the security entity and one or more
attributes of the security entity.
2. The method of claim 1, further including, determining a likely
value for an attribute of the security entity based on one or more
events.
3. The method of claim 2, wherein, the likely value for the
attribute of the security entity is determined based on the
performance of one event.
4. The method of claim 2, wherein, the likely value for the
attribute of the security entity is determined based on an
occurrence or non-occurrence of a subsequent event, after the
performance of one event.
5. The method of claim 2, further including generating a connection
record table for one or more security entities based on extracted
selective information from the communication.
6. The method of claim 5, further including: detecting one or more
indicators of a likely threat based on the selective information
stored in the security appliance; and storing an entry in an
indicator table with one or more attributes of the indicator
associated with at least one security entity.
7. The method of claim 6, further including confirming a likely
threat for the security entity based on an analysis of one or more
entries in the knowledge graph, connection record table and the
indicator table.
8. The method of claim 4, wherein one of the security entity is a
computing device and whether the computing device booted is
determined based on an event that occurred subsequently.
9. The method of claim 4, wherein one of the security entity is a
computing device and whether the computing device executed a
program is determined based on an event that occurred
subsequently.
10. A system to detect a likely threat of a malware attack,
comprising: a security appliance configured to monitor a
communication between a user computer and a destination computer;
extract selective information from the communication; associate
selective information to one or more attributes of a security
entity; and generate a knowledge graph for a plurality of security
entities, based on the associated selective information, the
knowledge graph indicative of a time based association between the
security entity and one or more attributes of the security
entity.
11. The system of claim 10, wherein, the security appliance
determines a likely value for an attribute of the security entity
based on one or more events.
12. The system of claim 11, wherein, the likely value for the
attribute of the security entity is based on the performance of one
event.
13. The system of claim 11, wherein, the likely value for the
attribute of the security entity is determined based on an
occurrence or non-occurrence of a subsequent event, after the
performance of one event.
14. The system of claim 11, wherein a connection record table for
one or more security entities are generated based on extracted
selective information from the communication.
15. The system of claim 14, wherein one or more of a likely threat
is detected based on the selective information stored in the
security appliance; and an entry is stored in an indicator table
with one or more attributes of the indicator associated with at
least one security entity.
16. The system of claim 15, wherein a likely threat for the
security entity is confirmed based on an analysis of one or more
entities in the knowledge graph, connection record table and the
indicator table.
17. The system of claim 13, wherein one of the security entity is a
computing device and whether the computing device booted in
determined based on an event that occurred subsequently.
18. The system of claim 13, wherein one of the security entity is a
computing device and whether the computing device executed a
program is determined based on an event that occurred subsequently.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
None.
TECHNICAL FIELD
The present invention relates generally to detecting a malicious
activity in a computing environment and, more particularly, to
detecting a malicious activity based on network communication in
the computing environment.
DESCRIPTION OF RELATED ART
Detecting malicious activity in a computing environment is becoming
complex. Sometimes, malicious code is downloaded on to a computing
device at one instant. The malicious code remains dormant for a
period of time while awaiting further command. At a later stage,
additional commands are issued to the malicious code to initiate
the malicious activity.
Generally, after the malicious attack has occurred and detected, a
signature of the malicious code is identified. Thereafter, a
malware scanner may look for a partial or full match of the
identified signature of the malicious code to identify and prevent
future attacks. In other words, a corrective action is taken after
an attack has occurred.
It may be desirable to predict a possible malicious attack, before
the attack takes place. It is with these needs in mind, this
disclosure arises.
SUMMARY OF THE INVENTION
In one embodiment, a method for detecting a likely threat from a
malicious attack is disclosed. Communication between a user
computer and a destination computer is monitored by a security
appliance. Selective information from the communication is
extracted by the security appliance. Extracted selective
information is associated with one or more attributes of a security
entity. A knowledge graph is generated for a plurality of security
entities, based on the associated selective information.
In yet another embodiment, a system to detect a likely threat from
a malicious attack is disclosed. Communication between a user
computer and a destination computer is monitored by a security
appliance. Selective information from the communication is
extracted by the security appliance. Extracted selective
information is associated with one or more attributes of a security
entity. A knowledge graph is generated for a plurality of security
entities, based on the associated selective information.
This brief summary has been provided so that the nature of the
disclosure may be understood quickly. A more complete understanding
of the disclosure can be obtained by reference to the following
detailed description of the preferred embodiments thereof in
connection with the attached drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
The foregoing and other features of several embodiments are now
described with reference to the drawings. In the drawings, the same
components have the same reference numerals. The illustrated
embodiments are intended to illustrate but not limit the invention.
The drawings include the following Figures:
FIG. 1 shows an example computing environment with example security
appliance of this disclosure, according an example of this
disclosure;
FIG. 2 depicts block diagram of an example security appliance of
this disclosure;
FIG. 3 shows various phases of an example malicious attack in an
example computing environment;
FIGS. 4-1 to 4-7 shows an example table showing information related
to various communication events occurring between a plurality of
computing devices;
FIG. 5 shows an example knowledge graph generated based on various
communication events occurring between a plurality of computing
devices;
FIG. 6 shows an example connection record table generated based on
various communication events occurring between a plurality of
computing devices;
FIG. 7 shows an example indicator table generated based on various
communication events occurring between a plurality of computing
devices; and
FIG. 8 shows an example flow diagram to detect one or more
indicators of a likely threat, according to an example of this
disclosure.
DETAILED DESCRIPTION
The embodiments herein and the various features and advantageous
details thereof are explained more fully with reference to the
non-limiting embodiments that are illustrated in the accompanying
drawings and detailed in the following description. Descriptions of
well-known components and processing techniques are omitted so as
to not unnecessarily obscure the embodiments herein. The examples
used herein are intended merely to facilitate an understanding of
ways in which the embodiments herein may be practiced and to
further enable those of skill in the art to practice the
embodiments herein. Accordingly, the examples should not be
construed as limiting the scope of the embodiments herein.
The embodiments herein disclose a systems and methods for detecting
a malicious activity in a computing environment. Referring now to
the drawings, where similar reference characters denote
corresponding features consistently throughout the figures, various
examples of this disclosure is described.
FIG. 1 depicts an example computing environment 100, with a
security appliance 102 of this disclosure. The computing
environment 100 includes a plurality of user computers, for
example, a first user computer 104-1, a second user computer 104-2
and a third user computer 104-3. The computing environment also
includes a plurality of network interconnect devices 106, 108 and
110. In some examples, network interconnect device 106 may couple
first user computer 104-1, second user computer 104-2 and third
user computer 104-3 to form a local area network, for example, an
office network. The network interconnect device 108 may be a
wireless router, for example, in a conference room, that may couple
one or more user computers to form another network, for example,
conference room wireless network. For example, the first user
computer 104-1 may also selectively couple to the network
interconnect device 108, when the first user computer 104-1 is in
the conference room.
The network interconnect device 110 may be configured to couple to
a network firewall device 112, which may couple the network
interconnect device 110 to a wide area network 114. The network
interconnect device 106 and 108 may couple to network interconnect
device 110 to access the wide area network 114. A plurality of
servers, for example, a first server 116, a second server 118, a
third server 120 and a fourth server 122 may be coupled to the wide
area network 114. The plurality of servers may be accessible to the
first user computer 104-1, second user computer 104-2 and the third
user computer 104-3 through the network interconnect device
110.
In one example, a network tap device 124 may be disposed between
the network interconnect device 110 and the firewall device 112.
The network tap device 124 may be configured to intercept and
forward any communication between a user computer and a server,
over the wide area network 110 to the security appliance 102.
Various functions and features of the security appliance 102 will
now be described with reference to FIG. 2.
Now, referring to FIG. 2, example security appliance 102 of this
disclosure will be described. The security appliance 102 includes a
packet receiver 202, a protocol analysis and data extraction module
204 (sometimes referred to as PADE module 204), a data buffer 206,
a statistics engine 208, a transaction processor 210, an analytics
engine 212, a knowledge graph 214, a signal and story store 216, a
packet and session store 218, an object store 220 and a transaction
store 222. The security appliance may additionally have an external
integration interface 224, a threat info feed interface 226 and an
application programming interface (API) 228. Various function and
features of the security appliance 102 will now be described.
Detailed operation of the security appliance 102 will be later
described with reference to additional examples and figures.
The packet receiver 202 is configured to receive information from
the network tap device 124. For example, packet receiver 202 may
receive information related to network communication between a user
computer and one or more servers, from the network tap device 124
in real time. Information related to network information may be one
or more packets of information transmitted and received by the user
computer. In some examples, the packet receiver 202 may be
configured to receive information related to network communication
between a user computer and one or more servers that might have
been captured by a capture device (not shown) and stored in a data
store (not shown). The information related to network communication
between a user computer and one or more servers may sometimes be
referred to as packets or packet of information in this disclosure.
As one skilled in the art appreciates, the packet of information
may contain information encapsulated in multiple layers. Analysis
and extraction of information from each layer may lead to
information in subsequent layers.
The PADE module 204 includes a protocol and session identification
module 230 (sometimes referred to as PSI module 230), prioritized
analysis queue 232 (sometimes referred to as PAQ module 232) and
parsing and matching module 234 (sometimes referred to as PAM
module 234). The PADE module 204 is configured to receive packet of
information. The PADE module 204 queues the received packet to be
stored in the packet and session store 218. Further, the PADE
module 204 queues the received packet with an initial priority for
further analysis by the PAQ module 232. The PAM module 234 analyzes
the received packet by parsing protocol information from the packet
content for each protocol encapsulated in the packet, and matches
that data with feature patterns of interest, for example, security
or network visibility. Processing of the packets by the PADE module
204 is an iterative process, where one level of encapsulation is
processed to determine and discover information in that protocol
and the protocol of the next encapsulation.
In one example, the prioritization used for analysis of the packet
is based on a probability that the packet may be associated with a
threat. This prioritization may be periodically updated, as the
analysis of the packet proceeds. In some situations, there may be
insufficient resources available at the packet and session store
218 to store all packets that are queued for storage. In one
example, the selection of packet information to write (or store) to
the packet and session store 218 may be based on a value of threat
probability. In some examples, the selection of packet information
to store may be based on a value of threat probability at the time
selection is made, rather than when the packet was queued for
storage. In other words, the queue to store the packet information
is prioritized based on a value of threat probability.
Once a packet has been selected for storage, raw data of the packet
may be written into the packet and session store 218 in a
compressed form. The packet and session store 218 may also have
indexing data for the packets to facilitate retrieval of the
packets based on one or more attributes. For example, the
attributes for indexing may be one or more of packet timestamp,
network addresses, protocol and the like. Connection information
extracted and generated by the PADE module 204 from one or more
packets may contain references to corresponding sessions in the
packet and session store 218. In one example, connection
information may be stored in the knowledge graph 214, after further
processing. Connection information may correspond to a plurality of
attributes like user computer, details about user of the user
computer, host server, organization of the user of the user
computer and the like.
The PADE module 204 based on the analysis of the packets,
identifies signal records, which may sometimes be referred to as
weak signals indicative of a threat, transaction records and
connection records. The identified signal records 236, transaction
records 238 and the connection records 240 are stored in the data
buffer 206 for further processing.
The statistics engine 208 processes the connection records 240
stored in the data buffer 206 and profiles the connection
information from the connection records. Connection information may
be stored in the knowledge graph 214, after further processing by
the statistics engine 208. Connection information may correspond to
a plurality of attributes like user computer, details about user of
the user computer, host server, organization of the user of the
user computer and the like.
The transaction processor 210 processes the transaction records 238
and extracts transaction information from the transaction records.
Extracted transaction information by the transaction processor 210
is stored in the knowledge graph 214. Selective extracted
transaction information is also stored in the signal and story
store 216.
The analytics engine 212 processes the signal records 236. As
previously indicated, signal records 236 may indicate weak signals
of an impending threat. The analytics engine 212 analyzes the
signal records 236 and develops a possible story of a likely
threat. The story may be a sequence of signals about user computer,
activity being performed and the like. The hypothesis tester 242
evaluates one or more weak signals for a likely threat. For
example, one or more threshold values may be used to evaluate a
likely threat. The story builder 244 builds a possible scenario for
a likely threat, based on analyzed signal records. Selective
generated story and corresponding signal records may be stored in
the signal and story store 216.
As one skilled in the art appreciates, the information previously
stored in the signal and story store 216 may be used by the
analytics engine 212 during evaluation of subsequent signal records
to further update or modify a possible scenario for a likely
threat. Additionally, the analytics engine 212 may use information
stored in the knowledge graph 214 during evaluation of signal
records and building of a story for a likely threat. The story
builder 244 also uses the analyzed signal records to generate
information to update priority of analysis of incoming packets by
the PADE module 204.
As one skilled in the art appreciates, the data buffer 206 may
store information related to signal records 236, transaction
records 238 and connection records 240 on a temporary basis. One or
more additional data stores may be provided to store these
information for an extended period of time, for possible future
use. Object store 220 is a data store to store information related
to various objects. For example, in some examples, objects may be
files exchanged between a user computer and destination computer.
Transaction store 222 stores information related to transaction,
for example, for an extended period of time.
External integration interface 224 may provide an interface to
communicate with other appliances, for example, other security
appliances. Threat info feed interface 226 may provide an interface
to communicate with external threat information feeds. These
external threat information feed may be used by the security
appliance 102 during various stages on analysis and story building.
Application programming interface 228 may provide interface to one
or more applications. For example, application programming
interface 228 may provide an interface to an user interface
application to permit a user to interact with the security
appliance 102.
Having described an example security appliance 102 of this
disclosure, now referring to FIG. 3, flow diagram 300 shows various
phases of an example malicious attack. FIG. 3 shows a compromised
server 302, a victim user computer 304 and a command and control
server 306 (sometimes referred to as a CnC server 306). In some
examples, the victim user computer 304 may correspond to one of the
first user computer 104-1, second user computer 104-2 and third
user computer 104-3 described with reference to FIG. 1. In some
examples, the compromised server 302 may correspond to first server
116 described with reference to FIG. 1. In some examples, the CnC
server 306 may correspond to one or more of the second server 118,
third server 120 and fourth server 122 described with reference to
FIG. 1.
In general, a hacker compromises an external website running on a
server the victim user computer 304 visits regularly, and injects
malicious content 308 (sometimes referred to as malicious code 308)
into the website. For example, the malicious content 308 may be
present on the compromised server 302. When a user from the victim
user computer 304 visits the website on the compromised server 302,
the malicious code 308 may be executed. In some examples, the
malicious code 308 may be an executable JavaScript. This phase may
sometimes referred to as an exploit phase. In some examples, the
malicious code 308 may load a malware 310 on to the victim user
computer 304.
The malware 310 loaded on to the victim user computer 304 may be an
executable code. This phase may sometimes be referred to as a
compromise phase. The malware executable code may then connect to
the CnC server 306 and waits for commands from the CnC server 306
to be executed on the victim user computer 304. This phase may
sometimes referred to as command and control phase.
According to an example of this disclosure, one or more weak
signals of a possible threat may be detected by the security
appliance 102, in each of the exploit phase, compromise phase and
command and control phase. For example, in the exploit phase, the
malicious code 308 typically contain long lines of codes. For
example, malicious code 308 may contain about 1000 characters or
more. On the other hand, a legitimate JavaScript code may contain
short lines of codes. For example, about 80 characters. In other
words, in an example implementation, a threshold length of code may
be defined and if a suspect code is greater than the threshold
length of code, it may indicate a likely weak signal of a threat.
As an example, if an anticipated average code length is about 80
characters, a threshold length of code may be set as a multiple of
the anticipated average length of code, for example, two to ten
times the anticipated average length of code. As one skilled in the
art appreciates, the length of malicious code 308 may be detected
or measured when the malicious code 308 is downloaded into the
victim user computer 304 for execution. In some examples, the
length of malicious code 308 may be measured by the security
appliance 102, by intercepting the communication between the
compromised server 302 and victim user computer 304.
In some examples, the malicious code may modify the entire document
content. For example, the JavaScript code may modify the entire
document using document write function. In other words, in an
example implementation, a function executed by a likely malicious
code is determined and based on the function executed by the likely
malicious code, a likely weak signal of a threat may be generated
or triggered. As an example, the malicious code 308 is evaluated
for type of function being performed. In some examples, the
malicious code 308 is evaluated for the type of function being
performed, in the security appliance 102, by intercepting the
communication between the compromised server 302 and victim user
computer 304.
In the compromise phase, the malware 310 typically is a small
executable file. Generally, malware file sizes are in the range of
about 100 kilobytes to 300 kilobytes. On the other hand, a
legitimate installation file will be typically larger, for example,
in the range of at least about 1 MB or greater. In other words, in
an example implementation, a threshold value for a file size of the
likely malware may be defined and if a suspect malware is less than
or equal to the threshold file size, it may indicate a likely weak
signal of a threat. As an example, if an average malware size may
be set and a multiple of the average malware size may be set as a
threshold value. For example, a multiple of one to three may be set
as a threshold value. If for example, average malware size is set
at 200 kilobytes, a multiple of three is used, threshold value of
the file size will be 600 kilobytes. If an executable file of less
than equal to 600 kilobytes is downloaded, the executable file may
be a malware, indicating a likely weak signal. In some examples,
the malware 310 may be encrypted or obfuscated. In other words, in
an example implementation, an encrypted or obfuscated file may
indicate a likely weak signal of a threat.
In the command and control phase, the malware 310 may send one or
more HTTP POST requests with small random looking content to the
CnC server 306. In response, the CnC server 306 may send empty
responses to these HTTP POST requests. In some examples, the posted
content may be different, but of same size. In other words, in an
example implementation, communication between a victim user
computer and a server is evaluated for the type of communication
and content exchanged between the victim user computer and the
server for a pattern. If the communication matches the pattern, it
may indicate a likely weak signal of a threat.
Having described various phases of likely malicious attack and
identification of likely weak signals of threat by the security
appliance 102, now, referring to FIG. 4, an example table 400 is
shown, which shows various network communication occurring between
various computing devices. For example, the security appliance 102
may intercept the communication between various computing
devices.
Now, referring to FIG. 4, table 400 shows various network
communication occurring between different computing devices. For
example, the computing devices may be various computing devices
shown in the network environment 100 of FIG. 1. The network
communication may have a plurality of sessions. Sessions may be
network sessions consisting of information transferred over a
single communication channel (for example, a TCP connection or a
UDP connection) between communication software on different
computing devices. Generally, sessions consist of data sent back
and forth between two computing devices. In some examples, more
than two computing devices may participate, for example, in a
broadcast session or a multicast session.
Column 402 shows time, column 404 shows Source IP address of a
computing device, column 406 shows Destination IP address of a
computing device and column 408 shows events occurring during a
given time.
Now, referring to rows 410-420 for a time range of T11-T12, various
activities performed as part of session S1 will now be described.
Referring to row 410, at time T11, DHCP session S1 is started.
Referring to row 412, in session S1, IP address is requested. For
example, request for IP address is sent to a DHCP server (not shown
in FIG. 1). Referring to row 414, a reply is received from the DHCP
server. For example, the DHCP server IP address is 2.2.2.2 and
assigned IP address is 1.1.1.1. In this example, the IP address
1.1.1.1 is assigned to first user computer 104-1 of FIG. 1.
Referring to row 418, the session S1 is ended.
As the security appliance 102 evaluates various network
communication between computing devices, selective information is
extracted from the network communication and stored in one or more
tables in a data store. For example, these tables may be stored in
knowledge graph 214 or signal and story store 216 of the security
appliance 102, as shown in FIG. 2. An example knowledge graph table
500 is shown in FIG. 5 and an example connection record table 600
is shown in FIG. 6. Various entries in the knowledge graph table
500 and connection record table 600 are made based on the analysis
of the network communication.
As an example, selective information derived from session S1, for
example, as shown in row 416 may be stored in the knowledge graph
table 500. As another example, selective information derived from
session S1, for example, as shown in row 420 may be stored in the
connection record table 600.
Now, referring to FIG. 5 and knowledge graph table 500, column 502
shows time, column 504 shows session, column 506 shows device IP,
column 508 shows device entity, column 510 shows user entity and
column 512 shows relationship. As an example, referring to row 514,
at time T12, based on session S1, the device with IP address of
1.1.1.1 was first user computer and the relationship was "IP
assigned". For example, selective information extracted from the
network communication and populated in row 514 of the knowledge
graph table 500 is shown in row 416 of the table 400 of FIG. 4.
Now, referring to FIG. 6 and connection record table 600, column
602 shows time range, column 604 shows session, column 606 shows
source IP, column 608 shows destination IP, column 610 shows
protocol and column 612 shows meta data. As an example, referring
to row 614, at time T11-12, based on session S1, the source IP
address of 1.1.1.1 sent to destination IP address of 2.2.2.2 one
packet of 100 bytes and received one packet of 500 bytes using DHCP
protocol. For example, selective information extracted from the
network communication and populated in row 614 of the connection
record table 600 is shown in row 420 of the table 400 of FIG.
4.
Now, referring back to FIG. 4, entries between rows 422 and 424
correspond to session S2. Based on the entries between rows 422 and
424, selective information from the network communication is
extracted. For example, extracted selective information is
populated in the knowledge graph table 500 at row 516 of FIG. 5. As
an example, referring to row 516, at time T50, based on session S2,
the device with IP address of 1.1.1.2 was second user computer and
the relationship was "IP assigned". For example, selective
information extracted from the network communication and populated
in row 516 of the knowledge graph table 500 is shown in row 426 of
the table 400 of FIG. 4.
Now, referring to row 616 of connection record table 600 of FIG. 6,
selective information for session S2 is entered in the connection
table 600. As an example, referring to row 616, at time T49-50,
based on session S2, the source IP address of 1.1.1.2 sent to
destination IP address of 2.2.2.2 one packet of 100 bytes and
received one packet of 500 bytes using DHCP protocol. For example,
selective information extracted from the network communication and
populated in row 616 of the connection record table 600 is shown in
row 424 of the table 400 of FIG. 4.
Now, referring back to FIG. 4, entries between rows 428 and 430
correspond to sessions S3 and S4. Based on the entries between rows
428 and 430, selective information from the network communication
is extracted. For example, extracted selective information is
populated in the knowledge graph table 500 at row 518 of FIG. 5. As
an example, referring to row 518, at time T51, based on session S3,
the device with IP address of 1.1.1.2 was second user computer and
the relationship was "system booted". For example, selective
information extracted from the network communication and populated
in row 518 of the knowledge graph table 500 is shown in row 432 of
the table 400 of FIG. 4. In this example, the security appliance
102 is able to conclude that the second user computer booted in
session S3, based on the HTTP request GET/update issued by a
specific application and corresponding response.
In this example, the security appliance 102 is concluding an event
occurred or not occurred (for example, Event A) based on another
event (for example, Event B) occurred or not occurred. For these
types of inferences or conclusions, Event B may sometimes be
referred to as a consequential artifact. In other words, HTTP
request GET/update issued by a specific application and
corresponding response corresponds to Event B and an conclusion
that the second user computer booted in session S3 corresponds to
Event A. In some examples, the security appliance 102 may conclude
an event occurred based on the event itself. In other words, if a
file was downloaded in session S3, that event of downloading a file
may be referred to as a direct artifact.
Now, referring to row 618 of connection record table 600 of FIG. 6,
selective information for session S3 is entered in the connection
table 600. As an example, referring to row 618, at time T50-51,
based on session S3, the source IP address of 1.1.1.2 sent to
destination IP address of 103.4.4.4 five packets with a total of
200 bytes and received four packets with a total of 150 bytes using
HTTP protocol. For example, selective information extracted from
the network communication and populated in row 618 of the
connection record table 600 is shown in row 430 of the table 400 of
FIG. 4.
Now, referring to row 434, during session S4, an unknown protocol
session was initiated. This information is stored in an indicator
table 700, shown in FIG. 7. Now, referring to FIG. 7, indicator
table 700, column 702 shows time, column 704 shows session, column
706 shows source IP, column 708 shows destination IP, column 710
shows indicator. In some examples, the indicator 710 may correspond
to a weak signal. Referring to row 712, at time T51, during session
S4, computing device with IP address of 1.1.1.2 communicated with
computing device with IP address of 200.1.1.1 using an unknown
protocol. For example, selective information extracted from the
network communication and populated in row 712 of the indicator
table 700 is shown in row 436 of the table 400 of FIG. 4.
Further, referring to row 714 of indicator table 700, another
indicator "on system boot" is recorded for session S4 at time T51.
As one skilled in the art appreciates, this entry was based on an
analysis of session S3, where it was concluded that second user
computer booted at time T51, as shown in row 514 of knowledge graph
500. As one skilled in the art appreciates, the indicators shown in
rows 712 and 714 may indicate a possible command and control phase
communication between second user computer and a malicious server,
for example, a CnC server with an IP address of 200.1.1.1.
Now, referring to row 620 of connection record table 600 of FIG. 6,
selective information for session S4 is entered in the connection
table 600. As an example, referring to row 620, at time T50-51,
based on session S4, the source IP address of 1.1.1.2 sent to
destination IP address of 200.1.1.1 one packet of 70 bytes and
received one packet of 50 bytes using an unknown protocol. For
example, selective information extracted from the network
communication and populated in row 620 of the connection record
table 600 is shown in row 438 of the table 400 of FIG. 4.
Now, referring back to FIG. 4, entries between rows 440 and 442
correspond to session S5. Based on the entries between rows 440 and
442, selective information from the network communication is
extracted. For example, extracted selective information
corresponding to row 442 is populated in the connection record
table 600 at row 622 of FIG. 6. Additionally, referring to row 444
of FIG. 4, the file from obscuresite.com is added to the object
store associated with the first user computer, for example, in
object store 220 of the security appliance 102. In this example,
downloading of the file from obscuresite.com is a direct
artifact.
Entries between rows 446 and 448 correspond to session S6. Based on
the entries between rows 446 and 448, selective information from
the network communication is extracted. In this example, in session
S6, the first user computer has moved to a new location and
connected to network interconnect 108 of FIG. 1. When the first
user computer tries to renew its IP address of 1.1.1.1, the DHCP
server rejects the IP address, due to its new location and assigns
a new IP address, in this case, an IP address of 3.3.3.3. For
example, extracted selective information corresponding to row 450
is populated in the knowledge graph table 500 at row 520 of FIG. 5.
And, extracted selective information corresponding to row 448 is
populated in the connection record table 600 at row 624 of FIG.
6.
Entries between rows 452 and 454 correspond to session S7. Based on
the entries between rows 452 and 454, selective information from
the network communication is extracted. In this example, in session
S7, the first user computer sends a request to get an image from
www.google.com and receives the image file in response. Referring
to row 456, the image file received from www.google.com is stored
in the object store associated with first user computer, for
example, in object store 220 of the security appliance 102. For
example, extracted selective information corresponding to row 454
is populated in the connection record table 600 at row 626 of FIG.
6. As the security appliance 102 concluded that there was no
information of interest in session S7 to be recorded in the
knowledge graph table 500, there is no corresponding entry in the
knowledge graph table 500 for session S7.
Entries between rows 458 and 460 correspond to start of an instant
messaging (IM) session S8. Based on the entries between rows 458
and 460, selective information from the network communication is
extracted. In this example, in session S8, the first user using the
first user computer sends an IM registration request and receives
an acknowledgement. For example, extracted selective information
corresponding to row 460 is populated in the knowledge graph table
500 at row 522 of FIG. 5. In this example, relationship between a
device entity, in this case, first user computer and a user entity,
the first user is established and maintained in the knowledge
graph.
Entries between rows 462 and 464 correspond to session S9. Based on
the entries between rows 462 and 464, selective information from
the network communication is extracted. For example, extracted
selective information corresponding to row 466 is populated in the
knowledge graph table 500 at row 524 of FIG. 5. And, extracted
selective information corresponding to row 464 is populated in the
connection record table 600 at row 628 of FIG. 6.
Entries between rows 468 and 470 correspond to sessions S10 and
S11. Based on the entries between rows 468 and 470, selective
information from the network communication is extracted. For
example, extracted selective information corresponding to row 472
is populated in the indicator table 700 at row 716 of FIG. 7.
Extracted selective information corresponding to row 474 is
populated in the knowledge graph table 500 at row 526 of FIG. 5.
Extracted selective information corresponding to row 476 is
populated in the indicator table 700 at row 718 of FIG. 7.
Extracted selective information corresponding to row 478 is
populated in the connection record table 600 at row 630 of FIG. 6.
And, extracted selective information corresponding to row 470 is
populated in the connection record table 600 at row 632 of FIG.
6.
Entries between rows 480 and 482 correspond to start of an instant
messaging (IM) session S12. Based on the entries between rows 480
and 482, selective information from the network communication is
extracted. In this example, in session S12, the second user using
the second user computer sends an IM registration request and
receives an acknowledgement. For example, extracted selective
information corresponding to row 482 is populated in the knowledge
graph table 500 at row 528 of FIG. 5.
Entries between rows 484 and 486 correspond to instant messages
between first user and the second user. The first user is using the
instant messaging session started in session S8 and the second user
is using the instant messaging session started in session S12. In
this example, the IM server has an IP address of 2.2.2.30. Now,
referring to row 488, an instant message is sent to first user,
with a hyperlink, through the IM server. For example, the source IP
address of 1.1.1.2 sends the instant message to the IM server with
an IP address of 2.2.2.30, using session S12 which is registered to
second user. Referring to row 486, the instant message received
from the second user is now sent to the first user, using session
S8, along with the hyperlink. In this example, the hyperlink may be
to a malicious host.
Entries between rows 490 to 492 correspond to session S13. In this
session, the first user computer (based on source IP address of
3.3.3.3) starts an HTTP session with a host with IP address of
201.2.2.2. In one example, the host with IP address of 201.2.2.2
may be a malicious host, which may be accessed when the hyperlink
from the instant message is activated. In one example, this
activity may correspond to an exploit phase described with
reference to FIG. 3.
In response, the first user computer receives a file. In one
example, this activity may correspond to a compromise phase
described with reference to FIG. 3. Now, referring to row 494, the
received file is added to an object store associated with first
user computer, for example, in object store 220 of the security
appliance 102. Referring to row 496, extracted selective
information corresponding to row 498 is populated in the indicator
table 700 at row 720 of FIG. 7. Extracted selective information
corresponding to row 498 is populated in the connection record
table 600 at row 634 of FIG. 6.
Now, the security appliance waits for the execution of the exploit
code downloaded to the first user computer. In one example, prior
to execution of the exploit code, the first user computer performs
a certificate revocation check. In one example, the certificate
revocation check is performed within a known time unit, for
example, two time unit after the download of the executable. As the
exploit code was downloaded at time T113, no new session was
initiated by the first user computer to perform certificate
revocation check by time T115. So, based on this analysis, the
security appliance concludes that the exploit code was not executed
by the first user computer.
As previously discussed, this is an example of a consequential
artifact, where an event (certification revocation check) did not
occur and based on the event not occurring, a conclusion is reached
that the execution of exploit code did not occur (again another
example of an event not occurring). In other words, in this
example, the event A did not occur (certificate revocation check)
and so, it is inferred that event B did not occur (execution of the
malicious code). Referring to row 492, extracted selective
information corresponding to row 492 is populated in the indicator
table 700 at row 722 of FIG. 7.
In one example, the security appliance 102 may trigger a message to
a user to indicate that an exploit code has been loaded on to first
user computer by first user which has not been executed. The user
may then take actions to minimize threat posed by the exploit code.
For example, the user may selectively delete the exploit code. As
the exploit code is stored in the object store, one or more
signatures for the exploit code may be generated. The generated
signature may be advantageously used to prevent future malicious
activity.
As one skilled in the art appreciates, the data stored in the
knowledge graph table 500, connection record table 600 and
indicator table 700 may include additional attributes, in addition
to attributes described herein. For example, the knowledge graph
table 500 may include additional attributes related to various
security entities like data, network, organization, device, persona
(or user attributes) and application. In one example, the security
entities are entities that may have attributes that may be directly
or indirectly relevant from a security or threat analysis
perspective.
As one skilled in the art appreciates, the security appliance 102
selectively extracts information from communication between two
computing devices and builds one or more tables of useful
information, for example, the knowledge graph table 500, the
connection record table 600 and indicator table 700. Various
entries in the knowledge graph table 500, the connection record
table 600 and indicator table 700 may be used by the security
appliance to proactively detect various anomalies or likely threats
in the network environment. Additionally, data stored in the
security appliance may be advantageously used to recreate a roadmap
of events that lead to a likely threat.
Now, referring to FIG. 8 an example flow diagram 800 is described.
In block S802, communication between a user computer and a
destination computer is monitored. For example, the user computer
may be the victim user computer 304 and the destination computer
may be a compromised server 302 as described with reference to FIG.
3. In some examples, the user computer may be one or more of the
user computers, for example, first user computer 104-1, second user
computer 104-2 and third user computer 104-3 as shown and described
with reference to FIG. 1. In some examples, the destination
computing device may be one or more the servers, for example, first
server 116, second server 118, third server 120 and the fourth
server 112 as shown and described with reference to FIG. 1.
In block S804, selective information from the communication is
extracted. For example, as described with reference to security
appliance 102 of FIG. 2 selective information from the packets are
extracted.
In block S806, selective information is associated with one or more
attributes of a security entity. For example, table 400 of FIG. 4
shows various information exchanged in network session, between two
computing devices. Selective information from the network session
is associated with one or more attributes of a security entity. For
example, various entries in the connection record table 600 of FIG.
6 shows association of one or more attributes of a security entity.
As an example, referring to row 616 of table 600, a security entity
with a source IP address of 1.1.1.2 communicated with another
security entity with a destination IP address of 2.2.2.2, using
DHCP protocol. The row 616 further shows number of packets sent,
size of the packet sent, number of packets received and size of the
packet received.
In block S808, a knowledge graph is generated for the security
entity based on the associated selective information. For example,
referring to table 500 of FIG. 5, a knowledge graph table is
generated. As an example, referring to row 516 of the knowledge
graph table 500, a security entity with a device IP address of
1.1.1.2 was "second user computer". And, during session S2 at time
T50, the IP address of 1.1.1.2 was assigned. As another example,
referring to row 518 of the knowledge graph table 500, at time T51,
during session S3, the second user computer with an IP address of
1.1.1.2 was booted.
In block S810, one or more indicators of a likely threat is
detected based on the selective information. For example, one or
more indicators of a likely threat are stored in the indicator
table 700 of FIG. 7. As an example, referring to row 620, during
session S4, security entity with an IP address of 1.1.1.2
communicated with another security entity 200.1.1.1 using an
unknown protocol. As the protocol of the communication was unknown,
an entry is created in the indicator table, as shown in row 712 of
table 700 with selective information from the communication. For
example, time (T51), session (S4), source IP (1.1.1.2), destination
IP (200.1.1.1) and an indicator of likely threat (Unknown
Protocol). Additionally, as shown in row 714 of table 700, another
entry is created indicating that at time T51, during session S4,
security entity with a source IP of 1.1.1.2 booted up. As
previously described, entries 712 and 714 may together indicate
that security entity with a source IP of 1.1.1.2 may be
communicating with a CnC server with a destination IP of 200.1.1.1.
Further, based on row 518 of the knowledge graph table 500, the
source IP of 1.1.1.2 is associated with second user computer.
As one skilled in the art appreciates, the security appliance 102
may analyze various entries in the knowledge graph table 500,
connection record table 600 and indicator table 700 to identify
likely threat to a security entity.
The embodiments disclosed herein can be implemented through at
least one software program running on at least one hardware device
and performing various functions of the security appliance. Various
functions of the security appliance as described herein can be at
least one of a hardware device, or a combination of hardware device
and software module.
The hardware device can be any kind of device which can be
programmed including e.g. any kind of computer like a server or a
personal computer, or the like, or any combination thereof, e.g.
one processor and two FPGAs. The device may also include means
which could be e.g. hardware means like e.g. an ASIC, or a
combination of hardware and software means, e.g. an ASIC and an
FPGA, or at least one microprocessor and at least one memory with
software modules located therein. Thus, the means are at least one
hardware means, and at least one software means. The method
embodiments described herein could be implemented in pure hardware
or partly in hardware and partly in software. Alternatively, the
invention may be implemented on different hardware devices, e.g.
using a plurality of CPUs.
The foregoing description of the specific embodiments will so fully
reveal the general nature of the embodiments herein that others
can, by applying current knowledge, readily modify and/or adapt for
various applications such specific embodiments without departing
from the generic concept, and, therefore, such adaptations and
modifications should and are intended to be comprehended within the
meaning and range of equivalents of the disclosed embodiments. It
is to be understood that the phraseology or terminology employed
herein is for the purpose of description and not of limitation.
Therefore, while the embodiments herein have been described in
terms of preferred embodiments, those skilled in the art will
recognize that the embodiments herein can be practiced with
modification within the spirit and scope of the claims as described
herein.
* * * * *
References