U.S. patent application number 15/262884 was filed with the patent office on 2017-08-31 for method for tracking machines on a network using multivariable fingerprinting of passively available information.
The applicant listed for this patent is THREATMETRIX PTY LTD. Invention is credited to ALISDAIR FAULKNER, DAVID G. JONES, SCOTT THOMAS.
Application Number | 20170251004 15/262884 |
Document ID | / |
Family ID | 39796673 |
Filed Date | 2017-08-31 |
United States Patent
Application |
20170251004 |
Kind Code |
A1 |
THOMAS; SCOTT ; et
al. |
August 31, 2017 |
Method For Tracking Machines On A Network Using Multivariable
Fingerprinting Of Passively Available Information
Abstract
A method for tracking machines on a network of computers
includes determining one or more assertions to be monitored by a
first web site which is coupled to a network of computers. The
method monitors traffic flowing to the web site through the network
of computers and identifies the one or more assertions from the
traffic coupled to the network of computers to determine a
malicious host coupled to the network of computers. The method
includes associating a first IP address and first hardware finger
print to the assertions of the malicious host and storing
information associated with the malicious host in one or more
memories of a database. The method also includes identifying an
unknown host from a second web site, determining a second IP
address and second hardware finger print with the unknown host, and
determining if the unknown host is the malicious host.
Inventors: |
THOMAS; SCOTT; (ROCKHAMPTON
NORTH, AU) ; JONES; DAVID G.; (CRESCENT FORESTVILLE,
AU) ; FAULKNER; ALISDAIR; (POTTS POINT, AU) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
THREATMETRIX PTY LTD |
CHATSWOOD |
|
AU |
|
|
Family ID: |
39796673 |
Appl. No.: |
15/262884 |
Filed: |
September 12, 2016 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
14455874 |
Aug 8, 2014 |
9444835 |
|
|
15262884 |
|
|
|
|
13442857 |
Apr 10, 2012 |
9332020 |
|
|
14455874 |
|
|
|
|
12022022 |
Jan 29, 2008 |
8176178 |
|
|
13442857 |
|
|
|
|
60887049 |
Jan 29, 2007 |
|
|
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
H04L 63/0281 20130101;
H04L 63/1425 20130101; H04L 43/10 20130101; H04L 2463/144 20130101;
H04L 63/1408 20130101; H04L 2463/121 20130101 |
International
Class: |
H04L 29/06 20060101
H04L029/06 |
Claims
1. A method for remote tracking of machines on a network of
computers, the method comprising: determining one or more
assertions to be monitored by a first web site, the first web site
being coupled to a network of computers; monitoring traffic flowing
to the web site through the network of computers; identifying the
one or more assertions from the traffic coupled to the network of
computers to determine a malicious host coupled to the network of
computers; associating a first IP address and first hardware
fingerprint to the one or more assertions of the malicious host;
storing information associated with the IP address, hardware
fingerprint, and the one or more assertions of the malicious host
in one or more memories of a database; identifying an unknown host
from a second web site; determining a second IP address and second
hardware fingerprint with the unknown host; and determining if the
unknown host is a malicious host.
1. The method of claim 1 wherein the hardware fingerprint comprises
device fingerprint.
2. The method of claim 1 wherein the hardware fingerprint includes
sampled attributes associated with one or more of `stack ticks`,
`time-skew`, and TCP Window size.
3. The method of claim 1 wherein the sampled attributes further
include remote determination of one or more of ISP, Local Storage
Object, first Party Browser Cookie, third Party Browser Cookie, TCP
IP Address, HTTP IP Address, HTTPS IP Address, RTSP IP Address, RTP
IP Address, FTP IP Address, DNS Names Server IP Address, Maximum
Transmission Unit, Connection Type, Connection Speed, Bogon Hijack
Address, Static/Dynamic Address, Proxy IP Address, TCP Sequence
Number, Browser string, Screen Resolution, Screen DPI, PC Start
Time, HTTP Header information, Local Time, Clock-Offset,
Clock-Drift, PC Time Zone, Browser Plugins, Enabled and Disabled
Browser functions, Browser Document Object Model, Operating System,
and Listening, Open and Closed Sockets or other available or
derivable information.
4. The method of claim 1 wherein the connecting host is protected
by a 12 proxy.
5. The method of claim 1 wherein the connecting host is protected
by an intermediary network device for the purposes of disguising
its IP Address.
6. The method of claim 1 wherein the hardware fingerprint is formed
by a fingerprinting device associated with a protected host.
7. The method of claim 1 wherein the hardware fingerprint is formed
by a fingerprinting device that resides on a data path between the
unknown host and a protected 3 host.
8. A method for fingerprinting a connecting host machine on a
network, the method comprising: forcing the connecting host into a
TCP connection, wherein timestamps are transmitted with each packet
associated with the connection; assigning a session handle to the
connection, some or all of subsequent connections that are
associated with the session handle being able to exchange data with
one another; extending a longevity of the connection, the longevity
allowing extended sampling of the host for the purposes of
fingerprinting; sampling attributes associated with the connection;
queuing the sampled attributes, IP address, and session handle to a
correlator process, the correlator process including one or more
algorithms for processing the sampled attributes; processing the
sampled attributes, IP address, and session handle; and forming a
fingerprint for the connecting host.
9. The method of claim 9 wherein the sampled attributes are
associated with one or more of `stack ticks`, `time-skew`, TCP
Window size, and IP address.
10. The method of claim 9 wherein the sampled attributes further
include remote determination of one or more of ISP, Local Storage
Object, first Party Browser Cookie, third Party Browser Cookie, TCP
IP Address, HTTP IP Address, HTTPS IP Address, RTSP IP Address, RTP
IP Address, FTP IP Address, DNS name server, Maximum Transmission
Unit, Connection Type, Connection Speed, Bogon Hijack Address,
Static/Dynamic Address, Proxy Address, TCP Sequence Number, Browser
string, Screen Resolution, Screen DPI, PC Start Time, HTTP Header
information, Local Time, Clock-Offset, TCP-Time Stamp, TCP
stack-tick, Clock-Drift, Time Zone, Browser Plugins, Enabled and
Disabled Browser functions, Browser Document Object Model,
Operating System, and Listening, Open and Closed Sockets or other
available or derivable information.
11. The method of claim 9 wherein the extending of the longevity of
the connection includes a tar-pitting process.
12. The method of claim 12 wherein the extending of the longevity
of the connection includes: delivering requested payload data in a
delayed or retarded manner; and requesting repeated transmission of
data and/or requests by simulating TCP data loss.
13. The method of claim 12 further comprising selecting a second
plurality of identity attributes characterized by quality measures
higher than a predetermined value.
14. A computer based system for populating a database to form a
knowledge base of malicious host entities, the system comprising a
machine readable memory or memories, the memory or memories
comprising: one or more codes directed to determining a plurality
of identity attributes; one or more codes directed to assigning a
quality measure to each of the plurality the identity attributes;
one or more codes directed to collecting one or more evidences from
the unknown host; one or more codes directed to determining an
attribute fuzzy GUID for each of the plurality of identity
attributes for the unknown host, the attribute fuzzy GUID being
associated with the evidences; one or more codes directed to
processing the attribute fuzzy GUID for each of the plurality of
attributes to determine a host fuzzy GUID for the unknown host; and
one or more codes directed to storing the host fuzzy GUID for the
unknown host in one or more memories of a database to form a
knowledge base.
15. The system of claim 15 wherein the unknown host is one of a
plurality of computing devices in a world wide network of
computers.
16. The system of claim 15 wherein the one or more codes directed
to storing is an executable code.
17. The system of claim 15 wherein the knowledge base comprises a
plurality of malicious host information.
18. The system of claim 15 wherein the host fuzzy GUID comprises an
identifier.
19. The system of claim 19 wherein the identifier is an IP address.
Description
CROSS-REFERENCES TO RELATED APPLICATIONS
[0001] This application is a continuation application of U.S.
patent application Ser. No. 14/455,874, filed Aug. 8, 2014, which
is a continuation of U.S. patent application Ser. No. 13/442,857,
filed Apr. 10, 2012, now U.S. Pat. No. 9,332,020, which is a
divisional application of U.S. patent application Ser. No.
12/022,022, filed Jan. 29, 2008, now U.S. Pat. No. 8,176,178, which
claims priority to U.S. Provisional Patent Application No.
60/887,049 filed Jan. 29, 2007. All of the above applications are
commonly assigned and incorporated herein by reference in their
entirety for all purposes.
[0002] This application is also related to U.S. patent application
Ser. No. 11/550,393 filed Oct. 17, 2006, Now U.S. Pat. No.
8,763,113, entitled "METHOD AND SYSTEM FOR PROCESSING A STREAM OF
INFORMATION FROM A COMPUTER NETWORK USING NODE BASED REPUTATION
CHARACTERISTICS" and U.S. patent application Ser. No. 11/550,395
filed Oct. 17, 2006, Now U.S. Pat. No. 8,141,148, entitled "A
METHOD AND SYSTEM FOR TRACKING MACHINES ON A NETWORK USING FUZZY
GUID TECHNOLOGY", commonly assigned, incorporated here by reference
for all purposes.
COPYRIGHT NOTICE
[0003] All content included such as text, graphics, logos, button
icons, images, audio clips, digital downloads, data compilations,
and software, is the property of its supplier and protected by
United States and international copyright laws. The compilation of
all content is protected by U.S. and international copyright laws.
Copyright 0 2006 ThreatMETRIX PTY LTD. All rights reserved.
BACKGROUND OF THE INVENTION
[0004] The present invention generally relates to network
monitoring techniques. More particularly, the invention provides a
method and system for tracking machines on a network using
fingerprinting technology. Merely by way of example, the invention
has been applied to a computer network environment. But it would be
recognized that the invention has a much broader range of
applicability. For example, the invention can be applied to a
firewall, an intrusion detection/prevention system, a server, a
content filter device, an anti-virus process, an anti-SPAM device,
a web proxy content filter, spyware, web security process,
electronic mail filter, any combination of these, and others.
[0005] Telecommunication techniques have been around for numerous
years. In the 1990s, another significant development in the
telecommunication industry occurred. People began communicating to
each other by way of computers, which are coupled to the telephone
lines or telephone network. These computers or workstations coupled
to each other can transmit many types of information from one
geographical location to another geographical location. This
information can be in the form of voice, video, and data, which
have been commonly termed as "multimedia." Information transmitted
over the Internet or Internet "traffic" has increased dramatically
in recent years. Information is now transmitted through networks,
wide-area networks, telephone systems, and the Internet. This
results in rapid transfer of information such as computer data,
voice or other multimedia information.
[0006] Although the telecommunication industry has achieved major
successes, certain drawbacks have also grown with wide spread
communication networks. As merely an example, negative effects
include an actor (initiator) connecting to another actor (acceptor)
in a manner not acceptable to the acceptor. The inability for the
acceptor to assess the risk of allowing connection from any
initiator means is a problem for efficient resource management and
protection of assets.
[0007] As the size and speed of these networks increase, similar
growth of malicious events using telecommunications techniques:
stalking, cyber-stalking, harassment, hacking, spam, computer-virus
outbreaks, Denial of Service attacks, extortion, fraudulent
behaviors (e.g., such as fraudulent websites, scams, 419 spam,
so-called phishing) have also continued to increase. The goal of
the malicious entity (Offender) is to inflict damage at minimum
risk of detection or accountability. In the current realm of
internet malicious activity, the offenders make use of anonymizing
elements to achieve the latter. A broad range of options are
available to the offender because of the current rate of
compromised hosts ("Bot") on the internet.
[0008] Various methods have been proposed to detect compromised
hosts. For example, prior work has been performed and published
that addresses the concept of machine-based fingerprinting. For
example, see
http://www.cse.ucsd.edu/users/tkohno/papers/PDF/KoBrC105PDF-lowres.pdf
These and other conventional methods have certain limitations that
are described throughout the present specification and more
particularly below.
[0009] From the above, it is seen that a technique for improving
security over a wide area network is highly desirable.
BRIEF SUMMARY OF THE INVENTION
[0010] The present invention generally relates to network
monitoring techniques. More particularly, the invention provides a
method and system for tracking machines on a network using
fingerprinting technology. Merely by way of example, the invention
has been applied to a computer network environment. But it would be
recognized that the invention has a much broader range of
applicability. For example, the invention can be applied to a
firewall, an intrusion detection/prevention system, a server, a
content filter device, an anti-virus process, an anti-SPAM device,
a web proxy content filter, application firewall, spyware, web
security process, electronic mail filter, any combination of these,
and others.
[0011] According to an embodiment of the invention, a method is
provided for tracking machines on a network of computers. The
method includes determining one or more assertions to be monitored
by a first web site which is coupled to a network of computers. The
method includes monitoring traffic flowing to the web site through
the network of computers and identifying the one or more assertions
from the traffic coupled to the network of computers to determine a
malicious host coupled to the network of computers. The method
associates a first IP address and first hardware finger print to
the one or more assertions of the malicious host and stores
information associated with the IP address, hardware finger print,
and the one or more assertions of the malicious host in one or more
memories of a database. The method also includes identifying an
unknown host from a second web site and determining a second IP
address and second hardware finger print with the unknown host. The
method then determines if the unknown host is the malicious host.
In a specific embodiment, the network of computers includes a world
wide network of computers. In an embodiment, the hardware
fingerprint includes information associated with one or more of
`stack ticks`, `time-skew`, TCP Window size, and IP address. In an
embodiment, the fingerprint is formed by a fingerprinting device
associated with a protected host. In another embodiment, the
fingerprint is formed by a fingerprinting device associated with a
stream-based host. In some embodiment, the connecting host is
protected by a proxy. In those embodiments, the fingerprint may be
formed by a fingerprinting device that resides on a data path
between the connecting host and a protected host.
[0012] According to another embodiment of the invention, a method
is provided for fingerprinting of a connecting host machine on a
network. The method includes forcing the connecting host into a TCP
connection mode, in which timestamps are transmitted with each
packet associated with the connection. The method includes
assigning a session handle to the connection. Some or all of
subsequent connections that are associated with the session handle
are able to exchange data with one another. The method extends a
longevity of the connection, such that the longevity allows
extended sampling of the host for the purposes of GUID fingerprint.
The method includes sampling communication information associated
with the connection, and queuing the sampled information, IP
address and session handle to a correlator process. In an
embodiment, the correlator process includes one or more algorithms
for processing the sampled information. The method processes the
sampled information, IP address and session handle to form a
fingerprint for the connecting host.
[0013] In a specific embodiment of the method for fingerprinting of
a connecting host machine on a network, the sampled information is
associated with one or more of `stack ticks`, `time-skew`, TCP
Window size, and IP address. In an embodiment, the extending of the
longevity of the connection includes a tar-pitting process. For
example, the extending of the longevity of the connection includes
delivering requested payload data in a delayed or retarded manner
and requesting repeated transmission of data and/or requests by
simulating TCP data loss. In an embodiment, the correlator may be
local or remote. In a specific embodiment, the one or more
algorithms include linear regression, auto-correlation and support
vector machines. In certain embodiment, the sampling the
communication includes sampling previous host reputation on IP
address. In an embodiment, the correlator process includes sample
data from external sensors, using an infrastructure that aggregates
and shares reputation and fingerprint data across multiple users of
reputation and fingerprinting services. In a specific embodiment of
the method, the correlator process incorporates the sampled
measurements to normalize localized jitter, load or latency
transients. In another embodiment, the correlator process includes
correlating host fingerprints with an identical IP address to
identify specific individual hosts. In an embodiment in which a
proxy is involved, the method replaces an HTTP delivered pixel with
an HTTPS delivered pixel with a suitability generated SSL
certificate to force the browser to bypass the HTTP proxy. In an
embodiment, the method also includes splitting data streams of HTTP
and HTTPS, wherein the HTTP host fingerprint and the HTTPS host
fingerprint are compared and correlated either in a single session
or across multiple sessions emerging from the same initial HTTPS IP
address. In an embodiment, wherein the connecting host is protected
by an anonymizing proxy service, the method further includes
forcing communication via a random port to retrieve data to by-pass
proxy. In an embodiment, wherein the connecting host is protected
by an anonymizing proxy service, the method further includes using
FTP to-pass the proxy. In an embodiment, the fingerprint is formed
by a fingerprinting device that resides on a data path between the
connecting host and a protected host. In another embodiment, the
hardware fingerprint includes information associated with one or
more of `stack ticks`, `time-skew`, TCP Window size, and IP
address. In certain embodiments, the fingerprint is formed by a
fingerprinting device associated with a protected host. In other
embodiments, the fingerprint is formed by a fingerprinting device
associated with a stream-based host. In some embodiments, the
connecting host is protected by a proxy, in which case, the
fingerprint is formed by a fingerprinting device that resides on a
data path between the connecting host and a protected host.
[0014] Various additional objects, features, and advantages of the
present invention can be more fully appreciated with reference to
the detailed description and accompanying drawings that follow.
BRIEF DESCRIPTION OF THE DRAWINGS
[0015] FIG. 1 is a simplified view diagram of layers in an internet
transaction according to an embodiment of the present
invention;
[0016] FIG. 2 is a simplified diagram of a method for evidence
gathering according to an embodiment of the present invention;
[0017] FIG. 3 is a simplified diagram of a method for evidence
processing according to an embodiment of the present invention;
[0018] FIG. 4 is a simplified flow diagram of a method for tracking
machines on a network of computers according to an embodiment of
the present invention; and
[0019] FIG. 5 is a simplified flow diagram of a method for querying
a knowledgebase of malicious hosts according to an embodiment of
the present invention.
DETAILED DESCRIPTION OF THE INVENTION
[0020] The present invention generally relates to network
monitoring techniques. More particularly, the invention provides a
method and system for tracking machines on a network using
fingerprinting technology. Merely by way of example, the invention
has been applied to a computer network environment. But it would be
recognized that the invention has a much broader range of
applicability. For example, the invention can be applied to a
firewall, an intrusion detection/prevention system, a server, a
content filter device, an anti-virus process, an anti-SPAM device,
a web proxy content filter, spyware, web security process,
electronic mail filter, any combination of these, and others.
[0021] According to embodiments of the invention, many factors are
considered in designing a method for tracking machines on a
network. Some of these factors are discussed below.
[0022] Availability of Data--Not all operating systems immediately
provide all of the necessary information for the scope of this
patent. Required data may be variably implemented and is at the
software developers' discretion to interpret the RFC for TCP/IP
communication in their own manner whilst testing for
interoperability.
[0023] Reliability of Data--Some attributes may be variable based
on operating conditions. For example heat, the presence of mains or
battery power to a host being `GUID fingerprinted` may have an
affect on one or more parameters. The rebooting of a host also
resets stack-ticks and is therefore of only transient interest.
[0024] Network Effects--Delay and latency are similar terms that
refer to the amount of time it takes a bit to be transmitted from
source to destination. Jitter is delay that varies over time. One
way to view latency is how long a system holds on to a packet. That
system may be a single device like a router, or a complete
communication system including routers and links. According to an
embodiment of the invention, methods are provided to account for
transient and perceived steady-state effects in the convergence to
`GUM fingerprinting`.
[0025] Evidentiary Quality--Part 1 (inaccuracy)--At the extreme end
of poor evidentiary quality is the concept of false positive and
maliciously inaccurate sensor reports. In an embodiment, the
invention provides a method to allow `acceptance of false
positives` in context of supporting evidence. The practical
application of this is that some data is unreliable (as stated
above) but can be accepted as a source for correlation which
supports convergence.
[0026] Evidentiary Quality--Part 2 (spoofability or spoof
susceptibility)--Conventional methods may assert that a single
detection is adequate, the equivalent of a DNA fingerprint that is
not "spoofable" or forgeable at detection time. This panacea may be
probable but no known method has been proven "un-spoofable". In the
situation of Bots, where the hosts is compromised by code (often at
the kernel level), the machine and DNA are potentially under
control of the Offender or Intermediary and as such, the attributes
later many potentially be also under their control. For example,
MAC address or clock-skew are modifiable if kernel control has been
surrendered. Multiple failings of this technique are discussed the
in the public domain at these sites.
[0027] http://www.cloppert.org/blog/2005_03_01_archive.html
[0028]
http://it.slashdot.org/it/05/03/04/1355253.shtml?tid=172&tid=158
[0029] The table below lists examples of host action and host
attribute.
TABLE-US-00001 Host attributed not spoofed Host attributed spoofed
Host Action - High evidentiary quality Low evidentiary quality Good
Host Action - High evidentiary quality Low evidentiary quality
Malicious
[0030] According to embodiments of the invention, methods are
provided to counter the possible exploitation of one or more
attributes by a malicious entity.
[0031] According to embodiments of the present invention, some
malicious individuals may visit a website and use less
sophisticated methods to hide their presence (such as NAT, proxies,
anonymizers and Tor/onion networks). In an embodiment, the
invention provides a method of tracking and identifying the
responsible host--we will call this `GUID fingerprinting`.
[0032] In a specific embodiment, the invention provides a method
for detection and tracking of hosts where openly available
information such as IP addresses and cookies are less effective,
where the hosts may be attempting to hide its existence
deliberately or by a number of network side-effects such as
corporate or ISP network design.
[0033] In certain embodiment of the invention, methods are provided
that are applicable to TCP or other session oriented transactions
where there is a bi-directional communication between two or more
hosts.
[0034] In an embodiment, the invention provides a method for
tracking machines on a network using available attributes via the
TCP/IP communications infrastructure for some common internet
activities that are difficult to discern malicious from normal
behavior. Sometimes this is referred to as low and slow' attack,
where `low` is used as an analogy to a plane flying beneath the
radar so as to avoid detection. In a specific embodiment, a method
is provided that improves tracking of machines by making use of
sampled and derived attributes based on IP (including UDP, TCP/IP
and HTTP and other) such as `stack ticks`, `time-skew`, TCP Window
size, Operating System, Port Number, Port Listening Status, IP
Address, TCP Sequence Number, Maximum Transmission Unit, Connection
Speed, Geolocation and ISP, in combination with attributes
collected using javascript, flash, HTML, Content Style Sheet or
other browser-based methods including Local Storage Object, first
party Browser Cookie, third party Browser Cookie, Browser user
agent string, Screen Resolution, Screen DPI, PC Start Time, Local
Time, Clock-Offset, Clock-Drift, PC Time Zone, Browser Plugins,
Enabled and Disabled Browser functions, Browser Document Object
Model, Operating System, and Listening, Open and Closed
Sockets.
[0035] In another embodiment of the invention, a method is provided
for applying a combination of host fingerprinting with IP
information to aid in the countering of zombie, BotNet, malicious
and compromised hosts. This can be applied with increased accuracy
in determining reputation or trust measurements for network hosts
that may be engaging in malicious activity.
[0036] In an embodiment of the invention, methods are provided for
forcing host data collection, data triangulation, correlation
methods and proxy mitigation. In an embodiment, the invention
provides techniques for driving accuracy in the fingerprinting
processes. Embodiments of the invention can be applied to malicious
host tracking, particularly zombie or BotNet computers that are
used without the knowledge of the owner. According to embodiments
of the invention, techniques are provided for increasing the
accuracy of the fingerprinting, including several strategies and
implementation methods that extend the above.
[0037] Depending upon the embodiment, the present invention
includes various features, which may be used. These features
include the following:
[0038] 1. That the application of host Globally Unique Identifier
(GUID) fingerprinting may be used for tracking potentially
malicious hosts that are primarily physically static and remain
bonded to a specific ISP or connection provider for access to a
network (internet).
[0039] 2. That applying a combination `stack ticks`, `time-skew`,
TCP Window size and IP address are increase effective method of
tracking hosts and formulating a GUID fingerprint.
[0040] 3. That the use of multiple sensors provides a method of
increasing the accuracy of the fingerprint described above. This
can be thought of as a triangulation or correlation method to
converge on a specific GUID fingerprint.
[0041] 4. One method for implementing multiple sensors to increase
accuracy by the use of one or more moderately fixed reference
points for the purposes of removing latency, jitter and other
transient network effects.
[0042] 5. A specific class of algorithms that are applied in
processes 3 and 4 in delivering the converged GUI fingerprint.
[0043] 6. That the application may also be applied to tracking and
detecting hosts that are used for a multitude of fraudulent hosting
sites with varied entity names, DNS names or IP addresses. This is
a method of reusing a single physical host to conduct activity
under many guises or pseudonyms.
[0044] 7. A method for enabling the process 1 and process 2 when
the host resides behind a firewall or network translation device
(NAT).
[0045] 8. A method for enabling the process 1 and process 2 when
the host resides behind a HTTP proxy and the communication is
initially HTTP.
[0046] 9. A method where hosts that act intermediaries or relays in
certain `store-and-forward` communications are able to implement
previous claims. Such hosts may be (but not limited to) IRC,
Instant Messaging, search, advertising or affiliate network
members, VOW switches or e-mail based communication.
[0047] 10. A method where network devices that are switching,
routing, bridging, or gateway devices are able to implement
previous claims.
[0048] 11. A method where a passive `stand-aside` network device
may modify the stream of network traffic and is able to implement
previous claims. (man in the middle).
[0049] 12. A method for detection of a "man-in-the-middle" attack
can be applied by determining a different host fingerprint is
apparent through the course of a transaction (or group of
transactions).
[0050] 13. A method for identification of users emerging from
Tor/Onion networks or where there is increased sophistication used
by the originator to protect their identity.
[0051] As shown, the above features may be in one or more of the
embodiments to follow. These features are merely examples, which
should not unduly limit the scope of the claims herein. One of
ordinary skill in the art would recognize many variations,
modifications, and alternatives.
[0052] According to an embodiment of the invention, a method is
provided for tracking machines when both malicious hosts and normal
visitors visit a site and conduct activities. For example, the site
may be provided by an HTTP/Web server. Each host arriving is GUID
fingerprinted so that the identity of the host can be
re-established on return visits. In an embodiment, the method for
tracking a host includes assigning a `reputation` or `trust-rating`
for the host that may be used as a risk-management mechanism in
certain transactions on the website. Such activities as payment
processing, account registration, account login,
entering/publishing of data may be points where a host (and the
Offender controlling the host) can perform fraudulent, malicious or
nuisance transactions.
[0053] FIG. 1 is a simplified diagram for host fingerprint
deployment according to an embodiment of the invention. This
diagram is merely an example, which should not unduly limit the
scope of the claims herein. One of ordinary skill in the art would
recognize other variations, modifications, and alternatives. As
shown, FIG. 1 includes a malicious host 110, Internet 120, a
firewall 130, a website 140 and a fingerprint system 150 such as a
ThreatINDEX Agent.
[0054] FIG. 2 is a simplified diagram of a configuration for a
fingerprint system 200 according to an embodiment of the present
invention. This diagram is merely an example, which should not
unduly limit the scope of the claims herein. One of ordinary skill
in the art would recognize other variations, modifications, and
alternatives. As shown, system 200 includes a platform 210. As an
example, platform 210 can be a THAgent Platform by ThreatMETRIX PTY
LTD. As shown, platform 210 includes one or more reputation
processors 211, a fingerprint correlator 212, and a web server
system 213, such as an Apache web server. The reputation processor
211 is coupled to a main customer website 220. The information
exchange between reputation processor and the customer website
includes Reputation Request, Reputation Response, and Reputation
Assertions. In an embodiment, Reputation Assertions can include,
for example, the following:
[0055] 1. IP Address, Session Handle;
[0056] 2. Session/Policy; and
[0057] 3. Evidence/Local Reputation.
[0058] Although the above has been illustrated in terms of specific
system features, it would be recognized that many variations,
alternatives, and modifications can exist. For example, any of the
system features can be further combined, or even separated. The
features can also be implemented, in part, through software or a
combination of hardware and software. The hardware and software can
be further integrated or less integrated depending upon the
application. Further details of certain methods according to the
present invention can be found throughout the present specification
and more particularly below.
[0059] Referring to FIG. 2, a method tracking machines on a network
of computers according to an embodiment of the invention can be
briefly described in a flowchart diagram in FIG. 3 and outlined
below. FIG. 3 is a simplified flowchart diagram for a method for
tracking machines on a network according to an embodiment of the
invention. This diagram is merely an example, which should not
unduly limit the scope of the claims herein. One of ordinary skill
in the art would recognize other variations, modifications, and
alternatives. As shown, the method includes the following
processes.
[0060] 1. (Process 310) Determine one or more assertions to be
monitored by a first web site, the first web site being coupled to
a network of computers;
[0061] 2. (Process 320) Monitor traffic flowing to the web site
through the network of computers;
[0062] 3. (Process 330) Identify the one or more assertions from
the traffic coupled to the network of computers to determine a
malicious host coupled to the network of computers;
[0063] 4. (Process 340) Associate a first IP address and first
hardware finger print to the one or more assertions of the
malicious host;
[0064] 5. (Process 350) Store information associated with the IP
address, hardware finger print, and the one or more assertions of
the malicious host in one or more memories of a database;
[0065] 6. (Process 360) Identify an unknown host from a second web
site;
[0066] 7. (Process 360) Determine a second IP address and second
hardware finger print with the unknown host; and
[0067] 8. (Process 370) Determine if the unknown host is the
malicious host.
[0068] In a specific embodiment, the network of computers includes
a world wide network of computers. In an embodiment, the hardware
fingerprint includes information associated with one or more
attributes including `stack ticks`, `time-skew`, TCP Window size,
and IP address. In an embodiment, the fingerprint is formed by a
fingerprinting device associated with a protected host. In another
embodiment, the fingerprint is formed by a fingerprinting device
associated with a stream-based host. In some embodiment, the
connecting host is protected by a proxy. In those embodiments, the
fingerprint may be formed by a fingerprinting device that resides
on a data path between the connecting host and a protected
host.
[0069] The above sequence of steps provides a method for tracking a
machine visiting a website according to an embodiment of the
present invention. As shown, the method uses a combination of steps
including a way of using IP address and fingerprint to track
machines on a network. Other alternatives can also be provided
where steps are added, one or more steps are removed, or one or
more steps are provided in a different sequence without departing
from the scope of the claims herein. Further details of the present
method can be found throughout the present specification and more
particularly below.
[0070] A method for machine tracking a host visiting a website on a
network according to an embodiment of the invention can be briefly
outlined below.
[0071] Step 1: A host arrives at a website. There is a TCP/IP
connection of the host to the website server ("server"). At the
time of arrival, the host retrieved various components (pages,
images etc.) from the server.
[0072] Step 2: The server or a supporting device (which will be
called the `fingerprinter`) commences to gather information about
the arriving host, assign a "session handle" and develop a GUID
fingerprint. In a specific embodiment, the method includes
instructing the browser to retrieve a `pixel`, Content Style Sheet
element, javascript, flash or other HTML element from the
"fingerprinter". An example of the configuration is illustrated in
FIG. 2.
[0073] Step 3: Once the GUID fingerprint is stored in a database,
the host activities are monitored on the site.
[0074] Step 4: If the host conducts malicious activity on the site,
the database is updated to report and retain evidence of this
activity. This activity may affect the host's `reputation`.
[0075] Step 5: Optionally, the GUID fingerprint, reports of
activity and reputation may be shared with other parties or
websites via a shared or `global` database.
[0076] Step 6: This website (or other sites which have received a
trustworthy report of the reputation of this host) may on occasion
of future visits respond differently in accordance with the newly
updated host reputation. For example, if a host is considered
untrustworthy, increased monitoring or rejection of specific
transactions would be a possible response.
[0077] The above sequence of steps provides a method for tracking a
machine visiting a website according to an embodiment of the
present invention. As shown, the method uses a combination of steps
including a way of assigning a "session handle" and developing a
GUID fingerprint. Other alternatives can also be provided where
steps are added, one or more steps are removed, or one or more
steps are provided in a different sequence without departing from
the scope of the claims herein. Further details of the present
method can be found throughout the present specification and more
particularly below.
[0078] A method for GUID fingerprinting according to an embodiment
of the invention include the following processes.
[0079] In a specific embodiment, the `fingerprinter` ensures or
attempts to force the connecting host into a TCP connection mode,
where timestamps are transmitted with each packet associated with
the connection.
[0080] In an embodiment, for the continuation of the connection, a
connection is assigned a "session handle." Some or all of
subsequent connections that all have the same related "session
handle" may also exchange data in a similar manner.
[0081] In an embodiment, the `fingerprinter` ensures there is
adequate connection activity to establish a significant sample of
the host for the purposes of GUID fingerprint. A method of
achieving this is to extend the longevity of one or more
connections. A method for extending the longevity of a network
connection includes manipulating the communication by a method of
`tar-pitting`.
[0082] FIG. 4 is a simplified view diagram illustrating a method
for extending longevity of network connection according to an
embodiment of the invention. This diagram is merely an example,
which should not unduly limit the scope of the claims herein. One
of ordinary skill in the art would recognize other variations,
modifications, and alternatives. As shown, the method includes a
process of by "tar-pitting." In a specific embodiment, the method
includes the following:
[0083] i. Delivering requested payload data in a delayed or
retarded manner.
[0084] ii. Requesting repeated transmission of data and/or requests
by simulating TCP data loss.
Of course, there can be other variations, modifications, and
alternatives.
[0085] According to an embodiment of the invention, a method for
fingerprinting includes sampling certain fingerprint attributes
associated with the communication and queuing the samples, IP
address and "session handle" to a "correlator" process that may be
local or 25 remote.
[0086] In an embodiment, the "correlator" process includes one or
more algorithms to converge results from the sampled fingerprint
attributes to form a fingerprint for the connecting host. Examples
of fingerprint attributes include `stack ticks`, `time-skew`, TCP
Window size, Maximum Transmission Unity, Connection Speed, HTTP
Header fields and IP address, etc. Examples of algorithms and
methods include but are not limited to linear regression,
auto-correlation and support vector machines as well as well known
policy and rule-matching techniques. The specific use of previous
host reputation on IP address and other attributes may be used to
accelerate convergence. This is an example of a more generalized
method of the correlator process incorporating additional sample
data from external sensors. This can be achieved via an
infrastructure that aggregates and shares reputation and
fingerprint data across multiple users of the reputation and
fingerprinting services.
[0087] FIG. 5 is a simplified diagram of a method for tracking
machines on a network using multiple fingerprinter hosts according
to an embodiment of the invention. This diagram is merely an
example, which should not unduly limit the scope of the claims
herein. One of ordinary skill in the art would recognize other
variations, modifications, and alternatives. In a specific
embodiment, the eCommerce website may instruct the browser to
retrieve a `pixel`, CSS, HTML, javascript or flash from several
"fingerprinter" hosts. In the example shown in FIG. 3, the
fingerprinter hosts are designated s Remote Fingerprinter 1 (501),
Remote Fingerprinter 2 (502), and Local Fingerprinter (503), etc.
These hosts may be located at specifically selected and
topologically located places on the network (or internet) that
provide (within some tolerance) consistent metrics and
measurements. The correlator process may then incorporate these
measurements as a method of normalizing any localized jitter, load
or latency transients.
[0088] In a specific embodiment, where multiple hosts are
associated with a single IP address, the fingerprinting methods
discussed above may be correlated. All host fingerprints with the
same IP address can be correlated to identify specific individual
hosts. This may also be combined or independent of other
methods.
[0089] In another embodiment of the invention, a method is provided
for fingerprinting in the case of connections that are detected or
suspected to be "proxied" (passing through a proxy connection that
is retransmitting packets on behalf of the originator or possibly
even a chain of proxies). Although a proxy fully retransmits all
payload data and can invalidate certain fingerprinting methods, the
present invention outlines methods for by-passing these proxies or
detecting anomalies that would indicate the presence of a
proxy.
[0090] In an embodiment, the invention provides a method for
replacement of a HTTP delivered pixel with an HTTPS delivered pixel
(with a suitability generated SSL certificate) to force the browser
to bypass the HTTP proxy. Using this method, the deployment
scenario illustrated in FIG. 1 or FIG. 3 is still applicable. It
should be noted this method is not guaranteed to work where the
user's machine has been configured for HTTPS proxy.
[0091] As discussed above, it is possible to force a method where
data streams of HTTP and HTTPS can be split. In this situation, the
HTTP host fingerprint and the HTTPS host fingerprint can be
compared and correlated either in a single session or across
multiple sessions emerging from the same initial HTTPS IP
address.
[0092] According to another embodiment of the invention, a method
is provided for fingerprinting a connecting host that is protected
by an anonymous proxy. For example, Tor, Onion, or other
anonymizing services exist on the internet to afford the user a
means of privacy protection in regard to their IP address. The
authors acknowledge the value of these services and also note that
many who operate such services use certain measures to mitigate or
control the number of hacker activities utilizing the service. In
some embodiments, the invention provides methods in order to
provide additional protection from such activities.
[0093] i. In an embodiment, the extrusion points of these networks
are far more (a) static and (b) small in number, than compromised
hosts and are therefore easily tracked using existing IP reputation
methods described throughout the application. The method for
extrusion point detection is to regularly subscribe to these
services and record the extrusion points.
[0094] ii. In another embodiment, communication may be forced via
server side HTML to use a TCP port not serviced for HTTP, HTTPS.
For example the use of a random port to retrieve data may by-pass
proxy. Similarly FTP may by-pass such proxies. Where proxy-bypass
is successful, the IP Address of the communication will differ from
the IP Address presented by the proxy.
[0095] iii. In another embodiment, communication may be forced via
service side HTML to bypass proxy settings as defined by the
browser. For example the browser is caused to request a media file
via the Real Time Streaming Protocol supported by many popular
media players of the time. Yet another example of the method is to
cause objects requested by the browser to initiate a connection
back to the `fingerprinter` and ignoring the proxy configuration in
the browser. Where the method is successful, the IP Address of the
communication will differ from the IP Address presented by the
proxy.
[0096] iv. In another embodiment, anomalies are detected between
attributes collected from the client and attributes determined from
the connecting host's protocol stream. One such example is
detecting that the operating system determined from the browser
agent string based on using one or more of javascript, flash, pixel
or Content Style Sheet element is inconsistent with the operating
system specific implementation of the TCP protocol of the
connecting host. An example of an operating specific implementation
of the TCP protocol is the rate of change between network stack
`ticks` as described by the timestamps feature in the TCP
extensions for high performance RFC1323. Yet another example is the
detection of inconsistencies between the time-zone measured using
javascript or flash downloaded by the client, and the time-zone of
the connecting host as implied by the geo-location of the
connecting host's IP Address.
[0097] v. In another embodiment, the presence of a proxy is
determined based on anomalies detected in HTML that are, or are
not, successfully requested by the client from the `fingerprinter`
Many proxies are configured to minimize the amount of information
about the client that can be leaked to a server in order to
preserve anonymity. One such example is that many commercial
proxies or open source CGI proxies will automatically filter 1*1
pixels, commonly known as web-bugs, which are used in a similar
fashion as traditional browser cookies for tracking clients.
Further this example makes use of both an encoded and an un-encoded
pixel, wherein the method of encoding is understood by the browser
but not by the proxy filter and hence one pixel is filtered and the
other is not, indicating the likely presence of an intermediary
filter device. One such example of encoding is to use Content Style
Sheet elements to embed the pixel as a transparent background of an
element.
[0098] vi. In another embodiment, the presence of a proxy is
determined based on inconsistencies between the geolocation of the
connecting host IP Address and the geolocation of the client's DNS
server IP Address using the following steps. [0099] vi.i The
connecting host requests a HTML page from the `fingerprinter` that
contains a unique hostname as part of a URL that has been uniquely
generated for that connecting host. The host name has two
characteristics 1) it contains a unique string that can be uniquely
matched to the session id or session handle 2) it belongs to an
authorative server which is accessible or on the `fingerprinter`.
[0100] vi.ii A DNS response is normally cached by a client,
however, because the host name is unique and the server can control
how long the client will cache each response, the client DNS will
need to resend the DNS request each time it wishes to access the IP
Address for the unique host name. [0101] vi.iii The
`fingerprinter`, which may include a DNS server, accesses the
attributes of the client's forwarded DNS request packet and
determines the IP Address of the originating client DNS server.
[0102] vi.iv The geolocation of the clients IP Address is then
compared to the geolocation of the connecting host IP Address to
determine if they are in a reasonable vicinity. For example, if
determined not to be from the same country then this would indicate
the use of a proxy.
[0103] According to another embodiment of the invention, a method
is provided for tracking machines on a network including proxy
configuration for stream-based communications. In this scenario
both malicious hosts and normal visitors transfer data via stream.
Protocols that are typical (but not limited to) are: FTP, SMTP,
IRC, Instant Messaging, VOIP communication. The method can be
briefly outlined below.
[0104] Step 1: A host attempts to commence a communication with a
stream-based service. The connection is established via an inbound
proxy host that either transparently bridges or routes the data to
the host delivering the service.
[0105] Step 2: The bridging or routing device (which will be called
the `fingerprinter`) commences to gather information about the
initiating host, assign a "session handle" and develop a GUID
fingerprint.
[0106] Step 3: Once the GUID fingerprint is stored in a database
and the hosts transmissions and other activities are monitored.
[0107] Step 4: If the host conducts malicious activity, the
database is updated to report and retain evidence of this activity.
This activity may affect the hosts `reputation`.
[0108] Step 5: Optionally, the GUID fingerprint, reports of
activity and reputation may be shared with other parties or service
providers via a shared or `global` database.
[0109] Step 6: This service (or other sites who have received a
trustworthy report of the reputation of this host) may on occasion
of future visits respond differently in accordance with the newly
updated host reputation.
[0110] The above sequence of steps provides a method for tracking a
machine on a network including proxy configuration for stream-based
communications according to an embodiment of the invention. As
shown, the method uses a combination of steps including a way of
using an inbound proxy host that either transparently bridges or
routes the data to the host delivering the service. Other
alternatives can also be provided where steps are added, one or
more steps are removed, or one or more steps are provided in a
different sequence without departing from the scope of the claims
herein. Further details of the present method can be found
throughout the present specification and more particularly
below.
[0111] According to an alternative embodiment of the invention, a
method is provided for tracking machines on a network using bridge
or router techniques. In this scenario both malicious hosts and
normal visitors transfer data via stream. Protocols that are
typical (but not limited to) are: FTP, SMTP, IRC, Instant
Messaging, VOW communication. The method can be briefly outlined
below. This situation is identical to described above with
reference to a proxy configuration. However, the device described
as a proxy is replaced with a device that operates transparently at
a lower level in the TCP/IP communication. The host behaves as a
normal bridge or router network device except for the Steps 1-6
described in Method B.
[0112] Step 1: A host attempts to commence a communication with a
stream-based service. The connection is established via an inbound
device that either transparently bridges or routes the data to the
host delivering the service. In an embodiment, the inbound device
operates transparently at a lower level in the TCP/IP
communication. For example, the inbound device can be a bridge or
router network device that is configured to be a fingerprinter.
[0113] Step 2: The `fingerprinter` commences to gather information
about the initiating host, assign a "session handle" and develop a
GUID fingerprint.
[0114] Step 3: Once the GUID fingerprint is stored in a database
and the hosts transmissions and other activities are monitored.
[0115] Step 4: If the host conducts malicious activity, the
database is updated to report and retain evidence of this activity.
This activity may affect the hosts `reputation`.
[0116] Step 5: Optionally, the GUID fingerprint, reports of
activity and reputation may be shared with other parties or service
providers via a shared or `global` database.
[0117] Step 6: This service (or other sites who have received a
trustworthy report of the reputation of this host) may on occasion
of future visits respond differently in accordance with the newly
updated host reputation.
[0118] The above sequence of steps provides a method for tracking a
machine on a network including proxy configuration for stream-based
communications according to an embodiment of the invention. As
shown, the method uses a combination of steps including a way of
using an inbound device, such as a bridge or a router network
device, that either transparently bridges or routes the data to the
host delivering the service. Other alternatives can also be
provided where steps are added, one or more steps are removed, or
one or more steps are provided in a different sequence without
departing from the scope of the claims herein. Further details of
the present method can be found throughout the present
specification.
[0119] According to yet another embodiment, the invention provides
a "man-in-the-middle" method that includes a "fingerprinting
device" that is able to influence and measure communication
activities for the purposes of fingerprinting. In an embodiment,
the fingerprinting device resides on the data path between the
"visiting host" and the "protected host". The method can be briefly
outlined below.
[0120] Step 1: At specific times of establishment of communication
from the visiting host to the protected hosts initiates a
communication from the "fingerprinting device". In a specific
embodiment, during the TCP session establishment, the
"fingerprinting device" replies in a manner that "spoofs" a
response to the "visiting host" before the "protected host"s
packets. The latter's packets are ignored and the TCP session is
established in the manner required for fingerprinting.
[0121] In Steps 2-6 to below, the "fingerprinting device" gathers
the required data by passively sniffing the appropriate traffic
elements.
[0122] Step 2: The `fingerprinting device` commences to gather
information about the initiating host, assign a "session handle"
and develop a GUID fingerprint.
[0123] Step 3: Once the GUID fingerprint is stored in a database
and the hosts transmissions and other activities are monitored.
[0124] Step 4: If the host conducts malicious activity, the
database is updated to report and retain evidence of this activity.
This activity may affect the hosts `reputation`.
[0125] Step 5: Optionally, the GUID fingerprint, reports of
activity and reputation may be shared with other parties or service
providers via a shared or `global` database.
[0126] Step 6: This service (or other sites who have received a
trustworthy report of the reputation of this host) may on occasion
of future visits respond differently in accordance with the newly
updated host reputation.
[0127] The above sequence of steps provides a method for tracking a
machine on a network including proxy configuration for stream-based
communications according to an embodiment of the invention. As
shown, the method uses a combination of steps including a way of
using a "fingerprinting device" that is able to influence and
measure communication activities for the purposes of
fingerprinting. This specific method is well suited to long running
sessions where TCP session initiation is a small fraction of the
overall communication volume. Other alternatives can also be
provided where steps are added, one or more steps are removed, or
one or more steps are provided in a different sequence without
departing from the scope of the claims herein. Further details of
the present method can be found throughout the present
specification.
[0128] It is also understood that the examples and embodiments
described herein are for illustrative purposes only and that
various modifications or changes in light thereof will be suggested
to persons skilled in the art and are to be included within the
spirit and purview of this application and scope of the appended
claims.
* * * * *
References