U.S. patent application number 12/773766 was filed with the patent office on 2010-11-04 for demographic analysis using time-based consumer transaction histories.
Invention is credited to Patrick Faith, Zhongxiao Jiang, Kevin P. Siegel.
Application Number | 20100280881 12/773766 |
Document ID | / |
Family ID | 43031090 |
Filed Date | 2010-11-04 |
United States Patent
Application |
20100280881 |
Kind Code |
A1 |
Faith; Patrick ; et
al. |
November 4, 2010 |
DEMOGRAPHIC ANALYSIS USING TIME-BASED CONSUMER TRANSACTION
HISTORIES
Abstract
Systems, apparatus, and methods for determining groups of
similar consumers and for identifying a trend in consumer behavior
are provided. Likelihoods of a transaction being initiated at
various times by one consumer can be calculated based on previous
transactions of the consumer. The likelihoods for different
consumers can be used to determine a group of similar consumers as
a demographic. The likelihoods of a transaction being initiated at
various times by a consumer of a demographic (or other entity) can
be used to forecast trends (such as a demand for a product) and
make business decisions, such as for marketing campaigns, inventory
levels (e.g. at particular stores or for all stores), pricing, and
store locations. Such likelihoods when focused to a particular
category of transactions can provide even greater accuracy.
Inventors: |
Faith; Patrick; (Pleasanton,
CA) ; Siegel; Kevin P.; (Milpitas, CA) ;
Jiang; Zhongxiao; (Cupertino, CA) |
Correspondence
Address: |
TOWNSEND AND TOWNSEND CREW LLP
TWO EMBARCADERO CENTER, 8TH FLOOR
SAN FRANCISCO
CA
94111
US
|
Family ID: |
43031090 |
Appl. No.: |
12/773766 |
Filed: |
May 4, 2010 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
61175381 |
May 4, 2009 |
|
|
|
Current U.S.
Class: |
705/7.34 ;
706/15; 706/52 |
Current CPC
Class: |
G06Q 10/06375 20130101;
G06Q 30/0202 20130101; G06Q 30/0204 20130101; G06Q 20/40 20130101;
G06Q 10/04 20130101; G06Q 30/02 20130101; G06Q 40/00 20130101; G06Q
30/0205 20130101; G06Q 30/0224 20130101; G06Q 40/12 20131203; G06Q
20/4016 20130101; G06Q 30/0269 20130101 |
Class at
Publication: |
705/10 ; 706/52;
706/15 |
International
Class: |
G06Q 10/00 20060101
G06Q010/00; G06N 5/02 20060101 G06N005/02; G06N 3/02 20060101
G06N003/02 |
Claims
1. A method of identifying a consumer as belonging to a particular
demographic, the method comprising: receiving first data associated
with first transactions of a first entity and second data
associated with second transactions of one or more second entities;
a computer system identifying one or more first patterns of the
first previous transactions and identifying one or more second
patterns of the second previous transactions, wherein each pattern
includes a plurality of values, with at least two of the values
respectively including contributions from transactions
corresponding to different time ranges; comparing the one or more
first patterns to the one or more second patterns; and based on the
comparison, determining whether the first entity and the one or
more second entities belong to a same demographic.
2. The method of claim 1, wherein the first entity is a first
consumer and the one or more second entities are one or more second
consumers.
3. The method of claim 2, wherein the first consumer belongs to the
same demographic at less than one hundred percent.
4. The method of claim 3, further comprising: comparing the first
patterns to other patterns associated with other demographics; and
determining whether the first consumer belongs to the other
demographics.
5. The method of claim 4, further comprising: defining the first
consumer as a linear combination of the patterns of the
demographics to which the first consumer belongs.
6. The method of claim 2, further comprising: sending an incentive
for a first transaction to the first consumer, wherein the
incentive is sent at a time correlated to when the first
transaction is likely for the one or more second consumers; and
wherein determining whether the first entity and the one or more
second entities belong to the same demographic includes determining
whether the first consumer uses the incentive when the first
transaction is likely for the one or more second consumers.
7. The method of claim 1, wherein a value of each pattern
corresponds to a likelihood of a transaction occurring within a
specified time range.
8. The method of claim 1, wherein identifying one or more patterns
of transactions includes: associating one or more keys with each
previous transaction; correlating pairs of previous transactions,
each correlated pair associated with a particular pair of keys; and
for each correlated pair, determining time intervals between the
transactions of the correlated pair; and for each key pair:
tracking numbers of occurrences of correlated pairs having time
intervals within specified time ranges, the transactions of the
correlated pairs being associated with corresponding keys of the
key pair.
9. The method of claim 8, wherein comparing the one or more first
patterns to the one or more second patterns includes comparing a
portion of first patterns associated with pairs of keys that match
with pairs of keys associated with a portion of the second
patterns, the method further comprising: identifying first patterns
associated with pairs of keys that do not match with pairs of keys
associated with the second patterns, wherein determining whether
the first entity and the one or more second entities belong to a
same demographic includes analyzing the non-matching patterns.
10. The method of claim 1, wherein comparing the one or more first
patterns to the one or more second patterns includes determining a
difference between corresponding values of matching patterns
between the first entity and the one or more second entities; and
comparing a sum of the differences to a threshold value.
11. The method of claim 1, wherein comparing the one or more first
patterns to the one or more second patterns includes: multiplying
values of a first pattern with corresponding values of a matching
second pattern to obtain a plurality of resulting values; and
analyzing the resulting values to determine whether the first
consumer and the one or more second consumers belong to a same
demographic.
12. The method of claim 11, wherein analyzing the resulting values
is performed with a neural network.
13. A computer program product comprising a tangible computer
readable medium storing a plurality of instructions for controlling
one or more processors to perform the method of claim 1.
14. A computer system comprising: one or more processors; and the
computer program product of claim 13.
15. A method of identifying a trend in consumer behavior, the
method comprising: receiving data associated with previous
transactions of an entity; a computer system determining one or
more patterns of the previous transactions, wherein each pattern
includes a plurality of values, with at least two of the values
respectively including contributions from transactions
corresponding to different time ranges; determining likelihoods for
an occurrence of a transaction according to the one or more
patterns, each likelihood at a respective one of a plurality of
different times; and identifying a trend in occurrences of the
transaction based on the likelihoods at the plurality of different
times.
16. The method of claim 15, wherein the entity is a group of
similar consumers.
17. The method of claim 15, wherein the entity is a merchant or a
group of similar merchants.
18. The method of claim 15, further comprising: selecting one or
more relevant patterns from the determined patterns based on a
product and/or a merchant associated with the transaction, wherein
the likelihoods are determined from the relevant patterns.
19. The method of claim 15, wherein determining the one or more
patterns of the previous transactions includes: associating one or
more keys with each previous transaction; correlating pairs of
previous transactions; and determining time intervals between
correlated pairs of previous transactions; and tracking the
occurrences of pairs of correlated previous transactions having
time intervals within specified time ranges for a plurality of
pairs of keys.
20. The method of claim 15, further comprising: adjusting one or
more characteristics of a product being sold based on the
identified trend.
21. The method of claim 20, wherein the characteristic includes a
number of items of the product and/or a price of the product.
22. The method of claim 21, wherein the number of items is adjusted
by changing inventory levels of the product at a particular
location.
23. The method of claim 15, wherein identifying a trend in
occurrences of the transaction based on the likelihoods at the
plurality of different times includes determining a likelihood
above an upper threshold value or below a lower threshold
value.
24. The method of claim 15, wherein determining likelihoods for an
occurrence of a transaction according to the one or more patterns
includes determining a probability distribution as a function of
time from the one or more patterns, and wherein the trend is
identified from changes in the probability distribution from one
time to another.
25. The method of claim 15, wherein determining likelihoods for an
occurrence of a transaction according to the one or more patterns
includes determining a probability distribution as a function of
time from the one or more patterns, the method further comprising:
determining one or more other probability distributions from one or
more other patterns associated with the entity, wherein the other
probability distributions are determined at calculation times
different than the probability distribution; calculating a
difference between the probability distribution and the one or more
other probability distributions, wherein the trend is identified
based on the difference.
26. A computer program product comprising a tangible computer
readable medium storing a plurality of instructions for controlling
one or more processors to perform the method of claim 15.
27. A computer system comprising: one or more processors; and the
computer program product of claim 26.
Description
CROSS-REFERENCES TO RELATED APPLICATIONS
[0001] The present application claims priority from and is a non
provisional application of U.S. Provisional Application No.
61/175,381, entitled "SYSTEMS AND METHODS FOR DETERMINING
AUTHORIZATION, RISK SCORES, AND PREDICTION OF TRANSACTIONS" filed
May 4, 2009, the entire contents of which are herein incorporated
by reference for all purposes.
[0002] This application is related to commonly owned and
concurrently filed U.S. patent applications entitled
"PRE-AUTHORIZATION OF A TRANSACTION USING PREDICTIVE MODELING" by
Faith et al. (attorney docket number 016222-046210US), "DETERMINING
TARGETED INCENTIVES BASED ON CONSUMER TRANSACTION HISTORY" by Faith
et al. (attorney docket number 016222-046220US), "TRANSACTION
AUTHORIZATION USING TIME-DEPENDENT TRANSACTION PATTERNS" by Faith
et al. (attorney docket number 016222-046240US), and
"FREQUENCY-BASED TRANSACTION PREDICTION AND PROCESSING" by Faith et
al. (attorney docket number 016222-046250US), the entire contents
of which are herein incorporated by reference for all purposes.
BACKGROUND
[0003] The present application is generally related to tracking and
processing consumer transactions, and more specifically to using a
history of consumer activity in determining a demographic of
consumers and trends in transactions of a demographic.
[0004] As part of marketing campaigns for a product, companies have
identified a particular group of consumers that are likely to buy
the product. A group of consumer is sometimes referred to as a
demographic. Typically, a demographic is determined by external
features of a consumer, such as age or geographic location.
Although such demographics can be useful, other characteristics of
consumers with a same age or location can vary significantly. Thus,
a marketing campaign or other business decision based on such
demographics can be faulty due to conflicting data since the
consumers have disperse characteristics. Moreover, such
demographics typically provide only very general information, and
thus do not provide information for making specific business
decisions.
[0005] Therefore, it is desirable to provide better methods of
identifying groups of consumers and for providing more detailed
information about the transactions of a group of consumers.
BRIEF SUMMARY
[0006] Embodiments provide systems, apparatus, and methods for
determining groups of similar consumers and for identifying a trend
in consumer behavior. Certain embodiments can use likelihoods of a
transaction being initiated at various times to determine a group
of similar users as a demographic (affinity group). The likelihoods
at a plurality of times can be used to forecast trends (such as a
demand for a product) and make business decisions, such as for
marketing campaigns, inventory levels (e.g. at particular stores or
for all stores), pricing, and store locations. Such likelihoods
when focused to a particular category of transactions (e.g. a
particular product) can provide even greater accuracy.
[0007] According to one embodiment, a method of identifying a
consumer as belonging to a particular demographic is provided. Data
respectively associated with transactions of a first entity and one
or more second entities are received. A computer system identifies
patterns of the first entity and patterns of the second entity.
Each pattern includes a plurality of values, with at least two of
the values respectively including contributions from transactions
corresponding to different time ranges. Whether the first entity
and the one or more second entities belong to a same demographic is
determined based on a comparison of patterns of the first entity
and the second entities.
[0008] According to another embodiment, a method of identifying a
trend in consumer behavior is provided. Data associated with
previous transactions of an entity are received. A computer system
determines one or more patterns of the previous transactions. Each
pattern includes a plurality of values, with at least two of the
values respectively including contributions from transactions
corresponding to different time ranges. Likelihoods for an
occurrence of a transaction are determined according to the one or
more patterns. Each likelihood is for the occurrence of the
transaction at a respective one of a plurality of different times.
A trend in occurrences of the transaction is identified based on
the likelihoods at the plurality of different times.
[0009] Other embodiments of the invention are directed to systems,
apparatuses, portable consumer devices, and computer readable media
associated with methods described herein.
[0010] A better understanding of the nature and advantages of the
present invention may be gained with reference to the following
detailed description and the accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
[0011] FIG. 1 shows a block diagram of a system according to an
embodiment of the present invention.
[0012] FIG. 2A shows a plot of a transaction history or other
events of a first consumer as analyzed according to embodiments of
the present invention.
[0013] FIG. 2B shows a plot of a transaction history or other
events of a second consumer as analyzed according to embodiments of
the present invention.
[0014] FIG. 3 is a flow chart of a method 300 for identifying a
consumer as belonging to a particular demographic and identifying a
trend in consumer behavior using a demographic according to
embodiments of the present invention.
[0015] FIG. 4 is a plot of a number of transactions at certain
elapsed times between a final transaction (with key KF) and an
initial event (with key KI) of a correlated key pair according to
embodiments of the present invention.
[0016] FIG. 5A shows a table that stores time information for a key
pair <KI:KF> according to embodiments of the present
invention.
[0017] FIG. 5B shows a plot for use in determining the time ranges
for table according to an embodiment of the present invention.
[0018] FIG. 6 shows an example of obtaining indicia of a similarity
of transactions of one entity relative to a transaction pattern of
another entity according to embodiments.
[0019] FIG. 7 shows a calculation of a likelihood that transaction
patterns of a first entity are similar to transaction patterns of a
second entity according to embodiments.
[0020] FIG. 8 is a flowchart of a method for determining a
likelihood of a transaction and using the likelihood to identify a
trend according to embodiments.
[0021] FIG. 9 shows a block diagram of an example computer system
usable with systems and methods according to embodiments of the
present invention.
DETAILED DESCRIPTION
[0022] Information about a group of consumers can be useful in
making business decisions (such as shaping marketing decisions or
inventory levels). However, consumers have typically been organized
by broad external factors like age. Embodiments can provide more
narrowly tailored affinity groups, e.g., ones that have similar
transaction patterns. Such affinity groups can lead to greater
accuracy in determining success of a business decision. Likelihoods
of a transaction at various times can also be used to identify
trends in certain types of transactions, and therefore allow
accurate forecasting.
[0023] I. System Overview
[0024] FIG. 1 shows an exemplary system 20 according to an
embodiment of the invention. Other systems according to other
embodiments of the invention may include more or less components
than are shown in FIG. 1.
[0025] The system 20 shown in FIG. 1 includes a merchant 22 and an
acquirer 24 associated with the merchant 22. In a typical payment
transaction, a consumer 30 may purchase goods or services at the
merchant 22 using a portable consumer device 32. The merchant 22
could be a physical brick and mortar merchant or an e-merchant. The
acquirer 24 can communicate with an issuer 28 via a payment
processing network 26. The merchant 22 could alternatively be
connected directly to the payment processing network 26. The
consumer may interact with the payment processing network 26 and
the merchant through an access device 34.
[0026] As used herein, an "acquirer" is typically a business
entity, e.g., a commercial bank that has a business relationship
with a particular merchant or an ATM. An "issuer" is typically a
business entity (e.g., a bank) which issues a portable consumer
device such as a credit or debit card to a consumer. Some entities
can perform both issuer and acquirer functions. Embodiments of the
invention encompass such single entity issuer-acquirers.
[0027] The consumer 30 may be an individual, or an organization
such as a business that is capable of purchasing goods or services.
In other embodiments, the consumer 30 may simply be a person who
wants to conduct some other type of transaction such as a money
transfer transaction or a transaction at an ATM.
[0028] The portable consumer device 32 may be in any suitable form.
For example, suitable portable consumer devices can be hand-held
and compact so that they can fit into a consumer's wallet and/or
pocket (e.g., pocket-sized). They may include smart cards, ordinary
credit or debit cards (with a magnetic strip and without a
microprocessor), keychain devices (such as the Speedpass.TM.
commercially available from Exxon-Mobil Corp.), etc. Other examples
of portable consumer devices include cellular phones, personal
digital assistants (PDAs), pagers, payment cards, security cards,
access cards, smart media, transponders, and the like. The portable
consumer devices can also be debit devices (e.g., a debit card),
credit devices (e.g., a credit card), or stored value devices
(e.g., a stored value card).
[0029] The merchant 22 may also have, or may receive communications
from, an access device 34 that can interact with the portable
consumer device 32. The access devices according to embodiments of
the invention can be in any suitable form. Examples of access
devices include point of sale (POS) devices, cellular phones, PDAs,
personal computers (PCs), tablet PCs, handheld specialized readers,
set-top boxes, electronic cash registers (ECRs), automated teller
machines (ATMs), virtual cash registers (VCRs), kiosks, security
systems, access systems, and the like.
[0030] If the access device 34 is a point of sale terminal, any
suitable point of sale terminal may be used including card readers.
The card readers may include any suitable contact or contactless
mode of operation. For example, exemplary card readers can include
RF (radio frequency) antennas, magnetic stripe readers, etc. to
interact with the portable consumer devices 32.
[0031] The access device 34 may also be a wireless phone. In one
embodiment, the portable consumer device 32 and the access device
are the same device. For example, a consumer may use a wireless to
phone to select items to buy through a browser.
[0032] When the access device 34 is a personal computer, the
interaction of the portable consumer devices 32 may be achieved via
the consumer 30 or another person entering the credit card
information into an application (e.g. a browser) that was opened to
purchase goods or services and that connects to a server of the
merchant, e.g. through a web site. In one embodiment, the personal
computer may be at a checkout stand of a retail store of the
merchant, and the application may already be connected to the
merchant server.
[0033] The portable consumer device 32 may further include a
contactless element, which is typically implemented in the form of
a semiconductor chip (or other data storage element) with an
associated wireless transfer (e.g., data transmission) element,
such as an antenna. Contactless element is associated with (e.g.,
embedded within) portable consumer device 32 and data or control
instructions transmitted via a cellular network may be applied to
contactless element by means of a contactless element interface
(not shown). The contactless element interface functions to permit
the exchange of data and/or control instructions between the mobile
device circuitry (and hence the cellular network) and an optional
contactless element.
[0034] The portable consumer device 32 may also include a processor
(e.g., a microprocessor) for processing the functions of the
portable consumer device 32 and a display to allow a consumer to
see phone numbers and other information and messages.
[0035] If the portable consumer device is in the form of a debit,
credit, or smartcard, the portable consumer device may also
optionally have features such as magnetic strips. Such devices can
operate in either a contact or contactless mode.
[0036] Referring again to FIG. 1, the payment processing network 26
may include data processing subsystems, networks, and operations
used to support and deliver authorization services, exception file
services, and clearing and settlement services. An exemplary
payment processing network may include VisaNet.TM.. Payment
processing networks such as VisaNet.TM. are able to process credit
card transactions, debit card transactions, and other types of
commercial transactions. VisaNet.TM., in particular, includes a VIP
system (Visa Integrated Payments system) which processes
authorization requests and a Base II system which performs clearing
and settlement services.
[0037] The payment processing network 26 may include a server
computer. A server computer is typically a powerful computer or
cluster of computers. For example, the server computer can be a
large mainframe, a minicomputer cluster, or a group of servers
functioning as a unit. In one example, the server computer may be a
database server coupled to a Web server. The payment processing
network 26 may use any suitable wired or wireless network,
including the Internet.
[0038] As shown in FIG. 1, the payment processing network 26 may
comprise a server 26a, which may be in communication with a
transaction history database 26b. In various embodiments, a
transaction analyzer 26c can determine patterns in transactions
stored in transaction history database 26b to determine certain
actions, such as authorizing a transaction or sending an incentive.
In one embodiment, an incentive system 27 is coupled with or part
of payment processing network 26 and can be used to determine an
incentive based on determined transaction patterns. Each of these
apparatus can be in communication with each other. In one
embodiment, all or parts of transaction analyzer 26c and/or
transaction history database 26b may be part of or share circuitry
with server 26a.
[0039] As used herein, an "incentive" can be any data or
information sent to a consumer to encourage a transaction. For
example, a coupon can be sent to a consumer as an incentive since
the consumer can obtain a better transaction price. As another
example, an advertisement can be sent to a consumer to encourage a
transaction by making the consumer aware of a product or service.
Other example of incentives can include rewards for making a
transaction and preferential treatment when making the
transaction.
[0040] The issuer 28 may be a bank or other organization that may
have an account associated with the consumer 30. The issuer 28 may
operate a server which may be in communication with the payment
processing network 26.
[0041] Embodiments of the invention are not limited to the
above-described embodiments. For example, although separate
functional blocks are shown for an issuer, payment processing
network, and acquirer, some entities perform all or any suitable
combination of these functions and may be included in embodiments
of invention. Additional components may also be included in
embodiments of the invention.
[0042] II. Transaction Patterns and Demographics
[0043] Consumer activity can include transactions, among other
things. Knowledge of a pattern of transactions of a consumer can
allow identification of consumers with similar transaction
patterns. Patterns of transactions can also allow identification of
similar merchants, or other entities. The similar consumers can be
identified as belonging to a demographic, also called an affinity
group. Transaction patterns of a demographic can also be used to
forecast business decisions for a company who has customers from
the determined demographic. However, the identification of a
pattern can be difficult given the enormous amount of data, some of
which might exhibit patterns and some of which may not.
[0044] As used herein, the term "pattern" refers broadly to a
behavior of any set of events (e.g. transactions) that have a
likelihood of repeating. In one aspect, the likelihood can be
greater than a random set of events, e.g., events that are
uncorrelated. The likelihood can be expressed as a probability
(e.g. as a percentage or ratio), a rank (e.g. with numbers or
organized words), or other suitable values or characters. One type
of pattern is a frequency-based pattern in which the events repeats
with one or more frequencies, which may be predefined. To define a
pattern, a reference frame may be used. In various embodiments, the
reference frame may be or include an elapsed time since a last
event (e.g. of a type correlated to the current event), since a
beginning of a fixed time period, such as day, week, month, year, .
. . (which is an example of a starting event), before an end of a
fixed time period, or before occurrence of a scheduled event (an
example of an ending event). Another event can be certain actions
by the consumer, such as traveling to a specific geographic
location or browsing a certain address location on the
Internet.
[0045] FIG. 2A shows a plot 200 of a transaction history or other
events of a first consumer as analyzed according to embodiments.
Plot 200 shows times at which each of a plurality of previous
transactions 210 have occurred. As shown, time is an absolute time
(e.g. date and time) or an elapsed time since an initial event 203.
Herein, the term "time" can refer to either or both a date and a
time of a particular day. These previous transactions 210, which
occur before an end time 205, can be analyzed to determine a
pattern 220, which can be a function that approximates when the
transactions are likely to occur. As an example, an identified
pattern can be used to predict a likelihood of a next transaction,
e.g. transaction 230.
[0046] Transaction patterns can also be used to determine whether
consumers are similar, and thus whether they might be part of a
same demographic. As a simple example, one demographic might be an
pregnant woman or a new mother. Such consumers may have similar
transaction patterns in buying products for a baby. But such a
demographic may not easily obtained from age or other easily
identifiable facts about a person. Also, certain consumers may have
similarities for which a cause may not be so easily identifiable.
In these cases, a demographic may not even be known to exist.
[0047] FIG. 2B shows a plot 250 of a transaction history or other
events of a second consumer as analyzed according to embodiments.
Plot 250 shows times at which each of a plurality of previous
transactions 260 have occurred. These previous transactions 260 can
be analyzed to determine a pattern 270.
[0048] The pattern 220 of the first consumer can then be compared
to the pattern 270 of the second consumer. As shown, the pattern
220 is similar to the pattern 270. In one embodiment, the
similarity can be based on value of the patterns at various times,
e.g., a comparison of the likelihood of a transaction at various
times. In other embodiments, parameters of functions that define or
approximate the pattern can be compared to determine if the
patterns are similar. For example, a function can be a linear
combination of prescribed set of basis functions (e.g. sines and
cosines) times corresponding coefficients, which may be
compared.
[0049] In one embodiment, the second consumer is a demographic, and
the pattern 270 is a combination of patterns of consumers in the
demographic. In such an instance, the number of occurrences of
transaction may be quite larger, but can be normalized to provide
an accurate comparison.
[0050] The identification of a pattern can have many difficulties.
If the previous transactions 210 include all of the transactions of
a consumer and exhibit only one pattern, then the identification of
a pattern may be relatively easy. However, if only certain types of
transactions for a consumer show a pattern, then the identification
can be more difficult. Some embodiments can use keys (K1, K2, . . .
) to facilitate the analysis of certain types of transactions,
where a key can correspond to a type of transaction. A key can also
allow identification of transactions as being relevant for a
current task (e.g. the key being associated with a transaction to
be incentivized).
[0051] Adding to the complexity can be whether the path to a
particular transaction has an impact on the pattern, e.g., a
pattern that exists only when certain transactions precede or
follow a transaction. Embodiments can store transaction data
associated with a specific order of keys (e.g. K1, K3). In this
manner, the data for that specific order can be analyzed to
determine the pattern. The order of keys also allows the further
identification of relevant transactions.
[0052] All of this complexity can be further compounded in
instances where a certain path (sequence of two or more
transactions) can have more than one pattern. Embodiments can use
certain functional forms to help identify different patterns. In
some embodiments, a combination of periodic functions are used,
e.g., e.sup.-iwt, where w is a frequency of a pattern. In one
embodiment, the frequencies are pre-selected thereby allowing an
efficient determination of the patterns. Further, the frequencies
can be identified by an associated wavelength, or wavelength range.
Counters can be used for each wavelength range, thereby allowing a
pattern to be very quickly identified by analyzing the values of
the counters.
[0053] III. Determining and Using Affinity Groups
[0054] FIG. 3 is a flow chart of a method 300 for identifying a
consumer as belonging to a particular demographic and identifying a
trend in consumer behavior using a demographic according to
embodiments. In one embodiment, previous transactions (e.g. 210)
are used to determine whether a first consumer is part of a
demographic. In one implementation, transactions within a specific
time period are analyzed, e.g., last year or all transactions
before an end time. The transactions can also be filtered based on
certain criteria, such that only certain types of transactions are
analyzed. The transaction history can include valid and fraudulent
transactions. All or parts of method 300 or other methods herein
can be performed by a computer system that can include all or parts
of network 26; such a system can include disparate subsystems that
can exchange data, for example, via a network, by reading and
writing to a same memory, or via portable memory devices that are
transferred from one subsystem to another.
[0055] In step 310, data respectively associated with transactions
previously performed by a first consumer and with transactions by
one or more second consumers are received. In one embodiment, the
one or more second consumers are already determined to be part of
an affinity group. In one aspect, the affinity group can be
considered as single entity with transaction patterns being for the
entire group.
[0056] In one embodiment, the data in the transaction history
database 26(b) can be received at a transaction analyzer 26(c) of
system 20 in FIG. 1, which includes a processor that may be
configured with software. Each transaction can have any number of
pieces of data associated with it. For example, the data may
include categories of an account number, amount of the transaction,
a time and date, type or name of product or service involved in the
transaction, merchant name or code (mcc), industry code, terminal
field (whether a card is swiped), and geographic location (country,
zip code, . . . ). In one embodiment, a merchant could be a whole
chain or a particular store of a chain. In some embodiments, the
transaction data can also include video and/or audio data, e.g., to
identify a person or a behavior of a person. The transaction data
can be different for each transaction, including the account
number. For example, the consumer can be identified with the
account number and other account numbers of the consumer can be
included in the analysis of the behavior of the consumer.
[0057] This data can be used to identify a particular type of
transaction. In one embodiment, the data for a transaction is
parsed to identify one or more keys, which are used as identifiers
for a particular transaction. In various embodiments, a key can
includes parts of the transaction data and/or data derived from the
transaction data. A key could also be composed of results from an
analysis of a transaction, e.g., whether the transaction is a
card-present transaction or a card-not-present transaction could be
determined from the transaction data and included in the key. In
one embodiment, a mapping module can perform the mapping of the
transaction data to one or more keys.
[0058] A key can be composed of multiple pieces of data (referred
to herein as a key element). A longer key has more key elements and
may be a more selective identifier of a type of transactions. Each
transaction can be associated with different keys, each with a
different scope of specificity for characterizing the
transaction.
[0059] In step 320, transactions are correlated with other
transactions and events. In one implementation, transactions of the
first consumer are only correlated with each other, and similarly
for the transactions of the second consumers. In this manner,
different transaction patterns can be identified for different
types of transactions, and for different entities. Other events
(e.g. start or end of a day, week, etc.) can be correlated to
transactions as well. An event can also be a movement of the
consumer from one state to another (e.g. from an at-home state to
an on-vacation state). Different events can also be identified with
keys. Herein, examples are used to described how keys are used to
identify transaction types, but other suitable methods can be
used.
[0060] In one embodiment, pairs of correlated keys (e.g. a key pair
<KI:KF>) are determined based on whether transactions
associated with an initial key (KI) are correlated with
transactions with a final a final key (KF). A first (initial) event
can be correlated with a later (final) transaction. The initial key
and the final key may be the same or different from each other. For
example, a transaction at one merchant may be correlated to a later
purchase at another merchant, which might occur if the merchants
are near to each other. In one embodiment, a group of more than two
keys could be correlated together, e.g. a group of three keys can
be correlated.
[0061] Two transactions can be correlated in multiple ways
depending on how many keys are associated with each transaction.
Thus, two transactions can contribute to more than one key pair,
when the transactions are associated with multiple keys. For
example, if an initial transaction is associated with two keys and
the final transaction is also associated with two keys, then there
could be four resulting key pairs. Also, a transaction may be
correlated to another transaction only via certain keys.
[0062] In one embodiment, the transactions of a group of consumers
can all be analyzed together to determine correlations of
transactions having certain keys. Certain key pairs can be
pre-determined for tracking. For example, a store may want to have
transactions at a specific location (or all locations) tracked.
[0063] In step 330, a computer system identifies one or more first
patterns of the transactions by a first consumer and one or more
second patterns of transaction by the one or more second consumers.
The computer system can be transaction analyzer 26(c), which can be
a subsystem or one apparatus. In some embodiments, a pattern can be
defined by a set of indicia, which can be a set of numerical
values. The indicia can convey the likelihood of a transaction as a
function of time. For example, pattern 220 conveys that
transactions are likely when the function has a higher value. In
one embodiment, each pattern includes a plurality of values (e.g.
likelihood values) with at least two of the values respectively
including contributions from transactions corresponding to
different time ranges.
[0064] In one embodiment, pairs of correlated transactions (or
other events) are used to determine a pattern, e.g., as times of
final transactions related to initial events. The times can be
stored as an absolute time and/or date for each transaction (e.g.
in chronological order) or organized as elapsed times for
correlated events of certain key pairs. The elapsed time may be the
time between a transaction with K1 and the next transaction with K2
for the correlated <K1:K2> pair. Other data can be stored as
well, e.g. data not included in the keys, such as an amount of the
transaction. The elapsed time can effectively equal an absolute
time if the initial event is the beginning of a time period.
[0065] In some embodiments, the time information is stored (e.g. in
transaction history database 26b) associated with the corresponding
key pair. For example, a key pair identifier (e.g. a unique ID
number) can be associated with the stored time information. As
examples of an association, a key pair identifier could point to
the time information, the time information could be stored in a
same row as the key pair identifier, and the key pair identifier
could be stored associated with the pointer.
[0066] In other embodiments, the time information for the key pair
<K1:K2> can be stored in a database table that can be
accessed with a query containing K1, K2, or the combination
(potentially in the order of K1:K2). For example, a search for K1
and/or K2 can provide the associated identifiers. In one
embodiment, a hash of each key of a pair is also associated with
the key pair identifier, so that information for each key can be
indexed and found separately. For example, hashes of K1 and K2 can
be stored in a lookup table so the key pair identifiers (and thus
the key pair information) can be easily found.
[0067] In one aspect, storing time information in association with
certain key pairs can allow the time information for specific types
of transactions to be easily accessed. Also, such organization can
provide easier analysis of the data to identify patterns for
specific key pairs. The occurrences of the transaction can then be
analyzed (e.g. Fourier analysis or other functional analysis) to
identify a pattern of the times and dates of these
transactions.
[0068] In step 340, the one or more first patterns are compared to
the one or more second patterns. In one embodiment, a respective
first pattern is compared to a corresponding (matching) second
pattern. A matching second pattern can be a pattern with an exact
same key pair, or a similar key pair. In one implementation,
patterns can be compared by comparing numerical values of certain
indicia of the patterns. In another embodiment, patterns that do
not match are also determined and incorporated into the
comparison.
[0069] In one embodiment, the indicia can be the likelihood of
patterns at respective times. In one implementation, the likelihood
is for any transaction by a consumer, and thus the entire
transaction history can be used. In another implementation, the
likelihood is for a particular key pair. When a particular
transaction is being investigated, the relevant pattern can be
found by querying a database using the key(s) of the particular key
pair. In various embodiments, the indicia for a likelihood may be a
number of transactions in a time range or the probability (or other
measure of likelihood) at a given point in time, e.g., as
calculated from a value of the pattern function at the point in
time.
[0070] In step 350, whether first and second consumers belong to
same demographic is determined based on the comparison. In one
embodiment, corresponding pairs of indicia from two patterns can be
subtracted from each other and compared to a threshold. In another
embodiment, corresponding pairs of indicia from two patterns can be
multiplied times each other and compared to a threshold. As
mentioned above, non-matching patterns can also be used. In one
aspect, more non-matching patterns can correspond to a lower
similarity of the first consumer to the second consumers. In yet
another embodiment, the first consumer can be similar to an
affinity group (i.e. belong to a demographic) with varying degrees
of similarity (e.g. by percentage of similarity).
[0071] In some embodiments, the indicia of the patterns can be
input into a modeling function as part of the determination. In
various implementations, the modeling function can be an
optimization function (e.g. a neural network) or can be a decision
tree (e.g. composed of IF THEN logic that compares the indicia). In
one embodiment, an optimization function can be trained on previous
transactions of multiple entities, and thus can determine how much
a pattern of a particular entity (e.g. a consumer, merchant, or
affinity group) is similar to a patter of another entity. In
another embodiment, the number of keys associated with the
transaction relates to the number of inputs into the modeling
function. The relationship is not necessarily one-to-one as similar
keys (e.g. ones of a same category) may be combined (e.g. same key
elements, but just different values), but there may be a
correspondence between the number of different types of keys and
the number of inputs.
[0072] Other embodiments can determine a demographic in other ways.
For example, a merchant affinity group can be determined by
identifying transaction patterns for a certain set of stores, which
can be grouped by merchant code. At least some of the consumer's
transaction patterns can then be compared to the transaction
patterns for a merchant affinity group. Consumers having patterns
similar to the merchant affinity group can belong to a same
consumer affinity group. As another example, a consumer can be sent
an incentive to buy a new product (e.g. music) from a coffee shop,
when the consumer is next predicted to visit the coffee shop to buy
coffee. Once the consumer performs the transaction, the consumer
can be partially defined by a certain music demographic according
to the fact that the consumer bought music and potentially to a
more narrow demographic of the type of music. As yet another
example, a consumer can then be sent an incentive for a transaction
consistent with a demographic to which the consumer may belong. The
use of incentive can be used to determine whether the person is
actually part of the affinity group. If the person does not use the
incentive, then the consumer is less likely to be part of the
affinity group.
[0073] In step 360, likelihoods of transactions are determined
according to demographic patterns at a plurality of times. For
example, after a demographic has been determined, the transaction
patterns of the demographic can be used to determine the likelihood
of a transaction at a plurality of times. The likelihood can be for
transactions at any time, e.g. in the past and/or in the future. In
one embodiment, the times include at least two different time
ranges.
[0074] In step 370, a trend in the pattern for transactions can be
identified based on likelihoods of transactions at the plurality of
times. A trend can be the change (increase or decrease) of a
likelihood from one time to another. In one embodiment, a trend can
be identified by a likelihood of a transaction being above an upper
threshold or being below a lower threshold. In such instances, a
high or low likelihood can indicate that the likelihood is
different from an average or outside of an expected range, and thus
a trend can exist. In one aspect, a likelihood of a pattern can be
integrated over time to determined an average level or range of
likelihood.
[0075] A trend can also be related to a change in a pattern
calculated at one time to a change in a pattern calculated at a
different time. For example, transaction patterns for a demographic
can be calculated at various times (e.g. periodically, such as
every month). A pattern calculated at one month can be compared to
the same pattern (e.g. same key pair) calculated at another month.
The change in the likelihood of a transaction at various times can
be used to determine a trend. For instance, the likelihood of a
transaction at a particular time may have increased and thus a
trend can be detected. The amount of change can be used to forecast
a likelihood even greater than the predicted likelihood since there
is a trend higher. In one embodiment, three or more patterns
calculated at different times may be used to determine whether a
trend actually exists. For each calculation of a pattern, the
specific transactions being used to determine the pattern can
change as new transactions may have been received and old
transactions can be dropped.
[0076] In one embodiment, a similar process can be performed for
the entities, such as merchants and groups of similar merchants.
Transaction patterns of one merchant can be compared to transaction
patterns of another merchant to determine similar merchants. As an
example, knowledge of a similar merchant can be used to determine
when fraud might occur at a merchant. Trends in transactions of a
merchant can be used to determine inventory levels inventory levels
(e.g. at particular stores or for all stores) and other business
decisions based on forecasting.
[0077] IV. Analysis of a Pattern
[0078] If a pattern of when transactions occur is known, then the
pattern can be used to determine when transactions are likely, and
thus determine demographics and trends. For example, if a pattern
(e.g. a pattern of transactions associated with specific keys) for
one or more previous months is known, embodiments can use this
pattern to determine a pattern for a future month (e.g. for same
month next year or for a next month). The patterns can be analyzed
in numerous ways, and FIG. 4 describes some embodiments.
[0079] FIG. 4 is a plot 400 of a number of transactions at certain
elapsed times between a final transaction (with key KF) and an
initial event (with key KI) of a correlated key pair according to
embodiments. Plot 400 can be considered as a histogram. The X axis
is elapsed time between a final transaction and a correlated
initial event. Any unit of time may be employed, such as minutes,
hours, days, weeks, and even years. The Y axis is proportional to a
number of transactions. Each bar 410 corresponds to the number of
transactions at an elapsed time. Each bar 410 can increase over
time as new transactions are received, where a new transaction
would have an elapsed time relative to a correlated initial event.
Note that more than one transaction-event pair can have the same
elapsed time.
[0080] In one embodiment, the X axis can have discrete times. For
example, only transactions for each day may be tracked. Thus, if
the initial event was the start of a month, then the number of
discrete time periods would have a maximum of 31 days. In such an
embodiment, elapsed time values within a certain range can all
contribute to a same parameter, and bars 410 may be considered as
counters. For example, if the discrete times were by day, any two
transactions that have an elapsed time of 12 days since a
correlated KI event would both cause the same counter to be
increased. In one embodiment, these counters are the time
information that is stored as mentioned above. In some
implementations, the time ranges do not all have the same length.
For example, the time ranges closer to zero can have a smaller
length (e.g. just a few minutes) than the time ranges further from
zero (e.g. days or months).
[0081] A pattern 420 can be discerned from the elapsed times. As
shown, pattern 420 has a higher value at elapsed times where more
transactions have occurred. In one embodiment, pattern 420 could
simply be the counters themselves. However, in cases where the time
intervals are not discrete or have a small range, bars 410 might
have zero or low value at times that happen to lie between many
transactions. In these cases, certain embodiments can account for
transactions at a specific time as well as transactions at times
that are close. For example, as shown, a function representing
pattern 420 begins curving up and plateaus near the cluster 460 of
transactions to form a peak 430. In one embodiment, each time point
of the function can have a value of a moving average of the number
of transaction within a time period before and after (or just one
or the other) the time point. In other embodiments, function can be
determined from interpolation or other fitting method (e.g., a fit
to periodic functions) performed on the counters.
[0082] Indicia of the pattern 420, e.g., the function values, can
be analyzed to determine when a transaction is likely and when one
pattern is similar to another pattern. In one implementation, peaks
of the pattern 420 are identified as corresponding to times when a
transaction is likely and troughs (e.g. 470) are identified as
corresponding to times when a transaction is unlikely, both of
which can correspond to a trend in the pattern. In one embodiment,
a width of the function at specific values or times may then be
used as a time window for identifying a trend. For example, a time
window (e.g. a two day or 1.5 day period) of when transactions
often occur (or generally do not occur) may be determined. The time
window can be used, for example, to determine when and for how long
inventory levels should be increased or decreased.
[0083] In one embodiment, a full width at half maximum may be used,
such as the width of peak 430. In another embodiment, the window
(e.g., 440) above a threshold value 450 is used, or just part of
this window, e.g., starting at the time where pattern 420 is above
the threshold and ending at the top (or other part) of peak 430. In
yet another embodiment, the time window may have a predetermined
width centered or otherwise placed (e.g. starting or ending) around
a maximum or other value above a threshold.
[0084] In embodiments using a threshold, the value of the pattern
function may be required to be above the threshold value before a
transaction is considered likely enough to be related to a trend.
As mentioned above, multiple threshold levels can be used, e.g., a
lower and upper threshold defining a range outside of which a
likelihood value can be identified with a trend. The use of
thresholds encompass using the exact likelihood values, which can
be equivalent to using many threshold levels. The modeling function
mentioned above may be used to perform any of these
determinations.
[0085] In one embodiment, a threshold determination could be
whether a counter has a high enough value (absolute or relative to
one or more other counter). In another embodiment, a threshold
level can be relative (e.g. normalized) compared to a total number
of transactions. A normalization or determination of a threshold
can be performed by adjusting the level depending on the low values
of likelihood of a pattern, e.g., a peak to trough height could be
used. In one aspect, the troughs may be offset to zero.
[0086] Storing time information that includes a number of
transaction at certain elapsed times, one can not only handle paths
(such as initial key to final key), but one can also easily
identify multiple patterns. Each peak can correspond to a different
pattern. For example, each peak can correspond to a different
frequency of occurrence for a transaction associated with the final
key relative to an event (e.g. a transaction) associated with the
initial key. In one embodiment, the time information for the
elapsed times can be stored by storing a time of when both events
occur. In another embodiment, time information can store the
elapsed time as one value. In yet another embodiment, the time of
one event might implicitly include the time of the initial event
(e.g. when the first event is beginning of a month or other fixed
time period).
[0087] From FIG. 4, one can identify one predominant pattern (peak
430) with a long wavelength (short frequency), which does not occur
very often, and three minor peaks with higher frequencies. However,
the determination of a pattern might still take significant
computational effort if the pattern can have any functional
form.
[0088] V. Use of Periodic Functions and Counters
[0089] Some embodiments use certain functional forms to help
identify different patterns. As mentioned above, periodic functions
can be used, e.g., e.sup.iwt, where w is a frequency of the
pattern. For example, each bar (counter) 410 of FIG. 4 can
correspond to a different frequency. The total probability V of a
K2 transaction occurring at a time t after a K1 transaction can be
considered as proportional to
W C w wt , ##EQU00001##
where C.sub.w corresponds to the counter value at the frequency w
and w runs over all of the frequencies. C.sub.w can be considered a
coefficient of the periodic function e.sup.iwt at a particular
frequency. Thus, conceptually, a probability can be calculated
directly from the above formula.
[0090] In one embodiment, the frequencies are pre-selected thereby
allowing an efficient determination of the patterns. Further, the
frequencies can be identified only by the associated wavelength, or
wavelength range. Note that in certain embodiments, the use of
e.sup.iwt is simply a tool and the actual value of the function is
not determined.
[0091] FIG. 5A shows a table 500 that stores time information for a
key pair <KI:KF> according to embodiments of the present
invention. The table 500 stores information for elapsed times
between transactions associated with the particular key pair. Table
500 can also store amount information for the transactions. Table
500 can be viewed as a tabular form of plot 400 along with all the
possible variations for different embodiments described for plot
400.
[0092] In one embodiment, each column 510 corresponds to a
different time range. The time range may correspond to ranges
mentioned above with reference to FIG. 4. As shown table 500 has 6
time ranges, but any number of time ranges may be used. The time
ranges can be considered to correspond to different functions that
approximate the transaction patterns of a consumer or other entity.
For example, each time range can correspond to or be considered a
different frequency w for e.sup.iwt.
[0093] In some embodiments, table 500 only has one row. In other
embodiments, the rows of table 500 correspond to different dollar
amounts (or dollar amount ranges). Thus, each time range may have
subgroups for set ranges of amounts (e.g. dollar amounts). The
organization is similar to a matrix, where a row or a column can be
viewed as a group or subgroup. Although five amount ranges are
shown, table 500 can have any number of dollar amounts. In some
embodiments, there is only one row. i.e. when dollar amounts are
not differentiated. Note that the convention of row and column is
used for ease of illustration, but either time or amount could be
used for either row or column (each an example of an axis). Also,
the data for a table can be stored in any manner, e.g. as a single
array or a two-dimensional array.
[0094] The values for the matrix elements 520 correspond to a
number of KF transactions that have elapsed times relative to a KI
event (e.g. a transaction) that fall within the time range of a
particular column 510. In one embodiment, each newly received K2
transaction can cause a box (element) 520 of the table (matrix) 500
to be increased. The value of the matrix element (an example of a
likelihood value) can be incremented by one for each transaction,
or another integer or non-integer value. The value can also be a
complex number, as described below. In another embodiment, a table
can be required to have a certain total of all values, average of
the values, minimum value in any matrix element, or other measure
of the values in the table. Such a requirement can ensure that
enough data has been received to provide an accurate pattern.
[0095] The values of the matrix elements can be used to determine
the pattern for the key pair <KI:KF>, e.g. as part of step
330 of method 300. For example, matrix elements with high values
relative to the other matrix elements can indicate a pattern of
transactions in the corresponding time range, which can correspond
to a particular frequency w. In another embodiment, one could view
each matrix element in isolation to determine whether a transaction
is likely. For example, if a matrix element exceeds a threshold
value, it may be determined that a transaction is likely to occur
in that time range. The threshold can be determined in various
ways, for example, as a function of a sum of all of the matrix
elements, or equivalently can be fixed with the matrix elements
being normalized before a comparison to a threshold. Thus, step 330
can be accomplished easier based on how the time information is
stored.
[0096] As mentioned above, the time ranges can all be of the same
length (e.g. 24 hours) or be of varying lengths. In one embodiment,
the first column is of very short time length, the second column is
of longer time length, and so on. In this manner, more detail is
obtained for short wavelengths while still allowing data to be
stored for long wavelengths without exhausting storage capacity. In
another embodiment, dollar amount ranges are progressively
structured in a similar manner as the time ranges can be. In one
implementation, the dollar amount range can be used to track the
likelihood of transactions having certain dollar amounts.
[0097] FIG. 5B shows a plot 510 for use in determining the time
ranges for table 500 according to an embodiment of the present
invention. The X axis corresponds to the column numbers. The Y axis
corresponds to the time of a particular column in minutes. For
example, the first column includes times between the first data
point at time domain zero and the data point at time domain 1. Due
to the large scale of the Y axis, the second data point appears to
be at zero, but is simply quite small relative to the maximum
value.
[0098] The wavelength .lamda. of a pattern corresponds to the time
range of a column. For embodiments, using time relative to another
transaction, then the .lamda. is the time between transactions. In
one embodiment, 16 time domains (ranges) are selected as follows:
.lamda..sub.o is under 1 minute, .lamda..sub.I is between 1 minute
and 2.7 minutes, .lamda..sub.2 is between 2.7 minutes and 7.4
minutes, .lamda..sub.3 is between 7.4 minutes and 20 minutes, and
.lamda..sub.15 is over 1.2 million minutes.
[0099] The amount values can also be used to determine patterns for
transactions of certain dollar amounts. If the amount is not of
concern, then the values in a column can be summed to get a total
value for a certain time range. The amounts can also be
incorporated into the mathematical concept introduced above. For
example, in mathematical notation, a value function can be defined
as
V = W C w A wt , ##EQU00002##
where A is an amount of a transaction.
[0100] When a transaction is received, the amount and corresponding
elapsed time for a particular key pair can be used to determine a
corresponding matrix element for the key pair table. The values in
the matrix elements can be normalized across one table and across
multiple tables. For example, a value can be divided by a sum for
all the values of a particular key pair table. Also, a sum can be
calculated for all values across multiple tables, and the values
for each table divided by this sum. As part of a normalization, the
value for a matrix element may be decreased when some of the data
used to determine the value becomes too old. For example, for a
time range that includes short time intervals, counts from
transactions that have occurred more than a year ago may be dropped
as being out of data since short timeframe patterns can change
quickly.
[0101] In various embodiments, tables for different key pairs can
have different time ranges and/or amount ranges. If such
differences do occur, the differences can be accounted when a
summing operation is performed. In one embodiment, the values in
the matrix elements can be smoothed to account for values in nearby
matrix elements, e.g., in a similar fashion as pattern 420.
[0102] In another embodiment, tables for different consumers can be
compared to determine affinity groups. For example, tables with
matching or similar key pairs can be subtracted (lower value more
similarity) or multiplied (higher value more similarity). The
closer the tables are, the more similarity (e.g. as a percentage)
the consumers are, where non-matching tables can be used for
normalization. In one example, one set of tables can correspond to
the affinity group, and the calculation can be used to determine
whether a person is within the affinity group.
[0103] In other embodiments, specific amount ranges or time ranges
can be suppressed. For example, if only certain types of patterns
(e.g. only certain frequencies) are desired to be analyzed, then
one can suppress the data for the other frequencies. In one
embodiment, the suppression is performed with a mask matrix that
has zeros in frequency columns and/or amount rows to be suppressed.
Thus, one can just multiply the matrices to obtain the desired
data. The amount ranges can be similarly suppressed. When
suppressing certain frequencies, these mask matrices can act
similarly to a high pass, low pass, or notch filters. For example,
if one wanted a coupon to be good only for 7 days, and it takes 1
day to create the coupon, the desired time window is any time range
that includes those 6 days. Accordingly, the time information for
transactions outside the time window can be suppressed as not being
of interest.
[0104] Regarding the creation and updating of such tables, after an
event (e.g. a consumer transaction) is received, embodiments can
determine which tracked key pairs have finals keys that match with
the keys resulting from the transaction. As a transaction can be
associated with many keys and key pairs, a transaction may cause
many tables to have a matrix element updated. For example, the
transaction may cause different tables for a specific consumer to
be updated. The updates could be for one table for all transactions
by that consumer (an example of a general table), and more specific
tables for particular zip codes, merchants, and other key elements.
The transaction can also cause updates of tables for the particular
merchant where the transaction occurred.
[0105] As there are different tables that can be updated, each with
a different initial key, the time range (and thus the matrix
element) that is updated may be different for each table. For
example, when elapsed time is used, the last transaction for each
table may be at a different elapsed time since the different
initial transactions. The transaction amount would typically be the
same, thus the exact row for the matrix element to be increased can
be the same, as long as the tables have the same amount ranges. But
the column (i.e. time) could be different for each table.
[0106] Regarding which time column to update, there can also be
more than one column updated for a particular table. For example, a
K2 transaction may have different time patterns relative to K1
transactions (i.e., <K1:K2> pair). Accordingly, when a K2
transaction is received, elapsed times from the last two, three, or
more K1 transactions could be used to update the table.
[0107] In a similar manner, one key pair table could be
<*:K2>, which includes correlations from a plurality of
initial keys to the K2 key in the same table. Effectively, this
table could equal the sum of all tables where K2 is the final key
for a particular consumer or other entity. However, if the
individual key pairs are not significant enough, the <*:K2>
table may be the only table that is tracked. Tables of the type
<K1:*> could also be tracked.
[0108] VI. Impedance (Likelihood of Another Transaction)
[0109] Besides being able to predict when a particular transaction
will occur, embodiments can also predict if another transaction is
going to occur after a current or a predicted transaction, which is
referred to as impedance. In some embodiments, such information can
be tracked by using complex numbers for the matrix elements of the
final event, with the imaginary part corresponding to the
impedance. In other embodiments, the impedance can be tracked
simply using another number for a matrix element or using another
table. In one embodiment, impedance values can be used as indicia
in the comparison of two patterns, e.g., to determine a
demographic. In another embodiment, impedance values can be used to
determine to identify a trend. For example, if impedance values
change from one time range to another that can signal a trend or if
impedance values change from one calculation of a pattern to the
next calculation of a pattern.
[0110] In such embodiments, the imaginary part of a matrix element
can correspond to an impedance that measures how likely it is that
another transaction will occur. Such values can be tracked for
individual consumers and/or groups of similar consumers (affinity
groups). The likelihood can specifically correspond to a future
transaction being correlated to the current transaction having the
time range and dollar amount of the matrix element. The real value
of a matrix element can correspond to the probability that the KF
event will occur, and the imaginary value can relate to the
probability that another event will be correlated to the KF event.
The imaginary part can be updated when another transaction is
correlated to the KF event of the specific time and amount. In one
embodiment, a table can have just one impedance value for the
likelihood of any transaction occurring later. Thus, just one
imaginary part could be stored for an entire table. In another
embodiment, the imaginary parts could be different for each matrix
element.
[0111] In an embodiment, a low impedance (e.g. a large negative
imaginary part) for a matrix element means that there is a high
probability that another transaction is going to occur, and a high
impedance (e.g. high positive value) means that it is unlikely that
another transaction is going to occur, with zero being
indeterminate. The implication of negative and positive values can
be swapped. In another embodiment, a high impedance is provided by
a low number (negative or positive), with larger numbers providing
low impedance, or vice versa. Certain future transactions can be
ignored (e.g. not counted) in determining impedance, for example,
if the dollar amount is too low.
[0112] In one embodiment, the imaginary number can be tracked (e.g.
increased or decreased) in the following way. (1) When a KF event
is received, each of the key pair tables that have the transaction
as the ending event are increased in real part of the appropriate
matrix element, with an elapsed time measured from the respective
starting event KI. (2) For each key pair table, the specific KI
event to which KF was correlated is determined. Then for that KI
event, each K0 event to which KI is correlated as a final event is
determined, and the appropriate matrix elements of specific tables
are determined using an elapsed time between the specific KI and K0
events. (3) The imaginary part of the individual matrix elements
can then be adjusted (e.g. decreased to obtain a reduced impedance)
to reflect a higher likelihood that a another transaction follows
KI, since the KF event did indeed follow. If all of the matrix
elements for a table have the same imaginary part, then the
specific KI event does not need to be known, just the tables that
have the key for an ending KI need to be known, which can be
determined with filters operating on the final keys.
[0113] In another embodiment, the imaginary part could be updated
in a forward manner. (a) A KF event is identified. Each of the key
pair tables that have KF as the ending event are increased for the
real part in the appropriate matrix element, with a KI event being
the starting event. In one aspect, KF might not have just come in,
but could be part of a whole collection of events being processed.
(b) Then specific K2 final events that correlate to the KF event as
an initial event are identified. (c) Depending on the number of K2
transactions, the imaginary part of the appropriate matrix element
can then be adjusted (e.g. increased, decreased, set, or reset). At
this point, the imaginary part for just one matrix element (e.g.
the matrix element from (a)) of tables for KF could be determined.
Or, all of the other matrix elements of the tables could also be
determined as well based on the value for the specific matrix
elements just determined. For example, all of the other matrix
elements of a table can be updated to reflect that the K2
transaction occurred. This can be done when all of the imaginary
parts are the same, or if just one value is stored for an entire
table.
[0114] In one embodiment, the default for the imaginary part can be
set at zero or some average value for a likelihood that a
transaction occurs. If after a certain amount of time, there are no
transactions correlated to it, then the value might increase and
continue to increase. Or the default could be set at a high
impedance, and then lowered as more transactions occur. In another
embodiment, if the future transaction is fraudulent, then the
complex part can also be changed to reflect a higher impedance
since a valid transaction does not occur. In another embodiment, if
a decline occurs after a transaction then the impedance is
increased (e.g. the imaginary part is decreased by one), if an
acceptance occurs after a transaction then the impedance is
decreased (e.g. the imaginary part is increased by one).
[0115] In this way, one can determine the specific instances where
the transaction is a dead end (i.e. not leading to other
transactions), and other instances where the transaction leads to
other transactions. A high impedance would convey that the
transaction is a dead end as no further transactions occur very
often. Conversely, one can determine that a transaction is a
gateway to many other transactions when it has a low impedance. In
one embodiment, an average or sum of all of the imaginary parts of
the matrix elements can be used to determine whether any future
transaction is likely.
[0116] Some embodiments can aggregate the imaginary part over all
KI correlated to a KF to determine a total likelihood that a KF
will provide more transactions. Thus, one can incentivize KF if the
likelihood is high. Each of the tables with KF as the final event
can be used to determine exactly when to send an incentive and what
the incentive might be.
[0117] In one embodiment, one can see a dead end for one affinity
group, but then look at another affinity group that does not show
this dead end. An analysis can then be made as to why the one group
dead ends, and strategy developed for causing the dead end not to
occur (e.g. sending a coupon, pre-authorization, or other
inventive). For example, one can identify stores that the one
affinity group does go to after the transaction, and send coupons
to that store. As another example, one can identify stores
geographically near a merchant that is a dead end and send a coupon
for a nearby store, even potentially for use within a short time
period after a predicted visit to the dead end merchant. After
seeing if a strategy works by sending coupons to a couple people in
an affinity group, coupons can be sent to more people in the
affinity group (including people just partially in the affinity
group).
[0118] Instead of or in addition to the above use of imaginary
values for impedance, greater impedance can also correspond to
fraud. If a fraud transaction K2 is found to correlate to a
transaction K1, then the <KI:K1> matrix elements (or just a
specific element) can have the impedance increased. Thus, the
impedance can reflect the profitability of the present transaction.
For example, certain transactions happening right after buying a
concert ticket can be associated with fraud, which is an example of
where each matrix element may have its own complex part.
[0119] In some embodiments, both real and imaginary parts of a
matrix element can contribute to an overall value, which can be
used to determine whether a trend exists or patterns are similar.
In other embodiments, values for the real or the imaginary
components can be analyzed separately to determine whether a trend
exists or patterns are similar.
[0120] VII. Determination of Demographic
[0121] Once the relevant patterns (e.g. key pair tables resulting
from a filtering process) are obtained for two entities, the
patterns can be compared to determine indicia regarding a
similarity between the patterns. The indicia can be used to
determine whether a person belongs to a particular demographic.
[0122] The calculation of the indicia can be performed in numerous
ways. In one embodiment, the indicia can include specific matrix
elements corresponding to a type of transaction. The matrix
elements can be modified (e.g. normalized) and/or summed to provide
an overall indicia. In another embodiment, the indicia can result
from an operation combining values from patterns of the two
entities. This section describes some embodiments for obtaining
relevant indicia of a likelihood for an event from certain event
patterns.
[0123] A. Determining Similarity of Transactions to Established
Patterns
[0124] Whether a specific transaction or groups of transactions by
one entity is similar to transaction patterns of another entity can
be of interest. Such a similarity can be used to determine a
demographic, which in turn can be used, for example, to predict
transactions or authorize a current transaction. To determine a
similarity, the current transaction can be compared to established
transaction patterns. In an example where one or more recent
transactions are received, these recent transaction(s) can be
compared to established transaction patterns of other entities
(e.g. a specific demographic).
[0125] FIG. 6 shows an example of obtaining indicia of a similarity
of transactions of one entity relative to a transaction pattern of
another entity according to embodiments. In FIG. 6, a table 610
created from first transactions of a first entity is multiplied
(element by element) by table 620 of the other entity to provide
indicia 630. Indicia 630 can provide a measure of how similar the
first entity (via table 1) is to a second entity (via pattern table
620). A matrix element of the retrieved table can provide a
likelihood at a particular time directly or in combination with
other values. For example, a matrix element can be divided by a sum
of matrix elements in a row, all matrix elements in a table, or all
transactions of a person to determine a likelihood for the recent
transaction.
[0126] In this simple example, suppose the first transactions are
associated only with K1, and K1 is correlated only to K2. Then a
table <K2:K1> can be created with a number of transactions in
the proper matrix elements of table 610 relative to previous
correlated K2 transactions. Table 610 shows non-zero values for the
first dollar amount and the fourth time range, second dollar amount
and second time range, and third dollar amount and fourth time
range, with zeros in the other matrix elements. Then, matrix
elements of table 610 can multiply the corresponding matrix
elements of the <K2:K1> key pair table 620 (which has been
matched and retrieved).
[0127] In one embodiment, the resulting indicia 630 is single value
showing an overlap between the tables. As shown, the result is
11*2, 8*7, 15*0, and 6*0 equaling 78, which can be normalized
later. The value of this matrix element 630 can (e.g. when
normalized) can provide a measure of a likelihood of similarity
between the two entities. In this example, the larger the value the
higher the similarity. The more transactions that occur with a
particular dollar and time in table 610 that correspond to non-zero
elements in table 620, the higher the resulting value, and thus the
patterns are more similar. In one aspect, the indicia can be
normalized to account for the total number of transactions in both
tables, and thus a more accurate percentage of similarity can be
determined. In another embodiment, all transactions with any dollar
amount can be summed before multiplying, which would give 17*2,
instead of 11*2 and 6*0. The different embodiments might depend on
the specific demographic being investigated.
[0128] Overall, multiple tables of the first entity might result
for a K1 transaction. Also, multiple tables may be used in the
comparison. The values can be aggregated (e.g. summed) or analyzed
separately to determine a likelihood of similarity.
[0129] B. Multiplying Tables--Alignment
[0130] There may be instances where the key pair for a table of one
entity is not found in the key pairs of tables of another entity.
When this occurs, a first table may be aligned with a second table
to determine a matching table for multiplying. In one embodiment, a
key for the first table can be broadened until a match occurs.
[0131] For example, a first table can have a final key of
:4812,345>, where 4812 is the merchant code and 345 is the first
three digits of a zip code. However, the tables of a second entity
may not contain a table with this final key. This may be because
the consumer has few transaction in zip codes starting with 345.
But the first table may still contain useful information as to a
similarity of the entities. Thus, the key :4812,345> can be
broadened to be :4812,*> so that it matches with a table of the
second entity. The zip code can be broadened in one step or
incrementally to :4812,34>, :4812,3>, and then :4812,*>,
where a match is found.
[0132] Such alignment can be performed between sets of key pair
tables. In a general sense, a set of key pair tables can be viewed
as a key manifold. When the key manifolds are normal (i.e. both
spaces have identical amounts of keys), then one can apply the
operations directly. However, if the key spaces are not normalized,
then an alignment may be performed.
[0133] In one embodiment, each table of one manifold is aligned
with exactly one table of the other manifold. In another
embodiment, there may not be a match found for a table from one
manifold to another. In such a case, the non-matching tables can be
dropped, or distinguished from tables that did match after
alignment. A distinction can also be made between tables that only
match after alignment and tables that match exactly. For example,
it may be useful to know what the entities do that is not the same
(no match), or maybe just similar (match after some alignment).
Also, other operations besides multiplication can be performed,
such as division, subtraction, and addition.
[0134] C. Other Calculations of Likelihood
[0135] With this framework of aligning and multiplying keys, more
complicated calculations of likelihood can be performed. Other
operations, such as division can be used. A purpose of division can
be to normalize a key manifold (i.e. a set of tables).
[0136] FIG. 7 shows a calculation of a likelihood that transaction
patterns of a first entity are similar to transaction patterns of a
second entity according to embodiments. Such a calculation can
provide a likelihood of occurrence of a transaction. In various
embodiments, first entity tables 730 can be recently generated or
previously generated and stored; and second entity tables 710 and
total transaction tables 720 can be updated at set times, e.g.,
once a day, week, etc. The constants table 740 is a table that can
be used for normalizing, e.g., to place the values of a table to be
within a specific range.
[0137] Second tables 710 can be obtained from transactions across
multiple entities, thereby using a combined entity (e.g. an
affinity group). In one embodiment, the specific set of tables have
a common key element, e.g., transactions for a specific merchant or
during a specific month. Other key elements can be used, e.g., zip
code, country, or any other suitable key element.
[0138] Total transactions tables 720 can be obtained from all
transactions across all or many entities. In one embodiment, the
total transactions tables 720 are obtained from the same entities
as tables 710. In another embodiment, the total transactions tables
720 are obtained from the same entities as tables 710 and the first
entity. Similar to second tables 720, total transactions tables 720
can share a same key element, for example, the same key element as
in second tables 710. The total transactions can include fraudulent
transactions and valid transactions, or just valid
transactions.
[0139] Once the tables are aligned, the total transaction tables
720 can be used to normalize the second tables 710 by dividing a
second table by the corresponding total transaction table. After
the division, the normalized tables can be stored in RAM (or any
other memory with faster access than disk). As with FIG. 6, the
division operation divides each matrix element of a second table
with the corresponding matrix element of the total transaction
table. The division can provide a normalization of the counters for
the second tables. For instance, a particular second table may have
high values in many matrix elements, but if there are many total
transactions, the total percentage of transactions for each matrix
element can be low. The division can also account for a
normalization of the first tables, or a separate normalization can
be performed on the first tables.
[0140] In one embodiment, each second table is aligned with exactly
one transaction table. For example, if there are 100 second tables
tracked (i.e. for a given group having a common element, such as
month), then 100 tables result from the alignment and division.
Note that the alignment can be implicit in the notation of a
division operation. In some embodiments, there may not be a match
of a second table to a total transaction table, although this may
happen rarely. In such a case, the second table may be dropped, and
thus there may be fewer resulting tables than second tables. In an
embodiment, one can differentiate second tables that do not have a
match from tables that did match, or between tables that only match
after alignment and tables that match exactly.
[0141] First tables 730 can then aligned with the normalized
tables. Before alignment, some or all of the first tables can be
summed. In one embodiment, two first tables can be summed when the
keys are similar. In effect, the final transactions for each of the
tables can be considered to be of a same type, i.e. have the same
key. For example, if the merchant is the same, but the zip codes
are different, the two tables can be merged and the zip code
dropped or broadened (which can be considered an intersection of
the two key pair tables). This summing may be particularly
appropriate when both tables would be aligned with a same second
table. In such a case, a summing after multiplying the first tables
by the normalized tables provides the same resulting table.
[0142] After alignment, the first tables can be multiplied
element-by-element with the normalized tables, thereby providing a
plurality of resulting tables. In one embodiment, these resulting
tables can be summed to provide one final table, or one final
value. In one aspect, the summing can be due to the second tables
710 being grouped to have a similar key element, and thus the final
table can relate to the one key element. This final table can
provide a measure of an overall similarity of the transaction
patterns, and can be used (e.g. by a modeling function) to
determine a likelihood of similarity. In another embodiment, each
of the resulting tables can independently be final tables that are
used by a modeling function.
[0143] In another embodiment, a mask matrix can be used to remove
certain matrix elements from the resulting tables or from the final
table. For example, the mask matrix can remove low frequency or
high frequency components, or be a notch filter to select
frequencies in the middle. Also, certain dollar amounts can also be
removed. In one implementation, the mask matrix has 1s in matrix
elements that are to be kept and 0s in matrix elements that are not
to be analyzed.
[0144] Although second tables 710 were normalized, the final
table(s) may still have matrix elements with values that can vary
widely. This variation in values can cause instability in a
modeling function, which uses the matrix element as indicia of the
patterns to obtain a total likelihood. Accordingly, in some
embodiments, constants matrix 740 is used to constrain the final
matrix element to be within a certain range of values, e.g.,
between -1 and 1 or 0 and 1. In one embodiment, constants matrix
740 is created from a specified functional form, such as tan h,
log, or sigmoid (generally S shaped) functional form.
[0145] Constants matrix 740 can also constrain matrix elements
values to correspond to a third number within the prescribed range.
For example, a zero output can be mapped to a matrix element value
where similarity may be more difficult to determine and thus
sensitivity needs to be greater. In one embodiment, the functional
form of constants matrix 740 can be kept for an extended period of
time, where inputs of specific matrix element values (e.g. maximum
and minimum values in a specific table) are used to determine the
exact values. Which count corresponds to zero may also have an
input parameter. The functional forms may be constant or vary
across multiple entities.
[0146] The calculation shown in FIG. 7 can done for different
groups of second tables, e.g. one group shares a same merchant, one
group shares a month, etc. In such embodiments, the first tables
used for a particular group can be chosen to correspond with a
particular group. Thus, different first tables can be used for
different groups. In one embodiment, each of these calculations can
then be combined and provided to a model function that uses the
inputs to determine whether a similarity exists, and potentially a
level of similarity.
[0147] In one embodiment, the normalized tables (and potentially
the first tables) can be stored across multiple processors and each
one can perform the corresponding multiplication if there is a
match to an first table. As an alternative, a query can be provided
to each processor and the processor that is storing the desired
second table can return the requested table. The final table(s) can
be provided to a single processor or set of processors that are
configured to run a modeling function.
[0148] D. Subtraction of Matrices
[0149] Besides comparing different entities by multiplying their
respective key manifolds (sets of key pair tables), the manifolds
can also be subtracted from each other. Each table (which can be
normalized) can be aligned and the elements subtracted. If the
difference is large, then the two manifolds are less likely to be
part of the same affinity group. Embodiments can also analyze
tables that do not match exactly, and ones that match only after an
alignment procedure. The values of these matrices can be analyzed
along with the differences for the tables that did match.
[0150] E. Representative Consumers
[0151] Beyond just tracking ones showing a correlation, the number
of tables being tracked for a consumer can be reduced in other
ways. In one embodiment, to reduce the number of tables, different
zip codes of a consumer could be considered the same. For example,
a salesperson might have similar transaction history in cities on
his/her route. All of these zip codes can be considered to be
equivalent, and thus put into one table. A similar treatment can be
made for other key elements. Also, in a similar manner, different
consumers can be treated as being equivalent.
[0152] The knowledge of which affinity groups a consumer is similar
can also be used to reduce the number of tables being tracked for a
consumer. As mentioned above, certain consumers can be grouped into
affinity groups (demographics) in order to obtain more information
about correlated keys. In one embodiment, the affinity groups may
be used to define a consumer. For example, if a group of consumers
had identical or substantially similar transaction histories then
separate tables do not need to be stored. When a consumer of the
affinity group made a purchase, the tables of the affinity group
could be accessed.
[0153] Accordingly, the amount of total storage can be reduced by
defining a consumer by his/her affinity groups. Instead of storing
the redundant tables for each consumer, just one set of tables can
be stored. However, it may be uncommon for a consumer to have the
same transaction patterns as a particular affinity group. In such
cases, the consumer can be defined as being a linear combination of
affinity groups, which are each a combination of key pairs. For
example, there can be 100 affinity groups AG. A consumer can be
defined as 10% AG1, 15% AG 23, 30% AG 41, 20% AG 66, and 25% AG 88.
A consumer can also have individual tables that are used in
combination with the affinity group tables.
[0154] A consumer can thus be viewed as a combination of
representative consumers (affinity groups). Which representative
consumers to use for a particular consumer can be determined by
sampling transactions of the consumer. The sampling can be
relatively small compared to the total transaction history that
would be used if the consumer had his/her own tables. For example,
the sample transaction of a consumer can show that he/she has
similar transactions to a certain demographic group. The percentage
for each affinity group can be determined by a similarity in the
tables for the consumer relative to the affinity group.
[0155] In one embodiment, the similarity can be defined by taking
the difference between each of the matrices (tables) of the
consumer from each of the matrices of the affinity group. Where a
table exists for the consumer, but not for the affinity group, a
table of zeros can be assumed for the affinity group, and vice
versa. As an alternative or in combination, an alignment mechanism
(e.g. method 700) can be used to obtain more matches between the
set of tables for the consumer and the set of tables for the
affinity group.
[0156] In one aspect, tables for a consumer could be created
initially until one or more affinity groups are identified. Certain
tables for a consumer could still be kept to ensure that the
transaction patterns do not change, thereby causing a need to
reevaluate the specific linear combination being employed. Thus,
transaction data can be used to determine which demographic a
person fits into, as opposed to determining what a demographic is
based on similar consumers.
[0157] In one embodiment, if no transaction history is available
for a consumer, the consumer may be approximated using selected
representative consumers. The selected representative consumers can
be based on attributes, such as age or residence. In another
embodiment, a representative consumer can be built for each zip
code. This representative consumer can be used as a default for a
consumer living in that zip code, at least until more information
is obtained.
[0158] In another embodiment, a coupon can be used to probe a
consumer to determine if he/she belongs to a specific group. The
coupon can be for a certain product and/or merchant that is highly
correlated to a certain affinity group. If the customer uses the
coupon, then there is a higher likelihood that the consumer is at
least partially included that affinity group.
[0159] VI. Determining Important Trends or Changes in Trends
[0160] To predict a likelihood of a future event, some embodiments
can obtain the relevant key pair tables for the entity (e.g. a
consumer or a demographic) and then analyze these tables. Which
tables are obtained and how they are analyzed depends on exactly
what events are trying to be predicted, i.e. the question being
answered. In some instances, one may be interested in a particular
transaction (e.g. a particular product or merchant) for marketing
purposes. However, one generally may want to know trends (a
transaction pattern or changes in a transaction pattern) in
consumer activity and changes in trends, but it may be difficult to
identify trends. Thus, automated ways to identify transaction
patterns, potentially having specific features, and to identify
patterns that are changing can be provided.
[0161] FIG. 8 is a flowchart of a method 800 for determining a
likelihood of a transaction and using the likelihood to identify a
trend according to embodiments. For instance, patterns with high or
low likelihoods can be used to determine a trend toward greater or
fewer transactions. Changes between patterns can also be
analyzed.
[0162] In step 810, data for one or more recent and/or upcoming
events is received. In one embodiment, the event data (e.g.
transaction data) is associated with one entity, e.g., a particular
consumer or affinity group. For recent events, whether an event is
"recent" can be relative to other events. For example, if an event
does not occur often, a recent event (e.g. a last event of that
type) can still occur a long time ago in absolute terms. For an
upcoming event, the event has not occurred yet, but can be known to
occur. For example, the start of a month (or other time period) has
a known time of occurrence. As another example, a scheduled event
(such as a sporting event or concert) can be used. Data for these
scheduled events can be obtained before they occur due to the
nature of these events. In one aspect, recent and/or upcoming
events are used to identify patterns that may exist in the future
as opposed to patterns that have existed previously.
[0163] In step 820, the event data is used to map each event to one
or more keys KI. In some embodiments, the mapped keys KI are
specifically keys that are being tracked for an entity. In step
830, tables of patterns that have an initial key of KI are
obtained, thereby providing <KI: tables relevant to the received
event data. In one embodiment, a matching and retrieval function
identifies the relevant tables using methods described herein. The
matching and retrieval function can also match tables that do not
have the exact same key, but similar keys. A similar key can be a
broader version (e.g. first 3 digits of a zip code) of a more
specific key (e.g. 5 digit zip code). Examples of when such
alignment would be performed include: when a specific key for a
current transaction is received, but only a broader version of that
key is being tracked; and when two entities are being compared and
different key pairs are tracked. In embodiments where an event is
an upcoming event, the upcoming event can be a final event (or
effectively the time ranges can be negative with the upcoming event
being an initial event), where transactions before the ending event
are analyzed.
[0164] In step 840, the <KI: tables having matrix elements with
sufficiently high or low counts are identified, e.g., to determine
KF events that can be part of a trend. The trend may be a change in
likelihood, which can be identified when a count is outside of a
band of expected or average values. In one embodiment, to determine
whether a matrix element has a sufficient count, one or more
absolute or relative threshold numbers can be used (e.g. below a
lower and above an upper bound). A relative threshold (e.g. a
percentage) could be determined using a total number of counts for
a table or group of tables. In another embodiment, all tables (i.e.
not just ones with a matching KI for initial key) could be analyzed
to find matrix elements with high or low counts, thereby
eliminating steps 810 to 830. However, using recent or upcoming
events can provide greater timeliness for any result, or action to
be performed based on a result. The identified KF events along with
the specific time ranges for the matrix elements with the high or
low counts can then be analyzed.
[0165] In step 850, other matrix elements not previously identified
are obtained for each likely KF event. For example, a KF event can
be correlated to more initial keys than just the ones identified in
step 820. These previously unanalyzed tables can also have counts
outside of a band for certain matrix elements involving a KF event.
The KF event can be used as a filter to identify unanalyzed tables,
from which other high or low count matrix elements can be obtained.
Thus, this step can be used to obtain a more accurate likelihood
for a specific KF event. Obtaining these other matrix elements may
not be needed, e.g., if KI is starting event, such as a beginning
of a week, month, etc. In this case, since other tables might
include the same data points, these other tables could just include
redundant information.
[0166] Also, other matrix elements for KF events can be important
if high accuracy is desired. For example, as the timeframes of the
different :KF> tables can be different (due to different KI
events), matrix elements having relatively average counts can
correspond to the same timeframe as a high or low count matrix
element. Thus, the number of counts for a likely time range can be
revised.
[0167] In this manner, high probability KF events can be determined
based on a few recent or upcoming KI events, and then a full
analysis of :KF> tables can be performed, as opposed to randomly
selecting KF events to determine when they might be likely to
occur. A KF event could be chosen for analysis, but a selected KF
event might not be highly likely. However, if one were interested
in a specific KF event, then it may be desirable to start method
800 at step 850.
[0168] In step 860, the matrix elements (e.g., just from step 840
or also from step 850) are combined to obtain a probability
distribution vs. time for a :KF> event, which is correlated to
many <KI: events. In one embodiment, each of the matrix elements
for the KF event are combined from a portion or all of the
<KI:KF> tables, where KI runs over the initial events that
are correlated to the KF event. This combination can account for
the fact that the different KI events occur at different times, and
thus the time ranges for each table can be different (e.g.
offset).
[0169] In one implementation, the earliest or latest KI event can
be identified, and offsets for the time ranges of the other tables
can be determined. The corresponding matrix elements can then be
added using the offsets. If a time range of a matrix element of one
table only partially overlaps an offset time range of another
table, then the combination can be broken up into more time ranges
with proportional contributions from each previous rime range. For
example, if two time ranges overlap, then three time sections can
result. The overlap section can receive contributions (i.e. a
percentage of the counts) from the two matrix elements, with the
amount of contribution proportional to the amount of overlap in
time for the respective time ranges.
[0170] To determine a time range of high likelihood, a probability
distribution can be created from the resulting time ranges X after
the combination and the counts Y for each time range. The resulting
time ranges X with the respective counts Y can be analyzed as a
function Y=F(X), which can correspond to pattern 420 of FIG. 4. The
Y values can be normalized so that the counts for time ranges of
different lengths are accounted. The Y values can also be
normalized based on the dollar amount of a transaction.
[0171] In step 870, a trend is confirmed for a particular time or
time window (e.g. time range of the identified matrix element) by
analyzing the probability distribution. In one embodiment, a total
likelihood for a KF event (e.g. across multiple initial events) can
be calculated to confirm that the likelihood is still outside the
average or expected band. Likelihood values near the time range can
also be examined to identify regions of significant change in
probability. A specific time window may correspond to a
predetermined time range of a matrix element, or be another time
range that results from an overlap of multiple time ranges. For
example, if two matrix elements overlap in time (e.g. because the
KI events occur at different times), then the time window may have
the range of the overlap time. In one aspect, the trend can be for
a one or more specific amounts of the transaction, which can be
selected by multiplying with a mask matrix.
[0172] Besides confirming a time window, any new time windows
related to a trend can be identified. To determine a time range of
high or low likelihood, the probability function F can be analyzed.
For example, the function F can be analyzed with a numerical
routine to identify a maximum, minimum, or regions having values
outside of a range. To identify maximum or minimum regions,
techniques such as finite difference, interpolation, finite
element, or other suitable methods, can be used to obtain first and
second derivatives of F. These derivates can then be used in an
optimization algorithm for finding a maximum. Global optimization
methods can also be used, such as simulated annealing.
[0173] In addition to finding a time window when an event is part
of a trend, a total probability over a specific time period can be
obtained. In one embodiment, the function F can be integrated (e.g.
sum counters for time ranges) over the desired time range. In
effect, to obtain a probability that an event will occur within a
prescribed time period, one can integrate contributions over all of
the relevant key pairs during the time period. As an example with
one key pair, a probability that someone will perform a certain
event (e.g. a transaction) once they are visiting San Francisco can
be obtained by integrating the key pair <SF:KF> over all of
the desired time periods. In one aspect, time periods of greater
than one month may not be relevant if a person never stays in San
Francisco for that long (which could be identified from a location
of a person's phone or by locations of transactions). One could
also determine a probability for a transaction to occur in November
in a similar way.
[0174] As an alternative to all of the above steps, one can select
a particular event and a particular time, which can be used to
select the relevant patterns from which the corresponding matrix
element can be analyzed. If the tables indicate a desirable
likelihood (e.g. relative to threshold values), then a trend can be
determined. The probability distribution can also be analyzed,
starting at the particular time, to find a stronger trend (e.g.
increasing or decreasing probability values), or to find a trend if
one did not exist at the particular time.
[0175] In one embodiment, the relevant patterns from which the
corresponding matrix element are selected by creating a set of key
pair tables with 1 or other non-zero values in the appropriate
matrix elements. These tables are then multiplied by the saved
tables (i.e. known patterns) to obtain the likelihood, effectively
filtering out the desired values. Besides a particular time, a time
window can also be specified, which may cause more than one matrix
element in a table to have a non-zero value. In this case, the
non-zero values can be based on a level of overlap of the time
window with the corresponding time ranges of the matrix
elements.
[0176] Referring back to method 800, in step 880, the currently
calculated probability distribution can optionally be compared to
other probability distributions calculated previously. In one
embodiment, the previously calculated distributions are for the
same :KF> event. In this manner, a change in probability at a
particular time in one distribution to another can be determined.
This change can be used to determine a trend. For example, the
change can be plotted and a trend in the change can be determined.
The change can be used to predict a trend even when the likelihood
values fall within an expected or average band.
[0177] In step 890, a course of action can be determined based on a
determined trend. Various example actions and determinations are
now described. Examples of actions include marketing campaigns,
inventory levels inventory levels (e.g. at particular stores or for
all stores), pricing, and store locations. In one embodiment, the
future time of the trend and its relationship to the present time
can impact the action taken. For example, if the time window starts
soon, an action that can be performed soon (e.g. discounting price)
can be initiated. Whereas if the time window does not start for an
extended period of time, an action that takes longer (e.g. moving
inventory or opening a store) can be performed.
[0178] Also, once an event is found to be likely, further analysis
can be performed t o determine whether and how and action is to be
performed. For example, a cost of an action, such as the cost of
moving inventory or discounting the price of a product, can be
determined as part of a cost-benefit analysis. The analysis can
also include situations where inventory levels are high, and thus
the product needs to be sold quickly. In one embodiment, the cost
of an action can include a possible loss due to fraud, which can be
calculated by comparing a transaction pattern of an entity (e.g. a
demographic) to patterns known to be fraudulent (e.g. by
multiplying tables of the entity against tables of a fraud entity).
In another embodiment, a profit of an event can be determined,
e.g., the profit from a transaction in which a trend is identified.
If the profit is high, then a higher cost and lower trend can be
tolerated.
[0179] In one embodiment, calculations for the prediction of an
event can be run in real time (e.g. within several hours after an
event or series of events occur). In another embodiment, the
calculations can be run as batch jobs that are run periodically,
e.g., daily, weekly or monthly. For example, a calculation can run
monthly to determine who is likely to buy a house, and then a
coupon for art, furniture, etc. can be sent to that person. In
various embodiments, prediction of major purchases can generally be
run in larger batches, whereas prediction of small purchases can be
run in real-time (e.g., in reaction to a specific transaction).
[0180] In some embodiments, ending events also can be used
similarly to predict what may happen before the event. Since the
occurrence of an ending event can be known ahead of time (e.g.
scheduled for a particular time), the correlated initial events can
still be predicted. For example, consumer activity prior to a
schedule sporting event can be determined, which may be done, e.g.,
using tables having negative time ranges with the ending event as
an initial key or with positive time ranges with the ending event
as a final key. Trends in such predicted initial events can also be
used to determine actions.
[0181] Any of the computer systems mentioned herein may utilize any
suitable number of subsystems. Examples of such subsystems are
shown in FIG. 9 in computer apparatus 900. In some embodiments, a
computer system includes a single computer apparatus, where the
subsystems can be the components of the computer apparatus. In
other embodiments, a computer system can include multiple computer
apparatuses, each being a subsystem, with internal components.
[0182] The subsystems shown in FIG. 9 are interconnected via a
system bus 975. Additional subsystems such as a printer 974,
keyboard 978, fixed disk 979, monitor 976, which is coupled to
display adapter 982, and others are shown. Peripherals and
input/output (I/O) devices, which couple to I/O controller 971, can
be connected to the computer system by any number of means known in
the art, such as serial port 977. For example, serial port 977 or
external interface 981 can be used to connect computer system 900
to a wide area network such as the Internet, a mouse input device,
or a scanner. The interconnection via system bus 975 allows the
central processor 973 to communicate with each subsystem and to
control the execution of instructions from system memory 972 or the
fixed disk 979, as well as the exchange of information between
subsystems. The system memory 972 and/or the fixed disk 979 may
embody a computer readable medium. Any of the values mentioned
herein can be output from one component to another component and
can be output to the user.
[0183] A computer system can include a plurality of the same
components or subsystems, e.g., connected together by external
interface 981. In some embodiments, computer systems, subsystem, or
apparatuses can communicate over a network. In such instances, one
computer can be considered a client and another computer a server.
A client and a server can each include multiple systems,
subsystems, or components, mentioned herein.
[0184] The specific details of particular embodiments may be
combined in any suitable manner without departing from the spirit
and scope of embodiments of the invention. However, other
embodiments of the invention may be directed to specific
embodiments relating to each individual aspect, or specific
combinations of these individual aspects.
[0185] It should be understood that the present invention as
described above can be implemented in the form of control logic
using hardware and/or using computer software in a modular or
integrated manner. Based on the disclosure and teachings provided
herein, a person of ordinary skill in the art will know and
appreciate other ways and/or methods to implement the present
invention using hardware and a combination of hardware and
software
[0186] Any of the software components or functions described in
this application, may be implemented as software code to be
executed by a processor using any suitable computer language such
as, for example, Java, C++ or Perl using, for example, conventional
or object-oriented techniques. The software code may be stored as a
series of instructions, or commands on a computer readable medium
for storage and/or transmission, suitable media include random
access memory (RAM), a read only memory (ROM), a magnetic medium
such as a hard-drive or a floppy disk, or an optical medium such as
a compact disk (CD) or DVD (digital versatile disk), flash memory,
and the like. The computer readable medium may be any combination
of such storage or transmission devices.
[0187] Such programs may also be encoded and transmitted using
carrier signals adapted for transmission via wired, optical, and/or
wireless networks conforming to a variety of protocols, including
the Internet. As such, a computer readable medium according to an
embodiment of the present invention may be created using a data
signal encoded with such programs. Computer readable media encoded
with the program code may be packaged with a compatible device or
provided separately from other devices (e.g., via Internet
download). Any such computer readable medium may reside on or
within a single computer program product (e.g. a hard drive, a CD,
or an entire computer system), and may be present on or within
different computer program products within a system or network. A
computer system may include a monitor, printer, or other suitable
display for providing any of the results mentioned herein to a
user.
[0188] The above description of exemplary embodiments of the
invention has been presented for the purposes of illustration and
description. It is not intended to be exhaustive or to limit the
invention to the precise form described, and many modifications and
variations are possible in light of the teaching above. The
embodiments were chosen and described in order to best explain the
principles of the invention and its practical applications to
thereby enable others skilled in the art to best utilize the
invention in various embodiments and with various modifications as
are suited to the particular use contemplated.
* * * * *