U.S. patent application number 14/562524 was filed with the patent office on 2016-01-28 for system and method for determining a propensity of entity to take a specified action.
The applicant listed for this patent is PALANTIR TECHNOLOGIES INC.. Invention is credited to Daniel ERENRICH, Anirvan MUKHERJEE.
Application Number | 20160026923 14/562524 |
Document ID | / |
Family ID | 55166992 |
Filed Date | 2016-01-28 |
United States Patent
Application |
20160026923 |
Kind Code |
A1 |
ERENRICH; Daniel ; et
al. |
January 28, 2016 |
SYSTEM AND METHOD FOR DETERMINING A PROPENSITY OF ENTITY TO TAKE A
SPECIFIED ACTION
Abstract
Systems and methods are disclosed for determining a propensity
of an entity to take a specified action. In accordance with one
implementation, a method is provided for determining the
propensity. The method includes, for example, accessing one or more
data sources, the one or more data sources including information
associated with the entity, forming a record associated with the
entity by integrating the information from the one or more data
sources, generating, based on the record, one or more features
associated with the entity, processing the one or more features to
determine the propensity of the entity to take the specified
action, and outputting the propensity.
Inventors: |
ERENRICH; Daniel; (Mountain
View, CA) ; MUKHERJEE; Anirvan; (Mountain View,
CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
PALANTIR TECHNOLOGIES INC. |
Palo Alto |
CA |
US |
|
|
Family ID: |
55166992 |
Appl. No.: |
14/562524 |
Filed: |
December 5, 2014 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
62039305 |
Aug 19, 2014 |
|
|
|
62027761 |
Jul 22, 2014 |
|
|
|
Current U.S.
Class: |
706/52 |
Current CPC
Class: |
G06Q 30/01 20130101;
G06N 7/005 20130101; G06N 5/048 20130101 |
International
Class: |
G06N 5/04 20060101
G06N005/04; G06N 7/00 20060101 G06N007/00 |
Claims
1. A system for determining a propensity of an entity to take a
specified action, the system comprising: one or more
computer-readable storage media configured to store instructions;
and one or more processors configured to execute the instructions
to: acquire information associated with the entity from one or more
data sources; form a record associated with the entity by
integrating the information from the one or more data sources;
generate, based on the record, one or more features associated with
the entity; process the one or more features to determine the
propensity of the entity to take the specified action; and output
the propensity.
2. The system of claim 1, wherein the one or more processors are
further configured to filter the record for information associated
with the specified action.
3. The system of claim 1, wherein the one or more processors are
further configured to train a model to predict the propensity of
the entity to take the specified action.
4. The system of claim 3, wherein the one are more processors are
further configured to determine, based on the trained model and the
record, the relative importance of the one or more features.
5. The system of claim 1, wherein the one or more processors are
further configured to: acquire a temporal period; and determine the
propensity of the entity to take the specified action within the
temporal period.
6. The system of claim 1, wherein the one or more processors are
further configured to generate a user interface to display the
propensity of the entity to take the specified action.
7. The system of claim 1, wherein the entity is a household and the
specified action is churn.
8. A method for determining a propensity of an entity to take a
specified action, the method being performed by one or more
processors and comprising: acquiring information associated with
the entity from one or more data sources; forming a record
associated with the entity by integrating the information from the
one or more data sources; generating, based on the record, one or
more features associated with the entity; processing the one or
more features to determine the propensity of the entity to take the
specified action; and outputting the propensity.
9. The method of claim 8, further comprising filtering the record
for information associated with the specified action.
10. The method of claim 8, further comprising training a model to
predict the propensity of the entity to take the specified
action.
11. The method of claim 10, further comprising determining, based
on the trained model and the record, the relative importance of the
one or more features.
12. The method of claim 8, further comprising: acquiring a temporal
period; and determining the propensity of the entity to take the
specified action within the temporal period.
13. The method of claim 8, further comprising generating a user
interface to display the propensity of the entity to take the
specified action.
14. The method of claim 8, wherein the entity is a household and
the specified action is churn.
15. A non-transitory computer-readable medium storing a set of
instructions that are executable by one or more processors to cause
the one or more processors to perform a method for determining a
propensity of an entity to take a specified action, the method
comprising: acquiring information associated with the entity one or
more data sources; forming a record associated with the entity by
integrating the information from the one or more data sources;
generating, based on the record, one or more features associated
with the entity; processing the one or more features to determine
the propensity of the entity to take the specified action; and
outputting the propensity.
16. The non-transitory computer-readable medium of claim 15,
further comprising instructions executable by the one or more
processors to cause the one or more processors to perform: training
a model to predict the propensity of the entity to take the
specified action.
17. The non-transitory computer-readable medium of claim 16,
further comprising instructions executable by the one or more
processors to cause the one or more processors to perform:
determining, based on the trained model and the record, the
relative importance of the one or more features.
18. The non-transitory computer-readable medium of claim 15,
further comprising instructions executable by the one or more
processors to cause the one or more processors to perform:
acquiring a temporal period; and determining the propensity of the
entity to take the specified action within the temporal period.
19. The non-transitory computer-readable medium of claim 15,
further comprising instructions executable by the one or more
processors to cause the one or more processors to perform:
generating a user interface to display the propensity of the entity
to take the specified action.
20. The non-transitory computer-readable medium of claim 15,
wherein the entity is a household and the specified action is
churn.
Description
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to U.S. Provisional Patent
Application No. 62/027,761, filed on Jul. 22, 2014, and U.S.
Provisional Patent Application No. 62/039,305, filed on Aug. 19,
2014, the disclosures of which are expressly incorporated herein by
reference in their entirety.
BACKGROUND
[0002] The amount of information being processed and stored is
rapidly increasing as technology advances present an
ever-increasing ability to generate and store data. On the one
hand, this vast amount of data allows entities to perform more
detailed analyses than ever. But on the other hand, the vast amount
of data makes it more difficult for entities to quickly sort
through and determine the most relevant features of the data.
Collecting, classifying, and analyzing large sets of data in an
appropriate manner allows these entities to more quickly and
efficiently identify patterns, thereby allowing them to predict
future actions.
BRIEF DESCRIPTION OF THE DRAWINGS
[0003] Reference will now be made to the accompanying drawings,
which illustrate exemplary embodiments of the present disclosure.
In the drawings:
[0004] FIG. 1 is a block diagram of an exemplary computer system,
consistent with embodiments of the present disclosure;
[0005] FIG. 2 is a flowchart of an exemplary method for determining
a propensity of an entity to take a specified action, consistent
with embodiments of the present disclosure;
[0006] FIG. 3 is a flowchart of an exemplary method for creating a
model to determine the propensity of an entity to take a specified
action, consistent with embodiments of the present disclosure;
[0007] FIG. 4 provides an exemplary use case scenario for
determining a propensity of an entity to take a specified action
applied to an exemplary data structure, consistent with embodiments
of the present disclosure.
[0008] FIG. 5 illustrates an exemplary user interface, consistent
with embodiments of the present disclosure; and
[0009] FIG. 6 illustrates another exemplary user interface,
consistent with embodiments of the present disclosure.
DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS
[0010] Reference will now be made in detail to several exemplary
embodiments, including those illustrated in the accompanying
drawings. Whenever possible, the same reference numbers will be
used throughout the drawings to refer to the same or like
parts.
[0011] Embodiments disclosed herein are directed to, among other
things, to systems and methods that can determine the propensity of
an entity (e.g., a person, a household, or a company) to take a
specified action. For example, a specific action can involve
determining the propensity that a customer will leave a supplier
during a given time period (e.g., churn). Such factors that can
affect the churn rate include customer dissatisfaction, cheaper
and/or better offers from the competition, more successful sales
and/or marketing by the competition, or reasons having to do with
the customer life cycle. If a supplier can receive an indication
that a customer is likely to churn, the supplier can take one or
more actions in order to keep the customer. The embodiments
disclosed herein can assist with providing that indication.
[0012] For example, the systems and methods can access one or more
data sources, the one or more data sources including information
associated with the entity, form a record associated with the
entity by integrating the information from the one or more data
sources, generate, based on the record, one or more features
associated with the entity, process the one or more features to
determine the propensity of the entity to take the specified
action, and output the propensity.
[0013] The operations, techniques, and/or components described
herein are implemented by a computer system, which can include one
or more special-purpose computing devices. The special-purpose
computing devices can be hard-wired to perform the operations,
techniques, and/or components described herein. The special-purpose
computing devices can include digital electronic devices such as
one or more application-specific integrated circuits (ASICs) or
field programmable gate arrays (FPGAs) that are persistently
programmed to perform the operations, techniques, and/or components
described herein. The special-purpose computing devices can include
one or more hardware processors programmed to perform such features
of the present disclosure pursuant to program instructions in
firmware, memory, other storage, or a combination. Such
special-purpose computing devices can combine custom hard-wired
logic, ASICs, or FPGAs with custom programming to accomplish the
techniques and other features of the present disclosure. The
special-purpose computing devices can be desktop computer systems,
portable computer systems, handheld devices, networking devices, or
any other device that incorporates hard-wired and/or program logic
to implement the techniques and other features of the present
disclosure.
[0014] The one or more special-purpose computing devices can be
generally controlled and coordinated by operating system software,
such as iOS, Android, Blackberry, Chrome OS, Windows XP, Windows
Vista, Windows 7, Windows 8, Windows Server, Windows CE, Unix,
Linux, SunOS, Solaris, VxWorks, or other compatible operating
systems. In other embodiments, the computing device can be
controlled by a proprietary operating system. Operating systems
control and schedule computer processes for execution, perform
memory management, provide file system, networking, I/O services,
and provide a user interface functionality, such as a graphical
user interface ("GUI"), among other things.
[0015] By way of example, FIG. 1 is a block diagram that
illustrates an implementation of a computer system 100, which, as
described above, can comprise one or more electronic devices.
Computer system 100 includes a bus 102 or other communication
mechanism for communicating information, and one or more hardware
processors 104 (denoted as processor 104 for purposes of
simplicity), coupled with bus 102 for processing information. One
or more hardware processors 104 can be, for example, one or more
microprocessors.
[0016] Computer system 100 also includes a main memory 106, such as
a random access memory (RAM) or other dynamic storage device,
coupled to bus 102 for storing information and instructions to be
executed by one or more processors 104. Main memory 106 also can be
used for storing temporary variables or other intermediate
information during execution of instructions to be executed by
processor 104. Such instructions, when stored in non-transitory
storage media accessible to one or more processors 104, render
computer system 100 into a special-purpose machine that is
customized to perform the operations specified in the
instructions.
[0017] Computer system 100 further includes a read only memory
(ROM) 108 or other static storage device coupled to bus 102 for
storing static information and instructions for processor 104. A
storage device 110, such as a magnetic disk, optical disk, or USB
thumb drive (Flash drive), etc., is provided and coupled to bus 102
for storing information and instructions.
[0018] Computer system 100 can be coupled via bus 102 to a display
112, such as a cathode ray tube (CRT), an LCD display, or a
touchscreen, for displaying information to a computer user. An
input device 114, including alphanumeric and other keys, is coupled
to bus 102 for communicating information and command selections to
one or more processors 104. Another type of user input device is
cursor control 116, such as a mouse, a trackball, or cursor
direction keys for communicating direction information and command
selections to one or more processors 104 and for controlling cursor
movement on display 112. The input device typically has two degrees
of freedom in two axes, a first axis (for example, x) and a second
axis (for example, y), that allows the device to specify positions
in a plane. In some embodiments, the same direction information and
command selections as cursor control may be implemented via
receiving touches on a touch screen without a cursor.
[0019] Computer system 100 can include a user interface module to
implement a GUI that may be stored in a mass storage device as
executable software codes that are executed by the one or more
computing devices. This and other modules may include, by way of
example, components, such as software components, object-oriented
software components, class components and task components,
processes, functions, attributes, procedures, subroutines, segments
of program code, drivers, firmware, microcode, circuitry, data,
databases, data structures, tables, arrays, and variables.
[0020] In general, the word "module," as used herein, refers to
logic embodied in hardware or firmware, or to a collection of
software instructions, possibly having entry and exit points,
written in a programming language, such as, for example, Java, Lua,
C, and C++. A software module can be compiled and linked into an
executable program, installed in a dynamic link library, or written
in an interpreted programming language such as, for example, BASIC,
Perl, Python, or Pig. It will be appreciated that software modules
can be callable from other modules or from themselves, and/or can
be invoked in response to detected events or interrupts. Software
modules configured for execution on computing devices can be
provided on a computer readable medium, such as a compact disc,
digital video disc, flash drive, magnetic disc, or any other
tangible medium, or as a digital download (and can be originally
stored in a compressed or installable format that requires
installation, decompression, or decryption prior to execution).
Such software code can be stored, partially or fully, on a memory
device of the executing computing device, for execution by the
computing device. Software instructions can be embedded in
firmware, such as an EPROM. It will be further appreciated that
hardware modules can be comprised of connected logic units, such as
gates and flip-flops, and/or can be comprised of programmable
units, such as programmable gate arrays or processors. The modules
or computing device functionality described herein are preferably
implemented as software modules, but can be represented in hardware
or firmware. Generally, the modules described herein refer to
logical modules that may be combined with other modules or divided
into sub-modules despite their physical organization or
storage.
[0021] Computer system 100 can implement the techniques and other
features described herein using customized hard-wired logic, one or
more ASICs or FPGAs, firmware and/or program logic which in
combination with the electronic device causes or programs computer
system 100 to be a special-purpose machine. According to some
embodiments, the techniques and other features described herein are
performed by computer system 100 in response to one or more
processors 104 executing one or more sequences of one or more
instructions contained in main memory 106. Such instructions can be
read into main memory 106 from another storage medium, such as
storage device 110. Execution of the sequences of instructions
contained in main memory 106 causes one or more processors 104 to
perform the process steps described herein. In alternative
embodiments, hard-wired circuitry can be used in place of or in
combination with software instructions.
[0022] The term "non-transitory media" as used herein refers to any
media storing data and/or instructions that cause a machine to
operate in a specific fashion. Such non-transitory media can
comprise non-volatile media and/or volatile media. Non-volatile
media includes, for example, optical or magnetic disks, such as
storage device 150. Volatile media includes dynamic memory, such as
main memory 106. Common forms of non-transitory media include, for
example, a floppy disk, a flexible disk, hard disk, solid state
drive, magnetic tape, or any other magnetic data storage medium, a
CD-ROM, any other optical data storage medium, any physical medium
with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM,
NVRAM, any other memory chip or cartridge, a register memory, a
processor cache, and networked versions of the same.
[0023] Non-transitory media is distinct from, but can be used in
conjunction with, transmission media. Transmission media
participates in transferring information between storage media. For
example, transmission media includes coaxial cables, copper wire
and fiber optics, including the wires that comprise bus 102.
Transmission media can also take the form of acoustic or light
waves, such as those generated during radio-wave and infra-red data
communications.
[0024] Various forms of media can be involved in carrying one or
more sequences of one or more instructions to one or more
processors 104 for execution. For example, the instructions can
initially be carried on a magnetic disk or solid state drive of a
remote computer. The remote computer can load the instructions into
its dynamic memory and send the instructions over a telephone line
using a modem. A modem local to computer system 100 can receive the
data on the telephone line and use an infra-red transmitter to
convert the data to an infra-red signal. An infra-red detector can
receive the data carried in the infra-red signal and appropriate
circuitry can place the data on bus 102. Bus 102 carries the data
to main memory 106, from which processor 104 retrieves and executes
the instructions. The instructions received by main memory 106 can
optionally be stored on storage device 110 either before or after
execution by one or more processors 104.
[0025] Computer system 100 can also include a communication
interface 118 coupled to bus 102. Communication interface 118 can
provide a two-way data communication coupling to a network link 120
that is connected to a local network 122. For example,
communication interface 118 can be an integrated services digital
network (ISDN) card, cable modem, satellite modem, or a modem to
provide a data communication connection to a corresponding type of
telephone line. As another example, communication interface 118 can
be a local area network (LAN) card to provide a data communication
connection to a compatible LAN. Wireless links can also be
implemented. In any such implementation, communication interface
118 can send and receive electrical, electromagnetic, or optical
signals that carry digital data streams representing various types
of information.
[0026] Network link 120 can typically provide data communication
through one or more networks to other data devices. For example,
network link 120 can provide a connection through local network 122
to a host computer 124 or to data equipment operated by an Internet
Service Provider (ISP) 126. ISP 126 in turn provides data
communication services through the world wide packet data
communication network now commonly referred to as the "Internet"
128. Local network 122 and Internet 128 both use electrical,
electromagnetic, or optical signals that carry digital data
streams. The signals through the various networks and the signals
on network link 120 and through communication interface 118, which
carry the digital data to and from electronic device 110, are
example forms of transmission media.
[0027] Computer system 100 can send messages and receive data,
including program code, through the network(s), network link 120
and communication interface 118. In the Internet example, a server
130 might transmit a requested code for an application program
through Internet 128, ISP 126, local network 122 and communication
interface 118. The received code can be executed by one or more
processors 104 as it is received, and/or stored in storage device
110, or other non-volatile storage for later execution.
[0028] FIG. 2 is a flowchart representing an exemplary method 200
for determining the propensity of an entity to take a specified
action. While the flowchart discloses the following steps in a
particular order, it is appreciated that at least some of the steps
can be moved, modified, or deleted where appropriate, consistent
with embodiments of the present disclosure. In some embodiments,
method 200 can be performed in full or in part by a computer system
(e.g., computer system 100). It is appreciated that some of these
steps can be performed in full or in part by other systems.
[0029] Referring to FIG. 2, at step 210, the computer system can
access one or more data sources that include information associated
with the entity. The one or more data sources can be stored locally
at the computer system and/or at one or more remote servers (e.g.,
such as a remote database), or at one or more other remote devices.
In some embodiments, the information in the data sources can be
stored in one or more multidimensional tables. By way of example,
information of a first type (e.g., bill payment amount) associated
with the entity, (e.g., a household), can be stored in a first
multidimensional table and information of a second type (e.g.,
automobile type) associated with the entity can be stored in a
second multidimensional table. In some embodiments a table can
contain information associated with a single entity. In other
embodiments, a table can store information associated with a
plurality of entities. For example, each row in the table can
correspond to a different entity (e.g., Household #1, Household #2,
etc.) and each column in the table can correspond to a payment
amount. In some embodiments, the information stored in the table
can include entries associated with a temporal period. For example,
a table can store a bill payment date for each bill payment amount.
The information can be stored as a continuous value (e.g., $800 as
a bill payment amount), as a categorical value (e.g., "Sedan" or
"Coupe" as an automobile type), as textual value, or as any other
type of value. In some embodiments, a table can be stored in either
a row-oriented database or a column-oriented database. For example,
a row in a row-oriented table can contain information associated
with an entity (e.g., Household #1) and data in the row can be
stored serially such that information associated with the entity
can be accessed in one operation.
[0030] In some embodiments the computer system can access the one
or more data sources periodically (e.g., once a week, once a month,
etc.). The computer system can access the one or more data sources
based on the one or more data sources being updated (e.g., a new
entry, such as payment bill amount, is added to a table). In some
embodiments, the computer system can access the one or more data
sources responsive to an input received from the user. The user
input can identify the entity (e.g. Household #5) for which
information is requested. In some embodiments, the user input can
identify a category or class of entities. For example, the user
input can identify a class of entities that are all consumers of a
specified provisioning entity (e.g., insurance company), the user
input can identify entities that are located within a specified
geographic region (e.g., all households within the state of
Illinois), or the user input can identify any other category of
entities (e.g., all households with an income over $100,000). In
response to the user input, the computer system can access the one
or more data sources including information associated with the
entities. In some embodiments, method 200 can be performed
periodically (e.g., once a week, once a month, etc.). In some
embodiments, method 200 can be performed whenever the one or more
data sources are accessed.
[0031] At step 220, the computer system can form a record including
all information from the one or more data sources associated with
the entity. In some embodiments, the record can be formed by
integrating the information that is associated with the entity from
the one or more data sources. The record can contain a multitude of
information related to the entity. For example, the record can
contain all information from the one or more data sources
associated with a household (e.g., number of members in household,
age of each member of the household, number of automobiles, income,
monthly bill mounts for each automobile, types of automobiles,
etc.). In some embodiments, the record can be stored as a cogroup
(e.g., the cogroup shown in FIG. 4). In some embodiments, the
record can be stored in either a row-oriented database or a
column-oriented database. For example, a row in a row-oriented
record can be associated with a data source (e.g., bill payment
amount) and data in the row can be stored serially such that data
associated with that data source can be accessed in one
operation.
[0032] At step 230, the computer system can filter the record for
information associated with the specified action. For example, the
specified action can be churn (e.g., cancellation of a
subscription) and the computer system can filter the record for
information related to churn. In some embodiments, the computer
system can provide context for the specified action. In some
embodiments, the computer system can determine whether the
specified action will likely occur within a specified temporal
period (e.g., one month). The computer system can filter out all
information associated with a time that is outside (e.g., before or
after) the specified temporal period. In some embodiments, the
computer system can determine the propensity for the specified
action based on only recent events. For example, the computer
system can filter out information associated with a time before the
specified time period (e.g., stale or less relevant information).
In some embodiments, each record can be filtered in a slightly
different way. The record can be filtered according to a user input
specifying an activity or temporal period. In some embodiments, the
record can be filtered automatically based on a presetting (e.g.,
the computer can be configured to filter out all information that
is more than one year old).
[0033] At step 240, the computer system can generate, based on the
record, one or more features associated with the entity. A feature
can be any discernable way of sorting or classifying the record
(e.g., average value, most recent value, most common value, etc.).
In some embodiments, the computer system can generate key value
pairs, wherein each key value pair contains a feature and a value.
For example, the computer system can generate features such as
"average bill payment amount", "average income", "average number of
automobiles", etc. and corresponding values such as "$670", "$73K",
"2.3 cars", etc. In some embodiments, features can be associated
with a time value. For example, computer system can generate
features for a specified temporal period (e.g., features can be
based only on the most recent values). Feature values can be
represented as a continuous value (e.g., $670), as a categorical
value (e.g., "Sedan" or "Coupe"), as a textual value, or as any
other type of value. In some embodiments, feature values can be
classified as weighted values. For example, a household income of
$73,000 can be represented as weighted value of {0.27 0}, {0.73
100000}.
[0034] At step 250, the computer system can process the one or more
features to determine the propensity of the entity to take the
specified action. In some embodiments, the propensity can be
determined by applying a trained model, such as the model described
in greater detail in FIG. 3. The input to the model can be key
value pairs of the one or more features associated with the entity
and the specified actions and the output of the model can be the
propensity of the entity to take the specified action. In some
embodiments, processing the one or more features associated with
the entity can result in a multitude of useful insights regarding
the features that influence the propensity of the entity to take
the specified action. Such insights, can include, for example, the
features that are most influential on the propensity of the entity
to take the specified action (e.g., change in income, etc.).
[0035] At step 260, the computer system can output the propensity.
In some embodiments the computer system can output the propensity
as a continuous value, such as a number or percentage (e.g., 80 or
80%) or as a categorical value (e.g., "low", "medium", or "high").
In some embodiments, the computer system can generate a user
interface, such as the user interfaces described in greater detail
in FIGS. 5 and 6 for displaying the propensity. In some
embodiments, the computer system can output a plurality of
propensities for a plurality of entities. The computer system can
output the plurality of propensities as an a separate file (e.g., a
text file or an Excel file) or as a table.
[0036] FIG. 3 shows a flowchart representing an exemplary method
300 for creating a model to determine the propensity of an entity
to take a specified action, consistent with embodiments of the
present disclosure. While the flowchart discloses the following
steps in a particular order, it is appreciated that at least some
of the steps can be moved, modified, or deleted where appropriate,
consistent with embodiments of the present disclosure. In some
embodiments, method 300 can be performed in full or in part by a
computer system (e.g., computer system 100). It is appreciated that
some of these steps can be performed in full or in part by other
systems.
[0037] Referring to FIG. 3, at step 310, the computer system can
access one or more data sources that include information associated
with the plurality of entities. The one or more data sources can be
stored locally at the computer system and/or at one or more remote
servers (e.g., such as a remote database), or at one or more other
remote devices. In some embodiments, the information in the data
sources can be stored in one or more multidimensional tables. By
way of example, information of a first type (e.g., bill payment
amount) associated with the plurality of entities, (e.g.,
households), can be stored in a first multidimensional table and
information of a second type (e.g., automobile type) associated
with the entities can be stored in a second multidimensional table.
In some embodiments a plurality of table can contain information
associated with the plurality of entities, wherein each table
contains information associated with each entity. In other
embodiments, a table can store information associated with a
plurality of entities. For example, each row in the table can
correspond to a different entity (e.g., Household #1, Household #2,
etc.) and each column in the table can correspond to a payment
amount. In some embodiments, the information stored in a table can
include entries associated with a temporal period. For example, a
table can store a bill payment date for each bill payment amount.
The information can be stored as a continuous value (e.g., $800 as
a bill payment amount), as a categorical value, (e.g., "Sedan" or
"Coupe" as an automobile type), as textual value, or as any other
type of value. In some embodiments, a table can be stored in either
a row-oriented database or a column-oriented database. For example,
a row in a row-oriented table can contain information associated
with an entity (e.g., Household #1) and data in the row can be
stored serially such that information associated with the entity
can be accessed in one operation.
[0038] In some embodiments the computer system can access the one
or more data sources periodically (e.g., once a week, once a month,
etc.). In other embodiments, the computer system can access the one
or more data sources based on the one or more data sources being
updated (e.g., a new entry, such as payment bill amount, is added
to a table). In some embodiments, the computer system can access
the one or more data sources responsive to an input received from
the user. In some embodiments, the user input can specifically
identify the plurality of entities (e.g., Household #1-#10,000) for
use in generating the model. In some embodiments, the user input
can identify a category or class of entities. For example, the user
input can identify a class of entities that are all consumers of a
specified provisioning entity (e.g., insurance company), the user
input can identify entities that are located within a specified
geographic region (e.g., all households within the state of
Illinois), or the user input can identify any other category of
entities (e.g., all households with an income over $100,000). In
response to a user input, the computer system can access the one or
more data sources including information associated with the
plurality of entities.
[0039] At step 320, the computer system can form a plurality of
records including information from the one or more data sources
associated with the plurality of entities, each record being
associated with an entity. In some embodiments, a record of the
plurality of records can be formed by integrating information from
the one or more data sources information that is associated with an
entity of the plurality of entities. The record can contain a
multitude of information related to the entity. For example, the
record can contain all information from the one or more data
sources associated with a household (e.g., number of members in
household, number of automobiles, income, monthly bill amounts for
each automobile, etc.). In some embodiments, the record can be
stored as a cogroup (e.g., the cogroup shown in FIG. 4). In some
embodiments, the record can be stored in either a row-oriented
database or a column-oriented database. For example, a row in a
record can be associated with a data source (e.g., bill payment
amount) and data in the row can be stored serially such that data
associated with that data source can be accessed in one
operation.
[0040] At step 330, the computer system can filter the plurality of
records for information associated with the specified action. For
example, the specified action can be churn (e.g., cancellation or
non-renewal of a subscription) and the computer system can filter
the record for information related to churn. In some embodiments,
the computer system can provide context for (e.g., frame) the
specified action. In some embodiments, the computer system can
determine whether the specified action will occur within a
specified temporal period (e.g., one month). The computer system
can filter out all information associated with a time that is
outside (e.g., before or after) the specified temporal period. In
some embodiments, the computer system can determine the propensity
for the specified action based on only recent information. For
example, the computer system can filter out information associated
with a time before the specified temporal period (e.g., stale or
less relevant information). In some embodiments, each record can be
filtered in a slightly different way. A record can be filtered
according to a user input specifying an activity or temporal
period. In some embodiments, the record can be filtered
automatically based on a presetting (e.g., the computer can be
configured to filter out all information that is more than one year
old).
[0041] The computer system can frame the record by associating a
label with the record. In some embodiments, the label can represent
whether the entity took the specified action within the specified
temporal period. For example, the computer system can associate a
label of "1" or "true" if the entity took the specified action
within the specified temporal period. By way of example, in the
context of the cancellation of a subscription, the computer system
can keep data from time period A to B (e.g., the specified temporal
period) and determine whether the entity cancelled the subscription
within a second time period, T. In this example, if the entity
cancelled the subscription in time period T, the computer system
can associate a label with the record indicating that the entity
took the specified action.
[0042] At step 340, the computer system can create, for each
record, a labelled example by generating one or more features
associated with an entity of the plurality of entities. A feature
can be any discernable way of sorting or classifying the record
(e.g., average value, most recent value, most common value, etc.).
In some embodiments, the computer system 340 can generate key value
pairs, wherein each key value pair contains a feature and a value.
For example, the computer system can generate features such as
"average bill payment amount", "average income", "average number of
automobiles", etc. and corresponding values such as "$670", "$73K",
"2.3 cars", etc. In some embodiments, features can be associated
with a time value. For example, computer system can generate
features for a specified temporal period (e.g., features can be
based only on the most recent values). Feature values can be
represented as a continuous value (e.g., $670), as a categorical
value (e.g., "Sedan" or "Coupe"), as a textual value, or as any
other type of value. In some embodiments, feature values can be
classified as weighted values. For example, a household income of
$73,000 can be represented as weighted value of {0.27 0}, {0.73
100000}. In some embodiments, the labelled example can include the
key value feature pairs and the record label (e.g., whether the
entity took the specified action).
[0043] At step 350, the computer system can select a subset of the
plurality of labelled examples to train a model. In some
embodiments, the subset can be created by randomly sampling the
plurality of labelled examples. A random sample can allow for
broader generalization of the model created at step 360. In some
embodiments, the user can select the subset of labelled examples.
For example, the user can select all entities with a particular
feature (e.g., all households with at least 2 cars). In some
embodiments, the subset can be created by sampling labelled
examples with a wide range of values for features that are known to
be more important (e.g., change in income).
[0044] At step 360, the computer system can train a model using the
subset of labelled examples. For example, the model can be trained
by generalizing a function that maps inputs (e.g., the one or more
features) to outputs (e.g., the label, such as whether the
specified action occurred). In some embodiments, the model can
perform regressions for each feature simultaneously. In some
embodiments, the model can be trained by a hyperparameter
optimization algorithm. In some embodiments, the hyperparameter
optimization algorithm can perform a grid search through a
hyperparameter space for the optimal hyperparameters. In some
embodiments, the hyperparameter algorithm can perform a random
search through the hyperparameter space. The computer system can
evaluate the hyperparameters against a holdout set of labelled
examples. For example, the computer system can apply the model
trained by hyperparameter optimization to the holdout set. In some
embodiments, the computer system can retrain the model with
different hyperparameters if a particular attribute (e.g.,
accuracy, area under the curve, log-likelihood, F1-score, Top N,
etc.) of the model does not exceed a predetermined threshold. In
some embodiments, the computer system can continue to retrain the
model until it obtains hyperparameters that exceed the threshold
value. In some embodiments, the computer system can train the model
a predetermined number of times (e.g., 10). The computer system can
evaluate the trained models against a holdout set and select the
model with the most favorable attributes (e.g., accuracy, area
under the curve, log-likelihood, F1-score, Top N, etc.).
[0045] At step 370, the computer system can output the model. In
some embodiments, the model can be outputted to a user for future
use. For example, a user can use the model to determine the
propensity of an entity to take a specified action. In other
embodiments, the computer system can output the model to be stored
locally or to be transmitted to an external database. In some
embodiments, the computer system can output the model for use in
another method, such as the method described in FIG. 2, to
determine the propensity of an entity to take a specified action.
In some embodiments, the computer system can output confidence
levels for the model. For example, the computer system can output
the particular attribute (e.g., accuracy, area under the curve,
log-likelihood, F1-score, Top N, etc.) of the model with respect to
the examples in the holdout set.
[0046] FIG. 4 provides an exemplary use case scenario for
determining a propensity of an entity to take a specified action
applied to an exemplary data structure. While the flowchart
discloses the following steps in a particular order, it is
appreciated that at least some of the steps can be moved, modified,
or deleted where appropriate, consistent with embodiments of the
present disclosure. In some embodiments, the use case scenario
shown in FIG. 4 can be performed by a computer system (e.g.,
computer system 100). It is appreciated that some of these steps
can be performed in full or in part by other systems.
[0047] Referring to FIG. 4, one or more data tables 410 acquired
from one or more data sources can include information associated
with the entity. The one or more data tables 410 can be stored
locally at the computer system and/or at one or more remote servers
(e.g., such as a remote database), or at one or more other remote
devices. In some embodiments, the information in the data tables
can be stored in one or more multidimensional tables. By way of
example, as shown in FIG. 4, information of a first type (e.g.,
bill payment amount) associated with the entity, (e.g., a
household), can be stored in a first multidimensional table 410 and
information of a second type (e.g., income or number of cars)
associated with the entity can be stored in a second
multidimensional table 410. In some embodiments a table can contain
information associated with a single entity. For example, Bill
Amount table 410 shows the most recent bill payment amounts
associated with the entity in this exemplary scenario. In other
embodiments (not shown), a table can store information associated
with a plurality of entities. For example, each row in the table
can correspond to a different entity (e.g., Household #1, Household
#2, etc.) and each column in the table can correspond to a payment
amount. In some embodiments, the information stored in the table
can include entries associated with a temporal period. For example,
a table can store a bill payment date for each bill payment amount.
As shown in FIG. 4, Bill Payment Table 410 can store dates in the
first column (e.g., 1/1/14, 2/1/14, and 3/1/14). Each bill payment
date can be associated with the bill payment amount. For example,
Bill Payment Table 410 shows that an amount of $800 was billed to
the household on Jan. 1, 2014. The information can be stored as a
continuous value (e.g., $800 as a bill payment amount), as a
categorical value, (e.g., "Sedan" or "Coupe" as an automobile
type), as textual value, or as any other type of value. In some
embodiments, a table can be stored in either a row-oriented
database or a column-oriented database. For example, a row in a
row-oriented table can contain information associated with an
entity (e.g., Household #1) and data in the row can be stored
serially such that information associated with the entity can be
accessed in one operation.
[0048] The computer system can form (420) a record 430 including
some or all information from the one or more data sources
associated with the entity. In some embodiments, record 430 can be
formed (420) by integrating the information from the one or more
data sources that is associated with the entity. Record 430 can
contain a multitude of information related to the entity. For
example, record 430 can contain all information from the one or
more data sources associated with a household (e.g., number of
members in household, number of automobiles, income, monthly bill
mounts for each automobile, etc.). In some embodiments, record 430
can be stored as a cogroup with each row of the cogroup associated
with a different category of information. In some embodiments,
record 430 can be stored in either a row-oriented database or a
column-oriented database. For example, a row in a row-oriented
record can be associated with a data source (e.g., bill payment
amount) and data in the row can be stored serially such that data
associated with that data source can be accessed in one operation.
As shown in FIG. 4, the "Bill Amount" is stored as row in record
430. Bill amounts $800, $600, and $600 can be stored serially such
that all of the payment amounts can be accessed in one operation.
Similarly, "Income" and "Number of Cars" are stored in separate
rows in record 430, and information from these sources (e.g. {$80K,
$70K, $70K} and {3, 2, 2}) can also be accessed in one
operation.
[0049] In some embodiments, the computer system can filter record
430 for information associated with the specified action (not
shown). For example, the specified action can be churn (e.g.,
cancellation of a subscription) and the computer system can filter
record 430 for information related to churn. In some embodiments,
the computer system can provide context for the specified action.
In some embodiments, the computer system can determine whether the
specified action will occur within a specified temporal period
(e.g., one month). The computer system can filter out all
information associated with a time that is outside (e.g., before or
after) the specified temporal period. In some embodiments, the
computer system can determine the propensity for the specified
action based on only recent events. For example, the computer
system can filter out information associated with a time before the
specified time period (e.g., stale or less relevant information).
In some embodiments, each record can be filtered in a slightly
different way. Record 430 can be filtered according to a user input
specifying an activity or temporal period. In some embodiments,
record 430 can be filtered automatically based on a presetting
(e.g., the computer can be configured to filter out all information
that is more than one year old). For example, the computer system
can determine the propensity of the entity to take the specified
action based on only data from the previous month. In the example
shown in FIG. 4, the computer system can filter out the older
entries of Bill Amount table 410 (e.g., Bill Amounts of $800 and
$600 corresponding to bill dates in January and February). The
computer system can also filter out similar entries in Income and
Number of Cars tables 410 (e.g., incomes of $80K and $70K and 3 and
2 number of cars). Thus, the computer system can use only the most
recent entries to determine the propensity of the household to take
the specified action (e.g., $600 in Bill Amount table 410, $70K in
Income table 410, and 2 in Number of Cars table 410).
[0050] The computer system can generate (440), based on record 430,
one or more features 450 associated with the entity. A feature can
be any discernable way of sorting or classifying the record (e.g.,
average value, most recent value, most common value, etc.). In some
embodiments, the computer system can generate key value pairs,
wherein each key value pair contains a feature and a value. For
example, the computer system can generate one or more features 450
such as "average bill payment amount", "average income", "average
number of automobiles", etc. and corresponding values such as
"$670", "$73K", "2.3 cars", etc. In some embodiments, the one or
more features 450 can be associated with a time value. For example,
computer system can generate features for a specified temporal
period (e.g., features can be based only on the most recent
values). Feature values can be represented as a continuous value
(e.g., $670), as a categorical value (e.g., "Sedan" or "Coupe"), as
a textual value, or as any other type of value. In some
embodiments, the one or more feature 450 can be stored as
classified as weighted values. For example, a household income of
$73,000 can be represented as weighted value of {0.27 0}, {0.73
100000}.
[0051] In some embodiments, the one or more features can be
extrapolated from the information contained in the record. For
example, a feature can be that the entity deactivated online
payments (e.g. customer deactivated ETF payment on 2/20). In some
embodiments, the one or more features can be related to
communications between the providing entity (e.g., insurance
provider) and consuming entity (e.g., household). For example,
computer system 100 can analyze (e.g., tokenize) the transcript of
a call between an agent and a household and assign a topical value
to that call (e.g., "topic 5" corresponding to anger). Computer
system 100 can store this information as a feature pair (not
shown), such as the pair {"Service Call Topic" "5"}. In some
embodiments, the one or more features can be related to whether the
household took a specified action (e.g., filed a claim or called to
change policy).
[0052] In some embodiments, the computer system can process (460)
the one or more features 450 to determine the propensity 470 of the
entity to take the specified action. In some embodiments, the
propensity 470 can be determined by applying a trained model, such
as the model described in greater detail in FIG. 3. The input to
the model can be key value pairs of the one or more features 450
associated with the entity and the specified actions and the output
of the model can be the propensity 470 of the entity to take the
specified action. In some embodiments, processing the one or more
features associated with the entity can result in a multitude of
useful insights regarding the features that influence the
propensity of the entity to take the specified action. Such
insights, can include, for example, the features that are most
influential on the propensity of the entity to take the specified
action (e.g., change in income, etc.).
[0053] In some embodiments, the computer system can output the
propensity 470. In some embodiments, the computer system can output
the propensity 470 as a continuous value, such as a number or
percentage (e.g., 80 or 80%) or as a categorical value (e.g.,
"low", "medium", or "high"). In some embodiments, the computer
system can generate a user interface, such as the user interfaces
described in greater detail in FIGS. 5 and 6 for displaying the
propensity 470.
[0054] FIG. 5 illustrates an exemplary user interface 500 provided
by a computer system (e.g., computer system 100) for display (e.g.,
display 122), in accordance with some embodiments. User interface
500 can include a plurality of tiles (e.g., tile 510), each tile
representing an entity (e.g., a household). In some embodiments,
tiles can be arranged according to the propensity of the entity to
take the specified action. For example, entities that are more
likely to take the specified action can be located near the top of
the display, whereas entities that are less likely to take the
specified action can be lower on the display. As shown in FIG. 5,
in some embodiments, the tiles can be arranged by date (e.g., date
520). For example, entities with the most recent activities can be
located near the top of the display. By way of example, tile 510
with the most recent date 520 of Feb. 21, 2014 is located in the
top left corner of the display. The tile to the right of tile 510
has the next most recent date (e.g., Feb. 20, 2014). Subsequent
tiles have dates that are less recent. In other embodiments,
entities with the longest pending outstanding action can be located
near the top of the screen.
[0055] In some embodiments, user interface 500 can be updated
periodically (e.g., once a day, once a week, once a month, etc.).
In other embodiments, user interface 500 can be updated when
information associated with any of the entities stored in the one
or more data sources is updated (e.g., a new entry, such as payment
bill amount, is added to a table). In some embodiments, user
interface 500 can update in response to an input received from the
user.
[0056] User interface 500 can automatically determine the entities
for which to generate the display. In some embodiments, user
interface 500 can display entities associated with a particular
user (e.g., John Smith, Triage Agent) once the user accesses user
interface 500. In some embodiments, the user can specifically
identify the entities for which to generate the display. In some
embodiments, the user can identity a category or class of entities
for which to generate the display. For example, the user can
identify a class of entities that are all consumers of a specified
provisioning entity (e.g., insurance company), the user input can
identify entities that are located within a specified geographic
region (e.g., all households within the state of Illinois), or the
user input can identify any other category of entities (e.g., all
households with an income over $100,000).
[0057] In some embodiments, user interface 500 can portray a date
520 (e.g., Feb. 21, 2014) associated with the entity in tile 510.
Date 520 can correspond to the current date, the date that method
200 was last performed for that entity, the date that information
in the one or more data sources associated with that entity was
last updated, or the date that the user last viewed the tile
associated with the entity. In some embodiments, user interface 500
can portray a propensity 540 of the entity to take the specified
action (e.g., "Med") in tile 510. For example, as shown in FIG. 5,
user interface 500 can portray the propensity as a categorical
value, such as "Med" in tile 510. In some embodiments, user
interface 500 can portray tile 510 in a color (e.g., green for
"low", red for "high", etc.) representing the propensity. In some
embodiments, user interface 500 can portray the propensity in tile
510 as numerical value or as a percentage.
[0058] User interface 500 can portray recent activity 530 in tile
510. In some embodiments, the recent activity 530 can be entered by
a user. By way of example, a recent activity could be that an
"Agent called customer on 2/21 regarding discounts" as shown in
tile 510. In some embodiments, user interface 500 can generate the
recent activity based on the one or more features associated with
the entity. For example, user interface 500 can display, "Customer
registered an additional luxury vehicle on 2/18" in tile 510
responsive to this information being updated in the record
associated with the entity. In some embodiments, tile 510 can
portray important features 540 associated with the entity. For
example, as shown in tile 510 of FIG. 5, these features can be
"vehicle", "discounts", etc. In some embodiments, user interface
500 can recommend an action for the user to take (e.g., service
call). In some embodiments, this recommendation can relate to the
recent activity 530. A user can use this information to take
preemptive action to prevent the entity from taking the specified
action. By way of example, if the propensity of a household
subscribing to an automobile insurance policy was high, the user
could take remedial action (e.g., lower rate, contact customer to
address customer concerns, etc.). In some embodiments user
interface 500 can display a number uniquely identifying the entity
(e.g., a policy number).
[0059] In some embodiments, user interface 500 can allow a user to
click on tile 510 to access additional information associated with
the entity. For example, a user can access user interface 600 shown
in FIG. 6 below by clicking on one of the tiles shown in user
interface 500 of FIG. 5. In some embodiments, user interface 600
can be inlaid over user interface 500. In some embodiments, user
interface 600 can be a distinct user interface.
[0060] User interface 500 can also allow access to additional user
interfaces (not shown) through the "INBOX," "FLAGGED," and "STATS"
links shown at the top of user interface 500. The "INBOX" user
interface can display messages between the user and other agents to
track the remedial actions that were taken. The INBOX user
interface can also be used to notify users of households with a
higher likelihood of cancelling the subscription. The "FLAGGED"
user interface can show customers (e.g., households) that the user
believed were at risk for taking the specified action. For example,
the FLAGGED user interface can contain a list of the households
most likely to cancel their insurance policy. In some embodiments,
these households can be selected manually by the user. In some
embodiments, these households can be automatically populated if the
propensity exceeds a predetermined threshold (e.g., the FLAGGED
interface can be populated with all households with a "High"
propensity). The FLAGGED user interface can allow the user to track
remediation steps (e.g., contacting the household, changing policy,
etc.). Households can remain in the FLAGGED user interface until
their risk of taking the specified action has declined, the user
has decided that the household is no longer at risk, or the
specification action occurred (e.g., the household cancelled its
subscription). The "STATS" interface can display metrics such as,
for example, the rate at which the user was able to prevent the
specified action from occurring categorized by action taken and the
most common and/or trending issues.
[0061] FIG. 6 illustrates another exemplary user interface 600
provided by the computer system (e.g., computer system 100) for
display (e.g., display 112) in accordance with some embodiments. In
some embodiments, user interface 600 can be accessed by clicking on
a tile (e.g., entity) in user interface 500. User interface 600 can
portray a date 610 (e.g., Feb. 18, 2014) associated with the
entity. Date 610 can correspond to the current date, the date that
method 200 was last performed for that entity, the date that
information in the one or more data sources associated with that
entity was last updated, or the date that the user last viewed the
tile associated with the entity. In some embodiments, user
interface 600 can portray a propensity 620 of the entity to take
the specified action. For example, as shown in FIG. 6, user
interface 600 can portray propensity 620 as a categorical value,
such as "Med.". In some embodiments, user interface 600 can convey
propensity 620 by shading the top bar in a different color (e.g.,
green for "low", red for "high", etc.) representing propensity 620.
In some embodiments, user interface 600 can portray propensity 620
as numerical value or as a percentage. In some embodiments user
interface 600 can display the entity status 630 (e.g., "Active" if
the household is currently subscribing to a policy).
[0062] In some embodiments, user interface 600 can display recent
activities 640 associated with the entity. For example, as shown in
FIG. 6, user interface 600 can display that the "customer
registered an additional luxury vehicle on 2/18". User interface
600 can recommend an action 650 for the user to take (e.g., service
call). In some embodiments, this recommendation 650 can relate to
the recent activity.
[0063] User interface 600 can provide the user with additional
information associated with the entity. As shown in the bottom left
panel of FIG. 6, user interface 600 can display basic biographic
information 660 for the entity. In the automobile insurance
context, for example, user interface 600 can display the policy
number, (e.g., 34726182), the entity name (e.g., household/owner of
the policy, David Stark), the policy coverage start date (e.g.,
12/12/2004), any secondary owners associated with the policy (e.g.,
James Watson), information associated with the insured automobile
(e.g., 2013 Cadillac Escalade), and the type of insurance policy
(e.g., Standard).
[0064] In some embodiments, user interface 600 can also display
information for an agent 670 associated with the entity. For
example, the user interface 600 can display the name (e.g., Bruce
Atherton) and contact information (e.g., 583 234-9172) of the
agent. A user can use this information to take preemptive action to
prevent the entity from taking the specified action. By way of
example, if the propensity of churning for a household subscribing
to an automobile insurance policy was high, the user could contact
the agent to take remedial action (e.g., lower rate, address
customer concerns, etc.).
[0065] In some embodiments, the right panel of FIG. 6, can display
recent events 680 associated with the entity. For example, user
interface 600 can display whether the entity status is active
(e.g., whether the entity is currently subscribing to a policy) or
whether the agent has taken any actions (e.g., called the household
or subscriber). In some embodiments, user interface 600 can also
allow the user and agent to converse in the right panel. For
example, the user can click on the "ADD AN UPDATE" button 690 to
remind the agent to contact the entity. The user interface can
display responsive comments 680 from the agent and the agent can
add any actions taken 680 (e.g., calling the household).
[0066] Embodiments of the present disclosure have been described
herein with reference to numerous specific details that can vary
from implementation to implementation. Certain adaptations and
modifications of the described embodiments can be made. Other
embodiments can be apparent to those skilled in the art from
consideration of the specification and practice of the embodiments
disclosed herein. It is intended that the specification and
examples be considered as exemplary only, with a true scope and
spirit of the present disclosure being indicated by the following
claims. It is also intended that the sequence of steps shown in
figures are only for illustrative purposes and are not intended to
be limited to any particular sequence of steps. As such, it is
appreciated that these steps can be performed in a different order
while implementing the exemplary methods or processes disclosed
herein.
* * * * *