U.S. patent application number 10/154427 was filed with the patent office on 2003-05-22 for system and method for analyzing data.
This patent application is currently assigned to Gordonomics Ltd.. Invention is credited to Gordon, Goren.
Application Number | 20030097294 10/154427 |
Document ID | / |
Family ID | 11075869 |
Filed Date | 2003-05-22 |
United States Patent
Application |
20030097294 |
Kind Code |
A1 |
Gordon, Goren |
May 22, 2003 |
System and method for analyzing data
Abstract
The present invention provides a method and system for analyzing
data that characterizing incoming data. The incoming data is
analyzed and processed to provide scores. The received scores are
further processed to provide a final score by processing the
incoming data record with internal databases catalog to provide a
final score that is a final diagnosis of the method and system of
the present invention. In a preferred embodiment of the present
invention the incoming data are information units regarding
customer behavior that are analyzed to characterize customers for
retention purposes.
Inventors: |
Gordon, Goren; (Rishon
Le-Zion, IL) |
Correspondence
Address: |
LYON & LYON LLP
633 WEST FIFTH STREET
SUITE 4700
LOS ANGELES
CA
90071
US
|
Assignee: |
Gordonomics Ltd.
|
Family ID: |
11075869 |
Appl. No.: |
10/154427 |
Filed: |
May 21, 2002 |
Current U.S.
Class: |
705/7.29 ;
705/1.1; 707/E17.058 |
Current CPC
Class: |
G06Q 40/00 20130101;
G06Q 30/0201 20130101; G06F 16/30 20190101 |
Class at
Publication: |
705/10 ;
705/1 |
International
Class: |
G06F 017/60 |
Foreign Application Data
Date |
Code |
Application Number |
Nov 21, 2001 |
WO |
PCT/IL01/01074 |
Nov 20, 2001 |
IL |
146597 |
Claims
What is claimed is:
1. In a computing environment accommodating at least one input
device connectable to at least one server device connectable to at
least one output device, a method of processing at least one
information unit introduced by the at least one input device by the
at least one server device to create at least one information score
based on the at least one information unit, the method comprising
the steps of: creating at least one complexity catalog based on the
at least one information unit; and establishing at least one score
unit based on the at least one complexity catalog; and establishing
scores for analyzing data.
2. The method of claim 1 further comprising the steps of: obtaining
at least one information unit from the at least one input device by
the at least one server device; and displaying at least one scoring
unit.
3. The method of claim 1 whereas the information unit contains
information about customer behavior.
4. In a computing environment accommodating at least one input
device connected to at least one server device having at least one
output device, a system for the processing at least one information
unit introduced via the at least one input device by the at least
one server device to process at least one information unit based,
the system comprising the elements of: an infrastructure server
device to create at least one complexity catalog; and a complexity
catalog to hold at least one list of ordered complexity values
associated with the partitioned sub-unit blocks; and an application
server to build at least one information summary unit based on the
at least one information unit and on at least one associated
complexity catalog; and a scoring component to provide scores.
Description
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority from PCT Application No.
PCT/IL01/01074, filed Nov. 21, 2001, and Israeli Patent Application
No. 146597, filed Nov. 20, 2001, each of which is hereby
incorporated by reference as if fully set forth herein.
BACKGROUND OF THE INVENTION
[0002] The present invention generally relates to a system and
method for detecting and analyzing of information units. More
specifically, the present invention relates to the detection and
analysis of data records by selecting particular aspects of the
data record.
[0003] Computerized systematic data analysis of information units
is performed today mainly by using decision trees. Decision trees
within the computerized systems utilize specifically designated
data stored in system databases to categorize information units
received as input data. The specifically designated data used for
categorizing the information units is based on assumptions such as
statistics or specific requirements of the system. One example of
utilizing such systems is the analysis performed for fraud
detection in credit card transactions. The fraud detection analysis
system can detect anomalous transactions according to designated
data. A transaction associated with the performance of a purchase
for a sum that exceeds substantially from the "normal" designated
sum will generate an alert, a warning or provide suitable
instructions to supervising routines of the system or to a specific
user. Nevertheless, the designated stored control data utilized for
the generation of the indication for fraudulent transactions has
limitations. The source of the limitations of designated data is
the inaccuracy that derives from the inherent nature of such data,
that attempts to predict the future with knowledge gained in the
past, and from difficulty to characterize a credit card holder's
"normal" behavior. The difficulty to characterize a credit card
holder's behavior originates with a wide variety of factors that
influence a person's behavior, such as religion, seasons of the
year, family status, ethnic origin, and the like. Other difficulty
for providing accurate information regards credit card holders that
do not have a simple pattern of transaction performance. U.S. Pat.
No. 5,819,226 discloses a prior art system known in the field of
fraudulent behavior detection. The patent provides an automated
system and method for detecting fraudulent transactions using the
neural network method as a predictive model. The neural network
model "learns" a pattern that it can later identify. The learning
process is based on a given number of iterations executed by the
neural network based detection system. However, the ability of a
fraudulent detection system based upon a neural network system is
substantially limited and all too often provides false diagnosis of
transactions as fraudulent. The principal reason for providing
false diagnosis is related to the manner in which the neural
networks method operates. The neural network method ability within
a fraudulent detection system is limited as it learns the pattern
of a single customer, credit card holder, or a group of customers,
and their fraudulent behavior pattern and produces a score based on
the "learned" patterns. Consequently, the neural network provides a
large amount of false recognitions, such as identifying not
fraudulent credit card transactions as fraudulent. The limitations
of the neural networks are due to their disability to deal with
"trouble making" customers who have an erratic behavior pattern and
do not have a simple pattern.
[0004] Therefore, there is an urgent need to introduce a method and
system that will provide accurate information regarding the input
information units. There is also an urgent need for improved
customer retention applications. The importance of customer
retention applications within modern commerce is considerable.
Business executives as well as commercial retail outlet owners are
keenly aware that in order to retain their customers a
substantially automatic learning process must be performed. There
is a need for a method and system that will provide useful
information received into a system that is operative in the
processing and analyzing customer behavior and will provide the
characteristics of customers concerning their dealings with the
specific businesses. The system and method proposed by the present
invention provides information related to customers behavior by
analyzing concurrently various fields within information units
containing diverse types of data.
SUMMARY OF THE INVENTION
[0005] The present invention relates to a computing environment
accommodating at least one input device connectable to at least one
server device connectable to at least one output device, a method
of processing at least one information unit introduced by the at
least one input device by the at least one server device to create
at least one information score based on the at least one
information unit, the method comprising the steps of: creating at
least one complexity catalog based on the at least one information
unit, and establishing at least one score unit based on the at
least one complexity catalog, and establishing scores for
assessment.
[0006] The method above mentioned further comprising the steps of:
obtaining at least one information unit from the at least one input
device by the at least one server device, and displaying the at
least one scoring unit. The information unit processed within the
present invention can include data about an individual behavior
such as a customer or a group of customers.
[0007] The present invention computing environment accommodating at
least one input device connected to at least one server device
having at least one output device, a system for the processing at
least one information unit introduced via the at least one input
device by the at least one server device to process at least one
information unit based, the system comprising the elements of: an
infrastructure server device to create at least one complexity
catalog, and a complexity catalog to hold at least one list of
ordered complexity values associated with the partitioned sub-unit
blocks, and an application server to build at least one information
summary unit based on the at least one information unit and on at
least one associated complexity catalog, and a scoring component to
provide scores.
BRIEF DESCRIPTION OF THE DRAWINGS
[0008] The present invention will be understood and appreciated
more fully from the following detailed description taken in
conjunction with the drawings in which:
[0009] FIG. 1 is a schematic block diagram of a system environment
of the preferred embodiment of the present invention; and
[0010] FIG. 2 is a schematic block diagram of the information
detection and analysis system of the preferred embodiment of the
present invention.
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
[0011] Preferred embodiments will now be described with reference
to the drawings. For clarity of description, any element numeral in
one figure will represent the same element if used in any other
figure.
[0012] The present invention provides a method and system for
analysis of data (SAD) by executing complexity calculations of a
set of data fields, enabling segmentation of data, providing scores
to the separated segmentations and providing final score of
clusters of the analyzed segmentations of data. The SAD can be used
in a variety of applications such as customer retention, marketing,
analysis of credit card transaction and the like. In one preferred
embodiment of the present invention the SAD aids a businesses
enterprise in customer retention. In contrast the state of art
presently is such that customer retention systems mostly include
analysis of performance measures, such as quantitative metrics and
formulas, for detecting potential performance problems. The
measures used include amongst others: retention rate, loyalty index
and satisfaction index. These measures are analyzed along the time
axis and other customer dimensions such as: customer category,
market geography segment, and the like. The customer retention
method and system proposed by the preferred embodiment of the
present invention includes a complexity calculation tool that
attacks the customer retention problem from a different, yet
complimentary angle. It gives a numerical parameter to the
complexity of the pattern, i.e., whether it is simple and monotonic
or erratic and unpredictable, and alerts whenever this parameter
changes, e.g. when a transaction of a simple behaving account
deviates from its monotonic behavior, or when a monotonic and
recurring transactions appear in an erratic account. The customer
retention system and method proposed by the preferred embodiment of
the present invention provides an end result with a final score.
The final score provides analysis that indicates whether a specific
customer is about to be lost to the business.
[0013] The new addition to the prior art in business intelligence
is divided into two major components: a parameterization of the
data by calculating the complexity of a given record, and an
analysis method of data records by calculating the complexity of
one or more dimensions of the data in a subject oriented way. The
parameterization of data enables to attach meaning to a parameter
(e.g. high complexity value means an erratic behavior and an
unstable customer, low complexity value means a monotonic behavior
and a stable customer). Segmentation of data according to a
parameter, a better classification can be made (e.g. all the high
complexity customers are grouped together in an "erratic behavior
group", while the low complexity customers are in the "monotonic
group"). The records are classified according to their behavior and
not according to a predetermined property (e.g. demographics).
After stratifying the records according to their complexity
parameter, further enhanced analysis will be provided by the
utilization of additional. For example, analysis by neural networks
per cluster of records with a similar complexity parameter will
produce a better result and a higher prediction rate than by
analyzing the entire set of records together or by a predefined
property. The similar complexity groups are better suited for the
neural network characteristics than most other types of grouping.
Analysis by additional tools (e.g. decision trees and the like) for
each cluster of similar complexity records will produce better
results for prediction, analysis and understanding of the data.
[0014] The analysis method of data records by calculating the
complexity of the data in a subject oriented way enables the
selection of the fields of interest and the calculation of the
other fields' complexity, thus showing the behavior of the selected
fields. For example, in CRM data records, the Customer field can be
selected and the complexity of the "Time of Call", "Length of Call"
and "Transaction Made" fields can be calculated, thereby analyzing
the behavior of all the customers is accomplished. This innovative
analysis method will show whether they always make the same
transaction in a short call (low complexity) or call in different
times of the day, making all kinds of transaction (high
complexity). The analysis provided by the present invention
provides the possibility to detect changes of patterns within
specific field or combination thereof within an information unit.
Thus, as the preferred embodiment is within the field of customer
retention, the behavior or a change of behavior of a customer or
group of customers over the time axis can be detected.
[0015] For example, by calculating the complexity of all the
agents, in CRM data records, this analysis can alert when an agent
has deviated from his normal behavior (i.e. when his complexity has
changed from previous complexity calculations). Thus, for example,
the supervisor can be alerted that the specific agent has changed
his behavior.
[0016] By analyzing the complexity of all the fields (e.g. in CRM
customers, agents, transactions, etc.), at different times, the new
analysis method can detect a change in behavior and generate an
alert concerning the change.
[0017] The preferred embodiment will be better understood by
relating to FIG. 1 that illustrates the environment of the SAD 18.
The SAD 18 receives data input from users 10, 12, and 14 via the
data communication network 20 (DCN). Users 10, 12 and 14 can be
individuals forwarding information units regarding particular
customers, businesses forwarding information regarding of all their
customers behavior or a transmitting center or agent transmitting
information regarding customers behavior from one or more
locations. The DCN 20 can be the Internet, LAN, WAN, a satellite
communication network and the like. The most common DCN 20 used is
the standard telephone system (POTS) that enables communication via
ordinary phone connection lines.
[0018] Referring now to FIG. 2 the SAD 18 includes an input device
56, a communication device 54, an output device 58 and an analysis
and evaluation server platform 22. The input device 56 can be a
pointing device, a keyboard device or the like. The output device
58 can be a printer, a screen display or the like. The
communication device 54 can be a modem, a network interface card or
any other suitable communication devices providing transmission and
reception of data via DCN 20 of FIG. 1. according to the preferred
embodiment of the present invention the analysis and evaluation
server platform 22 includes a processor device 24, and a memory
device 26. The processor device 24 is the logic unit designed to
perform arithmetic and logic operations by responding to and
processing the basic instructions driving the computing device. The
processor device 24 can be one of the Intel Pentium series, the
PowerPC series, the K6 series, the Celeron, the Athlon, the Duron,
the Alpha, or the like. The memory device 26 includes a reference
transaction database 28, an operating system 30, a control database
32, a complexity database catalog 36 and an application server 38.
The reference transaction database 28 includes database information
including a list of customers, personal information regarding
customers, history files containing customer behavior and other
relevant information related to credit card holders and agents. The
reference transaction database 28 can be located within the SAD 18
as illustrated in FIG. 2 or in any other separate location. The
operating system 30 is responsible for managing the operation of
the entire set of software programs implemented in the operation of
the SAD 18. The operating system 30 can be of any known operating
system such as Windows NT, Windows XP, UNIX, Linux, VMS, OS/400,
AIX, OS X and the like. The complexity database catalog 36 includes
all the complexity values assigned to the records processed by the
complexity engine 52. The complexity values stored within the
complexity database catalog 36 are discussed in detail in the
pending PCT application PCT/IL01/01074 incorporated herein by
reference. The control database 32 controls the input data received
by the input device 56 and the transfer thereof to the application
server 38. The control database 32 also directs the movement of the
data from the reference transaction database 28 to the application
server 38 and to the complexity database catalog 36 from the
application server 38. The application server 38 within the
preferred embodiment includes a complexity catalog handler 40, a
scoring component 42, a learning component 44, a database handler
46, a resource allocation component 48, a user interface component
50 and a complexity engine 52. The complexity catalog handler 40 is
responsible for the obtaining the appropriate complexity metrics
records created by the application server 38 from the complexity
database catalog 36. The resource allocation component 48 is
responsible for allocating variable resources to the processing of
the separate records in accordance with the complexity metrics
thereof. The user interface component 50 is a set of specifically
designed and developed front-end programs. The component 50 allows
the user of the system to interact dynamically with the system by
performing a set of predefined procedures operative to the running
of the method. Via the component 50 the user could select an
application, as selected for the SAD customer retention purposes,
activate the selected application, adjust specific processing
parameters, select sets of records for processing according to the
complexity metrics thereof, and the like. The component 50 could be
developed as a plug-in to any of the known user interfaces. The
component 50 will be preferably a Graphical User Interface (GUI)
but any other manner of interfacing with the user could be used
such as a command-driven interface, a menu-driven interface or the
like. The database handler 46 receives the input data records from
the control database 32 and provides the records to the complexity
catalog handler 40. The database handler 46 further receives
complexity values and scores provided to data records from the
complexity catalog handler 40 and provides the control database 32
that provides the complexity database catalog 36 and reference
transaction database 28 with the complexity values and scores
regarding to data records. The learning component 44 provides
mechanism for matching a given input such as the complexity vectors
for each transaction to a given output such as a customer behavior
or deviation of ordinary customer behavior. The learning component
44 provides the scoring component 42 with different scores that are
than processed within the scoring component 42. The complexity
engine 52 provides complexity values to data records received from
the control database 32 within the application server 38 and
handled by the database handler 46.
[0019] A person skilled in the art will appreciate that what has
been shown is not limited to the description above. Those skilled
in the art to which this invention pertains will appreciate many
modifications and other embodiments of the invention. It will be
apparent that the present invention is not limited to the specific
embodiments disclosed and those modifications and other embodiments
are intended to be included within the scope of the invention.
Although specific terms are employed herein, they are used in a
generic and descriptive sense only and not for purposes of
limitation. The invention, therefore, is not to be restricted
except in the spirit of the claims that follow.
* * * * *