U.S. patent application number 16/384474 was filed with the patent office on 2019-12-12 for automatic dynamic reusable data recipes.
The applicant listed for this patent is DOMO, Inc.. Invention is credited to Paul Baker, Jeff Burtenshaw, Joshua G. James, Daren Thayne.
Application Number | 20190377727 16/384474 |
Document ID | / |
Family ID | 66098723 |
Filed Date | 2019-12-12 |
![](/patent/app/20190377727/US20190377727A1-20191212-D00000.png)
![](/patent/app/20190377727/US20190377727A1-20191212-D00001.png)
![](/patent/app/20190377727/US20190377727A1-20191212-D00002.png)
![](/patent/app/20190377727/US20190377727A1-20191212-D00003.png)
![](/patent/app/20190377727/US20190377727A1-20191212-D00004.png)
![](/patent/app/20190377727/US20190377727A1-20191212-D00005.png)
![](/patent/app/20190377727/US20190377727A1-20191212-D00006.png)
![](/patent/app/20190377727/US20190377727A1-20191212-D00007.png)
United States Patent
Application |
20190377727 |
Kind Code |
A1 |
Burtenshaw; Jeff ; et
al. |
December 12, 2019 |
AUTOMATIC DYNAMIC REUSABLE DATA RECIPES
Abstract
A data recipe may be automatically generated to provide
requested information to a user. After the information is
requested, one or more data sources may be interrogated to discover
a plurality of data types of data stored in the data sources. The
data types may be categorized to define a plurality of data recipe
ingredients that are likely to be needed to provide the requested
information. The data recipe ingredients may be compared with a
reference data recipe. Based on the results of the comparison, a
new data recipe that provides the requested information may be made
by either modifying the reference data recipe or by proceeding
independently of the reference data recipe. The new data recipe
may, for example, calculate a key performance indicator used to
measure organizational performance.
Inventors: |
Burtenshaw; Jeff; (West
Jordan, UT) ; Thayne; Daren; (Orem, UT) ;
James; Joshua G.; (Orem, UT) ; Baker; Paul;
(Salt Lake City, UT) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
DOMO, Inc. |
American Fork |
UT |
US |
|
|
Family ID: |
66098723 |
Appl. No.: |
16/384474 |
Filed: |
April 15, 2019 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
14257669 |
Apr 21, 2014 |
10262030 |
|
|
16384474 |
|
|
|
|
61814586 |
Apr 22, 2013 |
|
|
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
G06F 16/221 20190101;
G06F 16/93 20190101; G06F 16/2246 20190101; G06F 16/254 20190101;
G06F 16/954 20190101; G06F 16/245 20190101; G06F 16/2458 20190101;
G06F 16/248 20190101; G06F 16/285 20190101 |
International
Class: |
G06F 16/245 20060101
G06F016/245; G06F 16/93 20060101 G06F016/93; G06F 16/22 20060101
G06F016/22; G06F 16/248 20060101 G06F016/248; G06F 16/25 20060101
G06F016/25; G06F 16/28 20060101 G06F016/28; G06F 16/954 20060101
G06F016/954 |
Claims
1. A computer-implemented method for processing structured data,
the method comprising: providing a library on a non-transitory
storage medium, the library defining a plurality of reference data
recipes, each reference data recipe comprising a data process
configured to produce quantitative output data by use of one or
more reference data components; deriving live data components from
data elements managed by a data source, the deriving comprising:
determining attributes of respective data elements of a plurality
of data elements managed by the data source, and applying
pre-determined categorization rules to the attributes determined
for the respective data elements, wherein applying the
pre-determined categorization rules to a live data component
corresponding to a specified data element comprises categorizing
the live data component as one of a measurement component and a
dimension component; selecting a reference data recipe from the
library in response to a request, the selecting comprising:
comparing reference data components of respective reference data
recipes to live data components derived from the data elements
managed by the one or more data sources, and comparing data
processes of the respective reference data recipes to the request;
producing a live data recipe, the producing comprising substituting
reference data components of the selected reference data recipe
with designated live data components; and generating quantitative
output data in response to the request, wherein generating the
quantitative output data comprises applying a first data process to
a measurement component of the live data recipe and a dimension
component of the live data recipe, the measurement component
comprising a first live data component corresponding to a first
data element managed by the data source, and the dimension
component comprising a second live data component corresponding to
a second data element managed by the data source.
2. The method of claim 1, wherein comparing the data processes of
the respective reference data recipes to the request comprises
comparing semantic processing metadata of the request to the
respective reference data recipes.
3. The method of claim 2, wherein the semantic processing metadata
of the request comprises a key performance indicator.
4. The method of claim 1, wherein producing the live data recipe
comprises modifying a data process of the selected reference recipe
to operate on the designated live data components.
5. The method of claim 4, wherein generating the quantitative
output data comprises applying the data process of the live data
recipe to data elements corresponding to the designated live data
components.
6. The method of claim 1, wherein determining the attributes of the
respective data elements comprises parsing a schema of the data
source.
7. The method of claim 6, wherein parsing the schema of the data
source comprises parsing data elements stored within the data
source into a NoSQL tree structure.
8. The method of claim 7, wherein parsing the data elements into
the NoSQL tree structure comprises selecting a schema for the NoSQL
tree structure based on a structure of the data source.
9. The method of claim 1, wherein the pre-determined categorization
rules comprise: a first rule to categorize date data types as
dimension components, a second rule to categorize alpha-numeric
data types as dimension components, and a third rule to categorize
numeric data types as measure components.
10. The method of claim 1, further comprising categorizing the live
data components by use of the pre-determined categorization rules,
the categorizing further comprising inferring relationships among
measure components and dimension components based on a structure of
the one or more data sources.
11. The method of claim 1, wherein comparing the reference data
components of respective reference data recipes to the live data
components comprises matching each of the live data components to a
semantic layer.
12. The method of claim 11, wherein matching each of the live data
components to the semantic layer comprises: locating an identifying
characteristic of the data element corresponding to each live data
component; locating, within the semantic layer, a phrase matching
each identifying characteristic, wherein the identifying
characteristic is selected from the group consisting of: a name of
the data element; a data type of the data element; a data size of
the data element; a data structure of the data element; metadata of
the data element; a sample data set related to the data
element.
13. The method of claim 11, wherein the semantic layer comprises a
plurality of pairings, wherein each pairing comprises a phrase, a
phrase mapping, and a confidence factor that indicates a likelihood
that a data element is related to the phrase, wherein matching each
of the live data components to the semantic layer comprises using
the confidence factor of the pairing with a phrase that matches the
data element of the live data component.
14. The method of claim 1, wherein comparing reference data
components of a reference data recipes to the live data components
comprises using a data map to quantify relationships between the
reference data components and data elements of the data source.
Description
CROSS-REFERENCE TO RELATED APPLICATION
[0001] The Application Data Sheet ("ADS") filed in the present
application is incorporated by reference. Any applications claimed
on the ADS for priority under 35 U.S.C. .sctn..sctn. 119, 120, 121,
or 365(c), and any and all parent, grandparent, great-grandparent,
etc., applications of such applications, are also incorporated by
reference, including any priority claims made in those applications
and any material incorporated by reference, to the extent such
subject matter is not inconsistent herewith. The present
application claims priority to: U.S. application Ser. No.
14/257,669 filed on Apr. 21, 2014 and issued as U.S. Pat. No.
10,262,030 on Apr. 16, 2019, and U.S. Provisional Application Ser.
No. 61/814,586 for "Automatic Dynamic Reusable Data Recipes," filed
Apr. 22, 2013, each of which is incorporated by reference herein in
its entirety.
FIELD OF THE INVENTION
[0002] The present invention relates to systems and methods for
generating user-requested information, and more particularly,
automated creation of data recipes.
DESCRIPTION OF THE RELATED ART
[0003] Many organizations possess vast troves of data, which may be
stored in a variety of locations. Many aspects of organizational
activity are often tracked and recorded. Nevertheless, despite the
immense quantities of data available, many organizations find that
they are unable to locate the information they currently need,
particularly when multiple data sets must be combined to provide
the information.
[0004] One example of this is the generation of key performance
indicators, or KPI's, that provide metrics for assessing
organizational performance. Many businesses use KPI's to make
strategic decisions. Unfortunately, in many instances, when a new
KPI is requested, considerable work must be done in order to obtain
it. For example, a user may have to (1) determine what the
component data of the KPI are, (2) determine where this data
resides among one or more files, databases, and the like, (3)
locate the data, (4) determine how the data should be combined in
order to obtain the KPI, and (5) combine the data in the manner
determined.
[0005] This can be exhaustive, particularly for a large
organization in which the component data for the KPI are managed by
multiple individuals. Hence, what may seem to be a simple request
for information can be surprisingly difficult to fulfill. This
problem may be compounded when, as is often the case, there is no
standard format, nomenclature, or other metadata that can be used
to automate the search for relevant data. The user must then engage
in some analysis to determine where the desired data are likely to
reside, how they are likely to be identified in the associated file
and/or database. Hence, it is often prohibitively time consuming
for an organization to find and use the data needed. Thus, tools
that could be used to enhance the performance and strategic
decision-making of the organization are simply not available.
[0006] Conventionally, data modeling is used to organize and
structure data for efficient query and retrieval in various
contexts. However, data modeling is typically a highly manual and
expert-based task. Conventional data modeling methods may be
expensive and labor-intensive, and may be unavailable to many
organizations. Furthermore, many known data modeling techniques
require significant computational time to resolve.
SUMMARY
[0007] As set forth above, locating, compiling, and processing data
needed to provide a requested piece of information can be very
difficult and time-consuming. The systems and methods of the
present invention may address such difficulty by providing
mechanisms for automatically creating a recipe to provide requested
information. This may be done without the need for the user to
review the associated data source or locate the constituent
data.
[0008] Various embodiments of the present invention may implement
dynamic reusable data recipes that allow a data structure to evolve
automatically and dynamically over time as new content is added.
Such content can be added for later access from a business semantic
layer of a software application.
[0009] In at least one embodiment, the system may operate in
contexts where high volumes of new content may be added and
accessed/queried on a continual basis, so that the ability for a
human data architect or team of data modelers would be overwhelmed
by the volume and variety of structures required to be developed to
support ever-changing needs.
[0010] In at least one embodiment, the system of the present
invention may automatically create and evolve reusable data
structures in real-time with little human intervention. One
application of such a system is to enable and implement a
community-based service that allows members to create new content
and share that content with others. This can include, for example,
extending existing content to cover new attributes and/or to answer
new facets of questions.
[0011] In at least one embodiment, an interrogation engine, a
categorization engine, and a comparison engine may be provided.
These engines may, co-operatively, have the capability to evolve
data recipes so as to optimize the reuse of existing data recipes
and extend them as needed. In this manner, the systems and methods
of the present invention may automate the process of generating
data structures for efficient data storage and retrieval of query
tools. Further details and variations are described herein.
BRIEF DESCRIPTION OF THE DRAWINGS
[0012] The accompanying drawings illustrate several embodiments of
the invention. Together with the description, they serve to explain
the principles of the invention according to the embodiments. One
skilled in the art will recognize that the particular embodiments
illustrated in the drawings are merely exemplary, and are not
intended to limit the scope of the present invention.
[0013] FIG. 1A is a block diagram depicting a hardware architecture
for practicing the present invention according to one embodiment of
the present invention.
[0014] FIG. 1B is a block diagram depicting a hardware architecture
for practicing the present invention in a client/server
environment, according to one embodiment of the present
invention.
[0015] FIG. 2 is a block diagram depicting the structure of a data
recipe according to one embodiment of the present invention.
[0016] FIG. 3 is a block diagram depicting the structure of a data
map according to one embodiment of the present invention.
[0017] FIG. 4 is a block diagram depicting the structure of a
semantic layer according to one embodiment of the invention.
[0018] FIG. 5 is a block diagram depicting a system for carrying
out automatic information provision, according to one embodiment of
the present invention.
[0019] FIG. 6 is a flowchart depicting a method of carrying out
automatic data recipe generation, according to one embodiment of
the present invention.
[0020] FIG. 7 is a block diagram depicting a first recipe used to
obtain a first KPI according to one embodiment of the
invention.
[0021] FIG. 8 is a block diagram depicting a second recipe used to
obtain a second KPI according to one embodiment of the
invention.
DETAILED DESCRIPTION OF THE EMBODIMENTS
[0022] For illustrative purposes, the systems and methods described
and depicted herein may refer to automated generation of data
recipes that provide information requested by a user. The data
recipes may, in some embodiments, relate to the operation of an
enterprise. However, one skilled in the art will recognize that the
techniques of the present invention can be applied to many
different types of information, and may apply to many different
situations apart from the exemplary enterprise operation context
mentioned previously.
System Architecture
[0023] According to various embodiments, the present invention can
be implemented on any electronic device(s) equipped to receive,
store, and present information. Such an electronic device(s) may
include, for example, one or more a desktop computers, laptop
computers, smartphones, tablet computers, or the like.
[0024] Although the invention is described herein in connection
with an implementation in a computer, one skilled in the art will
recognize that the techniques of the present invention can be
implemented in other contexts, and indeed in any suitable device
capable of receiving and/or processing user input. Accordingly, the
following description is intended to illustrate various embodiments
of the invention by way of example, rather than to limit the scope
of the claimed invention.
[0025] Referring now to FIG. 1A, there is shown a block diagram
depicting a hardware architecture for practicing the present
invention, according to one embodiment. Such an architecture can be
used, for example, for implementing the techniques of the present
invention in a computer or other device 101. Device 101 may be any
electronic device equipped to receive, store, and/or present
information, and to receive user input in connect with such
information.
[0026] In at least one embodiment, device 101 has a number of
hardware components well known to those skilled in the art. Input
device 102 can be any element that receives input from user 100,
including, for example, a keyboard, mouse, stylus, touch-sensitive
screen (touchscreen), touchpad, trackball, accelerometer, five-way
switch, microphone, or the like. Input can be provided via any
suitable mode, including for example, one or more of: pointing,
tapping, typing, dragging, and/or speech.
[0027] Data store 106 can be any magnetic, optical, or electronic
storage device for data in digital form; examples include flash
memory, magnetic hard drive, CD-ROM, DVD-ROM, or the like. In at
least one embodiment, data store 106 stores information which may
include documents 107 and/or libraries 111 that can be utilized
and/or displayed according to the techniques of the present
invention, as described below. In another embodiment, documents 107
and/or libraries 111 can be stored elsewhere, and retrieved by
device 101 when needed for presentation to user 100. Libraries 111
may include one or more data sets, including a first data set 109,
and optionally, a plurality of additional data sets up to an nth
data set 119.
[0028] Display screen 103 can be any element that graphically
displays documents 107, libraries 111, and/or the results of steps
performed on documents 107 and/or libraries 111 to provide data
output incident to automated provision of data recipes. Such data
output may include, for example, one or more prompts that request
information from the user 100, data, data visualizations, prompts
requesting input to confirm and/or modify data recipes, and the
like. In at least one embodiment where only some of the desired
output is presented at a time, a dynamic control, such as a
scrolling mechanism, may be available via input device 102 to
change which information is currently displayed, and/or to alter
the manner in which the information is displayed.
[0029] Processor 104 can be a conventional microprocessor for
performing operations on data under the direction of software,
according to well-known techniques. Memory 105 can be random-access
memory, having a structure and architecture as are known in the
art, for use by processor 104 in the course of running
software.
[0030] Data store 106 can be local or remote with respect to the
other components of device 101. In at least one embodiment, device
101 is configured to retrieve data from a remote data storage
device when needed. Such communication between device 101 and other
components can take place wirelessly, by Ethernet connection, via a
computing network such as the Internet, or by any other appropriate
means. This communication with other electronic devices is provided
as an example and is not necessary to practice the invention.
[0031] In at least one embodiment, data store 106 is detachable in
the form of a CD-ROM, DVD, flash drive, USB hard drive, or the
like. Documents 107 and/or libraries 111 can be entered from a
source outside of device 101 into a data store 106 that is
detachable, and later displayed after the data store 106 is
connected to device 101. In another embodiment, data store 106 is
fixed within device 101.
[0032] Referring now to FIG. 1B, there is shown a block diagram
depicting a hardware architecture for practicing the present
invention in a client/server environment, according to one
embodiment of the present invention. Such an implementation may use
a "black box" approach, whereby data storage and processing are
done completely independently from user input/output. An example of
such a client/server environment is a web-based implementation,
wherein client device 108 runs a browser that provides a user
interface for interacting with web pages and/or other web-based
resources from server 110. Documents 107 and/or libraries 111 can
be presented as part of such web pages and/or other web-based
resources, using known protocols and languages such as Hypertext
Markup Language (HTML), Java, JavaScript, and the like.
[0033] Client device 108 can be any electronic device incorporating
the input device 102 and/or display screen 103, such as a desktop
computer, laptop computer, personal digital assistant (PDA),
cellular telephone, smartphone, music player, handheld computer,
tablet computer, kiosk, game system, or the like. Any suitable type
of communications network 113, such as the Internet, can be used as
the mechanism for transmitting data between client device 108 and
server 110, according to any suitable protocols and techniques. In
addition to the Internet, other examples include cellular telephone
networks, EDGE, 3G, 4G, long term evolution (LTE), Session
Initiation Protocol (SIP), Short Message Peer-to-Peer protocol
(SMPP), SS7, Wi-Fi, Bluetooth, ZigBee, Hypertext Transfer Protocol
(HTTP), Secure Hypertext Transfer Protocol (SHTTP), Transmission
Control Protocol/Internet Protocol (TCP/IP), and/or the like,
and/or any combination thereof. In at least one embodiment, client
device 108 transmits requests for data via communications network
113, and receives responses from server 110 containing the
requested data.
[0034] In this implementation, server 110 is responsible for data
storage and processing, and incorporates data store 106 for storing
documents 107 and/or libraries 111. Server 110 may include
additional components as needed for retrieving data and/or
libraries 111 from data store 106 in response to requests from
client device 108.
[0035] In at least one embodiment, documents 107 are organized into
one or more well-ordered data sets, with one or more data entries
in each set. Data store 106, however, can have any suitable
structure. Accordingly, the particular organization of documents
107 within data store 106 need not resemble the form in which
documents 107 are displayed to user 100. In at least one
embodiment, an identifying label is also stored along with each
data entry, to be displayed along with each data entry.
[0036] The libraries 111 may include one or more data sources,
which may be stored at one or more locations and in one or more
formats. In at least one embodiment, libraries 111 are organized in
a file system within data store 106. Appropriate indexing can be
provided to associate particular documents with particular
quantitative data elements, reports, other documents, and/or the
like. Libraries 111 may include any of a wide variety of data
structures known in the database arts. As in FIG. 1A, libraries 111
may include one or more data sets, including a first data set 109,
and optionally, a plurality of additional data sets up to an nth
data set 119.
[0037] Documents 107 and/or libraries 111 can be retrieved from
client-based or server-based data store 106, and/or from any other
source. In at least one embodiment, input device 102 is configured
to receive data entries from user 100, to be added to documents 107
and/or libraries 111 held in data store 106. User 100 may provide
such data entries via the hardware and software components
described above according to means that are well known to those
skilled in the art.
[0038] Display screen 103 can be any element that graphically
displays documents 107, libraries 111, and/or the results of steps
performed on documents 107 and/or libraries 111 to provide data
output incident to automated provision of data recipes. Such data
output may include, for example, one or more prompts that request
information from the user 100, data, data visualizations, prompts
requesting input to confirm and/or modify data recipes, and the
like. In at least one embodiment where only some of the desired
output is presented at a time, a dynamic control, such as a
scrolling mechanism, may be available via input device 102 to
change which information is currently displayed, and/or to alter
the manner in which the information is displayed.
[0039] In at least one embodiment, the information displayed on
display screen 103 may include data in text and/or graphical form.
Such data may comprise visual cues, such as height, distance,
and/or area, to convey the value of each data entry. In at least
one embodiment, labels accompany data entries on display screen
103, or can be displayed when user 100 taps on or clicks on a data
entry, or causes an onscreen cursor to hover over a data entry.
[0040] Furthermore, as described in more detail below, display
screen 103 can selectively present a wide variety of data related
to automated data recipe generation. In particular, as described
herein, user 100 can provide input, such as a selection from a menu
containing a variety of options, to determine the various
characteristics of the information presented such as the type,
scope, and/or format of the information to be displayed on display
screen 103.
[0041] In one embodiment, the system can be implemented as software
written in any suitable computer programming language, whether in a
standalone or client/server architecture. Alternatively, it may be
implemented and/or embedded in hardware.
Data Recipe, Data Map, and Semantic Layer Structure
[0042] In general, a "data recipe" may include any instruction set
that enables requested information to be obtained from other data
in one or more data sources. A wide variety of data types and
processes may be included in a data recipe. Each piece of data used
by the data recipe (i.e., each "data recipe ingredient") may be of
any desired length and format. Thus, each piece of data may be a
character string, integer, floating point number, or any other type
of data, and may thus represent any information such as names,
times, dates, currency amounts, percentages, fractions, physical
dimensions, or any other data that may desirably be stored in a
computer.
[0043] Referring to FIG. 2, a block diagram depicts the structure
of a data recipe 200 according to one embodiment of the present
invention. The data recipe 200 may include one or more data recipe
ingredients 210 and one or more data processes 220 that describe
how the data recipe ingredients 210 are to be combined and/or
manipulated to obtain the requested information. Thus, the data
recipe ingredients 210 may include data of any of the types listed
in the preceding paragraph, and the data processes 220 may include
one or more formulas or other combination instructions indicating
how the data recipe ingredients 210 may be used to obtain the
requested information. If the data recipe ingredients 210 include
numbers, the data processes 220 may include mathematical formulas
or the like.
[0044] The data recipe ingredients 210 may include one or more
dimensions 230 and/or one or more measures 240. In general, a
measure 240 is a property on which calculations can be made, while
a dimension 230 is a data set that can be used for structured
labeling of measures. For example, dimensions 230 may represent
data that would likely be used as the scale on a data
visualization, such as the X-axis on a conventional bar chart or
line chart. The measures 240 may represent data that would likely
be used for the measurements on a data visualization, such as the
vertical displacements of the bars or points of a conventional bar
chart or line chart.
[0045] According to one example, a set of rules may be used to
define which of the data recipe ingredients 210 are dimensions 230
and which are measures 240. One example of such a set of rules is
as follows: [0046] Any of the data recipe ingredients 210 that
represent dates may be classified as dimensions 230. [0047] Any of
the data recipe ingredients 210 that do not represent dates but are
alpha-numeric may also be classified as dimensions 230. [0048] Any
of the data recipe ingredients 210 that are numeric and do not have
a name that includes "ID," "type," "group," "category," or a
similar descriptor may be measures 240. [0049] Any of the data
recipe ingredients 210 that are numeric and do not have less than a
number of distinct values, such as twenty distinct values may be
measures 240.
[0050] Such a rule set may grow in sophistication over time. In at
least one embodiment, machine learning techniques and/or other
techniques may be used to automatically grow, refine, and/or
otherwise develop the rule set used to classify the data recipe
ingredients 210 as dimensions 230 or measures 240. In alternative
embodiments, the dimensions 230 and the measures 240 may be defined
according to a wide variety of alternative definitions or rules, or
alternatively, other classifications (or no classifications) may be
applied to the data recipe ingredients 210.
[0051] Referring to FIG. 3, a block diagram depicts the structure
of a data map 300 according to one embodiment of the present
invention. The data map 300 may be designed to identify the data
recipe ingredients 210 within one or more data stores 106.
[0052] As shown, the data map 300 may include metadata 302, which
may include records for one or more reference data recipes that are
to be mapped to data sources such as the data store 106. More
specifically, the metadata 302 may include a first record 310
pertaining to a first reference data recipe and optionally, one or
more additional records pertaining to one or more additional
reference data recipes up to an nth record 320 pertaining to an nth
reference data recipe.
[0053] The first record 310 may include first reference data recipe
information 330 pertaining to the first reference data recipe. The
first reference data recipe information 330 may include a name or
other indicator of the first reference data recipe, the information
the first reference data recipe is designed to provide, data
processes 220 associated with the first reference data recipe, or
the like.
[0054] Additionally, the first record 310 may include the data
recipe ingredients 210 of the first reference data recipe, which
may include a first data recipe ingredient 340 and optionally, one
or more additional data recipe ingredients up to an nth data recipe
ingredient 342. For each of the data recipe ingredients 210 of the
first reference data recipe, the metadata 302 may contain a mapping
indicating where the data of each data recipe ingredient 210 can be
found in one or more data stores 106.
[0055] More specifically, the metadata 302 may also contain, for
each of the data recipe ingredients 210 of the first reference data
recipe, a data mapping including a first data mapping 350 for the
first data recipe ingredient 340 and optionally, one or more
additional data mappings up to an nth data mapping 352 for the nth
data recipe ingredient 342. The first data mapping 350 may indicate
the location of the first data recipe ingredient 340 in one or more
data stores 106. Similarly, the nth data mapping 352 may indicate
the location of the nth data recipe ingredient 342 in one or more
data stores 106.
[0056] Similarly, the nth record 320 may include nth reference data
recipe information 360 pertaining to the nth reference data recipe.
The nth reference data recipe information 360 may include a name or
other indicator of the nth reference data recipe, the information
the nth reference data recipe is designed to provide, data
processes 220 associated with the nth reference data recipe, or the
like.
[0057] Additionally, the nth record 320 may include the data recipe
ingredients 210 of the nth reference data recipe, which may include
a first data recipe ingredient 370 and optionally, one or more
additional data recipe ingredients up to an nth data recipe
ingredient 372. For each of the data recipe ingredients 210 of the
nth reference data recipe, the metadata 302 may contain a mapping
indicating where the data of each data recipe ingredient 210 can be
found in one or more data stores 106.
[0058] More specifically, the metadata 302 may also contain, for
each of the data recipe ingredients 210 of the nth data recipe, a
data mapping including a first data mapping 380 for the first data
recipe ingredient 370 and optionally, one or more additional data
mappings up to an nth data mapping 382 for the nth data recipe
ingredient 372. The first data mapping 380 may indicate the
location of the first data recipe ingredient 370 in one or more
data stores 106. Similarly, the nth data mapping 382 may indicate
the location of the nth data recipe ingredient 372 in one or more
data stores 106.
[0059] Referring to FIG. 4, a block diagram depicts the structure
of a semantic layer 400 according to one embodiment of the
invention. The semantic layer 400 may provide pairings between
terminology and data to facilitate location of data with a semantic
identification, such as a name, description, other semantic
metadata, or the like, within one or more data stores 106. The
semantic layer 400 may link a "phrase," which may include one or
more words or other semantic elements, with the location within one
or more data stores 106, where data corresponding to that phrase
may be found.
[0060] As shown, the semantic layer 400 may have one or more
pairings, which may include a first pairing 410 and optionally, one
or more additional pairings up to an nth pairing 420. The first
pairing 410 may include a first phrase 430 and a first phrase
mapping 440, which may indicate one or more locations within one or
more data stores 106 where the first phrase 430, or data
corresponding to the first phrase 430, may be found.
[0061] The first pairing 410 may further include a first confidence
factor 450 indicating a level of confidence in the link between the
first phrase 430 and the first phrase mapping 440. The first
confidence factor 450 may, for example, be a number such as a
percentage, where 0% indicates no confidence in the link, and 100%
indicates absolute certainty that the first phrase mapping 440 is
the location of the first phrase 430, or data related to the first
phrase 430, within one or more data stores 106. This structure may
facilitate machine learning to allow for improved performance in
generating and developing data recipes over time. The first
confidence factor 450 may be revised over time based on user
feedback, further comparisons with the data store 106, or the
like.
[0062] Similarly, the nth pairing 420 may include an nth phrase 460
and an nth phrase mapping 470, which may indicate one or more
locations within one or more data stores 106 where the nth phrase
460, or data corresponding to the nth phrase 460, may be found.
[0063] The nth pairing 420 may further include an nth confidence
factor 480 indicating a level of confidence in the link between the
nth phrase 460 and the nth phrase mapping 470. Like the first
confidence factor 450, the nth confidence factor 480 may be a
number or other indicator of a confidence level that the first
phrase mapping 440 is the location of the first phrase 430, or data
related to the first phrase 430, within one or more data stores
106. The nth confidence factor 480 may also be revised over time
based on user feedback, further comparisons with the data store
106, or the like.
[0064] If desired, the semantic layer 400 may be incorporated into
the data map 300. Alternatively, the semantic layer 400 may be
independent of the data map 300.
Conceptual Architecture
[0065] In at least one embodiment, the system of the present
invention enables automated provision of data recipes for
generating, collecting, and/or presenting information requested by
users. A data recipe may be formulated by interrogating one or more
data sources, such as the data store 106, to obtain data types
categorizing the data types into data recipe ingredients, and
comparing the data recipe ingredients with one or more reference
data recipes. The new data recipe may then be created
independently, or by modifying one or more reference data recipes
to obtain the new data recipe. The new data recipe may then be used
to generate, collect, and/or present the requested information.
[0066] Referring to FIG. 5, a block diagram depicts a system 500
for carrying out automatic data recipe generation, according to one
embodiment of the present invention. As shown, the system 500 may
have an interrogation engine 510, a categorization engine 520, and
a comparison engine 530 that may cooperate to generate a data
recipe 200. User inputs are shown on the left-hand side of FIG. 5,
and outputs to the user are shown on the right-hand side of FIG.
5.
[0067] As shown, the system 500 may receive an information request
540 from the user 100. The information request 540 may indicate one
or more pieces of information desired by the user 100. The
information request 540 may be for any type of information. The
information request 540 may be provided via the input device 102,
and may include one or more numbers, phrases, natural language
questions, menu selections, and/or a variety of other user input
elements.
[0068] In some embodiments, the system 500 may be incorporated into
a business intelligence system. The information request 540 may
relate to organizational performance, and may more specifically be
a key performance indicator (KPI).
[0069] Key performance indicators are performance measurement
indicators that can be used to evaluate success of an enterprise,
e.g., an entity, activity, organization, or group. KPI reports,
which summarize key performance indicators, can be very useful in
management of an enterprise so that effective decisions can be made
regarding business strategy and resource allocation. KPIs can be
compiled to create a "dashboard," which is a compiled snapshot of
the most important aspects of the operation of the enterprise.
[0070] Generally, KPIs are numerically measurable aspects of the
operation of an enterprise. Some KPIs are well-known and apply to a
wide range of businesses. However, the most important KPIs for a
business are often highly industry-specific, enterprise-specific,
or even department-specific. In management, the challenge is to
know which KPIs to focus on. Often, the process of finding the best
KPIs to use is an iterative one in which one set of KPI's is
utilized, and then refreshed to add and/or remove KPIs to the set
under review. The speed at which this process occurs is often
limited by the ability of an organization to locate and/or process
the data required to calculate the KPIs of interest.
[0071] Thus, if the information request 540 is for a KPI, the
system 500 may beneficially provide the user with a data recipe 200
that can be used to locate and properly process the data necessary
to obtain the KPI. This may be performed in several stages.
[0072] In at least one embodiment, the system 500 of the present
invention may apply a sophisticated set of heuristics within a
software application. The interrogation engine 510, the
categorization engine 520, and/or the comparison engine 530 may
utilize a heuristic algorithm 570 to perform any of a number of
tasks including, but not limited to creation and/or implementation
of the data map 300 and/or the semantic layer 400. FIG. 5
illustrates a connection between the heuristic algorithm 570 and
the categorization engine 520, but the interrogation engine 510
and/or the comparison engine 530 may also utilize the heuristic
algorithm 570, if desired.
[0073] The information request 540 may be received by the
interrogation engine 510. The interrogation engine 510 may
interrogate one or more data sources, which are exemplified by the
data store 106 in FIG. 5. The interrogation engine 510 may identify
data types 550 present within the data store 106. This may
optionally be done with the aid of the semantic layer 400, which
may help to map semantic elements of the information request 540
and/or the data types 550 to corresponding data within the data
store 106. Data interrogation may be done based on semantics so
that the data types 550 will conform to semantic arche-types.
Alternatively, the interrogation engine 510 may function
independently of the semantic layer 400.
[0074] In at least one embodiment, the interrogation engine 510 may
receive data from the data store 106 and parse the data into a
NoSQL tree structure for processing. If desired or necessary, the
schema of the NoSQL tree structure can be approximated based on the
structure of the data store 106.
[0075] The data types 550 may be provided to the categorization
engine 520, which may categorize the data types 550 to determine
which of the data types 550 are data recipe ingredients 210 of the
data recipe 200 to be created. Like the interrogation engine 510,
the categorization engine 520 may operate with the aid of the
semantic layer 400, or independently of the semantic layer 400.
[0076] The categorization engine 520 may function by, for example,
using the heuristic algorithm 570 to identify the dimensions 230
and/or measure 240 of the data. Relationships between data elements
within the schema may be inferred from the structure of the data
store 106, if possible. This may be accomplished, for example, by
interrogating the data store 106 to find ingredients in common
among different entities.
[0077] The data recipe ingredients 210 may be provided to the
comparison engine 530, which may compare the data recipe
ingredients 210 with one or more reference data recipes to
determine whether one or more of the reference data recipes can be
modified to yield the data recipe 200 that provides the requested
information. The reference data recipes may be stored in the data
map 300, for example, in the first reference data recipe
information 330 through the nth reference data recipe information
360. Thus, the comparison engine 530 may compare the data recipe
ingredients 210 with the data map 300, or more precisely, with the
data recipe ingredients stored within the data map 300. This may
entail comparing the data recipe ingredients 210 with the first
data recipe ingredient 340 through the nth data recipe ingredient
342 and the data recipe ingredients of the other data recipes, up
to the first data recipe ingredient 370 through the nth data recipe
ingredient 372.
[0078] The determination of whether to modify one of the reference
data recipes stored in the data map 300 may be made, for example,
based on the degree of similarity between the data recipe
ingredients 210 for the data recipe 200 desired, and the data
recipe ingredients 210 of the corresponding reference data recipe.
Thus, for example, if the data recipe ingredients 210 received by
the comparison engine 530 are very similar to the first data recipe
ingredient 340 through the nth data recipe ingredient 342 of the
first record 310 of the data map 300, it may be easier to modify
the corresponding first reference data recipe to obtain the data
recipe 200 that satisfies the information request 540. Conversely,
if none of the reference data recipes stored in the data map 300
have data recipe ingredients 210 that are similar to the data
recipe ingredients 210 received from the categorization engine 520,
the data recipe 200 may be created independently of any of the
reference data recipes.
[0079] Like the interrogation engine 510 and the categorization
engine 520, the categorization engine 520 may operate with the aid
of the semantic layer 400, or independently of the semantic layer
400. In one example, the comparison engine 530 may use the
heuristic algorithm 570 to attempt to match each element in the
schema to the semantic layer 400 based, for example, on name, data
type, sample data set, and/or any other suitable data elements.
[0080] In at least one embodiment, the comparison engine 530 may
use the data map 300. As set forth previously, the data map 300 may
describe one or more reference data recipes, such as the first
through nth reference data recipes of FIG. 3, and may describe how
each reference data recipe correlates to the data store 106. In
this manner, the comparison engine 530 may determine how a data
recipe can be applied to a different data source, used to power a
data visualization, or otherwise used in a manner different from
that of its reference data recipe.
[0081] If desired, comparison of the data recipe ingredients 210
received from the categorization engine 520 with those of the
reference data recipes of the data map 300 may be made by comparing
attributes of elements of the data recipe ingredients 210 with
corresponding attributes of the reference data recipes. The
following is an exemplary list of attributes that can be associated
with one another in accordance with the techniques of the present
invention: [0082] A name of the element; [0083] A data type of the
element; [0084] A display name of the element; [0085] An alias or
tag name of the element, which may yield more commonalities for
automated data recipe creation than elements such as the source
column name; [0086] A metadata tag associated with the element;
[0087] An average size of data of the element; [0088] A level of
uniformity of the element, which may, for example, be calculated as
the number of distinct data points within the data divided by the
total number of data points within the data; [0089] A cleanliness
level of the element, which may be determined, for example, by:
[0090] A presence and/or prevalence of NULLs in relationship keys;
[0091] A presence and/or prevalence of malformed data based on
assumed data types; and/or [0092] A presence and/or prevalence of
multiple variations of a standard data point; [0093] An anticipated
relationship between the element and other data, which may be
determined by, for example: [0094] Groupings via many-to-many tag
structures (related measures may share related tags) [0095]
Relationships to possible dimensions (which may include percentages
of probable matches or the like)
[0096] The foregoing is merely exemplary; the comparison engine 530
may utilize any of a wide variety of data comparison techniques
known in the art. In at least one embodiment, the system 500 may
gather feedback from the user 100 at one or more points in the
process via direct interaction (prompt and response) and/or by
monitoring manual adjustments to generated content (changes in the
state of the model). In one embodiment, the system 500 may provide
the data recipe 200 to the user for approval or rejection. If the
user 100 rejects the data recipe 200, the data recipe 200 may be
revised, for example, by further iterations with the interrogation
engine 510, the categorization engine 520, and/or the comparison
engine 530.
[0097] Once the data recipe 200 has been provided to the
satisfaction of the user 100, it may be used to adjust the
heuristic algorithm 570 through the use of machine learning or
other techniques. The data recipe 200 may also be added to the data
map 300 as one of the reference data recipes that may be modified
in the process of generating future data recipes.
[0098] Additionally or alternatively, the semantic layer 400 may
also be adjusted as the system 500 operates. These adjustments to
the semantic layer 400 may be made based upon user feedback, or
through the use of machine learning techniques or the like. Such
changes to the semantic layer 400 may then be fed back through the
system 500 to re-evaluate previously applied schema and thereby
improve the performance of the system 500 for future data recipe
generation.
[0099] If desired, the data recipe 200 may be provided to the user
100. The user 100 may then use the data recipe 200 to fulfill the
information request 540. The data recipe 200 may be used repeatedly
to obtain the requested information 580 as circumstances change.
For example, if the information request 540 represents a KPI, the
data recipe 200 may be used to obtain the KPI and then update
and/or review it according to an interval desired by the user
100.
[0100] Additionally or alternatively, the system 500 may also be
designed to apply the data recipe 200 to fulfill the information
request 540 by providing the requested information 580
automatically for the user 100. The system 500 may do this only
once, or at any interval desired by the user 100. If desired, the
system 500 may provide the user 100 with the requested information
580 and then receive feedback based on the requested information
580. For example, returning to the example of a KPI, if the user
deems that the requested information 580 is not the KPI that was
requested, the user 100 may provide feedback that causes the system
500 to revise the data recipe 200 to obtain the requested
information 580 again from the new data recipe.
[0101] Additionally or alternatively, after approval of the new
data recipe 200, the new data recipe 200 may be further refined as
needed. This may be done, for example, by additional iterations
through the interrogation engine 510, the categorization engine
520, the comparison engine 530, and/or the heuristic algorithm
570.
Automatic Data Recipe Generation
[0102] Referring to FIG. 6, a flowchart depicts a method 600 of
carrying out automatic data recipe generation, according to one
embodiment of the present invention. The method 600 may be carried
out, at least in part, by the system 500 as in FIG. 5, or with a
differently-configured data recipe provision system. The method 600
may be performed in connection with input from the user 100; such a
user 100 may be a developer, customer, enterprise leader, sales
representative for business intelligence services, or any other
individual. FIG. 6 illustrates a series of steps in a certain
order, but those of skill in the art will recognize that these
steps may be re-ordered, omitted, replaced with other steps, or
supplemented with additional steps, consistent with the spirit of
the invention.
[0103] The method 600 may utilize any suitable source of data, such
as for example a spreadsheet, database, website, blog, whitepaper,
report, key performance indicator (KPI), dashboard, and/or the
like, which may provide the data store 106 illustrated in FIG. 5.
It may then apply a rules-based algorithm to the data store 106 to
interrogate the data store 106, discover the data types contained,
categorize the data types into logical data recipe ingredients,
compare the data recipe ingredients to existing reference data
recipes, and either extend the existing reference data recipes as
needed or create a new data recipe.
[0104] The method 600 may start 610 with a step 620 in which the
information request 540 is received from the user 100. As mentioned
in connection with FIG. 5, this may be done in many ways and with
any of a wide variety of input devices 102.
[0105] Then in a step 630, the interrogation engine 510 may
interrogate the data store 106, which may represent one or more
data sources and may include any of a variety of data storage
devices and/or schema. The step 630 may include interrogation of
the data store 106 based on the information request 540 to provide
the data types 550 stored within the data store 106. As set forth
in the discussion of FIG. 5, the interrogation engine 510 may
receive data from the data store 106, and may use the semantic
layer 400 to assist with interrogation. The data map 300 and/or the
heuristic algorithm 570 may additionally or alternatively be
referenced by the interrogation engine 510 in the performance of
the step 630.
[0106] Then, in a step 640, the categorization engine 520 may
categorize the data types 550 from the data store 106, as provided
by the interrogation engine 510, to define the data recipe
ingredients 210 that may be components of the data recipe 200 that
is to be generated to satisfy the information request 540. As
mentioned in the description of FIG. 5, performance of the step 640
may entail usage of the semantic layer 400, the heuristic algorithm
570, and/or the data map 300. The data recipe ingredients 210 may
be the actual and only data recipe ingredients 210 of the data
recipe 200 that satisfies the information request 540, or they may
be over-inclusive (i.e., including more data recipe ingredients 210
than the data recipe 200 will need), or may even be under-inclusive
in the event that one or more steps of the method 600, including
the step 640, are to be performed recursively to supply additional
data recipe ingredients 210.
[0107] The method 600 may then proceed to a step 650, in which the
only data recipe ingredients 210 received from the step 640 are
compared by the comparison engine 530 with reference data recipes.
As set forth in the discussion of FIG. 5, this may entail
comparison of the only data recipe ingredients 210 received from
the categorization engine 520 with those stored within the data map
300. As indicated in the description of FIG. 5, performance of the
step 650 may entail usage of the semantic layer 400, the heuristic
algorithm 570, and/or the data map 300.
[0108] Then, in a query 660, the system 500 may determine whether
one of the reference data recipes can be modified to create the
data recipe 200 that will satisfy the information request 540.
Notably, it may be possible to create any data recipe 200 as a
modification of one or more data recipes if enough modification is
done. Thus, the query 660 may compare the likelihood of success
and/or computational time required to modify one or more of the
reference data recipes, with the likelihood of success and/or
computational time required if the data recipe 200 is to be created
independently of the reference data recipes.
[0109] If one or more reference data recipes are to be modified,
the method 600 may progress to a step 662 in which the data recipe
200 that satisfies the information request 540 is created by
modifying the one or more reference data recipes. Conversely, if
the data recipe 200 that satisfies the information request 540 is
to be created "from scratch," the method 600 may progress to a step
664 in which the data recipe 200 is created independently of the
reference data recipes.
[0110] In either case, the result is the creation of a new data
recipe 200. This data recipe 200 may optionally be presented to the
user 100 for approval and/or modification. A query 670 may
determine whether the user 100 approves the new data recipe 200
without modification. If the user 100 does not approve the data
recipe 200, or provides modifications, either via explicit or
implicit user feedback, the data recipe 200 may return to the step
630 and once again query the data store 106 for data types.
[0111] The step 630, the step 640, and/or the step 650 may again be
performed, but if desired, may incorporate the feedback provided by
the user 100. If no user feedback has been obtained, settings
applicable to the step 630, the step 640, and/or the step 650 may
be modified so that the resulting data recipe 200 is different from
that obtained previously. For example, the data map 300, the
semantic layer 400, and/or the heuristic algorithm 570 may operate
on different settings from those used to obtain the data recipe 200
rejected by the user 100.
[0112] If the user 100 approves the data recipe 200 generated by
the step 650 without modification, the method 600 may proceed to a
step 680 in which the data map 300 is updated to include the data
recipe 200. The data recipe 200 may be recorded in the data map 300
as one of the reference data recipes that can be the basis of
comparison for future iterations of the step 650, and may be
modified to obtain a new data recipe 200. If desired, the semantic
layer 400 and/or the heuristic algorithm 570 may also be updated to
reflect the data recipe 200 and/or any adjustments needed. Such
adjustments may be made pursuant to known artificial intelligence
and/or machine learning techniques based on the results of previous
steps and/or queries of the method 600.
[0113] The data recipe 200 may then be provided to the user 100. As
mentioned previously, the user 100 may use the data recipe 200, one
time or repeatedly, to obtain the requested information 580.
Additionally or alternatively, the method 600 may proceed to a step
690 in which the system 500 follows the data recipe 200 to obtain
the requested information 580 and provide the requested information
580 to the user 100. This may also be done one time or repeatedly
as desired by the user 100. The method 600 may then end 699.
Example
[0114] A wide variety of methods may be used to generate a wide
range of data recipes according to the invention. The following
example is presented by way of illustration and not limitation to
indicate some of the ways in which a system, such as the system 500
of FIG. 5, may be used to automatically generate a data recipe that
provides information requested by a user through the use of a
method such as the method 600 of FIG. 6.
[0115] Referring to FIG. 7, a block diagram depicts a first KPI
recipe 700 used to obtain a first KPI according to one embodiment
of the invention. The first KPI may include products sold by a
company broken down into product type and catalog price.
[0116] The first KPI recipe 700 may include first KPI ingredients
710, which may include product, product type, and catalog price
ingredients. More precisely, as illustrated in the breakout of FIG.
7, the first KPI ingredients 710 may include one or more dimensions
230 and/or one or more measures 240, such as a product dimension
720, a product type dimension 730, and a catalog price measure 740.
The determination of whether each of the first KPI ingredients 710
is a dimension or a measure may be made, for example, using the
criteria set forth in the description of FIG. 2.
[0117] As shown, the product dimension 720 may have a product key
element 750. The product type dimension 730 may have a product type
key element 760. Since it is a measure 240, the catalog price
measure 740 may include the product key element 750, the product
type key element 760, and a catalog price element 770.
[0118] The first KPI recipe 700 may be generated by following the
method 600 of FIG. 6, and/or utilizing the system 500 of FIG. 5.
Thus, after the user 100 submits the information request 540 for
the first KPI, the first KPI recipe 700 may be obtained by
receiving and interrogating one or more data sources such as the
data store 106. The interrogation engine 510, the categorization
engine 520, and/or the comparison engine 530 may operate with the
aid of the data map 300, the semantic layer 400, and/or the
heuristic algorithm 570 to provide a proposed model, or a proposed
data recipe, to the user 100.
[0119] The proposed data recipe may have any number of dimensions
230. In the example of FIG. 7, the first KPI ingredients 710
include the product dimension 720 and the product type dimension
730. If desired, the proposed data recipe may be presented to the
user 100, and the user 100 may be prompted to accept or reject the
proposed data recipe. If accepted, the proposed data recipe
provided to the user 100, used to satisfy the information request
540 by providing the requested information 580, and/or further
processed, for example, by the heuristic algorithm 570 for further
refinement as additional data is received.
[0120] Referring to FIG. 8, a block diagram depicts a second KPI
recipe 800 used to obtain a second KPI according to one embodiment
of the invention. As shown, the second KPI may include products
sold by a company broken down into product group, order price, and
order date. The second KPI may relate to the same product as the
first KPI.
[0121] The second KPI recipe 800 may include second KPI ingredients
810, which may include product, order price, product group, and
order date ingredients. The breakout of FIG. 8 illustrates the pool
of data recipe ingredients 210 from which the second KPI
ingredients 810 may be selected. The first KPI ingredients 710 may
be included in the pool and may be used to provide one or more of
the second KPI ingredients 810. Thus, the first KPI recipe 700 may
be modified to facilitate the creation of the second KPI recipe
800.
[0122] Like the first KPI ingredients 710, the second KPI
ingredients 810 may include one or more dimensions 230 and/or one
or more measures 240, one or more of which may be obtained from or
derived from the first KPI ingredients 710 of the first KPI recipe
700. More specifically, the second KPI ingredients 810 may include
the product dimension 720 from the first KPI ingredients 710.
Additionally, the second KPI ingredients 810 may include an order
date dimension 820, a product group dimension 830, and an order
price measure 840.
[0123] As shown, the order date dimension 820 may have an order
date key element 850. The product group dimension 830 may have a
product group key element 860. Since it is a measure 240, the order
price measure 840 may have the product key element 750, the order
date key element 850, the product group key element 860, and an
order price element 870.
[0124] The second KPI recipe 800 may also be obtained through the
use of the system 500 of FIG. 5 and/or the method 600 of FIG. 6.
This may be facilitated via comparison with the second KPI recipe
800. For example, the data definition for the second KPI recipe 800
may be compared with the first KPI recipe 700. One or more
dimensions, such as the order date dimension 820 and the product
group dimension 830, may be added. One or more new measures, such
the order price measure 840, may be created by combining data from
the first KPI ingredients 710, such as the product key element 750,
with data from the other second KPI ingredients 810, such as the
order date key element 850 and the product group key element 860,
and adding the order price element 870. Such a data recipe
modification technique may beneficially allow a data recipe to be
modified at runtime, without changing structures that relied on
previous definitions.
[0125] After the user 100 submits the information request 540 for
the second KPI, the second KPI recipe 800 may be obtained by
receiving and interrogating one or more data sources such as the
data store 106 (or alternatively, one or more data sources
different from those used to generate the first KPI recipe 700).
The interrogation engine 510, the categorization engine 520, and/or
the comparison engine 530 may operate with the aid of the data map
300, the semantic layer 400, and/or the heuristic algorithm 570 to
provide a proposed model, or a proposed data recipe to obtain the
second KPI, to the user 100.
[0126] The comparison engine 530 may operate by comparing data
recipe ingredients 210 obtained from the categorization engine 520
with the first KPI ingredients 710 to obtain the pool of potential
data recipe ingredients shown in FIG. 8. Then, the comparison
engine 530 may modify the first KPI ingredients 710 applicable to
the second KPI, and add new data recipe ingredients from the data
recipe ingredients 210 obtained from the categorization engine 520,
to obtain the second KPI ingredients 810.
[0127] As in the creation of the first KPI recipe 700, the proposed
data recipe for the second KPI may be presented to the user 100,
and the user 100 may be prompted to accept or reject the proposed
data recipe. If accepted, the proposed data recipe provided to the
user 100, used to satisfy the information request 540 for the
second KPI by providing the requested information 580, and/or
further processed, for example, by the heuristic algorithm 570 for
further refinement as additional data is received.
[0128] The use of the heuristic algorithm 570 may beneficially
avoid the need for the user 100 to manually map new data elements
such as those of the first KPI ingredients 710 and the second KPI
ingredients 810. Rather, the system 500 may iteratively present
options to user 100 until the user 100 approves of a proposed data
recipe.
[0129] One skilled in the art will recognize that the examples
depicted and described herein are merely illustrative, and that
other arrangements of user interface elements can be used. In
addition, some of the depicted elements can be omitted or changed,
and additional elements depicted, without departing from the
essential characteristics of the invention.
[0130] The present invention has been described in particular
detail with respect to possible embodiments. Those of skill in the
art will appreciate that the invention may be practiced in other
embodiments. First, the particular naming of the components,
capitalization of terms, the attributes, data structures, or any
other programming or structural aspect is not mandatory or
significant, and the mechanisms that implement the invention or its
features may have different names, formats, or protocols. Further,
the system may be implemented via a combination of hardware and
software, or entirely in hardware elements, or entirely in software
elements. Also, the particular division of functionality between
the various system components described herein is merely exemplary,
and not mandatory; functions performed by a single system component
may instead be performed by multiple components, and functions
performed by multiple components may instead be performed by a
single component.
[0131] Reference in the specification to "one embodiment" or to "an
embodiment" means that a particular feature, structure, or
characteristic described in connection with the embodiments is
included in at least one embodiment of the invention. The
appearances of the phrases "in one embodiment" or "in at least one
embodiment" in various places in the specification are not
necessarily all referring to the same embodiment.
[0132] In various embodiments, the present invention can be
implemented as a system or a method for performing the
above-described techniques, either singly or in any combination. In
another embodiment, the present invention can be implemented as a
computer program product comprising a non-transitory
computer-readable storage medium and computer program code, encoded
on the medium, for causing a processor in a computing device or
other electronic device to perform the above-described
techniques.
[0133] Some portions of the above are presented in terms of
algorithms and symbolic representations of operations on data bits
within a memory of a computing device. These algorithmic
descriptions and representations are the means used by those
skilled in the data processing arts to most effectively convey the
substance of their work to others skilled in the art. An algorithm
is here, and generally, conceived to be a self-consistent sequence
of steps (instructions) leading to a desired result. The steps are
those requiring physical manipulations of physical quantities.
Usually, though not necessarily, these quantities take the form of
electrical, magnetic or optical signals capable of being stored,
transferred, combined, compared and otherwise manipulated. It is
convenient at times, principally for reasons of common usage, to
refer to these signals as bits, values, elements, symbols,
characters, terms, numbers, or the like. Furthermore, it is also
convenient at times, to refer to certain arrangements of steps
requiring physical manipulations of physical quantities as modules
or code devices, without loss of generality.
[0134] It should be borne in mind, however, that all of these and
similar terms are to be associated with the appropriate physical
quantities and are merely convenient labels applied to these
quantities. Unless specifically stated otherwise as apparent from
the following discussion, it is appreciated that throughout the
description, discussions utilizing terms such as "processing" or
"computing" or "calculating" or "displaying" or "determining" or
the like, refer to the action and processes of a computer system,
or similar electronic computing module and/or device, that
manipulates and transforms data represented as physical
(electronic) quantities within the computer system memories or
registers or other such information storage, transmission or
display devices.
[0135] Certain aspects of the present invention include process
steps and instructions described herein in the form of an
algorithm. It should be noted that the process steps and
instructions of the present invention can be embodied in software,
firmware and/or hardware, and when embodied in software, can be
downloaded to reside on and be operated from different platforms
used by a variety of operating systems.
[0136] The present invention also relates to an apparatus for
performing the operations herein. This apparatus may be specially
constructed for the required purposes, or it may comprise a
general-purpose computing device selectively activated or
reconfigured by a computer program stored in the computing device.
Such a computer program may be stored in a computer readable
storage medium, such as, but is not limited to, any type of disk
including floppy disks, optical disks, CD-ROMs, DVD-ROMs,
magnetic-optical disks, read-only memories (ROMs), random access
memories (RAMs), EPROMs, EEPROMs, flash memory, solid state drives,
magnetic or optical cards, application specific integrated circuits
(ASICs), or any type of media suitable for storing electronic
instructions, and each coupled to a computer system bus. Further,
the computing devices referred to herein may include a single
processor or may be architectures employing multiple processor
designs for increased computing capability.
[0137] The algorithms and displays presented herein are not
inherently related to any particular computing device, virtualized
system, or other apparatus. Various general-purpose systems may
also be used with programs in accordance with the teachings herein,
or it may prove convenient to construct more specialized apparatus
to perform the required method steps. The required structure for a
variety of these systems will be apparent from the description
provided herein. In addition, the present invention is not
described with reference to any particular programming language. It
will be appreciated that a variety of programming languages may be
used to implement the teachings of the present invention as
described herein, and any references above to specific languages
are provided for disclosure of enablement and best mode of the
present invention.
[0138] Accordingly, in various embodiments, the present invention
can be implemented as software, hardware, and/or other elements for
controlling a computer system, computing device, or other
electronic device, or any combination or plurality thereof. Such an
electronic device can include, for example, a processor, an input
device (such as a keyboard, mouse, touchpad, track pad, joystick,
trackball, microphone, and/or any combination thereof), an output
device (such as a screen, speaker, and/or the like), memory,
long-term storage (such as magnetic storage, optical storage,
and/or the like), and/or network connectivity, according to
techniques that are well known in the art. Such an electronic
device may be portable or non-portable. Examples of electronic
devices that may be used for implementing the invention include: a
mobile phone, personal digital assistant, smartphone, kiosk, server
computer, enterprise computing device, desktop computer, laptop
computer, tablet computer, consumer electronic device, or the like.
An electronic device for implementing the present invention may use
any operating system such as, for example and without limitation:
Linux; Microsoft Windows, available from Microsoft Corporation of
Redmond, Wash.; Mac OS X, available from Apple Inc. of Cupertino,
Calif.; iOS, available from Apple Inc. of Cupertino, Calif.;
Android, available from Google, Inc. of Mountain View, Calif.;
and/or any other operating system that is adapted for use on the
device.
[0139] While the invention has been described with respect to a
limited number of embodiments, those skilled in the art, having
benefit of the above description, will appreciate that other
embodiments may be devised which do not depart from the scope of
the present invention as described herein. In addition, it should
be noted that the language used in the specification has been
principally selected for readability and instructional purposes,
and may not have been selected to delineate or circumscribe the
inventive subject matter. Accordingly, the disclosure of the
present invention is intended to be illustrative, but not limiting,
of the scope of the invention, which is set forth in the
claims.
* * * * *