U.S. patent application number 14/483784 was filed with the patent office on 2014-12-25 for document provenance scoring based on changes between document versions.
The applicant listed for this patent is International Business Machines Corporation. Invention is credited to Kenytt D. Avery, Edward L. Bader, Jean-Marc Costecalde, Chi M. Nguyen, Kevin N. Trinh.
Application Number | 20140379657 14/483784 |
Document ID | / |
Family ID | 51533184 |
Filed Date | 2014-12-25 |
United States Patent
Application |
20140379657 |
Kind Code |
A1 |
Avery; Kenytt D. ; et
al. |
December 25, 2014 |
Document Provenance Scoring Based On Changes Between Document
Versions
Abstract
A computer-implemented method, system and computer program
product is provided for optimizing a document change or provenance
scoring system by weighting sections of a document, scoring the
changes for each section, and then combining the change scores for
each section to generate an overall change score. An associated
report may also be generated that catalogs all of the various
scoring elements. The weighted score is stored in a document
management system and provides a human document reviewer a level of
detail to evaluate document changes. Accordingly, the weighted
score reveals whether a document's changes require a brief or
detailed review before the document's changes are approved for a
next document version.
Inventors: |
Avery; Kenytt D.; (Newport
Beach, CA) ; Bader; Edward L.; (Los Angeles, CA)
; Costecalde; Jean-Marc; (Irvine, CA) ; Nguyen;
Chi M.; (Irvine, CA) ; Trinh; Kevin N.;
(Garden Grove, CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
International Business Machines Corporation |
Armonk |
NY |
US |
|
|
Family ID: |
51533184 |
Appl. No.: |
14/483784 |
Filed: |
September 11, 2014 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
13804862 |
Mar 14, 2013 |
|
|
|
14483784 |
|
|
|
|
Current U.S.
Class: |
707/638 ;
707/748 |
Current CPC
Class: |
G06F 16/345 20190101;
G06F 16/313 20190101; G06F 16/148 20190101; G06F 16/93 20190101;
G06F 16/3334 20190101; G06F 16/245 20190101; G06F 16/338
20190101 |
Class at
Publication: |
707/638 ;
707/748 |
International
Class: |
G06F 17/30 20060101
G06F017/30 |
Claims
1. A computer-implemented method of indicating a degree of changes
within a document comprising: partitioning the document into a
plurality of sections and assigning each section a corresponding
weight value; determining a quantity of changes within each section
and applying the corresponding weight values to the quantity of
changes to produce a section value for each section; and
determining a change value for the document indicating the degree
of changes based on the section values.
2. The computer-implemented method of claim 1, further comprising:
identifying the document based on a comparison of the change value
with a threshold.
3. The computer-implemented method of claim 1, further comprising:
determining the change value for a plurality of versions of the
document; and comparing the change values for the plurality of
versions to identify a particular version.
4. The computer-implemented method of claim 1, wherein partitioning
includes partitioning the document based on one or more of defined
section labels and content of the document.
5. The computer-implemented method of claim 1, wherein assigning
includes one or more of manually generating the corresponding
weight for at least one section and automatically generating the
corresponding weight for at least one section based on document
content.
6. The computer-implemented method of claim 1, wherein determining
a change value includes calculating the change value as one of: a
summation of the quantity of changes within each section multiplied
by the corresponding weight values; and a summation of the quantity
of changes within each section multiplied by the corresponding
weight values and divided by a quantity of the plurality of
sections.
7. The computer-implemented method of claim 1, wherein each section
is associated with a provenance value, and the method further
comprising: combining the section values of the document with
provenance values of a previous version of the document to produce
a new provenance value for each section; and determining a current
provenance value for the document based on the new provenance
values for each section.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is a continuation of U.S. patent
application Ser. No. 13/804,862, entitled "DOCUMENT PROVENANCE
SCORING BASED ON CHANGES BETWEEN DOCUMENT VERSIONS" and filed Mar.
14, 2013, the disclosure of Which is incorporated herein by
reference in its entirety.
BACKGROUND
[0002] 1. Technical Field
[0003] Present invention embodiments relate to scoring changes in a
document, and more specifically, to generating a provenance score
for changes between document versions by weighting various document
sections.
[0004] 2. Discussion of the Related Art
[0005] In a document or content management system (CMS), tracking
changes between versions of a document is relevant for a reviewer
in the document approval process. Typically, changes in the
contents of a document are measured based on the number of words
changed, where each word is treated equally, and is generally
expressed as a numeric "provenance" value, e.g., based on a number
of words that changed horn one version of the document to the next.
However, in many circumstances, documents may contain various
sections with different levels of importance. For example, changes
in a document's introduction section may not be as significant as
changes in a document's body, summary or abstract sections.
[0006] One current solution is to review a document manually. When
a change is made to the document, the reviewer reviews the
document, and evaluates the significance of the changes. A manual
review process is both costly and time consuming since many changes
are to merely fix spelling and grammar errors. In other cases,
document editing merely changes words in the table of contents or
the introduction, yet these changes still require the reviewer to
read and approve all of these minor document changes.
BRIEF SUMMARY
[0007] According to one embodiment of the present invention, a
system is provided for indicating a degree of changes within a
document and includes at least one processor. The system partitions
the document into a plurality of sections. Each section of the
document is assigned a corresponding weight value. A quantity of
changes within each section is determined and the corresponding
weight values are applied to the quantity of changes to produce a
section value for each section. A change value is determined for
the document indicating the degree of changes based on the section
values. Embodiments of the present invention further include a
method and program product for indicating a degree of changes
substantially in the same manner described above.
BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS
[0008] Generally, like reference numerals in the various figures
are utilized to designate like components.
[0009] FIG. 1 is a diagrammatic illustration of an example
computing environment for use with an embodiment of the present
invention,
[0010] FIG. 2 is a flow chart diagram illustrating a manner in
which example document sections are weighted and scored based on
changes in insignificant sections according to an embodiment of the
present invention.
[0011] FIG. 3 is a flow chart diagram illustrating a manner in
which example document sections are weighted according to a second
example and scored based on changes in significant sections
according to an embodiment of the present invention.
[0012] FIGS. 4A and 4B are a procedural flow chart illustrating a
manner in which changes in document versions are scored according
to an embodiment of the present invention.
DETAILED DESCRIPTION
[0013] Present invention embodiments provide a document change or
provenance scoring system. The provenance score is determined by
weighting sections of a document, scoring the changes for each
section, and then combining the change scores for each section to
generate an overall change score or weighted change score. An
associated report may also be generated that catalogs all of the
various scoring elements. The change score is forwarded to a
document management system and/or to a human reviewer and provides
a mechanism to evaluate document changes. The change score is
stored in a document management system and provides a human
document reviewer a level of detail to evaluate document changes.
Accordingly, the change score reveals whether a document's changes
require a brief or detailed review before the document's changes
are approved for a next document version.
[0014] An example environment for use with present invention
embodiments is illustrated in FIG. 1. Specifically, the environment
includes one or more server systems 10, and one or more client or
end-user systems 14. Server systems 10 and client systems 14 may be
remote from each other and communicate over a network 12. The
network may be implemented by any number of any suitable
communications media (e.g., wide area network (WAN), local area
network (LAN), Internet, intranet, etc.). Alternatively, server
systems 10 and client systems 14 may be local to each other, and
communicate via any appropriate local communication medium (e.g.,
local area network (LAN), hardwire, wireless link, intranet,
etc.).
[0015] Client systems 14 enable users to provide request
information related to desired documents (e.g., documents, web
sites, news stories, etc.) to server systems 10. In another
example, the information and requests may be provided directly to
the server. The server systems include a weighting module 16 to
generate and assign weights to various sections of a document
(e.g., the introduction, body, abstract, etc.), and a scoring
module 20 to score the document based on changes in the document
from one version of the document to the next. A database system 18
may store various information for weighting and scoring documents
(e.g., collections of documents, document section weight values,
and change scores, etc.). The database system may be implemented by
any conventional or other database or storage unit, may be local to
or remote from server systems 10 and client systems 14, and may
communicate via any appropriate communication medium (e.g., local
area network (LAN), wide area network (WAN), Internet, hardwire,
wireless link, intranet, etc.). The client systems may present a
graphical user (e.g., GUI, etc.) or other interface (e.g., command
line prompts, menu screens, etc.) to solicit information from users
pertaining to document scoring, and may provide reports including
document change scores (e.g., document links, document version
change history, etc.)
[0016] Server systems 10 and client systems 14 may be implemented
by any conventional or other computer systems preferably equipped
with a display or monitor, a base (e.g., including at least one
processor 15, one or more memories 35 and/or internal or external
network interfaces or communications devices 25 (e.g., modem,
network cards, etc.)), optional input devices (e.g., a keyboard,
mouse or other input device), and any commercially available and
custom software (e.g., server/communications software, weighting
module, scoring module, browser/interface software, etc.).
[0017] Alternatively, one or more client systems 14 may perform
document change scoring when operating as a stand-alone unit. In a
stand-alone mode of operation, the client system stores or has
access to the data (e.g., document links, document section weight
values, etc.), and includes weighting module 16 and scoring module
20 to perform document change scoring. The graphical user (e.g.,
GUI, etc.) or other interface (e.g., command line prompts, menu
screens, etc.) solicits information from a corresponding user
pertaining to database searches, and may provide reports including
search results (e.g., document links, document relevance scores,
etc.)
[0018] Weighting module 16 and scoring module 20 may include one or
more modules or units to perform the various functions of present
invention embodiments described below. The various modules (e.g.,
weighting module, scoring module, etc.) may be implemented by any
combination of any quantity of software and/or hardware modules or
units, and may reside within memory 35 of the server and/or client
systems for execution by processor 15
[0019] A first example of a manner in which weighting module 16 and
scoring module 20 (e.g., via a server system 10 and/or client
system 14) performs document change scoring for a document with
changes to insignificant sections according to an embodiment of the
present invention is illustrated in FIG. 2. As viewed in FIG. 2,
there is a document section 205, a weighting section 240, and a
scoring section 270. A test plan document revised from an original
test plan is depicted in abstract form at 205. The document has
four sections, an introduction 210, a schedule and resources
section 215, a feature tests section 220, and an appendix 225.
[0020] The introduction 210 has 50% changes, schedule and resources
section 215 and feature tests section 220 both have 0% changes,
while appendix 225 has 40% changes. Many techniques are available
for determining changes between documents versions. For example,
lines or words from a document may be added, removed, or changed.
Given that an original document starts with 10 lines and then the
original document is edited by removing three lines to produce a
second version, changes can be determined in a number of ways. A
first possible technique is to produce changes relative to the
original document. For example, the percentage of change could be 3
lines out of 10 or 30% ((3 lines/10 lines*100%)=30%). in another
example, the change could be a percentage of original content
remaining in the second version or 7 lines. The percent of the
original content is computed to be 70% ((7 lines/10
lines*100%)=70%).
[0021] A second technique is to produce a change percentage
relative to the second version. For example, the change percentage
could be 3 lines out of 7 remaining lines or approximately 43% ((3
lines/7 lines*100%)=.about.43%). in another example, the change
percentage could be a percentage of original. content remaining in
the second version or 70% ((7 lines/10 lines*100%)=70%). The same
calculations apply when the number of original words and
changed/deleted words are used instead of a number of lines. Since
the techniques described herein contemplate tracking a document
across plural revisions (e.g., original, version 2, version 3,
etc.) scoring based on a previous version or sections thereof, or
across several pervious versions, provides a consistent and
intuitive scoring technique.
[0022] At this point, a simple change score may be calculated or
generated at 230. The change percentages from introduction 210 and
appendix 225 are averaged across all four document sections, i.e.,
introduction 210, schedule and resources section 215, feature tests
section 220, and appendix 225. The 50% and 40% changes from
introduction 210 and appendix 225, respectively add up to 90%
changes that are divided by four (sections) to yield and average
change of 22.5% per section as indicated at 230. Document changes
and change percentages may be determined at the time of version
storage in advance and stored on one of server systems 10 or client
systems 14. Alternatively, documents change percentages may be
generates as part of executing weighting module 16 and/or scoring
module 20 at run time.
[0023] However, in order to provide a more complete score according
to the techniques described herein, the weighting section 240
provides a relative weight for each section of a particular
document, i.e., weights 245, 250, 255, and 260, for document
sections, 210, 215, 220, and 225, respectively. The weights may be
those entered at step 410 or generated automatically (e.g., as
generated by weighting module 16) using known characteristics of
the terms employed within a document (e.g., using linguistic
analysis). In this example, introduction 210 is given a weight 245
of 0.1, scheduling section 215 is given a weight 250 of 1.0,
feature test section 220 is given a weight 255 of 10.0, and
appendix 225 is given a weight 260 of 0.5. Thus, the range of
weights from weight 255 of 10.0 to weight 245 of 0.1 indicates that
changes in the feature test section 220 is considered to be one
hundred times more relevant than changes in introduction 210 (i.e.,
weight 255 of 10.0 divided by a weight 245 of 0.1 is equal to
100(10.0/0.1)).
[0024] Once changes in the various document sections 205 are
determined or identified, the weights can be applied to each
document section 210, 215, 220, and 225 in document 205 in
weighting section 240. Potential change scores are calculated by
scoring section 270 (e.g., by way of scoring module 20). Scoring
section 270 shows two possible change scoring values, The first is
a raw change score of 25% or 25 as calculated in the lower part of
scoring section 270. The upper score in section 270 is the same
score of 25 divided by the number of document sections, in this
case 4, to yield a score of 6.25% changes or 6.25.
[0025] A second example of a manner in which weighting module 16
and scoring module 20 (e.g., via a server system 10 and/or client
system 14) performs document change scoring for a document with
changes to significant sections according to an embodiment of the
present invention is illustrated in FIG. 3. As viewed in FIG. 3,
there is a document section 305, a weighting section 340, and a
scoring section 370. A test plan document revised from an original
test plan (e.g., like that shown in FIG. 2) is depicted in abstract
form at 305. The document has four sections, an introduction 310, a
schedule and resources section 315, a feature tests section 320,
and an appendix 325.
[0026] The introduction 310 has 0% changes, schedule and resources
section 315 has 0% changes, feature tests section 320 has 10%
changes, while appendix 325 has 0% changes. As described above,
many techniques are available for determining changes between
document versions (e.g., number of words or lines changed) and
recorded as a document history or provenance value that is
typically presented to a user as a percentage.
[0027] The provenance percentage value, whether a percent change or
a percent of original content remaining, may be recorded as a
provenance or document lineage metadata attribute. Accordingly,
data differences between an original document and future document
versions, as well as between document versions, may be recorded as
lineage and/or provenance data and any additional ancillary data or
document attributes (e.g., document change metadata) in order to
record the entire change history of a document. In this manner, a
document's complete history can be presented to a user.
[0028] The results of the document change or difference processing
(e.g., as executed by weighting module 16 and scoring module 20)
can also be presented in other ways than a percentage value or
numerical figure. For example, for text documents, there can be
various types of "intelligent reporting" of the changes that
triggered the provenance and lineage changes, such as denoting
which user made which changes (e.g., an engineer making changes may
be of more import that changes made by a documents standards clerk
to ensure a proper document format).
[0029] At this point, a simple change percentage may be calculated
or generated at 330. The change percentages from feature tests
section 320 is averaged across all four document sections, i.e.,
introduction 310, schedule and resources section 315, feature tests
section 320, and appendix 325. The 10% change from feature tests
section 320 is divided by four (sections) to yield an average
change of 2.5% per section as indicated at 330.
[0030] As described in connection with FIG. 2, a more complete
score according to the techniques described herein is provided by
weighting section 340 that provides a relative weight for each
corresponding document section, i.e., weights 345, 350, 355, and
360, for document sections, 310, 315, 320, and 325, respectively
(e.g., as executed by weighting module 16 and scoring module 20).
In this example, the document sections are given the same weights
as those given in FIG. 2. Introduction 310 is given a weight 345 of
0.1, scheduling section 315 is given a weight 350 of 1.0, feature
test section 320 is given a weight 355 of 10.0, and appendix 325 is
given a weight 360 of 0.5.
[0031] Once changes in the various document sections 305 are
determined or identified, the weights can be applied to each
document section 310, 315, 320, and 325 using weights in weighting
section 340. Potential change scores are calculated by scoring
section 370 (e.g., by way of scoring module 20). Scoring section
370 shows two possible change scoring values. The first is a raw
change score of 100% or 100 as calculated in the lower part of
scoring section 370. The upper score in section 370 is the same
score of 100 divided by the number of document sections, in this
case 4, to yield a score of 25% changes or 25.
[0032] Two examples of documents change scoring have been described
in connection with FIGS. 2 and 3. Even though the example in FIG. 3
had 10% changes when compared to the 90% changes for the example
shown in FIG. 2, the example in FIG. 3 generated a raw score of
100, which is four times greater than the raw score of 25 generated
for FIG. 2, since the changes occurred in a section having a much
greater significance (as indicated by the corresponding
weight).
[0033] A more generalized manner in which weighting module 16 and
scoring module 20 (e.g., via a server system 10 and/or client
system 14) performs document change scoring according to an
embodiment of the present invention is further illustrated in FIGS.
4A and 4B. A client device or user saves a new or original document
to a storage device or server at step 405. For example, a newly
created document is stored and catalogued into content management
system (CMS).
[0034] In a CMS, data or content can be defined as documents,
movies, pictures, phone numbers, scientific data, and the like.
CMSs are frequently used for storing, controlling, revising, and
publishing documents. CMSs may enable a large number of people to
collaborate, contribute to and share stored data. The CMS controls
access to data based on user roles that define which information
users or user groups can view, edit, publish, etc. The CMS manages
storage and retrieval of data, reduces repetitive or duplicate
input, improves communication between users, and assists in report
generation (e.g., document change reports). Serving as a central
repository, the CMS typically increases the version level of new
updates to an already existing file (e.g., the CMS has the ability
to collect and track data for content in the CMS, which may include
authors, change dates and file versions, as well a document change
metrics).
[0035] An administrative user (e.g., by way of client systems 14)
defines both document sections (e.g., sections like those described
for test documents 205 and 305 above) and the weights to be
assigned to the various sections at step 410. The definition of
sections and weights may be used by all subsequent users editing
subsequent versions of the document. Alternatively, the assigned
weights may be automatically assigned by the CMS based on known
characteristics of documents within the CMS. By way of example, in
any given environment such as for a device test environment, design
documents, test documents, and device version release documents may
be part of a CMS for a company's product line. As such, these types
of documents will each have a similar structure and a document
section weight may be automatically assigned by weighting module 16
by parsing any given document based on known or learned document
structures and/or document sections.
[0036] At this point in a document's history, any given
organization may trigger a document review process at step 415. The
document review process determines whether the content in the new
document meets organizational standards, and that the associated
section and weight definitions are appropriate prior to releasing
the new document an original document version (or subsequent
document version). A server (e.g., a CMS server as one of server
systems 10) saves the document content and section weight
definitions at step 420.
[0037] When a document revision or update is needed, a user
retrieves the document (e.g., via client systems 14) at step 425.
The server (e.g., server systems 10) returns the document at step
430. The user updates the document content at step 435. The server
receives a document save request from the user at step 440. At this
point, the server uses the section definitions stored at step 420
to split the new document version into sections at step 445 (e.g.,
by weighting module 16).
[0038] Referring the FIG. 4B, the server compares the new document
version (or subsequent document version) sections to the
corresponding sections of the document's previously approved
version at step 450. The server combines differences in each
section with the defined weight for each section to calculate a
weighted provenance value at step 455 (e.g., by scoring module 20).
For example, if a previous document revision had a small change
score that did not trigger a document review, that revision may not
be considered an approved version for the purposes of an
organizational review, but may be a conditionally approved document
for which additional modifications may be based until a review is
warranted (e.g., a formal review). Only the most recent revision
that had gone through a review process and had been approved will
be used to compare against.
[0039] At this point in a document's history, the organization may
trigger a document review process at step 465. The document review
process is triggered based on the weighted provenance calculated at
step 455. If the new provenance value indicates that substantial
changes have been made to the new version (e.g., the change score
or provenance value exceeds a threshold), then the organization's
procedures may indicate that review of the new document version is
in order. The server saves the new document version and the
calculated weighted provenance value at step 470.
[0040] It will be appreciated that the embodiments described above
and illustrated in the drawings represent only a few of the many
ways of implementing document provenance scoring based on changes
between document versions.
[0041] The environment of the present invention embodiments may
include any number of computer or other processing systems (e.g.,
client or end-user systems, server systems, etc.) and databases or
other repositories arranged in any desired fashion, where the
present invention embodiments may be applied to any desired type of
computing environment (e.g., cloud computing, client-server,
network computing, mainframe, stand-alone systems, etc.). The
computer or other processing systems employed by the present
invention embodiments may be implemented by any number of any
personal or other type of computer or processing system (e.g.,
desktop, laptop, PDA, mobile devices, etc.), and may include any
commercially available operating system and any combination of
commercially available and custom software (e.g., browser software,
communications software, server software, weighting module, scoring
module, etc.). These systems may include any types of monitors and
input devices (e.g., keyboard, mouse, voice recognition, etc.) to
enter and/or view information.
[0042] It is to be understood that the software (e.g., weighting
module, scoring module, etc.) of the present invention embodiments
may be implemented in any desired computer language and could be
developed by one of ordinary skill in the computer arts based on
the functional descriptions contained in the specification and flow
diagrams and charts illustrated in the drawings. Further, any
references herein of software performing various functions
generally refer to computer systems or processors performing those
functions under software control. The computer systems of the
present invention embodiments may alternatively be implemented by
any type of hardware and/or other processing circuitry.
[0043] The various functions of the computer or other processing
systems may be distributed in any manner among any number of
software and/or hardware modules or units, processing or computer
systems and/or circuitry, where the computer or processing systems
may be disposed locally or remotely of each other and communicate
via any suitable communications medium (e.g., LAN, WAN, intranet,
Internet, hardwire, modem connection, wireless, etc.). For example,
the functions of the present invention embodiments may be
distributed in any manner among the various end-user/client and
server systems, and/or any other intermediary processing devices.
The software and/or algorithms described above and illustrated in
the flow diagrams and charts may be modified in any manner that
accomplishes the functions described herein. In addition, the
functions in the flow diagrams and charts or description may be
performed in any order that accomplishes a desired operation.
[0044] The software of the present invention embodiments (e.g.,
weighting module, scoring module, etc.) may be available on a
recordable or computer useable medium (e.g., magnetic or optical
mediums, magneto-optic mediums, floppy diskettes, CD-ROM, DVD,
memory devices, etc.) for use on stand-alone systems or systems
connected by a network or other communications medium.
[0045] The communication network may be implemented by any number
of any type of communications network (e.g., LAN, WAN, Internet,
intranet, VPN, etc.). The computer or other processing systems of
the present invention embodiments may include any conventional or
other communications devices to communicate over the network via
any conventional or other protocols. The computer or other
processing systems may utilize any type of connection (e.g., wired,
wireless, etc.) for access to the network. Local communication
media may be implemented by any suitable communication media (e.g.,
local area network (LAN), hardwire, wireless link, intranet,
etc.).
[0046] The system may employ any number of any conventional or
other databases, data stores or storage structures (e.g., files,
databases, data structures, data or other repositories, etc.) to
store information (e.g., documents, documents collection, document
section weight values, provenance values, and change scores, etc.).
The database system may be implemented by any number of any
conventional or other databases, data stores or storage structures
(e.g., files, databases, data structures or tables, data or other
repositories, etc.) to store information (e.g., documents, document
collections, document section weight values, provenance values,
change scores, etc.). The database system may be included within or
coupled to the server and/or client systems. The database systems
and/or storage structures may be remote from or local to the
computer or other processing systems, and may store any desired
data (e.g., documents, document collections, document section
weight values, provenance values, and change scores, etc.).
Further, the various tables (e.g., section lists, weighting tables,
provenance or scoring values, etc.) may be implemented by any
conventional or other data structures (e.g., files, arrays, lists,
stacks, queues, etc.) to store information, and may be stored in
any desired storage unit (e.g., database, data or other
repositories, etc.).
[0047] Present invention embodiments may be utilized for
determining any desired provenance information (e.g., change
determination functions, etc.) from any type of document or other
object (e.g., speech transcript, web or other pages, word
processing files, spreadsheet files, presentation files, electronic
mail, multimedia, etc.). the document may contain text in any
written language (e.g. English, Spanish, French, Japanese, etc.).
The partitioning of documents may be based on a logical sectioning
of the document by the author, by paragraphs, or by a concentration
of known keywords of interest (e.g., according to a technology or
based an organizations keyword priority). For example, a medical
document may be partitioned based on key medical terms or
mechanical design document may be portioned based on terms of the
mechanical arts.
[0048] Change percentages may be developed using any manner of
analysis (e.g., linguistic semantic, statistical, number of lines
changes, number of words changed, etc.) and may be expressed in any
values or units (e.g., raw score, quantity of change, percentage,
etc.). The weighting values may be computed in any manner. For
example, the weight values be normalized such that the total of all
weights is 1.0 (or 100%), or may be given any real number such as
those weights describe above in connection with FIGS. 2 and 3.
[0049] The present invention embodiments may employ any number of
any type of user interface (e.g., Graphical User interface (GUI),
command-line, prompt, etc.) for obtaining or providing information
(e.g., documents, document collections, sections for partitioning,
weights values, change reports, etc.), where the interface may
include any information arranged in any fashion. The interface may
include any number of any types of input or actuation mechanisms
(e.g., buttons, icons, fields, boxes, links, etc.) disposed at any
locations to enter/display information and initiate desired actions
via any suitable input devices (e.g., mouse, keyboard, etc.). The
interface screens may include any suitable actuators (e.g., links,
tabs, etc.) to navigate between the screens in any fashion.
[0050] The present invention embodiments are not limited to the
specific tasks or algorithms described above, but may be utilized
for computing Change scores or provenance values based on weights
assigned to various sections of document or other objects.
[0051] The terminology used herein is for the purpose of describing
particular embodiments only and is not intended to be limiting of
the invention. As used herein, the singular forms "a" "an" and
"the" are intended to include the plural forms as well, unless the
context Clearly indicates otherwise. It will be further understood
that the terms "comprises", "comprising", "includes", "including",
"has", "have", "having", "with" and the like, when used in this
specification, specify the presence of stated features, integers,
steps, operations, elements, and/or components, but do not preclude
the presence or addition of one or more other features, integers,
steps, operations, elements, components, and/or groups thereof.
[0052] The corresponding structures, materials, acts, and
equivalents of all means or step plus function elements in the
claims below are intended to include any structure, material, or
act for performing the function in combination with other claimed
elements as specifically claimed. The description of the present
invention has been presented for purposes of illustration and
description, but is not intended to be exhaustive or limited to the
invention in the form disclosed. Many modifications and variations
will be apparent to those of ordinary skill in the art without
departing from the scope and spirit of the invention. The
embodiment was chosen and described in order to best explain the
principles of the invention and the practical application, and to
enable others of ordinary skill in the art to understand the
invention for various embodiments with various modifications as are
suited to the particular use contemplated.
[0053] As will be appreciated by one skilled in the art, aspects of
the present invention may be embodied as a system, method or
computer program product. Accordingly, aspects of the present
invention may take the form of an entirely hardware embodiment, an
entirely software embodiment (including firmware, resident
software, micro-code, etc.) or an embodiment combining software and
hardware aspects that may all generally be referred to herein as a
"circuit," "module" or "system." Furthermore, aspects of the
present invention may take the form of a computer program product
embodied in one or more computer readable medium(s) having computer
readable program code embodied thereon.
[0054] Any combination of one or more computer readable medium(s)
may be utilized. The computer readable medium may be a computer
readable signal medium or a computer readable storage medium. A
computer readable storage medium may be, for example, but not
limited to, an electronic, magnetic, optical, electromagnetic,
infrared, or semiconductor system, apparatus, or device, or any
suitable combination of the foregoing. More specific examples (a
non-exhaustive list) of the computer readable storage medium would
include the following: an electrical connection having one or more
wires, a portable computer diskette, a hard disk, a random access
memory (RAM), a read-only memory (ROM), an erasable programmable
read-only memory (EPROM or Flash memory), an optical fiber, a
portable compact disc read-only memory (CD-ROM), an optical storage
device, a magnetic storage device, or any suitable combination of
the foregoing. In the context of this document, a computer readable
storage medium may be any tangible medium that can contain, or
store a program for use by or in connection with an instruction
execution system, apparatus, or device.
[0055] A computer readable signal medium may include a propagated
data signal with computer readable program code embodied therein,
for example, in baseband or as part of a carrier wave. Such a
propagated signal may take any of a variety of forms, including,
but not limited to, electro-magnetic, optical, or any suitable
combination thereof. A computer readable signal medium may be any
computer readable medium that is not a computer readable storage
medium and that can communicate, propagate, or transport a program
for use by or in connection with an instruction execution system,
apparatus, or device.
[0056] Program code embodied on a computer readable medium may be
transmitted using any appropriate medium, including but not limited
to wireless, wireline, optical fiber cable, RF, etc., or any
suitable combination of the foregoing.
[0057] Computer program code for carrying out operations for
aspects of the present invention may be written in any combination
of one or more programming languages, including an object oriented
programming language such as Java, Smalltalk, C++ or the like and
conventional procedural programming languages, such as the "C"
programming language or similar programming languages. The program
code may execute entirely on the user's computer, partly on the
user's computer, as a stand-alone software package, partly on the
user's computer and partly on a remote computer or entirely on the
remote computer or server. In the latter scenario, the remote
computer may be connected to the user's computer through any type
of network, including a local area network (LAN) or a wide area
network (WAN), or the connection may be made to an external
computer (for example, through the Internet using an Internet
Service Provider).
[0058] Aspects of the present invention are described with
reference to flowchart illustrations and/or block diagrams of
methods, apparatus (systems) and computer program products
according to embodiments of the invention. It will be understood
that each block of the flowchart illustrations and/or block
diagrams, and combinations of blocks in the flowchart illustrations
and/or block diagrams, can be implemented by computer program
instructions. These computer program instructions may be provided
to a processor of a general purpose computer, special purpose
computer, or other programmable data processing apparatus to
produce a machine, such that the instructions, which execute via
the processor of the computer or other programmable data processing
apparatus, create means for implementing the functions/acts
specified in the flowchart and/or block diagram block or
blocks.
[0059] These computer program instructions may also be stored in a
computer readable medium that can direct a computer, other
programmable data processing apparatus, or other devices to
function in a particular manner, such that the instructions stored
in the computer readable medium produce an article of manufacture
including instructions which implement the function/act specified
in the flowchart and/or block diagram block or blocks.
[0060] The computer program instructions may also be loaded onto a
computer, other programmable data processing apparatus, or other
devices to cause a series of operational steps to be performed on
the computer, other programmable apparatus or other devices to
produce a computer implemented process such that the instructions
which execute on the computer or other programmable apparatus
provide processes for implementing the functions/acts specified in
the flowchart and/or block diagram block or blocks.
[0061] The flowchart and block diagrams in the Figures illustrate
the architecture, functionality, and operation of possible
implementations of systems, methods and computer program products
according to various embodiments of the present invention. In this
regard, each block in the flowchart or block diagrams may represent
a module, segment, or portion of code, which comprises one or more
executable instructions for implementing the specified logical
function(s). It should also be noted that, in some alternative
implementations, the functions noted in the block may occur out of
the order noted in the figures. For example, two blocks shown in
succession may, in fact, be executed substantially concurrently, or
the blocks may sometimes be executed in the reverse order,
depending upon the functionality involved. It will also be noted
that each block of the block diagrams and/or flowchart
illustration, and combinations of blocks in the block diagrams
and/or flowchart illustration, can be implemented by special
purpose hardware-based systems that perform the specified functions
or acts, or combinations of special purpose hardware and computer
instructions.
* * * * *