U.S. patent application number 12/718056 was filed with the patent office on 2010-07-01 for method and system for augmenting web content.
Invention is credited to Craig Allen Gooding, Douglas Stevenson.
Application Number | 20100169366 12/718056 |
Document ID | / |
Family ID | 33555743 |
Filed Date | 2010-07-01 |
United States Patent
Application |
20100169366 |
Kind Code |
A1 |
Stevenson; Douglas ; et
al. |
July 1, 2010 |
METHOD AND SYSTEM FOR AUGMENTING WEB CONTENT
Abstract
A system for augmenting data from a source data file with data
from a reference database to generate an augmented data file and
tracking the augmented data file is provided. The system includes a
reference database including at least one reference datum. A
handler component is configured to retrieve a source data file
including the structured datum. A locator component is configured
to locate the structured datum in the source data file; an analyzer
component configured to associate the identified structured datum
to one reference datum to create an association according to an
analyzing strategy. A generating component is configured to
generate a hyperlink based upon the association and embeds the
generated hyperlink and an identification code in the source file
to create and track an augmented data file.
Inventors: |
Stevenson; Douglas; (San
Francisco, CA) ; Gooding; Craig Allen; (London,
GB) |
Correspondence
Address: |
CHOATE, HALL & STEWART LLP
TWO INTERNATIONAL PLACE
BOSTON
MA
02110
US
|
Family ID: |
33555743 |
Appl. No.: |
12/718056 |
Filed: |
March 5, 2010 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
11779254 |
Jul 17, 2007 |
7698311 |
|
|
12718056 |
|
|
|
|
10645313 |
Aug 20, 2003 |
7257585 |
|
|
11779254 |
|
|
|
|
60484818 |
Jul 2, 2003 |
|
|
|
Current U.S.
Class: |
707/769 ;
705/14.49; 707/E17.014; 715/207; 715/760 |
Current CPC
Class: |
Y10S 707/99942 20130101;
G06Q 30/0251 20130101; G06F 16/9577 20190101; Y10S 707/966
20130101; G06F 16/9566 20190101; Y10S 707/959 20130101 |
Class at
Publication: |
707/769 ;
715/207; 705/14.49; 707/E17.014; 715/760 |
International
Class: |
G06F 17/00 20060101
G06F017/00; G06F 17/30 20060101 G06F017/30; G06Q 30/00 20060101
G06Q030/00 |
Claims
1. A method for augmenting a keyword upon loading a file into a
browser, the method comprising: (a) opening, by a browser, a file
received by a client; (b) identifying, by a script of the browser
upon opening the file, a datum in the file corresponds to a keyword
to be augmented with content at a uniform resource locator address;
(c) converting, by the script as the file is opened and responsive
to the identification, the datum to a hyperlink that displays the
content from the uniform resource locator address in an overlay;
and (d) displaying, by the browser responsive to placement of a
cursor over the hyperlink, in an area proximate to the hyperlink,
an overlay displaying the content from the uniform resource locator
address.
2. The method of claim 1, wherein step (a) further comprises
opening, by the browser, the file comprising a web page received
from a server.
3. The method of claim 1, wherein step (b) further comprises
identifying the datum in the file that matches the keyword based on
context of content in the file.
4. The method of claim 1, wherein step (b) further comprises
determining that the datum of a text string in the file matches the
keyword found in a reference database, the reference database
comprising keywords to be augmented.
5. The method of claim 1, wherein step (b) further comprises
analyzing a plurality of datum in text of the file to determine
that the datum matches the keyword of a reference database, the
reference database comprising keywords to be augmented.
6. The method of claim 1, wherein step (c) further comprises
receiving from a server a uniform resource locator corresponding to
the keyword matching the datum.
7. The method of claim 1, wherein step (c) further comprises
generating the hyperlink to include an association between the
datum and the keyword.
8. The method of claim 1, wherein step (c) further comprises
generating by the script the hyperlink to be embedded as part of
the datum in the file.
9. The method of claim 1, wherein step (c) further comprises
generating by the script the hyperlink to include the uniform
resource locator to an advertisement.
10. The method of claim 1, wherein step (d) further comprises
displaying by the browser, the file as an augmented file having a
plurality of datum that correspond to keywords and that are
converted by the script to hyperlinks that display overlays to
advertisements.
11. A system for augmenting a keyword upon loading a file into a
browser, the system comprising: a browser on a client opening a
file; a script of the browser, upon opening the file, identifying a
datum in the file corresponds to a keyword to be augmented with
content at a uniform resource locator address and converting,
responsive to the identification, the datum to a hyperlink that
displays in an overlay the content from the uniform resource
locator address; and wherein the browser responsive to placement of
a cursor over the hyperlink displays in an area proximate to the
hyperlink, an overlay displaying the content from the uniform
resource locator address.
12. The system of claim 11, wherein the browser opens the file
comprising a web page received from a server.
13. The system of claim 11, wherein the script identifies the datum
in the file that matches the keyword based on context of content in
the file.
14. The system of claim 11, wherein the script determines that a
text string in the file matches the keyword found in a reference
database, the reference database comprising keywords to be
augmented.
15. The system of claim 11, wherein the script analyzes a plurality
of datum in text of the file to determine that the datum matches
the keyword of a reference database, the reference database
comprising keywords to be augmented.
16. The system of claim 11, wherein the script receives from a
server a uniform resource locator corresponding to the keyword
matching the datum.
17. The system of claim 11, wherein the script generates the
hyperlink to include an association between the datum and the
keyword.
18. The system of claim 11, wherein the script generates the
hyperlink to be embedded as part of the datum in the file.
19. The system of claim 11, wherein the script generates the
hyperlink to include a uniform resource locator to an
advertisement.
20. The system of claim 11, wherein the browser displays the file
as an augmented file having a plurality of datum that correspond to
keywords and that are converted by the script to hyperlinks that
display overlays to advertisements.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to U.S. patent application
Ser. No. 11/779,254, filed on Jul. 17, 2007, which claims priority
to U.S. patent application Ser. No. 10/645,313, filed on Aug. 20,
2003, which claims the benefit of U.S. Provisional Patent
Application Ser. No. 60/484,818, filed on Jul. 2, 2003, all of
which are hereby incorporated by reference in their entirety.
BACKGROUND OF THE INVENTION
[0002] Hypertext is the organization of computer-based text into
connected associations enabling a user to quickly access
information that the user chooses. An instance of such an
association is called a hyperlink or hypertext link. Hypertext was
the main concept that led to the invention of the World Wide Web,
which is nothing more than an enormous amount of information
content connected by an enormous number of hyperlinks.
[0003] While the hyperlink has proven to be a successful means of
relating two pieces of information, the process of generating
hyperlinks has proven to be generally tedious. To create a single
link, the process requires an author to define such portions,
usually a text string or photo element of a structured file
(usually a text document, web page, or other form of document) from
which the hyperlink originates and a destination address at which
the hyperlink terminates. In a closed system such as in a local
network of workstations, the destination might be within the same
file, directory, or computer, or the destination may be a
designated file within a designated directory on the network.
[0004] Even with the required information, the knitting together of
hyperlinks still requires some skill. The would-be author of a
document with suitable links must first identify the content of the
file the author seeks to augment and then must use an appropriate
application to edit the file. Generally, the MIME header embedded
in the file identifies the file type. The embedded header allows a
computer software product to recognize the data by virtue of its
Multi-Purpose Internet Mail Extensions ("MIME") type. MIME is an
extension of the original Internet e-mail protocol that lets people
use the protocol to exchange different kinds of data files on the
Internet including audio, video, images, application programs, and
other kinds, including text generally in the ASCII format. Once
identified, the file is opened for review using the appropriate
application as identified by the MIME header.
[0005] When the network is broader, such as on the Internet, static
addresses on the Internet may be used as destinations. A Uniform
Resource Locator is the address of a file (resource) accessible on
the Internet. The type of resource depends on the Internet
application protocol. Using the World Wide Web's protocol,
Hypertext Transfer Protocol (HTTP), the resource can be an HTML
page, an image file, a program such as a common gateway interface
application or Java applet, or any other file type supported by
HTTP. The URL contains the name of the protocol required to access
the resource, a domain name that identifies a specific computer on
the Internet, and presents a hierarchical description of a file
location on the computer.
[0006] For this reason, the content of resources may not have all
of the links that would be useful. Old data might direct browsers
to addresses where no data is now stored. "Link rot," describes a
gradual loss of data at URL's linked to documents. This gradual
loss occurs when a destination document is removed while the link
in an originating document to the destination document remains. The
reader receives a "404 message," an arbitrarily assigned code
indicating that the page to which the reader has directed the
browser no longer exists at the designated address. Another form of
link rot occurs when the destination page has been changed in
content and is no longer relevant according to the sending
description.
[0007] Where parties, such as advertisers, wish to inject links
into existing resources in order to direct the reader's browser to
designated resources, it is critical that the links remain current.
Because files are static, links that are old will not complete the
hyperlink transit and therefore will lose the benefit of the
hyperlink. Fixedly embedding links in the file subjects the file to
link rot.
[0008] There is, therefore, an unmet need in the art for a
publishing system and method for augmenting resources and
maintaining suitable current hyperlinks within the resources.
SUMMARY OF THE INVENTION
[0009] A system for augmenting data from a source data file with
data from a reference database to generate an augmented data file
and tracking the augmented data file is provided. The system
includes a reference database including at least one reference
datum. A handler component is configured to retrieve a source data
file including the structured datum. A locator component is
configured to locate the structured datum in the source data file;
an analyzer component configured to associate the identified
structured datum to one reference datum to create an association
according to an analyzing strategy. A generating component is
configured to generate a hyperlink based upon the association and
embeds the generated hyperlink and an identification code in the
source file to create and track an augmented data file.
[0010] One presently preferred embodiment includes a system for
augmenting file content, including web content, with hyperlinks to
designated destinations. The system works based on finding a datum
(a data subset of a file) within a file, recognizing the datum
based upon the contents of a reference database, associating the
datum with a designated resource (in a presently preferred
embodiment by means of a uniform resource locator address) and
generating a hyperlink in the data source file.
[0011] The generated hyperlink (in a presently preferred
embodiment) receives a user-friendly name based on the contents of
the resource located at the uniform resource locator address. The
an embodiment is a add-on to a browser allowing the browser to
augment files "on the fly," i.e. where the user directs the browser
to a resource located on a network, the method analyzes the file as
it is opened by the browser, augments the file with appropriate
hyperlinks, and displays the augmented file with active hyperlinks.
"Clicking on" the hyperlink will redirect the browser to the
associated uniform resource locator address.
[0012] Another presently preferred embodiment provides a rigorous
procedure for augmenting files that assures that a greater number
of hyperlinks are more uniformly applied than is the rule with
human authoring of hyperlinks. A reference database can check the
content of the file and will always place a hyperlink where
appropriate based upon context. Because the reference database can
be readily updated, the invention assures that redirection to
current resources, preventing the dead link, i.e. "Error 404, file
not found."
[0013] A presently preferred embodiment provides a method and a
software product to add advertisements to existing web content by
hyperlinking occurrences of structured data such as text strings to
resources located at a uniform resource locator address.
[0014] As will be readily appreciated from the foregoing summary,
the invention provides a system and a method for rapidly and for
rigorously augmenting files with hyperlinks.
BRIEF DESCRIPTION OF THE DRAWINGS
[0015] The preferred and alternative embodiments of the present
invention are described in detail below with reference to the
following drawings.
[0016] FIG. 1 is a flowchart of the method for augmenting source
data files;
[0017] FIG. 2 is a block diagram of a system for augmenting source
data files;
[0018] FIG. 3 is screen shot of a source data file selected for
augmentation;
[0019] FIG. 4 is a block of text selected for augmentation;
[0020] FIG. 5 is the block of text selected for augmentation based
upon the occurrence of a text string;
[0021] FIG. 6 is an excerpt of an exemplary reference database;
[0022] FIG. 7 is a screen shot of the source data file selected for
augmentation showing hyperlink and associated user-friendly
name;
[0023] FIG. 8 is a screen shot of the contents of a destination
file upon redirection of the browser.
DETAILED DESCRIPTION OF THE INVENTION
[0024] By way of overview, a system for augmenting data from a
source data file with data from a reference database to generate an
augmented data file is provided. The system includes a reference
database including at least one reference datum. A handler
component is configured to retrieve a source data file including
the structured datum. A locator component is configured to locate
the structured datum in the source data file; an analyzer component
configured to associate the identified structured datum to one
reference datum to create an association according to an analyzing
strategy. A generating component is configured to generate a
hyperlink based upon the association and embeds the generated
hyperlink in the source file to create an augmented data file.
[0025] FIG. 1 is a flowchart of the method 9 for augmenting source
data files. At an appropriate starting terminus 10, the method
begins by reading a structured datum from a source data file at a
block 13. The source data file may be one designated by an input
uniform resource locator address or by any suitable means to
designate a resource. Upon opening, the method 9 may optionally
identify the type of content on the page with a content identifier
such as a MIME header. In one embodiment of the invention, the
method 9 merely searches for the presence of a reference datum,
either informed by the content identifier or by simply searching an
occurrence of a well-structured datum within a given file. However,
once the file is open, the method has the contents of the file
available for comparison to a reference database.
[0026] At a block 16, the method 9 locates an occurrence of a
reference datum corresponding to the structured datum read in the
source data file. One presently preferred means of discerning the
correspondent relation between the structured datum and the
reference datum is by a JavaScript call made to a web-enabled
database. The Java script then compares the contents of the source
data file with reference data stored in a web-enabled reference
database. In one presently preferred embodiment, the reference and
structured data are keywords. The JavaScript code then extracts the
text from the document and converts all of the keywords in the
document to hyperlinks.
[0027] Other analyzing strategies are also available. Though
keywords are a facile and efficient means of generating hyperlinks.
One presently preferred embodiment uses a "fuzzy expert" or a
neural network analysis of the source data file, such as by a
natural language search of the document to generate a distinct
identifier for the content in the source data file. One advantage
of a natural language search is the ability to better place content
in context making links more contextually appropriate, for
instance, security might relate to security of a physical plant
such as security of a residence in one source data file in one
context and security of a website in another. Natural language
searches, however, create a large processing overhead, making them
less desirable where such resources are at a premium.
[0028] At a block 19, the method 9 generates an association based
upon the located reference datum in the reference database. The
reference datum will include not only the identifier, such as a
keyword in one embodiment, but also the associated uniform resource
locator address of the intended destination based upon the
occurrence of the identifier in the source data file. Generating an
association means to hyperlink the located structured datum in the
source data file to the associated uniform resource locator as
found in the reference datum in the reference database. The
generated hyperlink might optionally include a user-friendly
description of the content of the resource found at the associated
uniform resource locator address and additionally might include an
additional identification code such as an "advertiser id." In each
embodiment, the generated hyperlink is added to the original source
data file at a block 22 resulting in an augmented data file. Having
generated the augmented data file, the method 9 then terminates at
a block 25.
[0029] FIG. 2 is a block diagram of a system for augmenting source
data files. In this exemplary system, a network 33 is shown in a
presently preferred embodiment. Those skilled in the relevant art
will readily appreciate that the system may be practiced without
the presence of a network link. Also a reference database 39 is
shown as directly connected to the locator 42 and the analyzer 45.
The system is not compromised by network links to any of the
several components. One presently preferred embodiment has the
reference database 39 set on a web-enabled page for remote calls
through the Internet to the database. Like the presence of the
network 33 described above, the absence of a defined link through
the Internet does not compromise the operation of the method.
[0030] A source data file 30 resides on a server on a network 33. A
handler 36 retrieves the source data file 30 for use by the system.
A locator 42 examines the retrieved source data file 30 for
comparison to the reference database 39 according to an analyzing
strategy. The locator 42 designates found structured data from the
source data file 30 and found reference data from the reference
database 39 and provides the reference data to an analyzer 45.
[0031] The analyzer 45 is used to create associations between each
found structured datum and the uniform resource locator address
within the corresponding reference datum found by the locator 42.
These associations define the nature of a hyperlink a generator 48
generates according to the association created at the analyzer 45.
The generator 48 embeds these hyperlinks in the source data file
30. The resulting augmented data file 50 is returned to the handler
36 to reside at a uniform resource locator address on the network
33.
[0032] FIG. 3 is a screen shot 101 of contents of a source data
file 30 (FIG. 2) selected for augmentation. For the purposes of
this exemplary discussion, a block of text 104 is selected by the
method for augmentation. The present invention does not require the
opening of the source data file 30 in a browser however for the
purposes of illustration, the screen shot 101 is provided
herein.
[0033] FIG. 4 is a block of text 107 selected for augmentation. The
locator 42 (FIG. 2) begins its analysis of the selected block by
isolating the block of text 107 as structured data.
[0034] FIG. 5 is the block of text 107 selected for augmentation
indicating the occurrence of a text string 111. The locator 42
isolates the occurrence of the text string 111 because of its
presence in the reference database 39.
[0035] Referring to FIGS. 2 and 6, while the excerpt 115 of the
exemplary reference database 39 shows it to be a "flat file"
database, any database 39, relational, flat file, or other
configuration will suitably fulfill the basic functions of
associating an identifier such as a text string 118, with a uniform
resource locator address 124, and optionally a user-friendly
description 121 of the contents of the file found at the associated
uniform resource locator address 124. (Not shown is the optional
"advertiser id.")
[0036] The locator 42 (FIG. 2) refers to the database as it reviews
the contents of the source data file 30 (FIG. 2). According to the
analyzing strategy, the text string 118 occurs at a reference datum
in the source data file 30, the locator 42 provides the uniform
resource locator 124 associated with the found text string 118
along with the occurrence of the text string in the reference datum
to the analyzer 45. Upon receiving the ordered pair from the
locator address 42, the analyzer 45 creates the association. With
the association from the analyzer 45, the generator 48 creates a
hyperlink. As a result, the text block receives the appropriate
hyperlink.
[0037] FIG. 7 is a screen shot 101 of the augmented data file 50
showing the selected block of text 104, hyperlink 129, and
associated user-friendly name 133. As will readily be appreciated
by those skilled in the relevant art, the augmented data file 50
need not reside on a network. For instance, where a browser is
enhanced with the inventive method 9 (FIG. 1), the browser might be
directed to content on the Internet. Upon opening the source data
file 30 (FIG. 2), the browser will effect the method 9 (FIG. 1)
such that rather than displaying the contents of the source data
file 30, the browser will display the augmented data file 50. The
resulting augmented files 50 would be current at the time of
viewing. In a publisher embodiment, the advertiser might
continually files on the network, replacing augmented files on the
network.
[0038] The augmented data file 50 is displayed as set forth in FIG.
7. When a reader places the mouse cursor over the hyperlink 129, a
new layer for display showing the associated user-friendly name 133
may be optionally evoked. The "advertiser id" may be optionally
embedded but not necessarily visible, the advertiser id. This
advertiser id specifically provides a means for tracking the number
of times the hyperlink 129 is activated and generating a record for
tracking revenue due to advertising. The reader clicks on the
hyperlink 129 to direct the browser to a destination file.
[0039] FIG. 8 includes a screen shot 140 of the browser displaying
the contents of the destination file located at the uniform
resource locator address 137 upon redirection of the browser. When
the reader clicks on the hyperlink 129, a new browser window is
opened and directed through a click-tracking server to the
hyperlink destination.
[0040] While the preferred embodiment of the invention has been
illustrated and described, as noted above, many changes can be made
without departing from the spirit and scope of the invention. For
example, the method 9 will generate static files on the network at
distinct uniform resource locator addresses 124 and 137 in order to
distinguish from the original source data files. Accordingly, the
scope of the invention is not limited by the disclosure of the
preferred embodiment. Instead, the invention should be determined
entirely by reference to the claims that follow.
* * * * *