U.S. patent application number 14/570671 was filed with the patent office on 2015-04-09 for harvesting data from page.
This patent application is currently assigned to VCVC III LLC. The applicant listed for this patent is VCVC III LLC. Invention is credited to Christopher W. Jones, Nova Spivack, Lewis W. Tucker.
Application Number | 20150100870 14/570671 |
Document ID | / |
Family ID | 39082904 |
Filed Date | 2015-04-09 |
United States Patent
Application |
20150100870 |
Kind Code |
A1 |
Spivack; Nova ; et
al. |
April 9, 2015 |
HARVESTING DATA FROM PAGE
Abstract
Among other disclosure, computer-implemented methods and
computer program products for obtaining data from a page. A method
can include initiating a harvesting process for a page available in
a computer system. The method can include identifying a feed
representation that has been created for the page. The method can
include retrieving and storing, as part of the harvesting process,
at least a portion from the page based on information in the
identified feed representation. The feed representation can include
at least excerpts of content from the page. The feed representation
can include at least one representation selected from: an RSS feed,
an Atom feed, an XML feed, an RDF feed, a serialized data feed
representation, and combinations thereof.
Inventors: |
Spivack; Nova; (Sherman
Oaks, CA) ; Jones; Christopher W.; (San Francisco,
CA) ; Tucker; Lewis W.; (San Francisco, CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
VCVC III LLC |
Seattle |
WA |
US |
|
|
Assignee: |
VCVC III LLC
Seattle
WA
|
Family ID: |
39082904 |
Appl. No.: |
14/570671 |
Filed: |
December 15, 2014 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
11835079 |
Aug 7, 2007 |
8924838 |
|
|
14570671 |
|
|
|
|
60821891 |
Aug 9, 2006 |
|
|
|
Current U.S.
Class: |
715/205 |
Current CPC
Class: |
G06F 16/958 20190101;
G06F 40/14 20200101 |
Class at
Publication: |
715/205 |
International
Class: |
G06F 17/30 20060101
G06F017/30; G06F 17/22 20060101 G06F017/22 |
Claims
1. A computer-implemented method of automatically mining data,
comprising: identifying a first portion of a page that is included
in a feed representation associated with the page and a second
portion of the page that is not included in the feed
representation; retrieving and storing the second portion and not
the first portion of the page.
2. The computer-implemented method of claim 1, wherein identifying
the first portion includes comparing a part of the feed
representation with the page.
3. The computer-implemented method of claim 2, wherein the
comparing includes using a text recognition technique.
4. The computer-implemented method of claim 1, wherein the feed
representation includes one or more of an RSS feed, an Atom feed,
an XML feed, an RDF feed, and a serialized data feed
representation.
5. A computer-implemented method of automatically mining data,
comprising: identifying a page as a target for content retrieval;
identifying a feed representation for the page, the identified feed
representation including multiple feed entries matching portions of
the page; identifying a first of the portions of the page that
match the multiple feed entries and a second portion of the page
that does not match any of the multiple feed entries; retrieving
and storing the second portion and not the first portion of the
page.
6. The computer-implemented method of claim 5, wherein the feed
representation includes excerpts of content from the page.
7. The computer-implemented method of claim 5, wherein the feed
representation comprises one or more of an RSS feed, an Atom feed,
an XML feed, an RDF feed, and a serialized data feed
representation.
8. The computer-implemented method of claim 5, wherein identifying
the first portion includes performing text recognition.
9. The computer-implemented method of claim 5, further comprising:
identifying a second page linked to the first portion of the page;
selecting a third portion of the second page that matches one of
the feed entries; selecting a fourth portion of the second page
that does not match any of the feed entries; retrieving and storing
the fourth portion and not the third portion of the second
page.
10. The computer-implemented method of claim 5, further comprising:
identifying a second page linked to the second portion of the page;
selecting a third portion of the second page that matches one of
the feed entries; selecting a fourth portion of the second page
that does not match any of the feed entries; retrieving and storing
the fourth portion and not the third portion of the second
page.
11. A system for automatically mining data, comprising: a processor
and memory, cooperating to function as: an identifying unit
configured to identify a first portion of a page that is included
in a feed representation associated with the page and a second
portion of the page that is not included in the feed
representation; a retrieving unit configured to retrieve and
storing the second portion and not the first portion of the
page.
12. The system of claim 11, wherein identifying the first portion
includes comparing a part of the feed representation with the
page.
13. The system of claim 12, wherein the comparing includes using a
text recognition technique.
14. The system of claim 11, wherein the feed representation
includes one or more of an RSS feed, an Atom feed, an XML feed, an
RDF feed, and a serialized data feed representation.
15. A machine-readable storage medium having stored thereon a set
of instructions which when executed perform a method, the method
compromising: identifying a page as a target for content retrieval;
identifying a feed representation for the page, the identified feed
representation including multiple feed entries matching portions of
the page; identifying a first of the portions of the page that
match the multiple feed entries and a second portion of the page
that does not match any of the multiple feed entries; retrieving
and storing the second portion and not the first portion of the
page.
16. The machine-readable storage medium of claim 15, wherein the
feed representation includes excerpts of content from the page.
17. The machine-readable storage medium of claim 15, wherein the
feed representation comprises one or more of an RSS feed, an Atom
feed, an XML feed, an RDF feed, a serialized data feed
representation.
18. The machine-readable storage medium of claim 15, wherein
identifying the first portion includes performing text
recognition.
19. The machine-readable storage medium of claim 15, the method
further comprising: identifying a second page linked to the first
portion of the page; selecting a third portion of the second page
that matches one of the feed entries; selecting a fourth portion of
the second page that does not match any of the feed entries;
retrieving and storing the fourth portion and not the third portion
of the second page.
20. The machine-readable storage medium of claim 5, the method
further comprising: identifying a second page linked to the second
portion of the page; selecting a third portion of the second page
that matches one of the feed entries; selecting a fourth portion of
the second page that does not match any of the feed entries;
retrieving and storing the fourth portion and not the third portion
of the second page.
Description
CROSS-REFERENCE TO RELATED APPLICATION(S)
[0001] This patent application is a Continuation of U.S.
application Ser. No. 11/835,079, entitled "HARVESTING DATA FROM
PAGE", filed Aug. 7, 2007 which claims priority to Provisional
Patent Application No. 60/821,891, filed Aug. 9, 2006 and entitled
"HARVESTING DATA FROM PAGE", the entire contents of which are
incorporated herein by reference.
TECHNICAL FIELD
[0002] This document relates to harvesting data from a page.
BACKGROUND
[0003] Current approaches for harvesting content from web pages
face the situation that each site and page can have a unique layout
comprised of multiple components in various places, such as content
sections, ads, frames, columns, content boxes, page content that is
divided into sub-divisions or sub-objects, web page sub-components,
content articles that run or continue across several sections or
pages, etc. In response to this situation, present tools for
crawling and mining content from such pages have sometimes needed
to be specifically programmed for the unique layout and structure
of each site and/or page so that they know where the content of
interest is located in the layout for that site. In other words,
the programming tells the tool what parts of the page represent
content to keep, and content to discard, and what sections of the
page correspond to various types of content. For example, when
mining a news site, it may be desired to collect and operate on the
body text of the news articles in the site, but to ignore the ads
and other sidebar content, etc. Existing approaches in this regard
have involved programming specific mining agents, or programming a
general agent with specific rules or templates. Moreover, such
programming often need to be kept up to date for each site and/or
page as the content and structure of the sites and pages change
over time. This can be a labor intensive process that does not
scale well to mining very large numbers of sites/pages with
differing layouts and structure. Another existing solution uses
statistical or natural language processing methods, or machine
learning methods, to try to figure out automatically which parts of
sites and pages should be kept and which should be discarded, and
which parts of sites and pages correspond to various types of
content.
SUMMARY
[0004] The invention relates to harvesting data from a page.
[0005] Some implementations of the present invention relate to
automatically determining which parts of a website and/or web page
comprise the "important" or "targeted" text to mine.
[0006] In a first aspect, a computer-implemented method for
obtaining data from a page includes initiating a harvesting process
for a page available in a computer system. The method includes
identifying a feed representation that has been created for the
page. The method includes retrieving and storing, as part of the
harvesting process, at least a portion from the page based on
information in the identified feed representation.
[0007] Implementations can include any, all or none of the
following features. The method can further include identifying,
before retrieving the portion of the page, at least a part of the
feed representation to be used for retrieving the portion from the
page, the part of the feed representation including the
information. The method can further include using the identified
part of the feed representation to identify the portion of the page
as matching the information in the feed representation. Using the
identified part of the feed representation can include comparing
the identified part of the feed representation with contents of the
page. The feed representation can include at least excerpts of
content from the page. The feed representation can include at least
one representation selected from: an RSS feed, an Atom feed, an XML
feed, an RDF feed, a serialized data feed representation, and
combinations thereof. The method can further include identifying at
least one feed entry in the feed representation, the feed entry
relating to a portion of the page that links to another page;
determining a URL of the other page; retrieving page content from
the other page using the determined URL; and identifying, in the
retrieved page content, the portion of the page as matching content
from the identified feed entry. Retrieving the portion from the
page based on the information can include using a text recognition
technique. The method can further include determining, using the
identified feed representation, which content from the page not to
retrieve in the harvesting process; and identifying the portion
from the page as not being included in the determined content of
the page not to retrieve. The method can further include causing
the determined content of the page not to be retrieved in the
harvesting process.
[0008] In a second aspect, a computer program product is tangibly
embodied in a computer-readable medium and includes instructions
that when executed by a processor perform a method for obtaining
data from a page. The method includes initiating a harvesting
process for a page available in a computer system. The method
includes identifying a feed representation that has been created
for the page. The method includes retrieving and storing, as part
of the harvesting process, at least a portion from the page based
on information in the identified feed representation.
[0009] In a third aspect, a computer-implemented method for
obtaining data from a page includes identifying a page as a target
for content retrieval, the page including multiple content
portions. The method includes identifying a feed representation
that has been created for the identified page, the identified feed
representation including multiple feed entries each corresponding
to at least some of the multiple content portions. The method
includes processing each of the multiple feed entries by: accessing
the identified page; identifying any of the multiple content
portions that match contents of the feed entry being processed; and
retrieving at least one of the multiple content portions based on
the identified content portion. The method includes storing, as a
result of the content retrieval, each retrieved content portion
obtained from the processing of the multiple feed entries.
[0010] Implementations can include any, all or none of the
following features. At least a first content portion of the
multiple content portions can link to another page, and processing
a first feed entry of the multiple feed entries relating to the
first content portion can include: accessing the other page to
which the first content portion links; identifying contents of the
accessed other page that match contents of the first feed entry;
and retrieving the identified contents of the accessed other page.
The feed representation can include at least excerpts of content
from the identified page. The feed representation can include at
least one representation selected from: an RSS feed, an Atom feed,
an XML feed, an RDF feed, a serialized data feed representation,
and combinations thereof. Identifying contents of the accessed page
that match contents of the feed entry being processed can include
using a text recognition technique. The method can further include
determining, using the identified feed representation, which
content from the identified page not to retrieve in the content
retrieval; and identifying the at least one of the multiple content
portion as not being included in the determined content of the page
not to retrieve. The method can further include causing any of the
multiple content portions identified as matching contents of the
feed entry being processed not to be retrieved in the content
retrieval.
[0011] In a fourth aspect, a computer program product is tangibly
embodied in a computer-readable medium and includes instructions
that when executed by a processor perform a method for obtaining
data from a page. The method includes identifying a page as a
target for content retrieval, the page including multiple content
portions. The method includes identifying a feed representation
that that has been created for the identified page, the identified
feed representation including multiple feed entries each
corresponding to at least some of the multiple content portions.
The method includes processing each of the multiple feed entries
by: accessing the identified page; identifying any of the multiple
content portions that match contents of the feed entry being
processed; and retrieving at least one of the multiple content
portions based on the identified content portion. The method
includes storing, as a result of the content retrieval, each
retrieved content portion obtained from the processing of the
multiple feed entries.
[0012] Advantages of some implementations include: automatically
targeting the harvesting activity to the desired content within
sites, pages and/or parts of sites or pages; and providing a less
labor intensive or computationally intensive approach to mining and
harvesting the content of sites and pages with varying structure
and content.
BRIEF DESCRIPTION OF THE DRAWINGS
[0013] FIG. 1 is an example of a portion of a page from which
content can be harvested.
[0014] FIG. 2 shows an example of a system that can retrieve
content using a feed representation.
[0015] FIG. 3 is an example of a flow chart for a method that can
be performed.
[0016] FIG. 4 is another example of a flow chart for a method that
can be performed.
[0017] FIG. 5 is a block diagram of a computing system that can be
used in connection with computer-implemented methods described in
this document.
[0018] Like reference symbols in the various drawings indicate like
elements.
DETAILED DESCRIPTION
[0019] There will now be described an exemplary implementation that
relates to automatically determining which parts of a website
and/or web page comprise the "important" body text to mine. The
description makes reference to FIG. 1, in which a portion of a web
page 100 is shown. For example, the web page 100 is created using a
known markup language and is made publicly available at the address
http://novaspivack.typepad.com/ where it can be viewed using a web
browser.
[0020] The web page 100 can make selected portions of its content
available in a feed representation. For example, the feed
representation of a site or page can contain blurbs comprising
excerpts (or in some cases, full-text) of the content of the site.
A news site's feed, for example, can provide blurbs or full text
for each news article in the site or site section it represents.
Thus, the feed representation contains some or all of the content
that, according to the publisher of the page 100 or another creator
of the feed representation, is considered to be "most important" or
the "main content" of the page (for example, news articles),
compared to other content that is "less important" or "peripheral
content" (for example, ads or sections of the page containing
comments and/or annotations to the page etc.) that the page may
contain, when judged against a standard of relevance. Such a feed
representation can be provided using an RSS feed or Atom feed, or
other extensible markup language (XML) or rich data format (RDF)
feed formats, to name a few examples, or any other type of
serialized "data feed" representation of the content. When
mentioned herein, RSS refers to protocols or technologies
including, but not limited to: RDF Site Summary (sometimes referred
to as RSS 0.9, RSS 1.0); Rich Site Summary (sometimes referred to
as RSS 0.91, RSS 1.0); and Really Simple Syndication (sometimes
referred to as RSS 2.0). Here, the web page 100 includes a link 102
that provides a feed representation of the page 100. The
information in such a feed can be used to guide and target mining
efforts aimed at page 100.
[0021] For example, the feed for the page 100 may have an entry
that contains the following excerpt:
TABLE-US-00001 TABLE 1 What is Radar Networks up to? Shel Israel
and I just finished up working together for 10 days. I needed
Shel's perspective on what we are working on at Radar Networks.
[0022] In one implementation, content is harvested as follows.
First, the URL to which the feed entry (Table 1) points is
determined. In this example, that URL is
http://novaspivack.typepad.com/nova_spivacks_weblog/2006/08/what_am_i_upt-
o.html. Next, the content from that URL is automatically retrieved.
Next, it is determined which part of this page is the "main
content"--the content that should be mined. This determination is
made by identifying the part of the page that matches the text from
the feed entry. In this example, that is the part of the page
starting with: "What is Radar Networks up to? Shel Israel and I
just finished up working together for 10 days. I needed Shel's
perspective on what we are working on at Radar Networks." The
matching part can be identified using any technique for comparing
content, such as text recognition.
[0023] In other words, an implementation of a mining agent can
automatically look for the part of the page that matches this text
and identify it as the part that matters. Rather than the mining
agent having to somehow parse or analyze the page to determine
which content it should mine, the determination of which part(s) of
the page should be mined is made by the content provider, the
publisher of the feed, when they decide which content to publish in
their feed--the agent simply mines the content portions that are
referenced from the feed. In this example, the identified portion
is the part of the page that is mined; the rest is ignored. In
other implementations, more contents can be mined. Thus, without
the agent being specifically programmed for the layout of this
particular weblog, the agent can determine where in the layout to
find the "important" content that it is supposed to mine.
[0024] Such an agent (or methods performed in harvesting
operations), can be configured and used with an aim toward
harvesting only the main content of a specific site or page, and
not the advertisements or other peripheral content. It could also
be used conversely to figure out which content to ignore--for
example, if the goal is to filter out the main content, so that the
peripheral content can be mined, in which case the agent mines
everything except what is provided in the feed for a given
page.
[0025] FIG. 2 shows an example of a system 200 that can retrieve
content using a feed representation. The system 200 here includes a
first device 202 and a second device 204. The first and second
devices are both connected to at least one network 206, such as the
Internet, an intranet or a cellular telephone network, to name a
few examples. In some implementations, the first device 202
comprises a server device and the second device comprises a client
device, the devices being configured to communicate over the World
Wide Web (WWW).
[0026] The first device can make one or more pages available, for
example to the second device and/or other entities on the network
206. In this example, only a few pages 208 and 210 are shown for
clarity, but other implementations can make available dozens,
hundreds or millions of pages or more.
[0027] Here, the page 208 contains links to the other pages 210.
That is, the page 208 includes a portion 208A linking to the page
210A; a portion 208B linking to the page 210B; and a portion 208C
linking to the page 210C. Users visiting the page 208 can access
any of the pages 210 by activating the corresponding link(s) on the
page 208. As other examples, a user can access any of the pages 210
directly by entering its address into a browser or other client
program, or by navigating to the page using a link on another page
(not shown). For example, the page 208 in some implementations can
include the web page 100 (FIG. 1).
[0028] In this example, the pages 210 are shown as residing within
the same device as the pages 210 (i.e., the first device 202). In
other implementations, one or more of the pages 210 can be located
on another device or in another system. That is, the pages 210 need
not have been published by the same entity as the page 208, or be
controlled by that entity.
[0029] Here, the first device also makes available a feed
representation 212 associated with the page 208. The feed
representation can have any of a number of types. In some
implementations, the feed representation 212 can be an RSS feed, an
Atom feed, an XML feed, an RDF feed, a serialized data feed
representation, and combinations thereof. In short, the publisher
of the page 208 can provide the feed representation 212 to
complement the provision of information through the page 208, for
example to highlight selected portions of that page's content. For
example, the feed representation 212 in some implementations can
include the feed representation for the web page 100 available
through the link 102 (FIG. 1).
[0030] The feed representation 212 includes at least one or more
excerpts of content from the page 208. Here, the feed
representation includes a part 212A associated with the portion
218A; a part 212B associated with the portion 218B; and a part 212C
associated with the portion 218C. For example, a feed entry in any
of the parts 212A-C can include the contents shown in the exemplary
Table 1 above.
[0031] The second device 204 can be configured for seeking out
information available through the network 206 that is of interest
according to one or more relevance standards. The second device can
also retrieve identified contents and store them temporarily or
indefinitely for one or more purposes, such as to perform
additional processing on the information, or to forward it to
another entity (not shown), to name just a few examples.
[0032] Here, the second device includes a content harvester 214
that is configured to perform such content retrieval. For example,
the content harvester 214 can initiate a harvesting process for a
page available in the computer system 206. The content harvester
214 can provide for storing and/or other processing of the
retrieved content.
[0033] Here, the second device also includes an information
identifier 216 that can identify the information or other content
to be retrieved, and pass this information on to the content
harvester 214. As one example, the information identifier 216 can
identify a feed representation that has been created for the page.
As another example, the information identifier 216 can identify at
least a part of the feed representation to be used for retrieving a
portion from the page. The content harvester 214 can retrieve and
store at least a portion from the page based on information in the
identified feed representation.
[0034] Assume, for example, that a feed entry in the past 212A of
the feed representation 212 includes the contents in Table 1 above.
The information identifier 216 can use the identified part 212A to
identify the portion 208A as matching the information in the feed
representation. The information identifier 216 can use any of
several techniques in this operation. For example, the information
identifier 216 can compare the identified part 212A with contents
of the page to identify the portion 208A. As another example,
retrieving the portion from the page based on the information can
include using a text recognition technique, such as by parsing text
in the feed entry and in the page content, and using the text
recognition to match the feed entry with the page portion.
[0035] With the information identifier 216 having identified, say,
the portion 208A as corresponding to an entry in the part 212A of
the feed representation 212, the content harvester 214 can in some
implementations retrieve at least that portion 208A from the page
208.
[0036] In contrast, when the feed entries correspond to a page that
links to content on one or more pages, for example like the page
208, the information identifier 216 can determine a page
identifier, such as a uniform resource locator (URL) of the other
page to which the page links. The content harvester 214 can then
retrieve page content from the other page using the determined URL.
The information identifier 216 can identify, in the retrieved page
content, the portion of the page as matching content from an
identified feed entry. Based on the identification, the matching
page portion is retrieved in the harvesting process.
[0037] The above examples focus on retrieving some or all of the
content that has been chosen for inclusion in a feed
representation. Other approaches can be used. For example, the feed
representation can be used for retrieval of content that has not
been chosen for inclusion in a feed representation. This can
provide the advantage of helping to avoid information that
qualifies as the main or central content of a page according to a
relevance standard.
[0038] In some implementations, the information identifier 216 can
be configured to ignore or omit contents that have entries in the
feed representation 212 when identifying portions of the page 208.
That is, the information identifier 216 can use the feed
representation 212 to determine which content from the page 208
that is not to be retrieved in the harvesting process. For example,
the information identifier 216 can find one or more portions of the
page 208 that is not included in the feed representation, and also
identify such content as being a candidate for retrieval. The
content harvester 214 can be configured so that the determined
content of the page--i.e., contents that have entries in the feed
representation is not retrieved in the harvesting process.
[0039] FIG. 3 is an example of a flow chart for a method 300 that
can be performed. The method 300 can be performed by a processor
executing instructions in a computer-readable medium. For example,
the method 300 can be performed to obtain data from a page in the
system 100 (FIG. 1).
[0040] As shown, the method 300 includes a step 310 of initiating a
harvesting process for a page available in a computer system. For
example, this can involve initiating or launching any or all of the
second device 204, the content harvester 214 or the information
identifier 216. The harvesting process can be directed to the page
208 and/or to any or all of the pages 210.
[0041] The method 300 includes a step 320 of identifying a feed
representation that has been created for the page. For example, the
information identifier 216 can identify the feed representation 212
as having been created for the page 208.
[0042] The method 300 includes a step 330 of retrieving and
storing, as part of the harvesting process, at least a portion from
the page based on information in the identified feed
representation. For example, the content harvester 214 can retrieve
any or all of the portions 208A-C from the page 208, and/or any or
all contents from any of the pages 210 and store it in the second
device 204.
[0043] One or more other steps can be performed before, in between,
and/or after the steps of the method 300. For example, the second
device 204 or another device can be configured to process retrieved
content.
[0044] As another example, the following is an outline description
of an implementation of a harvesting method.
[0045] 1. Get the RSS or Atom feed for site or page x [0046] a. For
each feed entry, k, in the feed for x, [0047] i. Get the URL, L, in
site x that the entry represents. [0048] ii. Get the full-content
of the web page or file that L represents [0049] iii. Find the part
of that page or file that matches the content in k [0050] 1. Get
the content of that part of L [0051] 2. This content is the main
content of that page
[0052] A related example will now be described with reference to
FIG. 4, which is another example of a flow chart for a method 400
that can be performed. The method 400 can be performed by a
processor executing instructions in a computer-readable medium. For
example, the method 400 can be performed to obtain data from a page
in the system 100 (FIG. 1). Other examples or implementations of
individual steps in the method 400 can also be found in the
description above and/or in the drawings.
[0053] As shown, the method 400 includes a step 410 of identifying
a page as a target for content retrieval, the page including
multiple content portions.
[0054] The method 400 includes a step 420 of identifying a feed
representation that has been created for the identified page, the
identified feed representation including multiple feed entries each
corresponding to at least some of the multiple content
portions.
[0055] The method 400 includes looped steps 430-470 of processing
each of the multiple feed entries.
[0056] The step 440 in the loop includes accessing the identified
page.
[0057] The step 450 in the loop includes identifying any of the
multiple content portions that match contents of the feed entry
being processed.
[0058] The step 460 in the loop includes retrieving at least one of
the multiple content portions based on the identified content
portion.
[0059] The step 470 in the loop indicates that the loop can be
performed for each feed entry.
[0060] The method 400 includes a step 480 of storing, as a result
of the content retrieval, each retrieved content portion obtained
from the processing of the multiple feed entries.
[0061] One or more additional steps can be performed with the
method 400, for example as described above with reference to method
300.
[0062] When one or more content portion of the page links to
another page, the processing of a feed entry can include: accessing
the other page to which the first content portion links;
identifying contents of the accessed other page that match contents
of the first feed entry; and retrieving the identified contents of
the accessed other page.
[0063] Other approaches in line with one or more aspects of this
description can be used.
[0064] FIG. 5 is a schematic diagram of a generic computer system
1100. The system 1100 can be used for the operations described in
association with any of the computer-implement methods described
previously, according to one implementation. The system 1100
includes a processor 1110, a memory 1120, a storage device 1130,
and an input/output device 1140. Each of the components 1110, 1120,
1130, and 1140 are interconnected using a system bus 1150. The
processor 1110 is capable of processing instructions for execution
within the system 1100. In one implementation, the processor 1110
is a single-threaded processor. In another implementation, the
processor 1110 is a multi-threaded processor. The processor 1110 is
capable of processing instructions stored in the memory 1120 or on
the storage device 1130 to display graphical information for a user
interface on the input/output device 1140.
[0065] The memory 1120 stores information within the system 1100.
In one implementation, the memory 1120 is a computer-readable
medium. In one implementation, the memory 1120 is a volatile memory
unit. In another implementation, the memory 1120 is a non-volatile
memory unit.
[0066] The storage device 1130 is capable of providing mass storage
for the system 1100. In one implementation, the storage device 1130
is a computer-readable medium. In various different
implementations, the storage device 1130 may be a floppy disk
device, a hard disk device, an optical disk device, or a tape
device.
[0067] The input/output device 1140 provides input/output
operations for the system 1100. In one implementation, the
input/output device 1140 includes a keyboard and/or pointing
device. In another implementation, the input/output device 1140
includes a display unit for displaying graphical user
interfaces.
[0068] The features described can be implemented in digital
electronic circuitry, or in computer hardware, firmware, software,
or in combinations of them. The apparatus can be implemented in a
computer program product tangibly embodied in an information
carrier, e.g., in a machine-readable storage device or in a
propagated signal, for execution by a programmable processor; and
method steps can be performed by a programmable processor executing
a program of instructions to perform functions of the described
implementations by operating on input data and generating output.
The described features can be implemented advantageously in one or
more computer programs that are executable on a programmable system
including at least one programmable processor coupled to receive
data and instructions from, and to transmit data and instructions
to, a data storage system, at least one input device, and at least
one output device. A computer program is a set of instructions that
can be used, directly or indirectly, in a computer to perform a
certain activity or bring about a certain result. A computer
program can be written in any form of programming language,
including compiled or interpreted languages, and it can be deployed
in any form, including as a stand-alone program or as a module,
component, subroutine, or other unit suitable for use in a
computing environment.
[0069] Suitable processors for the execution of a program of
instructions include, by way of example, both general and special
purpose microprocessors, and the sole processor or one of multiple
processors of any kind of computer. Generally, a processor will
receive instructions and data from a read-only memory or a random
access memory or both. The essential elements of a computer are a
processor for executing instructions and one or more memories for
storing instructions and data. Generally, a computer will also
include, or be operatively coupled to communicate with, one or more
mass storage devices for storing data files; such devices include
magnetic disks, such as internal hard disks and removable disks;
magneto-optical disks; and optical disks. Storage devices suitable
for tangibly embodying computer program instructions and data
include all forms of nonvolatile memory, including by way of
example semiconductor memory devices, such as EPROM, EEPROM, and
flash memory devices; magnetic disks such as internal hard disks
and removable disks; magneto-optical disks; and CD-ROM and DVD-ROM
disks. The processor and the memory can be supplemented by, or
incorporated in, ASICs (application-specific integrated
circuits).
[0070] To provide for interaction with a user, the features can be
implemented on a computer having a display device such as a CRT
(cathode ray tube) or LCD (liquid crystal display) monitor for
displaying information to the user and a keyboard and a pointing
device such as a mouse or a trackball by which the user can provide
input to the computer.
[0071] The features can be implemented in a computer system that
includes a back-end component, such as a data server, or that
includes a middleware component, such as an application server or
an Internet server, or that includes a front-end component, such as
a client computer having a graphical user interface or an Internet
browser, or any combination of them. The components of the system
can be connected by any form or medium of digital data
communication such as a communication network. Examples of
communication networks include, e.g., a LAN, a WAN, and the
computers and networks forming the Internet.
[0072] The computer system can include clients and servers. A
client and server are generally remote from each other and
typically interact through a network, such as the described one.
The relationship of client and server arises by virtue of computer
programs running on the respective computers and having a
client-server relationship to each other.
[0073] A number of embodiments of the invention have been
described. Nevertheless, it will be understood that various
modifications may be made without departing from the spirit and
scope of the invention. Accordingly, other embodiments are within
the scope of this description.
[0074] A number of embodiments have been described. Nevertheless,
it will be understood that various modifications may be made
without departing from the spirit and scope of this disclosure.
Accordingly, other embodiments are within the scope of the
following claims.
* * * * *
References