U.S. patent application number 09/726537 was filed with the patent office on 2001-04-05 for apparatus and method of implementing fast internet real-time search technology (first).
Invention is credited to Cerna, James J. JR., Gonzalez, Carlos M..
Application Number | 20010000192 09/726537 |
Document ID | / |
Family ID | 23619730 |
Filed Date | 2001-04-05 |
United States Patent
Application |
20010000192 |
Kind Code |
A1 |
Gonzalez, Carlos M. ; et
al. |
April 5, 2001 |
Apparatus and method of implementing fast internet real-time search
technology (FIRST)
Abstract
An apparatus for and method of monitoring and retrieving
identifiable statements and other information pertinent to one or
more desired search terms or concepts. In a preferred embodiment, a
fast Internet real-time search technology (FIRST) system provides a
user interface for inputting and editing a list of desired search
terms, and a list of resource locations to be monitored. A server
causes the periodic access of each resource location listed to
determine whether any information on a particular resource location
has been added that is pertinent to the desired search terms since
the previous visit to the location. Any added information is
returned to the server for processing. An alert message can be
generated and forwarded to an intended recipient to alert the
recipient of the added information.
Inventors: |
Gonzalez, Carlos M.; (Menlo
Park, CA) ; Cerna, James J. JR.; (Redwood Shores,
CA) |
Correspondence
Address: |
Eric Oliver
DICKSTEIN SHAPIRO MORIN & OSHINSKY LLP
2101 L Street NW
Washington
DC
20037-1526
US
|
Family ID: |
23619730 |
Appl. No.: |
09/726537 |
Filed: |
December 1, 2000 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
09726537 |
Dec 1, 2000 |
|
|
|
09409256 |
Sep 30, 1999 |
|
|
|
Current U.S.
Class: |
1/1 ;
707/999.001; 707/999.003; 707/E17.109 |
Current CPC
Class: |
Y10S 707/99943 20130101;
Y10S 707/99939 20130101; Y10S 707/99936 20130101; Y10S 707/99933
20130101; Y10S 707/99952 20130101; G06F 16/00 20190101; Y10S
707/99934 20130101 |
Class at
Publication: |
707/3 ;
707/1 |
International
Class: |
G06F 017/30; G06F
007/00 |
Claims
What is claimed as new and desired to be protected by letters
patent of the United States is:
1. An information monitoring system comprising: a processing unit;
and a memory, wherein a computer program is stored in said memory
for execution by said processing unit to locate real-time
information pertinent to a desired search term, to determine if the
located information differs from tracking information corresponding
to a given location, and to generate an alert according to a
difference from the tracking information.
2. The information monitoring system as recited in claim 1, wherein
said memory further stores a resource listing containing a list of
a plurality of locations to be monitored for information pertinent
to a desired search term.
3. The information monitoring system as recited in claim 2, wherein
the resource listing identifies locations using a uniform resource
locator (URL) that identifies web pages, message boards, and
locations of discussions groups located on the Internet.
4. The information monitoring system as recited in claim 3, wherein
the resource listing identifies e-mail recipients using e-mail
addresses, said processing unit executing the computer program so
as to monitor e-mail messages addressed to the identified e-mail
recipients to locate information pertinent to the search terms.
5. The information monitoring system as recited in claim 4, wherein
said processing unit monitors the listed locations for information
pertinent to desired search terms by performing a search task that
attempts to match the text of at least one desired search term with
the text of the resource location identified in the resource
listing.
6. The information monitoring system as recited in claim 4, wherein
said processing unit monitors the listed locations for information
pertinent to desired search terms by performing a search task that
returns data which contain references to messages on the resource
location identified in the resource listing.
7. A method of searching for information pertinent to a desired
list of terms, the method comprising the steps of: periodically
accessing predetermined resources found on a communication network
so as to effect real-time monitoring of the resources; downloading
data from each of the predetermined resources; determining whether
the downloaded data contains information pertinent to at least one
term on the desired list of terms; and identifying any updates to
the pertinent information downloaded from each of the predetermined
resources.
8. The method of searching for information recited in claim 7,
further comprising the step of modifying a client file containing
the list of predetermined resources to include e-mail newsletters,
Usenet newsgroups, and proprietary network chat rooms to be
accessed by said step of periodically accessing.
9. The method of searching for information recited in claim 8,
further comprising the step of sending an e-mail alert message to
at least one alert addressee listed in the client file, the alert
message containing the update to the pertinent information
identified in said step of identifying any updates to the pertinent
information.
10. The method of searching for information recited in claim 9,
further comprising the step of generating an alert message
containing text, graphics, audio or video information pertinent to
at least one term on the desired list of terms contained in the
client file.
11. The method of searching for information recited in claim 7,
wherein said step of determining whether the downloaded data
contains information pertinent to at least one term on the desired
list of terms comprises the substep of performing a concept search
that identifies information as pertinent to at least one desired
term based on conceptual relevancy irrespective of the textual
match between the listed terms and terms found in the data
downloaded from the predetermined resources.
12. A fast Internet real-time search technology (FIRST) system for
use in monitoring information on Web pages, message boards, chat
rooms, discussion groups, e-mail messages, and other communications
over the Internet, the FIRST system comprising: a user interface
which creates client files, each client file including a list of
desired search terms for which the FIRST system is to monitor
communications over the Internet, a list of desired resource
locations on the Internet for which the FIRST system is to monitor,
and a list of desired recipients for whom the FIRST system is to
alert upon uncovering information concerning one or more of the
desired search terms listed in the client file; a Maximillian
server processing requests for monitoring communications over the
Internet, said Maximillian server including a client file reader
parsing client files to coordinate execution of monitoring for
desired search terms in the resource locations identified in the
client files, and a script directory identifying at least one bot
launched to examine data retrieved from at least one resource
location identified in the client files; and an alert generator
communicating an alert from the FIRST system to the desired
recipients listed in the client files based on results produced
after examination by the at least one bot, wherein the alert is an
e-mail message sent to all of the desired recipients listed in a
particular client file corresponding to the results produced by the
at least one bot.
13. The fast Internet real-time search technology (FIRST) system
according to claim 12, wherein the resource locations include
static and dynamic Websites, user postings on Websites, and e-mail
bulletins, and wherein the data examined by the at least one bot
includes textual, pictorial, aural, and visual information.
14. The fast Internet real-time search technology (FIRST) system
according to claim 12, further comprising a plurality of bots
launched by said Maximillian server in accordance with the script
directory, each bot examining a respective one of the resource
locations listed in the client file.
15. The fast Internet real-time search technology (FIRST) system
according to claim 14, wherein at least one of said plurality of
bots examines a resource location and determines whether a textual
match is made between the desired search terms and the data
retrieved from the resource location.
16. The fast Internet real-time search technology (FIRST) system
according to claim 15, wherein the at least one of said plurality
of bots examines a resource location and determines whether a
conceptual match is made between the desired search terms and the
data retrieved from the resource location.
17. An article of manufacture comprising a machine-readable storage
medium having stored therein indicia of a plurality of
machine-executable control program steps, the control program
comprising the steps of: (a) inputting a list of monitored sites
and identified terms, wherein data related to such terms may appear
in the monitored sites; (b) retrieving data from one of the
monitored sites; (c) examining the data to return any portion of
the data related to the identified terms; (d) using a tracking
index to detect a real-time change in the data on the monitored
site retrieved in step (b); (e) retrieving the real-time change in
the data on the monitored site based on the tracking index; (f)
generating an information alert to inform an intended recipient of
the real-time change in data on the monitored site retrieved in
step (b); and (g) repeating steps (b) through (f) for each
monitored site listed in step (a).
18. The article of manufacture of claim 17, wherein the returning
step (c) of the control program comprises the substep of retrieving
messages from message boards dedicated to discussion of at least
one term identified in said inputting step (a).
19. The article of manufacture of claim 17, wherein the tracking
index of step (d) is a message identification number originating on
a monitored site.
20. The article of manufacture of claim 19, wherein the change in
the data retrieved in step (e) includes a subject of a message on a
message board related to at least one term identified in said
inputting step (a) and the identity of the entity posting the
message.
Description
BACKGROUND OF THE INVENTION
1. No single academic, corporate, governmental, or non-profit
entity administers the activities of people on the Internet. The
very existence and operation of the Internet stems from the fact
that hundreds of thousands of separate operators of computers and
computer networks independently use common data transfer protocols
to exchange communications and information with other computers
(which in turn exchange communications and information with still
other computers). There is no centralized storage location, control
point, or communications channel for the Internet, and it would not
be technically feasible for a single entity to control all of the
information conveyed on the Internet.
2. The explosive growth in popularity of the Internet over recent
years is in large part based on the unrestricted communication
medium it provides. The Internet has created a very low cost forum
in which people can freely publish information, views and opinions.
Ironically, it is the Internet's ability to empower its users with
the free flow of information exchange that makes its users the most
vulnerable. The ease, for example, in which a company can generate
"buzz" about its new product or service through strategic postings
in Internet message boards, chat rooms, and discussion forums, can
just as easily be used by stock-manipulative, rumor-mongering,
short-sellers to distribute false or misleading information about
the company and its offerings.
3. Although Internet search engines find directory services such as
AltaVista, Excite, Hotbot, Lycos, and Yahoo may be used to gather
some of the information propagated around the Internet about a
company and its products or services, these search entities
generally maintain static databases that are updated infrequently
relative to the dynamic information exchanges that transpire over
the Internet on a daily basis.
SUMMARY OF THE INVENTION
4. Preferred embodiments of the invention solve the foregoing (and
other) problems, and present significant advantages and benefits by
providing an apparatus for and method of monitoring and retrieving
identifiable statements and other information pertinent to one or
more desired search terms or concepts. In a preferred embodiment, a
fast Internet real-time search technology (FIRST) system provides a
user interface for inputting and editing a list of desired search
terms, and a list of resource locations to be monitored. A server
causes the periodic access of each resource location listed to
determine whether any information on a particular resource location
has been added that is pertinent to the desired search terms since
the previous visit to the location. Any added information is
returned to the server for processing. An alert message can be
generated and forwarded to an intended recipient to alert the
recipient of the added information.
BRIEF DESCRIPTION OF THE DRAWINGS
5. Many advantages, features, and applications of the invention
will be apparent from the following detailed description of the
invention which is provided in connection with the accompanying
drawings in which:
6. FIG. 1 is a block diagram illustrating a preferred embodiment of
the invention that monitors resource locations on the Internet;
7. FIG. 2 is a block diagram of a preferred embodiment of the
Maximillian server depicted in the system illustrated in FIG. 1;
and
8. FIG. 3 is a flow chart illustrating the operational flow of a
preferred embodiment of the invention.
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
9. Preferred embodiments and applications of the invention will now
be described with reference to FIGS. 1-3. Other embodiments may be
realized and structural or logical changes may be made to the
disclosed embodiments without departing from the spirit or scope of
the invention. Although the invention is particularly described as
applied to the monitoring and retrieval from resource locations
(e.g., Web pages, message boards, etc.) on the Internet of
information pertinent to a list of desired search terms, it should
be readily apparent that the invention may be embodied in any
searching mechanism or other retrieval service having the same or
similar problems.
10. In a preferred embodiment, a monitoring and retrieval apparatus
and (corresponding method) is embodied in a fast Internet real-time
search technology (FIRST) system, as illustrated in FIG. 1. As
shown in FIG. 1, the FIRST system is composed of a central
processing structure 12 in the form of a server (referred to herein
as "Maximillian server") used to monitor and retrieve information
from a network 18 (such as the Internet in the illustrated
embodiment). Maximillian server 12 receives inputs and delivers
output to users through user interface 10. A series of files are
accessed, generated, and updated by Maximillian server 12 during
operation of the FIRST system, particularly files such as client
file 14 and script directory 16, as will be described in more
detail below.
11. As shown in FIG. 2, Maximillian server 12 is preferably
embodied using one or more components coupled together using bus 28
(although alternative connection schemes known in the art may also
be used). As illustrated in FIG. 2, a central processing unit (CPU)
20 is provided for execution of one or more computer programs
stored in a recording medium such as memory 22. CPU 20 performs,
controls, or at least informs the various processing steps
performed by the FIRST system in monitoring and retrieving data
from network 18. A client reader 24 is provided to access data
stored in one or more client files 14.
12. In the preferred embodiment, "client files" contain a listing
of one or more desired terms or concepts for which a user would
like to have the FIRST system monitor network 18. For example, a
user may wish to have the FIRST system monitor the Internet for any
information related or pertinent to the commonly traded stock of
Oracle Corporation. In the client file therefore the user could
enter the company name "Oracle" or "Oracle Corporation" as its
desired search terms, together with any additional search terms
such as the stock symbol "ORCL". As will be described in detail
below, Maximillian server 12, after accessing client file 14, would
monitor network 18 (in the form of the Internet) for any
information related or pertinent to the listed search terms
"Oracle," "Oracle Corporation," or "ORCL." Any number of client
files may be activated for monitoring by Maximillian server 12.
13. Client files 14 further include a listing of resources that are
to be monitored by the FIRST system. In the preferred embodiment,
the resources may be identified by uniform resource locator (URL)
addresses of resource locations on the Internet. The URLs may
represent Websites (static or dynamic), individual Web pages,
message boards, locations of discussions groups, as well as any
other communication resource, including e-mail messages. When an
e-mail address is listed as a resource, the FIRST system monitors
the content of e-mail messages (e.g., e-mail bulletins, list server
messages, private mail, etc.) sent to the listed e-mail address.
(In the preferred embodiment, the e-mail messages are routed
simultaneously to both the intended recipient and the Maximillian
server, although alternative e-mail routing schemes known in the
art may be implemented.)
14. In the preferred embodiment, Maximillian server 12 accesses
each of the resources listed in the client file 14 currently being
processed. In accessing each resource, one or more "bots" are used
by the FIRST system. The use of the term "bot" herein refers to the
execution of an individual script (or computer program) by one or
more processing devices, including CPU 20. In the preferred
embodiment, a single bot is assigned to perform a single script in
the script directory 16, although other embodiments could permit
multiple scripts being executed by a single bot (or a single script
being executed by multiple bots), as desired. Each bot is
preferably programmed uniquely in accordance with the tasks
required.
15. Script directory 16 contains various scripts executed by one or
more processing devices during the operation of the FIRST system.
The scripts are run automatically, preferably under control of
Maximillian server 12, to perform discrete actions such as respond
to different types of input, generate output, and to carry out
various tasks as dictated by the specific script being
implemented.
16. An alert generator 26 is provided to compose alert messages for
transmission to one or more intended recipients. In the preferred
embodiment, the client file 14 will include one or more intended
recipients who are to receive an alert message once the FIRST
system uncovers information relative to the desired search terms.
In the preferred embodiment, alert generator 26 composes an e-mail
message concerning the extent to which information related or
pertinent to the desired search terms is found by the FIRST system.
The e-mail message is sent by Maximillian server 12 to one or more
intended recipients listed in the client file 14. Although the use
of e-mail is made in the illustrated embodiment, it should be
readily apparent that any means of communication (e.g., facsimile,
voice mail, etc.), including real-time display on user interface 10
(or remote display over network 18) may be used to provide an alert
message.
17. In operation, the illustrated embodiment described above allows
the FIRST system to monitor and retrieve information from one or
more predetermined resources in accordance with the operational
flow depicted in FIG. 3 in (steps S30 through S38). In the
preferred embodiment, one or more client files 14 are read by
client reader 24. For illustration purposes, it is assumed that
only a single client file is operational at any one time. In this
illustration, therefore, the data in the client file 14 is parsed
in step S30 to identify the different search terms or concepts, the
various resources to be monitored, the intended alert recipients,
and any other identification data presented in client file 14.
18. Maximillian server 12 then operates to identify the different
resources listed in client file 14. In the preferred embodiment,
Maximillian server 12 sequentially accesses individual resources as
listed in client file 14, although the embodiment may easily be
modified to access the resources in any order, serially or in
parallel. In this embodiment, the sequential accessing process is
performed at or near the maximum processing speed permitted by the
system so as to effect monitoring of the listed resources in
real-time. For each resource location identified, Maximillian
server 12 attempts to download data (e.g., hypertext mark-up
language (HTML) data) from the listed location, step S31. The
appropriate bot, as listed in script directory 16, is then launched
in step S32 and the downloaded data examined in step S33. In the
preferred embodiment, the launched bot examines the data directly,
although in an alternative embodiment a separate processing device
may be used.
19. The examination done in step S33 may employ one or more of a
variety of well-known search/retrieval techniques. For example, a
plain full-text search may be employed that looks for the exact
word match of one or more desired search terms. As an alternative
to (or in conjunction with) the full-text search, a conceptual or
relevancy search may be employed that associates each of the
desired search terms with different concepts or topics and attempts
to match the different concepts/topics with those found in the
resource location accessed. In addition, the examination may simply
search for a specific type of information such as data representing
messages on a given topic. Thus, as an illustration, where a
specific Website or Web page is dedicated to presenting a
discussion forum on a subject (e.g., ORCL stock), the examination
simply looks for any information presented, regardless of the exact
match of text with listed search terms. In an alternative
embodiment, the searching function is performed by the
hardware/software resident on the resource location (or on a remote
location).
20. In one preferred embodiment, the information (e.g., text,
pictorial, aural, video, etc.) found in step S33 to be related or
otherwise pertinent to one or more of the desired search terms
listed in client file 14 is culled out or returned from the
resource location using any number of known techniques (e.g., using
a "grep" command). The returned information is assumed to have (or
be given) a unique identifier referred to herein as a "tracking
index." The tracking index may differ from one resource location to
another. For example, in some resources the information returned is
a discussion group message having been assigned a unique message ID
number, in other resources, the subject of the message itself
serves as a unique identifier. The tracking index is thus
determined in step S34.
21. By comparing in step S35 the unique identifier of the returned
information with the identifier previously associated with the same
resource location, the FIRST system can easily determine whether
any changes in the information have been made (e.g., additional
messages added, revised, etc.). Where there is no substantial
difference in the tracking index for a given resource location, as
determined in step S35, the data from another resource location can
be downloaded by repeating process steps S32-S34, for example. If
some change in the information on the resource location is
detected, however, for example, through a change in the tracking
index, the changed information is retrieved in step S36.
22. Alert generator 26 is then employed to compose in step S37 an
alert message to inform an intended recipient of the information
uncovered by the FIRST system. In the preferred embodiment, alert
generator 26 composes an e-mail message for transmission to one or
more of the intended recipients listed in client file 14. The
composed message may provide a copy of the information (e.g., text,
audio/video file, graphic image, etc.) found to be new (or revised)
concerning one or more of the desired search terms. Additional
information such as the entity posting the new (or revised)
information, the revision date, posting time, etc. that can be
retrieved from the resource location may also be added to the alert
message, as desired. Alert generator 26 may also utilize
alternative delivery mechanisms such as real-time display of alert
messages on user interface 10, direct real-time feed to users over
network 18, etc.
23. In step S38 the tracking index for the resource location is
then updated based on the changed (or revised) information
detected. To the extent necessary, one or more additional tracking
indexes may be employed for a single resource location to more
particularly identify the variety of information (e.g., multiple
messages, graphic and text messages, etc.) presented in the
location. (The tracking index may be stored in client file 14,
memory 22, one or more other storage devices (not shown), or may
alternatively be derived when needed from other data stored.) While
preferred embodiments of the invention have been described and
illustrated, it should be apparent that many modifications to the
embodiments and implementations of the invention can be made
without departing from the spirit or scope of the invention. For
example, while only a real-time search technology has been
specifically illustrated to monitor and retrieve information from
resources located on the Internet, the invention may easily be
deployed to monitor other communication networks (e.g., intranets,
private bulletin boards, individual local or wide area networks,
proprietary chat rooms, ICQ, IRC channels, instant messaging
systems, ThirdVoice postings, etc.) using real-time or
non-real-time systems in lieu of or in addition to the monitoring
of the Internet resources. The client files 14 (as well as the
resources listed therein) may be processed sequentially or in
parallel by one or more processing devices. Additional modules may
be added to interact with the FIRST system. For example, the alert
messages generated may be forwarded to the user for manual or
automatic annotation (e.g., categorizing alert information) and
returned to a repository module for statistical analysis and
archival storage. Thus, in one implementation, the user reviewing
the information in the alert message may be presented with a series
of input buttons in which to categorize the information (e.g.,
"irrelevant," "significant," "critical," etc.) and automatically
reply or forward the information to the repository.
24. The modules described herein, particularly those illustrated in
FIGS. 1 and 2, may be one or more hardware, software, or hybrid
components residing in (or distributed among) one or more local or
remote computer systems. Although the modules are shown as
physically separated components, it should be readily apparent that
the modules may be combined or further separated into a variety of
different components, sharing different resources (including
processing units, memory, clock devices, software routines, etc.)
as required for the particular implementation of the embodiments
disclosed herein. Indeed, even a single general purpose computer
executing a computer program stored on a recording medium to
produce the functionality and any other memory devices referred to
herein may be utilized to implement the illustrated embodiments.
User interface device 10 may be any device used to input and/or
output information. The interface device 10 may be implemented as a
graphical user interface (GUI) containing a display or the like, or
may be a link to other user input/output devices known in the
art.
25. In addition, memory unit 22 described herein may be any one or
more of the known storage devices (e.g., Random Access Memory
(RAM), Read Only Memory (ROM), hard disk drive (HDD), floppy drive,
zip drive, compact disk-ROM, DVD, bubble memory, etc.), and may
also be one or more memory devices embedded within CPU 20, or
shared with one or more of the other components. The computer
programs or algorithms described herein may easily be configured as
one or more hardware modules, and the hardware modules shown may
easily be configured as one or more software modules without
departing from the invention. Accordingly, the invention is not
limited by the foregoing description or drawings, but is only
limited by the scope of the appended claims.
* * * * *