U.S. patent application number 09/875953 was filed with the patent office on 2001-10-18 for apparatus and method of implementing fast internet real-time search technology (first).
Invention is credited to Cerna, James J. JR., Gonzalez, Carlos M..
Application Number | 20010032203 09/875953 |
Document ID | / |
Family ID | 23619730 |
Filed Date | 2001-10-18 |
United States Patent
Application |
20010032203 |
Kind Code |
A1 |
Gonzalez, Carlos M. ; et
al. |
October 18, 2001 |
Apparatus and method of implementing fast internet real-time search
technology (FIRST)
Abstract
An apparatus for and method of monitoring and retrieving
identifiable statements and other information pertinent to one or
more desired search terms or concepts. In a preferred embodiment, a
fast Internet real-time search technology (FIRST) system provides a
user interface for inputting and editing a list of desired search
terms, and a list of resource locations to be monitored. A server
causes the periodic access of each resource location listed to
determine whether any information on a particular resource location
has been added that is pertinent to the desired search terms since
the previous visit to the location. Any added information is
returned to the server for processing. An alert message can be
generated and forwarded to an intended recipient to alert the
recipient of the added information.
Inventors: |
Gonzalez, Carlos M.; (Menlo
Park, CA) ; Cerna, James J. JR.; (Redwood Shores,
CA) |
Correspondence
Address: |
Eric Oliver
DICKSTEIN SHAPIRO
MORIN & OSHINSKY LLP
2101 L Street NW
Washington
DC
20037-1526
US
|
Family ID: |
23619730 |
Appl. No.: |
09/875953 |
Filed: |
June 8, 2001 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
09875953 |
Jun 8, 2001 |
|
|
|
09726537 |
Dec 1, 2000 |
|
|
|
09726537 |
Dec 1, 2000 |
|
|
|
09409256 |
Sep 30, 1999 |
|
|
|
6260041 |
|
|
|
|
Current U.S.
Class: |
1/1 ;
707/999.003; 707/999.201; 707/E17.109; 709/223 |
Current CPC
Class: |
Y10S 707/99934 20130101;
G06F 16/00 20190101; Y10S 707/99952 20130101; Y10S 707/99936
20130101; Y10S 707/99933 20130101; Y10S 707/99939 20130101; Y10S
707/99943 20130101 |
Class at
Publication: |
707/3 ; 707/201;
709/223 |
International
Class: |
G06F 017/30; G06F
015/173 |
Claims
What is claimed as new and desired to be protected by Letters
Patent of the United States is:
1. An information monitoring system comprising: a processing unit;
and a memory, wherein a computer program is stored in said memory
for execution by said processing unit to locate real-time
information pertinent to a desired search term, to determine if the
located information differs from tracking information corresponding
to a given location, and to generate an alert according to a
difference from the tracking information.
2. The information monitoring system as recited in claim 1, wherein
said memory further stores a resource listing containing a list of
a plurality of locations to be monitored for information pertinent
to a desired search term.
3. The information monitoring system as recited in claim 2, wherein
the resource listing identifies locations using a uniform resource
locator (URL) that identifies web pages, message boards, and
locations of discussions groups located on the Internet.
4. The information monitoring system as recited in claim 3, wherein
the resource listing identifies e-mail recipients using e-mail
addresses, said processing unit executing the computer program so
as to monitor e-mail messages addressed to the identified e-mail
recipients to locate information pertinent to the search terms.
5. The information monitoring system as recited in claim 4, wherein
said processing unit monitors the listed locations for information
pertinent to desired search terms by performing a search task that
attempts to match the text of at least one desired search term with
the text of the resource location identified in the resource
listing.
6. The information monitoring system as recited in claim 4, wherein
said processing unit monitors the listed locations for information
pertinent to desired search terms by performing a search task that
returns data which contain references to messages on the resource
location identified in the resource listing.
7. A method of searching for information pertinent to a desired
list of terms, the method comprising the steps of: periodically
accessing predetermined resources found on a communication network
so as to effect real-time monitoring of the resources; downloading
data from each of the predetermined resources; determining whether
the downloaded data contains information pertinent to at least one
term on the desired list of terms; and identifying any updates to
the pertinent information downloaded from each of the predetermined
resources.
8. The method of searching for information recited in claim 7,
further comprising the step of modifying a client file containing
the list of predetermined resources to include e-mail newsletters,
Usenet newsgroups, and proprietary network chat rooms to be
accessed by said step of periodically accessing.
9. The method of searching for information recited in claim 8,
further comprising the step of sending an e-mail alert message to
at least one alert addressee listed in the client file, the alert
message containing the update to the pertinent information
identified in said step of identifying any updates to the pertinent
information.
10. The method of searching for information recited in claim 9,
further comprising the step of generating an alert message
containing text, graphics, audio or video information pertinent to
at least one term on the desired list of terms contained in the
client file.
11. The method of searching for information recited in claim 7,
wherein said step of determining whether the downloaded data
contains information pertinent to at least one term on the desired
list of terms comprises the substep of performing a concept search
that identifies information as pertinent to at least one desired
term based on conceptual relevancy irrespective of the textual
match between the listed terms and terms found in the data
downloaded from the predetermined resources.
12. A fast Internet real-time search technology (FIRST) system for
use in monitoring information on Web pages, message boards, chat
rooms, discussion groups, e-mail messages, and other communications
over the Internet, the FIRST system comprising: a user interface
which creates client files, each client file including a list of
desired search terms for which the FIRST system is to monitor
communications over the Internet, a list of desired resource
locations on the Internet for which the FIRST system is to monitor,
and a list of desired recipients for whom the FIRST system is to
alert upon uncovering information concerning one or more of the
desired search terms listed in the client file; a Maximillian
server processing requests for monitoring communications over the
Internet, said Maximillian server including a client file reader
parsing client files to coordinate execution of monitoring for
desired search terms in the resource locations identified in the
client files, and a script directory identifying at least one bot
launched to examine data retrieved from at least one resource
location identified in the client files; and an alert generator
communicating an alert from the FIRST system to the desired
recipients listed in the client files based on results produced
after examination by the at least one bot, wherein the alert is an
e-mail message sent to all of the desired recipients listed in a
particular client file corresponding to the results produced by the
at least one bot.
13. The fast Internet real-time search technology (FIRST) system
according to claim 12, wherein the resource locations include
static and dynamic Websites, user postings on Websites, and e-mail
bulletins, and wherein the data examined by the at least one bot
includes textual, pictorial, aural, and visual information.
14. The fast Internet real-time search technology (FIRST) system
according to claim 12, further comprising a plurality of bots
launched by said Maximillian server in accordance with the script
directory, each bot examining a respective one of the resource
locations listed in the client file.
15. The fast Internet real-time search technology (FIRST) system
according to claim 14, wherein at least one of said plurality of
bots examines a resource location and determines whether a textual
match is made between the desired search terms and the data
retrieved from the resource location.
16. The fast Internet real-time search technology (FIRST) system
according to claim 15, wherein the at least one of said plurality
of bots examines a resource location and determines whether a
conceptual match is made between the desired search terms and the
data retrieved from the resource location.
17. An article of manufacture comprising a machine-readable storage
medium having stored therein indicia of a plurality of
machine-executable control program steps, the control program
comprising the steps of: (a) inputting a list of monitored sites
and identified terms, wherein data related to such terms may appear
in the monitored sites; (b) retrieving data from one of the
monitored sites; (c) examining the data to return any portion of
the data related to the identified terms; (d) using a tracking
index to detect a real-time change in the data on the monitored
site retrieved in step (b); (e) retrieving the real-time change in
the data on the monitored site based on the tracking index; (f)
generating an information alert to inform an intended recipient of
the real-time change in data on the monitored site retrieved in
step (b); and (g) repeating steps (b) through (f) for each
monitored site listed in step (a).
18. The article of manufacture of claim 17, wherein the returning
step (c) of the control program comprises the substep of retrieving
messages from message boards dedicated to discussion of at least
one term identified in said inputting step (a).
19. The article of manufacture of claim 17, wherein the tracking
index of step (d) is a message identification number originating on
a monitored site.
20. The article of manufacture of claim 19, wherein the change in
the data retrieved in step (e) includes a subject of a message on a
message board related to at least one term identified in said
inputting step (a) and the identity of the entity posting the
message.
Description
BACKGROUND OF THE INVENTION
[0001] No single academic, corporate, governmental, or non-profit
entity administers the activities of people on the Internet. The
very existence and operation of the Internet stems from the fact
that hundreds of thousands of separate operators of computers and
computer networks independently use common data transfer protocols
to exchange communications and information with other computers
(which in turn exchange communications and information with still
other computers). There is no centralized storage location, control
point, or communications channel for the Internet, and it would not
be technically feasible for a single entity to control all of the
information conveyed on the Internet.
[0002] The explosive growth in popularity of the Internet over
recent years is in large part based on the unrestricted
communication medium it provides. The Internet has created a very
low cost forum in which people can freely publish information,
views and opinions. Ironically, it is the Internet's ability to
empower its users with the free flow of information exchange that
makes its users the most vulnerable. The ease, for example, in
which a company can generate "buzz" about its new product or
service through strategic postings in Internet message boards, chat
rooms, and discussion forums, can just as easily be used by
stock-manipulative, rumor-mongering, short-sellers to distribute
false or misleading information about the company and its
offerings.
[0003] Although Internet search-engines and directory services such
as AltaVista, Excite, Hotbot, Lycos, and Yahoo may be used to
gather some of the information propagated around the Internet about
a company and its products or services, these search entities
generally maintain static databases that are updated infrequently
relative to the dynamic information exchanges that transpire over
the Internet on a daily basis.
SUMMARY OF THE INVENTION
[0004] Preferred embodiments of the invention solve the foregoing
(and other) problems, and present significant advantages and
benefits by providing an apparatus for and method of monitoring and
retrieving identifiable statements and other information pertinent
to one or more desired search terms or concepts. In a preferred
embodiment, a fast Internet real-time search technology (FIRST)
system provides a user interface for inputting and editing a list
of desired search terms, and a list of resource locations to be
monitored. A server causes the periodic access of each resource
location listed to determine whether any information on a
particular resource location has been added that is pertinent to
the desired search terms since the previous visit to the location.
Any added information is returned to the server for processing. An
alert message can be generated and forwarded to an intended
recipient to alert the recipient of the added information.
BRIEF DESCRIPTION OF THE DRAWINGS
[0005] Many advantages, features, and applications of the invention
will be apparent from the following detailed description of the
invention which is provided in connection with the accompanying
drawings in which:
[0006] FIG. 1 is a block diagram illustrating a preferred
embodiment of the invention that monitors resource locations on the
Internet;
[0007] FIG. 2 is a block diagram of a preferred embodiment of the
Maximillian server depicted in the system illustrated in FIG. 1;
and
[0008] FIG. 3 is a flow chart illustrating the operational flow of
a preferred embodiment of the invention.
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
[0009] Preferred embodiments and applications of the invention will
now be described with reference to FIGS. 1-3. Other embodiments may
be realized and structural or logical changes may be made to the
disclosed embodiments without departing from the spirit or scope of
the invention. Although the invention is particularly described as
applied to the monitoring and retrieval from resource locations
(e.g., Web pages, message boards, etc.) on the Internet of
information pertinent to a list of desired search terms, it should
be readily apparent that the invention may be embodied in any
searching mechanism or other retrieval service having the same or
similar problems.
[0010] In a preferred embodiment, a monitoring and retrieval
apparatus and (corresponding method) is embodied in a fast Internet
real-time search technology (FIRST) system, as illustrated in FIG.
1. As shown in FIG. 1, the FIRST system is composed of a central
processing structure 12 in the form of a server (referred to herein
as "Maximillian server") used to monitor and retrieve information
from a network 18 (such as the Internet in the illustrated
embodiment). Maximillian server 12 receives inputs and delivers
output to users through user interface 10. A series of files are
accessed, generated, and updated by Maximillian server 12 during
operation of the FIRST system, particularly files such as client
file 14 and script directory 16, as will be described in more
detail below.
[0011] As shown in FIG. 2, Maximillian server 12 is preferably
embodied using one or more components coupled together using bus 28
(although alternative connection schemes known in the art may also
be used). As illustrated in FIG. 2, a central processing unit (CPU)
20 is provided for execution of one or more computer programs
stored in a recording medium such as memory 22. CPU 20 performs,
controls, or at least informs the various processing steps
performed by the FIRST system in monitoring and retrieving data
from network 18. A client reader 24 is provided to access data
stored in one or more client files 14.
[0012] In the preferred embodiment, "client files" contain a
listing of one or more desired terms or concepts for which a user
would like to have the FIRST system monitor network 18. For
example, a user may wish to have the FIRST system monitor the
Internet for any information related or pertinent to the commonly
traded stock of Oracle Corporation. In the client file therefore
the user could enter the company name "Oracle" or "Oracle
Corporation" as its desired search terms, together with any
additional search terms such as the stock symbol "ORCL". As will be
described in detail below, Maximillian server 12, after accessing
client file 14, would monitor network 18 (in the form of the
Internet) for any information related or pertinent to the listed
search terms "Oracle," "Oracle Corporation," or "ORCL." Any number
of client files may be activated for monitoring by Maximillian
server 12.
[0013] Client files 14 further include a listing of resources that
are to be monitored by the FIRST system. In the preferred
embodiment, the resources may be identified by uniform resource
locator (URL) addresses of resource locations on the Internet. The
URLs may represent Websites (static or dynamic), individual Web
pages, message boards, locations of discussions groups, as well as
any other communication resource, including e-mail messages. When
an e-mail address is listed as a resource, the FIRST system
monitors the content of e-mail messages (e.g., e-mail bulletins,
list server messages, private mail, etc.) sent to the listed e-mail
address. (In the preferred embodiment, the e-mail messages are
routed simultaneously to both the intended recipient and the
Maximillian server, although alternative e-mail routing schemes
known in the art may be implemented.)
[0014] In the preferred embodiment, Maximillian server 12 accesses
each of the resources listed in the client file 14 currently being
processed. In accessing each resource, one or more "bots" are used
by the FIRST system. The use of the term "bot" herein refers to the
execution of an individual script (or computer program) by one or
more processing devices, including CPU 20. In the preferred
embodiment, a single bot is assigned to perform a single script in
the script directory 16, although other embodiments could permit
multiple scripts being executed by a single bot (or a single script
being executed by multiple bots), as desired. Each bot is
preferably programmed uniquely in accordance with the tasks
required.
[0015] Script directory 16 contains various scripts executed by one
or more processing devices during the operation of the FIRST
system. The scripts are run automatically, preferably under control
of Maximillian server 12, to perform discrete actions such as
respond to different types of input, generate output, and to carry
out various tasks as dictated by the specific script being
implemented.
[0016] An alert generator 26 is provided to compose alert messages
for transmission to one or more intended recipients. In the
preferred embodiment, the client file 14 will include one or more
intended recipients who are to receive an alert message once the
FIRST system uncovers information relative to the desired search
terms. In the preferred embodiment, alert generator 26 composes an
e-mail message concerning the extent to which information related
or pertinent to the desired search terms is found by the FIRST
system. The e-mail message is sent by Maximillian server 12 to one
or more intended recipients listed in the client file 14. Although
the use of e-mail is made in the illustrated embodiment, it should
be readily apparent that any means of communication (e.g.,
facsimile, voice mail, etc.), including real-time display on user
interface 10 (or remote display over network 18) may be used to
provide an alert message.
[0017] In operation, the illustrated embodiment described above
allows the FIRST system to monitor and retrieve information from
one or more predetermined resources in accordance with the
operational flow depicted in FIG. 3 in (steps S30 through S38). In
the preferred embodiment, one or more client files 14 are read by
client reader 24. For illustration purposes, it is assumed that
only a single client file is operational at any one time. In this
illustration, therefore, the data in the client file 14 is parsed
in step S30 to identify the different search terms or concepts, the
various resources to be monitored, the intended alert recipients,
and any other identification data presented in client file 14.
[0018] Maximillian server 12 then operates to identify the
different resources listed in client file 14. In the preferred
embodiment, Maximillian server 12 sequentially accesses individual
resources as listed in client file 14, although the embodiment may
easily be modified to access the resources in any order, serially
or in parallel. In this embodiment, the sequential accessing
process is performed at or near the maximum processing speed
permitted by the system so as to effect monitoring of the listed
resources in real-time. For each resource location identified,
Maximillian server 12 attempts to download data (e.g., hypertext
mark-up language (HTML) data) from the listed location, step S31.
The appropriate bot, as listed in script directory 16, is then
launched in step S32 and the downloaded data examined in step S33.
In the preferred embodiment, the launched bot examines the data
directly, although in an alternative embodiment a separate
processing device may be used.
[0019] The examination done in step S33 may employ one or more of a
variety of well-known search/retrieval techniques. For example, a
plain full-text search may be employed that looks for the exact
word match of one or more desired search terms. As an alternative
to (or in conjunction with) the full-text search, a conceptual or
relevancy search may be employed that associates each of the
desired search terms with different concepts or topics and attempts
to match the different concepts/topics with those found in the
resource location accessed. In addition, the examination may simply
search for a specific type of information such as data representing
messages on a given topic. Thus, as an illustration, where a
specific Website or Web page is dedicated to presenting a
discussion forum on a subject (e.g., ORCL stock), the examination
simply looks for any information presented, regardless of the exact
match of text with listed search terms. In an alternative
embodiment, the searching function is performed by the
hardware/software resident on the resource location (or on a remote
location).
[0020] In one preferred embodiment, the information (e.g., text,
pictorial, aural, video, etc.) found in step S33 to be related or
otherwise pertinent to one or more of the desired search terms
listed in client file 14 is culled out or returned from the
resource location using any number of known techniques (e.g., using
a "grep" command). The returned information is assumed to have (or
be given) a unique identifier referred to herein as a "tracking
index." The tracking index may differ from one resource location to
another. For example, in some resources the information returned is
a discussion group message having been assigned a unique message ID
number, in other resources, the subject of the message itself
serves as a unique identifier. The tracking index is thus
determined in step S34.
[0021] By comparing in step S35 the unique identifier of the
returned information with the identifier previously associated with
the same resource location, the FIRST system can easily determine
whether any changes in the information have been made (e.g.,
additional messages added, revised, etc.). Where there is no
substantial difference in the tracking index for a given resource
location, as determined in step S35, the data from another resource
location can be downloaded by repeating process steps S32-S34, for
example. If some change in the information on the resource location
is detected, however, for example, through a change in the tracking
index, the changed information is retrieved in step S36.
[0022] Alert generator 26 is then employed to compose in step S37
an alert message to inform an intended recipient of the information
uncovered by the FIRST system. In the preferred embodiment, alert
generator 26 composes an e-mail message for transmission to one or
more of the intended recipients listed in client file 14. The
composed message may provide a copy of the information (e.g., text,
audio/video file, graphic image, etc.) found to be new (or revised)
concerning one or more of the desired search terms. Additional
information such as the entity posting the new (or revised)
information, the revision date, posting time, etc. that can be
retrieved from the resource location may also be added to the alert
message, as desired. Alert generator 26 may also utilize
alternative delivery mechanisms such as real-time display of alert
messages on user interface 10, direct real-time feed to users over
network 18, etc.
[0023] In step S38 the tracking index for the resource location is
then updated based on the changed (or revised) information
detected. To the extent necessary, one or more additional tracking
indexes may be employed for a single resource location to more
particularly identify the variety of information (e.g., multiple
messages, graphic and text messages, etc.) presented in the
location. (The tracking index may be stored in client file 14,
memory 22, one or more other storage devices (not shown), or may
alternatively be derived when needed from other data stored.)
[0024] While preferred embodiments of the invention have been
described and illustrated, it should be apparent that many
modifications to the embodiments and implementations of the
invention can be made without departing from the spirit or scope of
the invention. For example, while only a real-time search
technology has been specifically illustrated to monitor and
retrieve information from resources located on the Internet, the
invention may easily be deployed to monitor other communication
networks (e.g., intranets, private bulletin boards, individual
local or wide area networks, proprietary chat rooms, ICQ, IRC
channels, instant messaging systems, ThirdVoice postings, etc.)
using real-time or non-real-time systems in lieu of or in addition
to the monitoring of the Internet resources. The client files 14
(as well as the resources listed therein) may be processed
sequentially or in parallel by one or more processing devices.
Additional modules may be added to interact with the FIRST system.
For example, the alert messages generated may be forwarded to the
user for manual or automatic annotation (e.g., categorizing alert
information) and returned to a repository module for statistical
analysis and archival storage. Thus, in one implementation, the
user reviewing the information in the alert message may be
presented with a series of input buttons in which to categorize the
information (e.g., "irrelevant," "significant," "critical," etc.)
and automatically reply or forward the information to the
repository.
[0025] The modules described herein, particularly those illustrated
in FIGS. 1 and 2, may be one or more hardware, software, or hybrid
components residing in (or distributed among) one or more local or
remote computer systems. Although the modules are shown as
physically separated components, it should be readily apparent that
the modules may be combined or further separated into a variety of
different components, sharing different resources (including
processing units, memory, clock devices, software routines, etc.)
as required for the particular implementation of the embodiments
disclosed herein. Indeed, even a single general purpose computer
executing a computer program stored on a recording medium to
produce the functionality and any other memory devices referred to
herein may be utilized to implement the illustrated embodiments.
User interface device 10 may be any device used to input and/or
output information. The interface device 10 may be implemented as a
graphical user interface (GUI) containing a display or the like, or
may be a link to other user input/output devices known in the
art.
[0026] In addition, memory unit 22 described herein may be any one
or more of the known storage devices (e.g., Random Access Memory
(RAM), Read Only Memory (ROM), hard disk drive (HDD), floppy drive,
zip drive, compact diskROM, DVD, bubble memory, etc.), and may also
be one or more memory devices embedded within CPU 20, or shared
with one or more of the other components. The computer programs or
algorithms described herein may easily be configured as one or more
hardware modules, and the hardware modules shown may easily be
configured as one or more software modules without departing from
the invention. Accordingly, the invention is not limited by the
foregoing description or drawings, but is only limited by the scope
of the appended claims.
* * * * *