U.S. patent application number 10/610574 was filed with the patent office on 2004-01-08 for systems and methods for providing online event tracking.
Invention is credited to Kubala, Francis G., Shepard, Scott, Srivastava, Amit.
Application Number | 20040006748 10/610574 |
Document ID | / |
Family ID | 30003990 |
Filed Date | 2004-01-08 |
United States Patent
Application |
20040006748 |
Kind Code |
A1 |
Srivastava, Amit ; et
al. |
January 8, 2004 |
Systems and methods for providing online event tracking
Abstract
A system notifies a user of the detection of data that is
relevant to an event of interest. The system obtains a user profile
that includes one or more example documents that define the event.
The system receives data that corresponds to multimedia information
and determines the relevance of the data to the event based on the
one or more example documents. The system notifies the user when
the data is determined to be relevant.
Inventors: |
Srivastava, Amit; (Waltham,
MA) ; Shepard, Scott; (Waltham, MA) ; Kubala,
Francis G.; (Boston, MA) |
Correspondence
Address: |
Leonard C. Suchyta
c/o Christian Andersen
Verizon Corporate Services Group Inc.
600 Hidden Ridge, HQE03H01
Irving
TX
75038
US
|
Family ID: |
30003990 |
Appl. No.: |
10/610574 |
Filed: |
July 2, 2003 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
60394064 |
Jul 3, 2002 |
|
|
|
60394082 |
Jul 3, 2002 |
|
|
|
60419214 |
Oct 17, 2002 |
|
|
|
Current U.S.
Class: |
715/201 ;
715/255 |
Current CPC
Class: |
H04M 2201/42 20130101;
G10L 25/78 20130101; H04M 2201/60 20130101; Y10S 707/99943
20130101; H04M 2203/305 20130101; G10L 15/26 20130101 |
Class at
Publication: |
715/530 |
International
Class: |
G06F 017/21 |
Goverment Interests
[0003] The U.S. Government may have a paid-up license in this
invention and the right in limited circumstances to require the
patent owner to license others on reasonable terms as provided for
by the terms of Contract No. 2001*S651600*000 awarded by the Office
of Advanced Information Technology.
Claims
What is claimed is:
1. A method for identifying documents that are relevant to an event
of interest, comprising: receiving, from a user, one or more
example documents that define the event; obtaining documents in
real time that correspond to information created in a plurality of
media formats; determining relevance of the documents to the event
based on the one or more example documents; and alerting the user
when one or more of the documents are determined to be
relevant.
2. The method of claim 1, wherein the one or more example documents
include at least one of text documents, audio documents, and video
documents.
3. The method of claim 1, wherein the one or more example documents
include a total of at least approximately two thousand words.
4. The method of claim 1, wherein the information includes at least
two of real time audio broadcasts, real time video broadcasts, and
text streams or files.
5. The method of claim 1, further comprising: building a
statistical language model using the one or more example
documents.
6. The method of claim 5, wherein the determining relevance of the
documents includes: finding similarities between words in the
documents and words in the one or more example documents, and
identifying one of the documents as relevant when the words in the
document are similar to the words in the one or more example
documents.
7. The method of claim 1, wherein the determining relevance of the
documents includes: determining similarities between the documents
and the one or more example documents, and identifying one of the
documents as relevant when the document is similar to at least one
of the one or more example documents.
8. The method of claim 1, wherein the determining relevance of the
documents includes: generating scores for the documents, and
determining that ones of the documents with scores above a
threshold are relevant.
9. The method of claim 8, further comprising: providing the ones of
the documents to the user based on the scores.
10. The method of claim 8, wherein the generating scores includes:
determining scores based on degrees of similarities between the
documents and the one or more example documents.
11. The method of claim 1, wherein the alerting the user includes
at least one of: placing a telephone call to the user, sending an
e-mail to the user, sending a page to the user, sending an instant
message to the user, and sending a facsimile to the user.
12. The method of claim 1, wherein the alerting the user includes:
sending an alert to the user after a predetermined number of the
documents are determined to be relevant.
13. The method of claim 1, further comprising: receiving, from the
user, a request for additional information relating to the event,
and sending the additional information to the user.
14. The method of claim 13, wherein the additional information
includes the one or more documents that are determined to be
relevant.
15. The method of claim 13, wherein the additional information
includes the information, corresponding to the one or more
documents that are determined to be relevant, in one of the media
formats in which the information was created.
16. A system for identifying data that is relevant to an event of
interest, comprising: means for obtaining, from a user, a user
profile that includes one or more example documents that define the
event; means for receiving real-time data that corresponds to
multimedia information; means for determining relevance of the data
to the event based on the one or more example documents; and means
for notifying the user when the data is determined to be
relevant.
17. An event tracking system, comprising: collection logic
configured to: receive data items that include textual
representations of multimedia information; and tracking logic
configured to: obtain a user profile that includes one or more
example documents that define an event for which a user desires
data, determine relevance of the data items received by the
collection logic to the event based on the user profile, and send
an alert to the user when at least one of the data items is
determined to be relevant.
18. The system of claim 17, wherein the one or more example
documents include at least one of text documents, audio documents,
and video documents.
19. The system of claim 17, wherein the one or more example
documents collectively include at least approximately two thousand
words.
20. The system of claim 17, wherein the multimedia information
includes at least one of real time audio broadcasts, real time
video broadcasts, text streams, and text files.
21. The system of claim 17, wherein the tracking logic is further
configured to: build a statistical language model using the one or
more example documents.
22. The system of claim 21, wherein when determining relevance of
the data items, the tracking logic is configured to: determine
similarities between words in the data items and words in the one
or more example documents, and identify one of the data items as
relevant when the words in the data item are similar to the words
in the one or more example documents.
23. The system of claim 17, wherein when determining relevance of
the data items, the tracking logic is configured to: determine
similarities between the data items and the one or more example
documents, and identify one of the data items as relevant when the
data item is similar to at least one of the one or more example
documents.
24. The system of claim 17, wherein when determining relevance of
the data items, the tracking logic is configured to: generate
scores for the data items, and determining that ones of the data
items with scores greater than a threshold are relevant.
25. The system of claim 24, wherein the tracking logic is further
configured to: provide the ones of the data items to the user based
on the scores.
26. The system of claim 24, wherein when generating scores, the
tracking logic is configured to: determine scores based on a degree
of similarity between the data items and the one or more example
documents.
27. The system of claim 17, wherein when sending an alert, the
tracking logic is configured to cause at least one of a telephone
call to be placed to the user, an e-mail to be sent to the user, a
page to be sent to the user, an instant message to be sent to the
user, and a facsimile to be sent to the user.
28. The system of claim 17, wherein when sending an alert, the
tracking logic is configured to: wait until a predetermined number
of the data items are determined to be relevant before sending the
alert to the user.
29. The system of claim 17, wherein the tracking logic is further
configured to: receive, from the user, a request for additional
information relating to the event, and send the additional
information to the user.
30. The system of claim 29, wherein the additional information
includes the at least one data item that is determined to be
relevant.
31. The system of claim 29, wherein the additional information
includes the multimedia information corresponding to the at least
one data item that is determined to be relevant.
32. A computer-readable medium that stores instructions which when
executed by a processor cause the processor to perform a method for
notifying a user of documents that are relevant to an event of
interest, the computer-readable medium comprising: instructions for
obtaining at least one example document that defines the event;
instructions for acquiring real-time documents corresponding to
information created in a plurality of media formats; instructions
for determining relevance of the real-time documents to the event
based on the at least one example document; and instructions for
notifying the user when one or more of the real-time documents are
determined to be relevant.
33. An event tracking system, comprising: one or more indexers
configured to: capture data, the data including at least one of
audio data, video data, and text data, and transcribe the data when
the data is the audio data or the video data to create text data;
and alert logic configured to: receive at least one example
document that defines an event for which a user desires
information, receive the text data from the one or more indexers,
determine relevance of the text data to the event based on the at
least one example document, and alert the user when the text data
is determined to be relevant.
34. A method for notifying a user of documents that are relevant to
an event of interest, comprising: receiving one or more example
documents that define the event; obtaining a plurality of types of
media documents; using a model-based approach to determine
relevance of the media documents to the event based on the one or
more example documents; and alerting the user when one or more of
the media documents are determined to be relevant.
Description
RELATED APPLICATION
[0001] This application claims priority under 35 U.S.C. .sctn.119
based on U.S. Provisional Application Nos. 60/394,064 and
60/394,082, filed Jul. 3, 2002, and Provisional Application No.
60/419,214, filed Oct. 17, 2002, the disclosures of which are
incorporated herein by reference.
[0002] This application is related to U.S. patent application, Ser.
No. ______ (Docket No. 02-4039), entitled, "Systems and Methods for
Providing Real-Time Alerting," filed concurrently herewith and
incorporated herein by reference.
BACKGROUND OF THE INVENTION
[0004] 1. Field of the Invention
[0005] The present invention relates generally to multimedia
environments and, more particularly, to systems and methods for
tracking events and providing notification to users when events of
interest are detected.
[0006] 2. Description of Related Art
[0007] With the ever-increasing number of data producers throughout
the word, such as audio broadcasters, video broadcasters, news
stream sources, etc., it is getting harder to determine when
information relating to events of interest are created. One reason
for this is that the data exists in many different formats and in
many different languages.
[0008] The need for tracking events takes many forms. For example,
government agencies may desire all information relating to a
particular event, such as the World Trade Center attacks. Stock
brokers and fund managers may desire all information regarding a
company scandal or takeover. Lawyers may desire all information
regarding a controversial Supreme Court decision. Concerned parents
might desire all information regarding a particular disaster at
their son's college. These are but a few examples of the need for
event tracking..backslash.
[0009] A conventional approach to event tracking requires that the
individual desiring the information, or someone associated with
this individual, spend time searching for it. For example, the
individual might search the Internet and visit different web sites
to read news articles relating to the event. The individual might
also watch the news or news channels on television for information
relating to the event.
[0010] There are several problems with this approach. For example,
it is a time consuming process to search out and peruse different
types of media to find information relevant to an event of
interest. To consistently monitor a wide range of media sources,
any of which can broadcast information of interest at any time of
the day and any day of the week, would require a rather large work
force.
[0011] As a result, there is a need for an automated event tracking
system that monitors multimedia broadcasts and alerts one or more
users when information relating to an event of interest is
detected.
SUMMARY OF THE INVENTION
[0012] Systems and methods consistent with the present invention
address this and other needs by providing event tracking that
monitors multimedia broadcasts against a user-provided profile to
identify information relating to an event of interest. The systems
and methods alert one or more users using one or more alerting
techniques when information relating to the event is
identified.
[0013] In one aspect consistent with the principles of the
invention, a system notifies a user of the detection of data that
is relevant to an event of interest. The system obtains a user
profile that includes one or more example documents that define the
event. The system receives data that corresponds to multimedia
information and determines the relevance of the data to the event
based on the one or more example documents. The system notifies the
user when the data is determined to be relevant.
[0014] In another aspect of the invention, an event tracking system
includes collection logic and tracking logic. The collection logic
receives data items that include textual representations of
multimedia information. The tracking logic obtains a user profile
that includes one or more example documents that define an event
for which a user desires data and determine the relevance of the
data items received by the collection logic to the event based on
the user profile. The tracking logic sends an alert to the user
when at least one of the data items is determined to be
relevant.
[0015] According to yet another aspect of the invention, an event
tracking system is provided. The event tracking system includes one
or more indexers and alert logic. The one or more indexers are
configured to capture data that includes audio data, video data,
and/or text data, and transcribe the data when the data is the
audio data or the video data to create text data. The alert logic
is configured to receive at least one example document that defines
an event for which a user desires information and receive the text
data from the one or more indexers. The alert logic is further
configured to determine the relevance of the text data to the event
based on the at least one example document and alert the user when
the text data is determined to be relevant.
[0016] According to a further aspect of the invention, a method for
notifying a user of documents that are relevant to an event of
interest is provided. The method includes receiving one or more
example documents that define the event and obtaining different
types of media documents. The method further includes using a
model-based approach to determine the relevance of the media
documents to the event based on the one or more example documents
and alerting the user when one or more of the media documents are
determined to be relevant.
BRIEF DESCRIPTION OF THE DRAWINGS
[0017] The accompanying drawings, which are incorporated in and
constitute a part of this specification, illustrate the invention
and, together with the description, explain the invention. In the
drawings,
[0018] FIG. 1 is a diagram of a system in which systems and methods
consistent with the present invention may be implemented;
[0019] FIGS. 2A-2C are exemplary diagrams of the multimedia sources
of FIG. 1 according to an implementation consistent with the
principles of the invention;
[0020] FIG. 3 is an exemplary diagram of an audio indexer of FIG.
1;
[0021] FIG. 4 is a diagram of a possible output of the speech
recognition logic of FIG. 3;
[0022] FIG. 5 is a diagram of a possible output of the story
segmentation logic of FIG. 3;
[0023] FIG. 6 is an exemplary diagram of the alert logic of FIG. 1
according to an implementation consistent with the principles of
the invention;
[0024] FIGS. 7 and 8 are flowcharts of exemplary processing for
providing information relating to an event of interest according to
an implementation consistent with the principles of the
invention;
[0025] FIG. 9 is an exemplary diagram of a graphical user interface
according to an implementation consistent with the principles of
the invention;
[0026] FIG. 10 is an exemplary diagram of a graphical user
interface once example documents have been provided by a user
according to an implementation consistent with the principles of
the invention;
[0027] FIG. 11 is an exemplary diagram of a graphical user
interface that presents relevant documents to a user according to
an implementation consistent with the principles of the invention;
and
[0028] FIG. 12 is an exemplary diagram of an entry in the list of
FIG. 11 according to an implementation consistent with the
principles of the invention.
DETAILED DESCRIPTION
[0029] The following detailed description of the invention refers
to the accompanying drawings. The same reference numbers in
different drawings may identify the same or similar elements. Also,
the following detailed description does not limit the invention.
Instead, the scope of the invention is defined by the appended
claims and equivalents.
[0030] Systems and methods consistent with the present invention
provide mechanisms for monitoring multimedia broadcasts against a
user-provided profile to identify information relating to an event
of interest. The systems and methods may use a model-based approach
to find the relevance of information to the user profile. The
systems and methods may alert one or more users upon detection of
relevant information.
[0031] It may be useful to begin with a definition of an event. An
event is an occurrence that is specific to time, a place, or a
person. By contrast, a topic or subject is a broad level
description of happenings. Examples of events may include Sonny
Bono being killed in a skiing accident or the crash of American
Airlines flight 587. Examples of topics may include skiing
accidents or airplane disasters.
Exemplary System
[0032] FIG. 1 is a diagram of an exemplary system 100 in which
systems and methods consistent with the present invention may be
implemented. System 100 may include multimedia sources 110,
indexers 120, alert logic 130, database 140, and servers 150 and
160 connected to clients 170 via network 180. Network 180 may
include any type of network, such as a local area network (LAN), a
wide area network (WAN) (e.g., the Internet), a public telephone
network (e.g., the Public Switched Telephone Network (PSTN)), a
virtual private network (VPN), or a combination of networks. The
various connections shown in FIG. 1 may be made via wired,
wireless, and/or optical connections.
[0033] Multimedia sources 110 may include audio sources 112, video
sources 114, and text sources 116. FIGS. 2A-2C are exemplary
diagrams of audio sources 112, video sources 114, and text sources
116, respectively, according to an implementation consistent with
the principles of the invention.
[0034] FIG. 2A illustrates an audio source 112. In practice, there
may be multiple audio sources 112. Audio source 112 may include an
audio server 210 and one or more audio inputs 215. Audio input 215
may include mechanisms for capturing any source of audio data, such
as radio, telephone, and conversations, in any language. There may
be a separate audio input 215 for each source of audio. For
example, one audio input 215 may be dedicated to capturing radio
signals; another audio input 215 may be dedicated to capturing
conversations from a conference; and yet another audio input 215
may be dedicated to capturing telephone conversations. Audio server
210 may process the audio data, as necessary, and provide the audio
data, as an audio stream, to indexers 120. Audio server 210 may
also store the audio data.
[0035] FIG. 2B illustrates a video source 114. In practice, there
may be multiple video sources 114. Video source 114 may include a
video server 220 and one or more video inputs 225. Video input 225
may include mechanisms for capturing any source of video data, with
possibly integrated audio data in any language, such as television,
satellite, and a camcorder. There may be a separate video input 225
for each source of video. For example, one video input 225 may be
dedicated to capturing television signals; another video input 225
may be dedicated to capturing a video conference; and yet another
video input 225 may be dedicated to capturing video streams on the
Internet. Video server 220 may process the video data, as
necessary, and provide the video data, as a video stream, to
indexers 120. Video server 220 may also store the video data.
[0036] FIG. 2C illustrates a text source 116. In practice, there
may be multiple text sources 116. Text source 116 may include a
text server 230 and one or more text inputs 235. Text input 235 may
include mechanisms for capturing any source of text, such as
e-mail, web pages, newspapers, and word processing documents, in
any language. There may be a separate text input 235 for each
source of text. For example, one text input 235 may be dedicated to
capturing news wires; another text input 235 may be dedicated to
capturing web pages; and yet another text input 235 may be
dedicated to capturing e-mail. Text server 230 may process the
text, as necessary, and provide the text, as a text stream or file,
to indexers 120. Text server 230 may also store the text.
[0037] Returning to FIG. 1, indexers 120 may include one or more
audio indexers 122, one or more video indexers 124, and one or more
text indexers 126. Each of indexers 122, 124, and 126 may include
mechanisms that receive data from multimedia sources 110, process
the data, perform feature extraction, and output analyzed, marked
up, and enhanced language metadata. In one implementation
consistent with the principles of the invention, indexers 122-126
include mechanisms, such as the ones described in John Makhoul et
al., "Speech and Language Technologies for Audio Indexing and
Retrieval," Proceedings of the IEEE, Vol. 88, No. 8, August 2000,
pp. 1338-1353, which is incorporated herein by reference.
[0038] Audio indexer 122 may receive an input audio stream from
audio sources 112 and generate metadata therefrom. For example,
indexer 122 may segment the input stream by speaker, cluster audio
segments from the same speaker, identify speakers by name or
gender, and transcribe the spoken words. Indexer 122 may also
segment the input stream based on topic and locate the names of
people, places, and organizations. Indexer 122 may further analyze
the input stream to identify when each word was spoken (possibly
based on a time value). Indexer 122 may include any or all of this
information in the metadata relating to the input audio stream.
[0039] Video indexer 124 may receive an input video stream from
video sources 122 and generate metadata therefrom. For example,
indexer 124 may segment the input stream by speaker, cluster video
segments from the same speaker, identify speakers by name or
gender, identify participants using face recognition, and
transcribe the spoken words. Indexer 124 may also segment the input
stream based on topic and locate the names of people, places, and
organizations. Indexer 124 may further analyze the input stream to
identify when each word was spoken (possibly based on a time
value). Indexer 124 may include any or all of this information in
the metadata relating to the input video stream.
[0040] Text indexer 126 may receive an input text stream or file
from text sources 116 and generate metadata therefrom. For example,
indexer 126 may segment the input stream/file based on topic and
locate the names of people, places, and organizations. Indexer 126
may further analyze the input stream/file to identify when each
word occurs (possibly based on a character offset within the text).
Indexer 126 may also identify the author and/or publisher of the
text. Indexer 126 may include any or all of this information in the
metadata relating to the input text stream/file.
[0041] FIG. 3 is an exemplary diagram of indexer 122. Indexers 124
and 126 may be similarly configured. Indexers 124 and 126 may
include, however, additional and/or alternate components particular
to the media type involved.
[0042] As shown in FIG. 3, indexer 122 may include audio
classification logic 310, speech recognition logic 320, speaker
clustering logic 330, speaker identification logic 340, name
spotting logic 350, topic classification logic 360, and story
segmentation logic 370. Audio classification logic 310 may
distinguish speech from silence, noise, and other audio signals in
an input audio stream. For example, audio classification logic 310
may analyze each thirty second window of the input stream to
determine whether it contains speech. Audio classification logic
310 may also identify boundaries between speakers in the input
stream. Audio classification logic 310 may group speech segments
from the same speaker and send the segments to speech recognition
logic 320.
[0043] Speech recognition logic 320 may perform continuous speech
recognition to recognize the words spoken in the segments that it
receives from audio classification logic 310. Speech recognition
logic 320 may generate a transcription of the speech. FIG. 4 is an
exemplary diagram of a transcription 400 generated by speech
recognition logic 320. Transcription 400 may include an
undifferentiated sequence of words that corresponds to the words
spoken in the segment. Transcription 400 contains no linguistic
data, such as punctuation and capitalization.
[0044] Returning to FIG. 3, speaker clustering logic 330 may
identify all of the segments from the same speaker in a single
document (i.e., a body of media that is contiguous in time (from
beginning to end or from time A to time B)) and group them into
speaker clusters. Speaker clustering logic 330 may then assign each
of the speaker clusters a unique label. Speaker identification
logic 340 may identify the speaker in each speaker cluster by name
or gender.
[0045] Name spotting logic 350 may locate the names of people,
places, and organizations in the transcription. Name spotting logic
350 may extract the names and store them in a database. Topic
classification logic 360 may assign topics to the transcription.
Each of the words in the transcription may contribute differently
to each of the topics assigned to the transcription. Topic
classification logic 360 may generate a rank-ordered list of all
possible topics and corresponding scores for the transcription.
[0046] Story segmentation logic 370 may change the continuous
stream of words in the transcription into document-like units with
coherent sets of topic labels and other document features generated
or identified by the components of indexer 122. This information
may constitute metadata corresponding to the input audio stream.
FIG. 5 is a diagram of exemplary text 500 that includes
representations of metadata that may be output from story
segmentation logic 370. Text 500 may include linguistic data, such
as punctuation and capitalization. The metadata text may also
include other information not shown in FIG. 5, such as data that
identifies the type of media involved, data that identifies the
source of the input stream, data that identifies relevant topics,
data that identifies speaker name or gender, data that identifies
names of people, places, or organizations, and data that identifies
the start and duration of each word spoken. Story segmentation
logic 370 may output the metadata in the form of documents to alert
logic 130, where a document corresponds to a body of media that is
contiguous in time (from beginning to end or from time A to time
B).
[0047] Returning to FIG. 1, alert logic 130 determines the
relevance of the documents from indexers 120 to one or more user
profiles. In an implementation consistent with the principles of
the invention, a single alert logic 130 corresponds to multiple
indexers 120 of a particular type (e.g., multiple audio indexers
122, multiple video indexers 124, or multiple text indexers 126) or
multiple types of indexers 120 (e.g., audio indexers 122, video
indexers 124, and text indexers 126). In another implementation,
there may be multiple alert logic 130, such as one alert logic 130
per indexer 120.
[0048] FIG. 6 is an exemplary diagram of alert logic 130 according
to an implementation consistent with the principles of the
invention. Alert logic 130 may include collection logic 610 and
tracking logic 620. Collection logic 610 may manage the collection
of documents from indexers 120. Collection logic 610 may store the
documents in database 140. Collection logic 610 may also provide
the documents to tracking logic 620.
[0049] Tracking logic 620 may determine the relevance of the
documents to one or more user profiles. A user profile may include
one or more example documents that define an event of interest. A
user may identify an event of interest by giving the event a
descriptive title and providing example documents, such as four
example documents. The user may also identify the manner by which
the user desires to be notified of documents relevant to the
event.
[0050] Tracking logic 620 may use the example documents to create a
statistical language model. The statistical language model may
assign probabilities to the words in the example documents, as well
as, all of the words in the lexicon. Tracking logic 620 may use the
statistical language model to determine the similarity of documents
collected by collection logic 610 to the example documents. The
statistical language model may identify documents that contain
words that are grammatically similar to the words in the example
documents, but may not necessarily use the same words as the
example documents.
[0051] The statistical language model may build a uni-gram model
for use in determining document similarity. For the uni-gram model
to succeed, as many different types of words (i.e., words that
describe the event in different ways) or enough words to cover the
event in detail are needed. In practice, two thousand to four
thousand words may be sufficient to make a decent statistical
language model. These words may be located within a single example
document or multiple example documents.
[0052] Tracking logic 620 may use the statistical language model to
determine the relevance of newly received documents (i.e.,
documents newly received by collection logic 610). Tracking logic
620 may also score the documents based on their determined
relevance. For example, a document may be scored based on its
similarity to the example documents.
[0053] When a relevant document is detected, tracking logic 620 may
generate an alert notification and send it to notification
server(s) 160. Alternatively, tracking logic 620 may wait until a
predetermined number of relevant documents are detected before
generating the alert notification. The particular number of
relevant documents that tracking logic 620 identifies before
generating the alert notification may be specified in the user
profile.
[0054] Returning to FIG. 1, database 140 may store the documents
received by alert logic 130. Database 140 may, thereby, store a
history of the information seen by alert logic 130. Database 140
may also store some or all of the original media (audio, video, or
text) relating to the documents. In order to maintain adequate
storage space in database 140, it may be practical to expire (i.e.,
delete) documents after a certain time period.
[0055] Server 150 may include a computer or another device that is
capable of interacting with alert logic 130 and clients 170 via
network 180. Server 150 may obtain user profiles from clients 170
and provide them to alert logic 130. Server 150 may also gather
information from alert logic 130 and send it to one or more of
clients 170. Clients 170 may include personal computers, laptops,
personal digital assistants, or other types of devices that are
capable of interacting with server 150 to provide user profiles
and, possibly, receive alerts and other information. Clients 170
may present information to users via a graphical user interface,
such as a web browser window.
[0056] Notification server(s) 160 may include one or more servers
that transmit alerts regarding one or more documents that relate to
an event of interest. A notification server 160 may include a
computer or another device that is capable of receiving
notifications from alert logic 130 and notifying users of the
alerts. Notification server 160 may use different techniques to
notify users. For example, notification server 160 may place a
telephone call to a user, send an e-mail, page, instant message, or
facsimile to the user, or use other mechanisms to notify the user.
In an implementation consistent with the principles of the
invention, notification server 160 and server 150 are the same
server. In another implementation, notification server 160 is a
knowledge base system.
[0057] The notification sent to the user may include a message that
indicates that one or more relevant documents have been detected.
Alternatively, the notification may include a portion or all of a
relevant document, possibly in its original format. For example, an
audio or video signal may be streamed to the user or a text
document may be sent to the user.
Exemplary Processing
[0058] FIGS. 7 and 8 are flowcharts of exemplary processing for
providing information relating to an event of interest according to
an implementation consistent with the principles of the invention.
Processing may begin with a user generating a user profile. To do
this, the user may access server 150 in a conventional manner
using, for example, a web browser on client 170. The user may
interact with server 150 to provide a title and one or more example
documents that describe an event for which the user would be
interested in receiving information. In other words, the user
desires to know when future information is created or broadcast
that relates to the event described by the title and the example
document(s).
[0059] As described above, it may be beneficial to provide
sufficient words (e.g., two to four thousand words) that describe
the event. These words may be located in one or more example
documents. The user may obtain the example documents by whatever
means and provide them in electronic form (or links to them) to
server 150. For example, the user may copy a news article from a
newspaper's web site, scan a magazine article, or type in or
otherwise input text. It may also be possible for the user to
provide an audio or video clip to server 150. In this case, server
150 may send the clip through the appropriate indexers 120 to
obtain a transcription of the clip.
[0060] To facilitate the user's interaction with server 150, client
170 may present a graphical user interface (GUI) to the user. FIG.
9 is an exemplary diagram of a GUI 900 according to an
implementation consistent with the principles of the invention. GUI
900 may include buttons 910 associated with creating an event,
adding examples to an event, and removing an event. GUI 900 may
also include information describing existing events 920 (i.e.,
events for which the user has already provided a title and/or
example document(s)). Each of existing events 920 may be identified
by the title provided by the user for the event.
[0061] If the user wants to create an event, the user may select
the create an event button. The user may, thereafter, provide a
title for the event and add example documents that describe the
event. The title may be used for later retrieval of information
relating to the event. If the user wants to remove an event, the
user may select one of existing events 920 and then select the
remove an event button. If the user wants to add example documents
to an already existing event, the user may select one of existing
events 920 and then select the add examples button. The user may,
thereafter, add one or more documents that define the event.
[0062] FIG. 10 is an exemplary diagram of GUI 900 once example
documents have been provided by a user according to an
implementation consistent with the principles of the invention. In
this example, the user provided four example documents 1010-1040
that describe the event, which in this case relates to U.S. Special
Forces in the Philippines. Each of documents 1010-1040 includes
information that identifies the type of document, the name of the
document, and the source of the document. Document 1010, for
example, is a text document, entitled "Opinion Piece," taken from
the Guardian newspaper of the United Kingdom. Document 1020 is an
audio document, entitled "Support of Troop Deployment," taken from
National Public Radio (NPR). Document 1030 is a text document taken
from Reuters and document 1040 is a text document taken from the
Washington Post.
[0063] Alert logic 130 receives the user profile from server 150
(act 710) (FIG. 7). From the example document(s) in the user
profile, alert logic 130 may build a statistical language model
using conventional techniques. The statistical language model may
assign probabilities to the words in the example documents, as well
as other words in the lexicon.
[0064] Alert logic 130 continuously receives documents from
indexers 120 in near real time (i.e., in real time subject to minor
processing delays by indexers 120) (act 720). In the implementation
where there is one alert logic 130 per indexer 120, then alert
logic 130 may operate upon documents from a single indexer 120. In
the implementation where there is one alert logic 130 for multiple
indexers 120, then alert logic 130 may concurrently operate upon
documents from multiple indexers 120. In either case, alert logic
130 may store the documents in database 140.
[0065] Alert logic 130 may also determine the relevance of the
documents to the event defined in the user profile (act 730). For
example, alert logic 130 may use the statistical language model to
find similarities between words in the documents and the words in
the example documents. Because the statistical language model looks
for similarities based on specific words or word synonyms, a
document may be determined relevant even if it does not have the
same words as the example documents.
[0066] If alert logic 130 determines that the documents are not
relevant to the event (act 740), then alert logic 130 awaits
receipt of the next document(s) from indexers 120. If one or more
of the documents are relevant (act 740), however, alert logic 130
may generate a relevance score for the document(s) (act 750). The
score may be based on the degree of similarity between the document
and the example documents.
[0067] Alternatively, a document's relevance may be determined
based on the score. For example, alert logic 130 may generate a
relevance score for each of the documents and determine that
documents with scores above a certain threshold are relevant and
documents with scores below the threshold are not relevant.
[0068] In any event, alert logic 130 may then generate an alert
notification (act 760). Alternatively, alert logic 130 may wait
until a sufficient number of relevant documents have been
identified before generating the alert notification. In any event,
the alert notification may identify the relevant document(s)
(audio, video, or text) and/or the event to which the alert
pertains. This permits the user to obtain more information
regarding the document(s) if desired. Alert logic 130 may send the
alert notification to notification server(s) 160. Alert logic 130
may identify the particular notification server 160 to use based on
information in the user profile.
[0069] Notification server 160 may generate a notification based on
the alert notification from alert logic 130 and send the
notification to the user (act 770). For example, notification
server 160 may place a telephone call to the user, send an e-mail,
page, instant message, or facsimile to the user, or otherwise
notify the user. In one implementation, the notification includes a
portion of or the entire relevant document, possibly in its
original format.
[0070] At some point, the user may desire additional information
regarding an event. In this case, the user may provide some
indication to client 170 of the desire for additional information.
For example, the user may select a button on a graphical user
interface, such as GUI 900 (FIG. 9), that indicates that
information regarding a particular alert or event is desired. In
the example of FIG. 9, the user may double click or otherwise
select an existing event 920. Client 170 may send this indication
to alert logic 130 via server 150.
[0071] Alert logic 130 may receive the indication that the user
desires additional information regarding an event (act 810) (FIG.
8). In response, alert logic 130 may retrieve the relevant
documents, possibly including metadata relating to the event, from
database 140 (act 820). Alert logic 130 may then provide a list of
the relevant documents and/or the documents themselves to client
170 (act 830). Client 170 may provide a list of the relevant
documents to the user.
[0072] FIG. 11 is an exemplary diagram of a graphical user
interface (GUI) 1100 that presents relevant documents to a user
according to an implementation consistent with the principles of
the invention. GUI 1100 includes a list 1110 of relevant documents
detected by alert logic 130. List 1110 may include a number of
entries corresponding to the number of relevant documents
identified by alert logic 130.
[0073] FIG. 12 is an exemplary diagram of an entry 1200 in list
1110 according to an implementation consistent with the principles
of the invention. Entry 1200 includes a score 1210, a date 1220,
and a title 1230. Score 1210 may include a visual indication of the
score of the document. In this example, score 1210 includes a
scale. In other cases, score 1210 may include a numerical value or
some other type of visual indication of a score. Date 1220 may
include the time and/or date on which the document was created.
Title 1230 may include the title of the document. In some cases,
the document title may be generated by indexers 120.
[0074] Returning to FIG. 11, the relevant documents may be
presented to the user in a number of ways. For example, the
documents may be presented in order of score (e.g., highest score
first) and date (newest document first). In one implementation,
score takes precedents over date. GUI 1100 may also visually
distinguish documents that have already been seen by the user from
documents that have not yet been seen. In the example of FIG. 11,
documents that have not yet been seen by the user are italicized
and bolded.
[0075] If the user desires additional information regarding one of
the documents, the user may indicate so by selecting (e.g., double
clicking) the document. In this case, client 170 may present the
document to the user within a graphical user interface, such as GUI
1100. If the user desires, the user may retrieve the original media
corresponding to the document. The original media may be stored in
database 140 along with the metadata, stored in a separate database
possibly accessible via network 180, or maintained by one of
servers 210, 220, or 230 (FIG. 2). If the original media is an
audio or video document, the audio or video document may be
streamed to client 170. If the original media is a text document,
the text document may be provided to client 170.
[0076] If the user identifies a document that is of particular
relevance, the user may add the document to the set of example
documents in the user profile. To do this, the user may select the
document by, for example, clicking or double clicking on it. The
user may then select the add examples button (see FIG. 11). The
document would then be added to the set of example documents in the
user profile.
CONCLUSION
[0077] Systems and methods consistent with the present invention
permit a user to define an event of interest by supplying a set of
documents that describe the event in a way that the user is
interested in the event. The systems and methods then determine the
relevance of all incoming documents to the event defined by the
user. The systems and methods notify the user when one or more
relevant documents are identified.
[0078] The foregoing description of preferred embodiments of the
present invention provides illustration and description, but is not
intended to be exhaustive or to limit the invention to the precise
form disclosed. Modifications and variations are possible in light
of the above teachings or may be acquired from practice of the
invention. For example, while series of acts have been described
with regard to the flowcharts of FIGS. 7 and 8, the order of the
acts may differ in other implementations consistent with the
principles of the invention.
[0079] Also, systems and methods have been described as acting upon
newly-created information. In other implementations consistent with
the principles of the invention, the systems and methods may also
identify information that has already been created. In this case,
database 140 may be searched for relevant information.
[0080] In the implementations described above, an event is
identified by a title and defined by one or more example documents.
In another implementation, the title may be used as one of the
example documents defining the event.
[0081] Certain portions of the invention have been described as
"logic" that performs one or more functions. This logic may include
hardware, such as an application specific integrated circuit or a
field programmable gate array, software, or a combination of
hardware and software.
[0082] No element, act, or instruction used in the description of
the present application should be construed as critical or
essential to the invention unless explicitly described as such.
Also, as used herein, the article "a" is intended to include one or
more items. Where only one item is intended, the term "one" or
similar language is used. The scope of the invention is defined by
the claims and their equivalents.
* * * * *