U.S. patent application number 10/464892 was filed with the patent office on 2004-12-23 for personalized indexing and searching for information in a distributed data processing system.
This patent application is currently assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION. Invention is credited to Best, Steven Francis, Brown, Michael Wayne, Cooper, Michael Richard.
Application Number | 20040260680 10/464892 |
Document ID | / |
Family ID | 33517366 |
Filed Date | 2004-12-23 |
United States Patent
Application |
20040260680 |
Kind Code |
A1 |
Best, Steven Francis ; et
al. |
December 23, 2004 |
Personalized indexing and searching for information in a
distributed data processing system
Abstract
Personalized searching for information in a distributed data
processing system including providing in a search portal a personal
search term list for a user, the personal search term list
comprising search keywords known to be of interest to the user;
receiving from the user a navigation identification message
comprising a navigation location; and creating a personalized
search index in dependence upon the navigation location and the
contents of the personal search term list, wherein the personalized
search index comprises index records comprising time stamps.
Typical embodiments also comprise receiving in the search portal
from the user a navigation request message comprising a navigation
direction; creating, in dependence upon the personalized search
index, the navigation direction, and a last navigation time stamp,
a response to the navigation request message; transmitting the
response to the user; and updating the last navigation time
stamp.
Inventors: |
Best, Steven Francis;
(Georgetown, TX) ; Brown, Michael Wayne;
(Georgetown, TX) ; Cooper, Michael Richard;
(Austin, TX) |
Correspondence
Address: |
IBM CORP (BLF)
c/o BIGGERS & OHANIAN, LLP
504 LAVACA STREET, SUITE 970
AUSTIN
TX
78701-2856
US
|
Assignee: |
INTERNATIONAL BUSINESS MACHINES
CORPORATION
Armonk
NY
|
Family ID: |
33517366 |
Appl. No.: |
10/464892 |
Filed: |
June 19, 2003 |
Current U.S.
Class: |
1/1 ;
707/999.003; 707/E17.109 |
Current CPC
Class: |
G06Q 30/0623 20130101;
G06F 16/9535 20190101 |
Class at
Publication: |
707/003 |
International
Class: |
G06F 007/00 |
Claims
What is claimed is:
1. A method of personalized searching for information in a
distributed data processing system, the method comprising:
providing in a search portal a personal search term list for a
user, the personal search term list comprising search keywords
known to be of interest to the user; receiving from the user a
navigation identification message comprising a navigation location;
and creating a personalized search index in dependence upon the
navigation location and the contents of the personal search term
list, wherein the personalized search index comprises index
records, each index record comprising a time stamp.
2. The method of claim 1 wherein the time stamp comprises an
indication of the date and time of the receiving of the navigation
identification message.
3. The method of claim 1 wherein the personalized search index
comprises a search index further comprising a user identification
for the search portal, the user identification for the search
portal comprising data uniquely identifying the user as among other
users of the search portal.
4. The method of claim 1 wherein the navigation identification
message further comprises a search keyword and creating a
personalized search index further comprises indexing the search
keyword with the navigation location and a time stamp in the
personalized search index.
5. The method of claim 1 wherein creating a personalized search
index further comprises: retrieving a document from the navigation
location; identifying search keywords from the personal search term
list that occur in the retrieved document; and indexing in the
personalized search index the navigation location, a time stamp,
and keywords from the personal search term list that occur in the
retrieved documents.
6. The method of claim 1 further comprising establishing a user
account for the user on the search portal, the user account
comprising user identification unique to the user and a last
navigation time stamp.
7. The method of claim 1 further comprising: receiving in the
search portal from the user a navigation request message comprising
a navigation direction; creating, in dependence upon the
personalized search index, the navigation direction, and a last
navigation time stamp, a response to the navigation request
message; transmitting the response to the user; and updating the
last navigation time stamp.
8. The method of claim 7 wherein creating a response comprises:
retrieving, from user account data, a last navigation time stamp;
retrieving a navigation location from the personalized search index
in dependence upon the last navigation time stamp from the user
account data and the navigation direction; and retrieving the
document identified by the navigation location; wherein
transmitting the response to the user comprises transmitting the
document to the user.
9. The method of claim 8 wherein the navigation request message
further comprises a navigation interval and retrieving a navigation
location from the personalized search index further comprises
retrieving a navigation location from the personalized search index
in dependence upon the last navigation time stamp from the user
account data, the navigation direction, and the navigation
interval.
10. The method of claim 1 further comprising: creating a subset of
the personalized search index; and making the subset available to
users for remote playback.
11. A system for personalized searching for information in a
distributed data processing system, the system comprising: means
for providing in a search portal a personal search term list for a
user, the personal search term list comprising search keywords
known to be of interest to the user; means for receiving from the
user a navigation identification message comprising a navigation
location; and means for creating a personalized search index in
dependence upon the navigation location and the contents of the
personal search term list, wherein the personalized search index
comprises index records, each index record comprising a time
stamp.
12. The system of claim 11 wherein the time stamp comprises an
indication of the date and time of a receiving of the navigation
identification message.
13. The system of claim 11 wherein the personalized search index
comprises a search index further comprising a user identification
for the search portal, the user identification for the search
portal comprising data uniquely identifying the user as among other
users of the search portal.
14. The system of claim 11 wherein the navigation identification
message further comprises a search keyword and means for creating a
personalized search index further comprises means for indexing the
search keyword with the navigation location and a time stamp in the
personalized search index.
15. The system of claim 11 wherein means for creating a
personalized search index further comprises: means for retrieving a
document from the navigation location; means for identifying search
keywords from the personal search term list that occur in the
retrieved document; and means for indexing in the personalized
search index the navigation location, a time stamp, and keywords
from the personal search term list that occur in the retrieved
documents.
16. The system of claim 11 further comprising means for
establishing a user account for the user on the search portal, the
user account comprising user identification unique to the user and
a last navigation time stamp.
17. The system of claim 11 further comprising: means for receiving
in the search portal from the user a navigation request message
comprising a navigation direction; means for creating, in
dependence upon the personalized search index, the navigation
direction, and a last navigation time stamp, a response to the
navigation request message; means for transmitting the response to
the user; and means for updating the last navigation time
stamp.
18. The system of claim 17 wherein means for creating a response
comprises: means for retrieving, from user account data, a last
navigation time stamp; means for retrieving a navigation location
from the personalized search index in dependence upon the last
navigation time stamp from the user account data and the navigation
direction; and means for retrieving the document identified by the
navigation location; wherein means for transmitting the response to
the user comprises means for transmitting the document to the
user.
19. The system of claim 17 wherein the navigation request message
further comprises a navigation interval and means for retrieving a
navigation location from the personalized search index further
comprises means for retrieving a navigation location from the
personalized search index in dependence upon the last navigation
time stamp from the user account data, the navigation direction,
and the navigation interval.
20. The system of claim 11 further comprising: means for creating a
subset of the personalized search index; and means for making the
subset available to users for remote playback.
21. A computer program product for personalized searching for
information in a distributed data processing system, the computer
program product comprising: a recording medium; means, recorded on
the recording medium, for providing in a search portal a personal
search term list for a user, the personal search term list
comprising search keywords known to be of interest to the user;
means, recorded on the recording medium, for receiving from the
user a navigation identification message comprising a navigation
location; and means, recorded on the recording medium, for creating
a personalized search index in dependence upon the navigation
location and the contents of the personal search term list, wherein
the personalized search index comprises index records, each index
record comprising a time stamp.
22. The computer program product of claim 21 wherein the time stamp
comprises an indication of the date and time of a receiving of the
navigation identification message.
23. The computer program product of claim 21 wherein the
personalized search index comprises a search index further
comprising a user identification for the search portal, the user
identification for the search portal comprising data uniquely
identifying the user as among other users of the search portal.
24. The computer program product of claim 21 wherein the navigation
identification message further comprises a search keyword and means
for creating a personalized search index further comprises means,
recorded on the recording medium, for indexing the search keyword
with the navigation location and a time stamp in the personalized
search index.
25. The computer program product of claim 21 wherein means for
creating a personalized search index further comprises: means,
recorded on the recording medium, for retrieving a document from
the navigation location; means, recorded on the recording medium,
for identifying search keywords from the personal search term list
that occur in the retrieved document; and means, recorded on the
recording medium, for indexing in the personalized search index the
navigation location, a time stamp, and keywords from the personal
search term list that occur in the retrieved documents.
26. The computer program product of claim 21 further comprising
means, recorded on the recording medium, for establishing a user
account for the user on the search portal, the user account
comprising user identification unique to the user and a last
navigation time stamp.
27. The computer program product of claim 21 further comprising:
means, recorded on the recording medium, for receiving in the
search portal from the user a navigation request message comprising
a navigation direction; means, recorded on the recording medium,
for creating, in dependence upon the personalized search index, the
navigation direction, and a last navigation time stamp, a response
to the navigation request message; means for transmitting the
response to the user; and means, recorded on the recording medium,
for updating the last navigation time stamp.
28. The computer program product of claim 27 wherein means for
creating a response comprises: means, recorded on the recording
medium, for retrieving, from user account data, a last navigation
time stamp; means, recorded on the recording medium, for retrieving
a navigation location from the personalized search index in
dependence upon the last navigation time stamp from the user
account data and the navigation direction; and means, recorded on
the recording medium, for retrieving the document identified by the
navigation location; wherein means for transmitting the response to
the user comprises means, recorded on the recording medium, for
transmitting the document to the user.
29. The computer program product of claim 27 wherein the navigation
request message further comprises a navigation interval and means
for retrieving a navigation location from the personalized search
index further comprises means, recorded on the recording medium,
for retrieving a navigation location from the personalized search
index in dependence upon the last navigation time stamp from the
user account data, the navigation direction, and the navigation
interval.
30. The computer program product of claim 21 further comprising:
means, recorded on the recording medium, for creating a subset of
the personalized search index; and means, recorded on the recording
medium, for making the subset available to users for remote
playback.
Description
BACKGROUND OF THE INVENTION
[0001] 1. Field of the Invention
[0002] The field of the invention is data processing, or, more
specifically, methods, systems, and products for personalized
indexing and searching for information in a distributed data
processing system.
[0003] 2. Description of Related Art
[0004] An example from current art of a large distributed data
processing system is the World Wide Web. Search engines on the web
are basically massive full-text indexes of millions of web pages.
These search engines are specialized software programs specialized
to receive search query messages from users or from users'
browsers, where the search query messages comprise keywords or
search terms. Search engines formulate, or `parse,` the query
messages into database queries against web search databases
comprising massive search indexes.
[0005] The web includes many web sites comprising many millions of
web pages, each of which is a document specially structured in a
markup language, such as, for example, HTML, WML, HDML, and so on,
to support some hyperlinking in some data communications protocol,
such as, for example, HTTP, WAP, HDTP, and so on. The search
indexes for the search engines are created by software robots
called `spiders` or `crawlers` that survey the web and retrieve
documents for indexing. The indexing itself is often carried out by
another software engine that takes as its input the pages gathered
by spiders, extracts keywords according to some algorithm, and
creates index entries based upon the keywords and URLs identifying
the indexed documents.
[0006] That is, spiders gather documents into a documents database,
identifying the documents to be gathered from a URL list in the
documents database or through hyperlinks in the documents
themselves or through other methods. Spiders take as their inputs
the entire web and produce as outputs documents to be indexed.
Indexing engines take as their inputs documents to be indexed and
produce as their outputs search indexes. Search engines take as
inputs search indexes and search request messages bearing search
terms and produce as their outputs search result messages for
return to requesting users' browsers.
[0007] In current art, spiders gather documents with no regard for
individual users' interests or history of web navigation. In
current art, index engines create search indexes with no regard for
individual users' interests or history of web navigation. In
current art, search engines create responses to search queries from
users with no regard for individual users' interests or history of
web navigation. If searches could be performed with regard for
individual users' interests or history of web navigation, searches
could be better focused and search results could be more pertinent
to users' purposes in searching for information. There are ongoing
needs for improvement, therefore, in searching and indexing
information in large distributed data processing system like the
web.
SUMMARY OF THE INVENTION
[0008] Methods, systems, and products are disclosed for
personalized searching for information in a distributed data
processing system including providing in a search portal a personal
search term list for a user, the personal search term list
comprising search keywords known to be of interest to the user;
receiving from the user a navigation identification message
comprising a navigation location; and creating a personalized
search index in dependence upon the navigation location and the
contents of the personal search term list, wherein the personalized
search index comprises index records, each index record comprising
a time stamp. A time stamp typically comprises an indication of the
date and time of the receiving of the navigation identification
message. A personalized search index typically comprises a search
index further comprising a user identification for the search
portal, the user identification for the search portal comprising
data uniquely identifying the user as among other users of the
search portal. A navigation identification message typically
includes a search keyword and creating a personalized search index
typically also includes indexing the search keyword with the
navigation location and a time stamp in the personalized search
index.
[0009] In typical embodiments, creating a personalized search index
further comprises retrieving a document from the navigation
location; identifying search keywords from the personal search term
list that occur in the retrieved document; and indexing in the
personalized search index the navigation location, a time stamp,
and keywords from the personal search term list that occur in the
retrieved documents. Typical embodiments often also include
establishing a user account for the user on the search portal, the
user account comprising user identification unique to the user and
a last navigation time stamp.
[0010] Typical embodiments also include receiving in the search
portal from the user a navigation request message comprising a
navigation direction; creating, in dependence upon the personalized
search index, the navigation direction, and a last navigation time
stamp, a response to the navigation request message; transmitting
the response to the user; and updating the last navigation time
stamp. In such embodiments, creating a response typically includes
retrieving, from user account data, a last navigation time stamp;
retrieving a navigation location from the personalized search index
in dependence upon the last navigation time stamp from the user
account data and the navigation direction; and retrieving the
document identified by the navigation location. Transmitting the
response to the user typically include transmitting the document to
the user.
[0011] In many embodiments, a navigation request message further
comprises a navigation interval and retrieving a navigation
location from the personalized search index further comprises
retrieving a navigation location from the personalized search index
in dependence upon the last navigation time stamp from the user
account data, the navigation direction, and the navigation
interval. Many embodiments include creating a subset of the
personalized search index and making the subset available to users
for remote playback.
[0012] The foregoing and other objects, features and advantages of
the invention will be apparent from the following more particular
descriptions of exemplary embodiments of the invention as
illustrated in the accompanying drawings wherein like reference
numbers generally represent like parts of exemplary embodiments of
the invention.
BRIEF DESCRIPTION OF THE DRAWINGS
[0013] FIG. 1 depicts an architecture for a distributed data
processing system in which various embodiments of the present
invention may be implemented.
[0014] FIG. 2 sets forth a block diagram of computer useful in
systems for indexing and searching for information in distributed
data processing systems according to embodiments of the present
invention.
[0015] FIG. 3 depicts an exemplary software architecture in which
methods, systems, and products may be implemented according to
embodiments of the present invention.
[0016] FIG. 4 depicts a further exemplary software architecture in
which methods, systems, and products may be implemented according
to embodiments of the present invention.
[0017] FIG. 5 shows an exemplary personalized search index.
[0018] FIG. 6 sets forth a flow chart illustrating an exemplary
method of personalized searching for information in a distributed
data processing system.
[0019] FIG. 7 sets forth a flow chart illustrating methods of
providing a personal search term list.
[0020] FIG. 8 sets forth a flow chart illustrating exemplary
methods of inserting index records in a personalized search
index.
[0021] FIG. 9 sets forth a flow chart illustrating an exemplary
method of operating a history navigation engine advantageously in
dependence upon a personalized search index.
[0022] FIG. 10 shows a further exemplary personalized search
index.
[0023] FIG. 11 sets forth a flow chart illustrating a further
method of searching for information in a distributed data
processing system according to personalized navigation history.
[0024] FIG. 12 sets forth a flow chart illustrating an exemplary
method of operating a history navigation engine advantageously in
dependence upon subsets of a personalized search index.
[0025] FIG. 13 depicts an exemplary GUI on a client running a data
communication application.
DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS
Introduction
[0026] The present invention is described to a large extent in this
specification in terms of methods for personalized indexing and
searching for information in a distributed data processing system.
Persons skilled in the art, however, will recognize that any
computer system that includes suitable programming means for
operating in accordance with the disclosed methods also falls well
within the scope of the present invention.
[0027] Suitable programming means include any means for directing a
computer system to execute the steps of the method of the
invention, including for example, systems comprised of processing
units and arithmetic-logic circuits coupled to computer memory,
which systems have the capability of storing in computer memory,
which computer memory includes electronic circuits configured to
store data and program instructions, programmed steps of the method
of the invention for execution by a processing unit. The invention
also may be embodied in a computer program product, such as a
diskette or other recording medium, for use with any suitable data
processing system.
[0028] Embodiments of a computer program product may be implemented
by use of any recording medium for machine-readable information,
including magnetic media, optical media, or other suitable media.
Persons skilled in the art will immediately recognize that any
computer system having suitable programming means will be capable
of executing the steps of the method of the invention as embodied
in a program product. Persons skilled in the art will recognize
immediately that, although most of the exemplary embodiments
described in this specification are oriented to software installed
and executing on computer hardware, nevertheless, alternative
embodiments implemented as firmware or as hardware are well within
the scope of the present invention.
Definitions
[0029] In this specification, the terms "field," "data element,"
and "attribute," unless the context indicates otherwise, generally
are used as synonyms, referring to individual elements of
information, typically represented as digital data. Aggregates of
data elements are referred to as "records" or "data structures."
Aggregates of records are referred to as "tables" or "files."
Aggregates of files or tables are referred to as "databases." In
the context of tables, fields may be referred to as "columns," and
records may be referred to as "rows." Complex data structures that
include member methods, functions, or software routines as well as
data elements are referred to as "classes." Instances of classes
are referred to as "objects" or "class objects."
[0030] "802.11" refers to a family of specifications developed by
the IEEE for wireless LAN technology. 802.11 specifies an
over-the-air interface between a wireless client and a base station
or between two wireless clients. Specification 802.11b, also known
as `802.11 High Rate` or `Wi Fi,` provides wireless network
functionality similar to Ethernet.
[0031] "Browser" means a web browser, a communications application
for locating and displaying web pages. Browsers typically comprise
a markup language interpreter, web page display routines, and an
HTTP communications client. Typical browsers today can display
text, graphics, audio and video. Browsers are operative in
network-enabled devices, including wireless network-enabled devices
such as network-enabled PDAs and mobile telephones. Browsers in
wireless network-enabled devices often are downsized browsers
called "microbrowsers." Microbrowsers in wireless network-enabled
devices often support markup languages other than HTML, including
for example, WML, the Wireless Markup Language.
[0032] "CGI" means "Common Gateway Interface," a standard
technology for data communications of resources between web servers
and web clients. More specifically, CGI provides a standard
interface between servers and server-side `gateway` programs which
administer actual reads and writes of data to and from files
systems and databases.
[0033] "Client," "client device," or "client computer" refers to
any computer, any automated computing machinery, used according to
embodiments of the present invention to prepare and communicate
search queries or search query messages and, in return, receive and
display search results or responses. Examples of client devices are
personal computers, PDAs, mobile telephones, laptop computers, and
others as will occur to those of skill in the art. Various
embodiments of client devices support wireline communications or
wireless communications. The use as a client device of any
instrument capable of administering search queries and search
results is well within the present invention.
[0034] A "communications application" is any data communications
software capable of operating couplings for data communications to
send and receive search query messages and search responses,
including browsers, microbrowsers, special purpose data
communications systems, and others as will occur to those of skill
in the art. "Coupled for data communications" means any form of
data communications, wireless, 802.11b, Bluetooth, infrared, radio,
internet protocols such as TCP/IP, HTTP protocols, email protocols,
networked, direct connections, dedicated phone lines, dial-ups,
serial connections with RS-232 (EIA232) or Universal Serial Buses,
hard-wired parallel port connections, network connections according
to the Power Line Protocol, and other forms of connection for data
communications as will occur to those of skill in the art.
Couplings for data communications include networked couplings for
data communications. Examples of networks useful with various
embodiments of the invention include cable networks, intranets,
extranets, internets, local area networks, wide area networks, and
other network arrangements as will occur to those of skill in the
art.
[0035] "CPU" means `central processing unit.` The term `CPU` as it
is used in this disclosure includes any form of computer processing
unit, regardless whether single, multiple, central, peripheral, or
remote, in any form of automated computing machinery, including
client devices, servers, and so on.
[0036] A "document" is any resource on any distributed data process
system containing information amenable to indexing and searching
according to embodiments of the present invention. Documents
include static files in markup languages, such as static HTML
files, as well as dynamically-generated content such as query
results and output from CGI scripts and Java.TM. servlets, and
output from dynamic server pages such as Active Server Pages, Java
Server Pages, and others as will occur to those of skill in the
art.
[0037] "GUI" means `graphical user interface.`
[0038] "HDML" stands for `Handheld Device Markup Language,` a
markup language used to format content for web-enabled mobile
phones. HDML is proprietary to Openwave Systems, Inc., and can only
be operated on phones that use Openwave browsers. Rather than WAP,
HDML operates over Openwave's Handheld Device Transport Protocol
("HDTP").
[0039] "HTML" stands for `HyperText Markup Language,` a standard
markup language for displaying web pages on browsers.
[0040] "HTTP" stands for `HyperText Transport Protocol,` the
standard data communications protocol of the World Wide Web.
[0041] A "hyperlink," also referred to as "link" or "web link," is
a reference to a resource name or network address which when
invoked allows the named resource or network address to be
accessed. More particularly in terms of the present invention,
invoking a hyperlink implements a request for access to a resource,
generally a document. Often a hyperlink identifies a network
address at which is stored a resource such as a web page or other
document. Hyperlinks are often implemented as anchor elements in
markup in documents. As the term is used in this specification,
however, hyperlinks include links effected through anchors as well
as URIs invoked through `back` buttons on browsers, which do not
involve anchors. Hyperlinks include URIs typed into address fields
on browsers and invoked by a `Go` button, also not involving
anchors. In addition, although there is a natural tendency to think
of hyperlinks as retrieving web pages, their use is broader than
that. In fact, hyperlinks access "resources" generally available
through hyperlinks including not only web pages but many other
kinds of data as well as dynamically-generated server-side output
from Java servlets, CGI scripts, and other resources as will occur
to those of skill in the art.
[0042] "The Internet" is a global network connecting millions of
computers utilizing the Internet Protocol` or `IP` as the network
layer of their networking protocol stacks, and, typically, also
using the Transmission Control Protocol or `TCP` as the transport
layer of their networking protocol stacks. The Internet is
decentralized by design, a strong example of a distributed data
processing system. An "internet" (uncapitalized) is any network
using IP as the network layer in its network protocol stack.
[0043] "LAN" is an abbreviation for "local area network." A LAN is
a computer network that spans a relatively small area. Many LANs
are confined to a single building or group of buildings. However,
one LAN can be connected to other LANs over any distance via
telephone lines and radio waves. A system of LANs connected in this
way is called a wide-area network ("WAN"). The Internet is an
example of a WAN.
[0044] "Network" is used in this specification to mean any
networked coupling for data communications among computers or
computer systems, clients, servers, and so on. Examples of networks
useful with the invention include intranets, extranets, internets,
local area networks, wide area networks, and other network
arrangements as will occur to those of skill in the art.
[0045] "PDA" refers to a personal digital assistant, a handheld
computer useful as a client according to embodiments of the present
invention.
[0046] "Resource" means any aggregation of information administered
in distributed processing systems according to embodiments of the
present invention. Network communications protocols generally, for
example, HTTP, transmit resources, not just files. A resource is an
aggregation of information capable of being identified by a URI or
URL. In fact, the `R` in `URI` stands for `Resource.` The most
common kind of resource is a file, but resources include
dynamically-generated query results, the output of CGI scripts,
dynamic server pages, and so on. It may sometimes be useful to
think of a resource as similar to a file, but more general in
nature. Files as resources include web pages, graphic image files,
video clip files, audio clip files, files of data having any MIME
type, and so on. As a practical matter, most HTTP resources, WAP
resources, and the like are currently either files or server-side
script output. Server side script output includes output from CGI
programs, Java servlets, Active Server Pages, Java Server Pages,
and so on.
[0047] "Server" in this specification refers to a computer or
device comprising automated computing machinery on a network that
manages resources, including documents, and requests for access to
such resources. A "web server," in particular is a server that
communicates with client computers through communications
applications, such as browsers or microbrowsers, by means of
hyperlinking protocols such as HTTP, WAP, or HDTP, in order to
manage and make available to networked computers documents, digital
objects, and other resources.
[0048] "SQL" stands for `Structured Query Language,` a standardized
query language for requesting information from a database. Although
there is an ANSI standard for SQL, as a practical matter, most
versions of SQL tend to include many extensions. This specification
provides examples of database queries against semantics-based
search indexes expressed as pseudocode SQL. Such examples are said
to be `pseudocode` because they are not cast in any particular
version of SQL and also because they are presented for purposes of
explanation rather than as actual working models.
[0049] A "Java Servlet" is a program designed to be run from
another program rather than directly from an operating system.
"Servlets" in particular are designed to be run on servers from a
conventional Java interface for servlets. Servlets are modules that
extend request/response oriented servers, such as Java-enabled web
servers. Java servlets are an alternative to CGI programs.
[0050] "TCP/IP" refers to two layers of a standard OSI data
communications protocol stack. The network layer is implemented
with the Internet Protocol, hence the initials `IP.` And the
transport layer is implemented with the Transport Control Protocol,
referred to as `TCP.` The two protocols are used together so
frequently that they are often referred to as the TCP/IP suite, or,
more simply, just `TCP/IP.` TCP/IP is the standard data transport
suite for the well-known world-wide network of computers called
`the Internet.`
[0051] A "URI" or "Universal Resource Identifier" is an identifier
of a named object in any namespace accessible through a network.
URIs are functional for any access scheme, including for example,
the File Transfer Protocol or "FTP," Gopher, and the web. A URI as
used in typical embodiments of the present invention usually
includes an internet protocol address, or a domain name that
resolves to an internet protocol address, identifying a location
where a resource, particularly a document, a web page, a CGI
script, or a servlet, is located on a network, often the Internet.
URIs directed to particular resources, such as particular
documents, HTML files, CGI scripts, or servlets, typically include
a path name or file name locating and identifying a particular
resource in a file system coupled through a server to a network. To
the extent that a particular resource, such as a CGI file, a
servlet, or a dynamic web page, is executable, for example to store
or retrieve data, a URI often includes query parameters, or data to
be stored, in the form of data encoded into the URI. Such
parameters or data to be stored are referred to as `URI encoded
data,` or sometime as `form data.`
[0052] "URI encoded data" or "form data" is data packaged in a URI
for data communications, a useful method for communicating variable
names and values in a distributed data processing system such as
the Internet. Form data is typically communicated in hyperlinking
protocols, such as, for example, HTTP which uses GET and POST
functions to transmit URI encoded data. In this context, it is
useful to remember that URIs do more than merely request file
transfers. URIs identify resources on servers. Such resource may be
files having filenames, but the resources identified by URIs also
may include, for example, queries to databases, including queries
to search engines according to embodiments of the present
invention. Results of such queries do not necessarily reside in
files, but they are nevertheless data resources identified by URIs
and identified by a search engine and query data that produce such
resources. An example of URI encoded data is:
[0053] http://www.foo.com/cgi-bin/MyScript.cgi?field1=value1
&field2=value2
[0054] This example shows a URI bearing encoded data. The encoded
data is the string "field1=value1&field2=value2." The encoding
method is to string field names and field values separated by
`&` and "=" with spaces represented by `+.` There are no quote
marks or spaces in the string. Having no quote marks, spaces are
encoded with `+,` and `&` is encoded with an escape character,
in this example, `%26.` For example, if an HTML form has a field
called "name" set to "Lucy", and a field called "neighbors" set to
"Fred & Ethel", the data string encoding the form would be:
[0055] name=Lucy&neighbors=Fred+%26+Ethel
[0056] "URLs" or "Universal Resource Locators" comprise a kind of
subset of URIs, such that each URL resolves to a network address.
That is, URIs and URLs are distinguished in that URIs identify
named objects in namespaces, where the names may or may not resolve
to addresses, while URLs do resolve to addresses. Although
standards today are written on the basis of URIs, it is still
common to such see web-related identifiers, of the kind used to
associate web data locations with network addresses for data
communications, referred to as "URLs." This specification uses the
terms URI and URL more or less as synonyms.
[0057] "WAN" means `wide area network.` One example of a WAN is the
Internet.
[0058] "WAP" refers to the Wireless Application Protocol, a
protocol for use with handheld wireless devices. Examples of
wireless devices useful with WAP include mobile phones, pagers,
two-way radios, hand-held computers, and PDAs. WAP supports many
wireless networks, and WAP is supported by many operating systems.
WAP supports HTML, XML, and particularly WML (the Wireless Markup
Language), which is a language particularly designed for small
screen and one-hand navigation without a keyboard or mouse.
Operating systems specifically engineered for handheld devices
include PalmOS, EPOC, Windows CE, FLEXOS, OS/9, and JavaOS. WAP
devices that use displays and access the Internet run
"microbrowsers." The microbrowsers use small file sizes that can
accommodate the low memory constraints of handheld devices and the
low-bandwidth constraints of wireless networks.
[0059] "WML" stands for `Wireless Markup Language,` an XML language
used as a markup language for web content intended for wireless
web-enabled devices that implement WAP. There is a WAP forum that
provides a DTD for WML. A DTD is an XML `Document Type
Definition.`
[0060] "World Wide Web," or more simply "the web," refers to a
system of internet protocol ("IP") servers that support specially
formatted, hyperlinking documents, documents formatted in markup
languages such as HTML, XML, WML, and HDML. The term "web" is used
in this specification also to refer to any server or connected
group or interconnected groups of servers that implement a
hyperlinking protocol, such as HTTP, WAP, HDTP, or others, in
support of URIs and documents in markup languages, regardless
whether such servers or groups of servers are coupled to the World
Wide Web as such.
[0061] "XML" stands for `extensible Markup Language,` a language
that support user-defined markup including user-defined elements,
tags, and attributes. XML's extensibility contrasts with most
web-related markup languages, such as HTML, which are not
extensible, but which instead use a standard defined set of
elements, tags, and attributes. XML's extensibility makes it a good
foundation for defining other languages. WML, the Wireless Markup
Language, for example, is a markup language based on XML. Modem
browsers and other communications clients tend to support markup
languages other than HTML, including, for example, XML.
Personalized Information Indexing
[0062] Exemplary methods, system, and products for personalized
indexing of information in a distributed data processing system are
now explained with reference to the accompanying drawings,
beginning with FIG. 1. FIG. 1 depicts an architecture for a
distributed data processing system in which various embodiments of
the present invention may be implemented. The distributed data
processing system of FIG. 1 includes a number of computers coupled
for data communications in networks. The distributed data
processing system of FIG. 1 includes networks 102, 104. Networks in
such systems may comprise LANs, WANs, intranets, internets, the
Internet, webs, and the World Wide Web itself. Such networks
comprise media that may be used to provide couplings for data
communications between various devices and computers connected
together within a distributed data processing system. Such networks
may include permanent couplings, such as wire or fiber optic
cables, or temporary couplings made through wireline telephone or
wireless communications.
[0063] In the example of FIG. 1, server 128 and server 104 are
connected to network 102 along with storage unit 132. In addition,
several exemplary client devices including a PDA 106, a workstation
108, and a mobile phone 110 are coupled for data communications to
network 102. Network-enabled mobile phone 110 connects to network
102 through wireless link 116, and PDA 106 connects to network 102
through wireless link 114. In the example of FIG. 1, server 128
couples directly to client workstation 130 and network 104 (which
may be a LAN), which incorporates wireless communication links
supporting a wireless coupling to laptop computer 126 and wireline
protocols supporting a wired coupling to client workstation
112.
[0064] Client devices and servers in such distributed processing
systems may be represented by a variety of computing devices, such
as mainframes, personal computers, personal digital assistants,
web-enabled mobile telephones, and so on. The particular servers
and client devices illustrated in FIG. 1 are for explanation, not
for limitation. Distributed data processing systems may include
additional servers, clients, routers, other devices, and
peer-to-peer architectures, not shown in FIG. 1, as will occur to
those of skill in the art. Networks in such distributed data
processing systems may support many data communications protocols,
TCP/IP, HTTP, WAP, HDTP, and others as will occur to those of skill
in the art. Various embodiments of the present invention may be
implemented on a variety of hardware platforms in addition to those
illustrated in FIG. 1. FIG. 1 is intended as an example of a
heterogeneous distributed computing environment in which various
embodiments of the present invention may be implemented, not as an
architectural limitation of the present invention.
[0065] FIG. 2 sets forth a block diagram of automated computing
machinery comprising a computer 106, such as a client device or
server, useful in systems for personalized indexing of information
in distributed data processing systems according to embodiments of
the present invention. The computer 106 of FIG. 2 includes at least
one computer processor 156 or `CPU` as well as random access memory
168 ("RAM"). Stored in RAM 168 is an application program 152.
Application programs useful in implementing inventive methods of
the present invention include servlets and CGI scripts running on
servers and data communications programs such as browsers or
microbrowsers running on client machines. Also stored in RAM 168 is
an operating system 154. Operating systems useful in computers
according to embodiments of the present invention include Unix,
Linux, Microsoft NT.TM., and many others as will occur to those of
skill in the art.
[0066] The computer 106 of FIG. 2 includes computer memory 166
coupled through a system bus 160 to the processor 156 and to other
components of the computer. Computer memory 166 may be implemented
as a hard disk drive 170, optical disk drive 172, electrically
erasable programmable read-only memory space (so-called `EEPROM` or
`Flash` memory) 174, RAM drives (not shown), or as any other kind
of computer memory as will occur to those of skill in the art.
[0067] The example computer 106 of FIG. 2 includes communications
adapter 167 implementing couplings for data communications 184 to
other computers 182, servers or clients. Communications adapters
implement the hardware level of couplings for data communications
through which client computers and servers send data communications
directly to one another and through networks. Examples of
communications adapters include modems for wired dial-up
connections, Ethernet (IEEE 802.3) adapters for wired LAN
connections, and 802.11b adapters for wireless LAN connections.
[0068] The example computer of FIG. 2 includes one or more
input/output interface adapters 178. Input/output interface
adapters in computers implement user-oriented input/output through,
for example, software drivers and computer hardware for controlling
output to display devices 180 such as computer display screens, as
well as user input from user input devices 181 such as keyboards
and mice.
[0069] For further explanation, FIG. 3 depicts an exemplary
software architecture in which methods, systems, and products may
be implemented according to embodiments of the present invention
for personalized searching for information in a distributed data
processing system. The example of FIG. 3 provides a personal search
term list 300 in a search portal 334. A `search portal` 334, as the
term is used in this specification, means a data communications
server such as a web server that supports a personalized search
index 500. The search portal 334 in the example of FIG. 3 includes
a search engine 332 operating in dependence upon the personalized
search index 500.
[0070] The personal search term list in the example of FIG. 3
comprises search keywords 302 of interest to a user 310. The
keywords 302 are identified as being of interest to the user by
their inclusion in the personal search term list, and they are
known to be of interest because, as explained in more detail below
in this specification:
[0071] the user invoked them as contents of a hyperlink in
navigating a distributed data processing system,
[0072] or the user selected them from within a document,
[0073] or the user provided them to the search portal as search
criteria in a search query message,
[0074] or the user inserted them directly into the user's personal
search term list through an edit function provided for that
purpose.
[0075] In the example of FIG. 3, a software module for providing
312 a personal search term list 300 operates by inserting into a
table in computer memory records comprising a keyword 302
identified by one of the methods just mentioned, along with a user
identification 305.
[0076] The exemplary software architecture of FIG. 3 includes a
module that receives 316 from a user 310 a navigation
identification message 300 comprising a user identification 304 for
the search portal and a navigation location 314. More particularly,
in the example of FIG. 3, receiving 316 a navigation identification
message 300 is carried out by receiving a navigation identification
message from a user's data communications application 306, the data
communications application, such as a browser or microbrowser,
installed and operating on a client computer 308. In the example of
FIG. 3, the navigation identification message is communicated from
the data communications application to the search portal through a
network, typically utilizing a hyperlinking data communications
protocol such as HTTP, WAP, HDTP, and the like.
[0077] The data communications application is configured to create
and send a navigation identification message to the search portal
every time its user operates the data communications application so
as to navigate within a distributed data processing system.
Navigating within a distributed data processing system means
operating data communications applications so as to request and
receive documents and other resources from computers comprising the
distributed processing system. In the example of the web as a
distributed processing system, navigating within the web means
requesting web pages and other documents from web servers through a
browser or microbrowser operating as a data communications
application in a client machine. Prior art data communications
applications such as browsers typically do not report users'
navigation to search portals and must therefore be configured to do
so. Configuring a data communications application to report users'
navigation to a search portal is carried out by modifying its
programming, either in its source code or through a plug-in, to
store in computer memory a user identification for a user for a
search portal as well as a network address for the search portal,
and to create and transmit a navigation identification message to
the search portal every time its user operates the data
communications application so as to navigate within a distributed
data processing system.
[0078] Such data communications applications may create a
navigation identification message, taking browsers and HTML as
examples, by use of hyperlinks. In HTML, hyperlinks are implemented
with anchor elements that include `href` attributes that identify
documents or other resources requested through a hyperlink. Here is
an example of an anchor element:
[0079] <a href="http://www.ibm.com/index.html">Click Here For
Java Portal Report</a>
[0080] The anchor element tags, start tag and end tag, are
<a> and </a>. The href attribute is an HTML attribute
included within the start tag of the anchor element. The contents
of the element is the string "Click Here For Java Portal Report." A
browser renders the hyperlink by displaying on a browser screen the
contents of the anchor element, "Click Here For Java Portal
Report," in an inverse color or highlighted so as to distinguish it
as a hyperlink. When a user invokes the hyperlink by, for example
mouse-clicking the displayed part on the browser screen, the
browser, in ordinary operation, opens a data communications
connection to the server identified by the domain name in the href
attribute, in this example, "www.ibm.com," and requests the
document identified by "index.html." In browsers configured for use
with embodiments of the present invention, the browser also opens a
data communications connection, such as a TCP connection, to a
search portal and transmits to the search portal the entire URI
"http://www.ibm.com/ind- ex.html" along with a user identification
for a user for the search portal, the two together comprising a
navigation identification message, so-called because including the
URI has the effect of identifying to the search portal where on the
web the user is visiting. The following is an example of a
navigation identification message represented as URI encoded data
for transmission to a search portal in an HTTP POST or GET
message:
[0081]
userid=John+Smith&location=http://www.ibm.com/index.html
[0082] The exemplary software architecture of FIG. 3 includes a
module that inserts 320 index records 318 in a personalized search
index 500 in dependence upon the user identification 304, the
navigation location 314, and the personal search term list 300.
Inserting 320 index records in a personalized search index creates
a personalized search index 500 as illustrated in FIG. 5. The
personalized search index 500 of FIG. 5 is `personalized`
particularly in that it includes a user identification or `userID`
572 for the search portal.
[0083] In typical indexing engines according to embodiments of the
present invention, moreover, inserting 320 index records 318 in a
personalized search index 500 includes inserting time stamps on the
index records as shown at reference 578 on FIG. 5. As used in this
disclosure, the term "time stamp" refers to data encoding both the
date and the time when a navigation identification message 300 is
received 316. The time stamps 578 in the exemplary personalized
search index 500 of FIG. 5 are shown with a precision of 0.1
seconds. That is, the time stamp on record 552, for example, shows
that record 552 is derived from a navigation identification message
that was received on Mar. 1, 2003 at approximately 32.1 seconds
after 7:15 a.m. local time in the time zone in which is located the
search portal on which the personalized search index 500 is
installed. The precision level of 0.1 seconds is chosen for this
example because it provides a resolution typically smaller than
human response time in computer operations. That is, it is unlikely
that a user will cause a search portal to receive navigation
identification messages at intervals of less than 0.1 seconds. The
time stamp precision of 0.1 seconds is chosen, however, purely for
purposes of explanation, not as a limitation of the invention. Any
time stamp precision may be used as will occur to those of skill in
the art for the needs of any particular search portal, and all such
time stamp precisions are well within the scope of the present
invention.
[0084] User identifications or userIDs generally in this
specification are described as user identifications `for a search
portal.` A user identification for a search portal typically
comprises data uniquely identifying a user to a search portal. User
identifications are user identifications `for a search portal`
because embodiments of the invention advantageously support user
access from any client machine. That is, for example, a user of
browsers configured to operate according to embodiments of the
present invention can install such browsers on a computer at work,
a computer at home, and a wirelessly-coupled laptop, each of which
implements a different domain name and a different user name for
the user. Each such browser, however, stores in its computer memory
and uses in its communications with a search portal the same user
identification for the search portal, which may be the same as one
of the user identification on one of the user's client machines,
but may be different from all of them. In this way, the search
portal is advised of user navigation for the user regardless from
which client machine the navigation originates. The search portal
creates a personalized search index pertinent to the user on the
basis of all the user's navigation of the web, even when the
navigation occurs across a multiplicity of client machines. And the
search portal's search engine can provide improved search focus to
the user regardless of the client machine from which search
requests originate.
[0085] The example personalized search index 500 of FIG. 5 includes
keywords 570 indexed with navigation locations, in this example,
URIs, identifying the location in cyberspace where the keywords are
found. More particularly, the keywords are extracted from documents
identified by URIs that match keywords stored in a personal search
term list for a user--and then inserted into records in a
personalized search index along with a userID and a URI. It is in
this sense that a personalized search index 500 is created in
dependence upon user identification 304, navigation location 314,
and a personal search term list 300.
[0086] The exemplary architecture of FIG. 3 includes a module that
receives 324 in the search portal 334 from the user 310 a search
query message 328 comprising search criteria 328 and user
identification 329 for the search portal. A search query message
328 can be implemented, for example, as an HTTP request message or
GET message bearing search criteria 328 as search keywords URI
encoded. Here is an example of URI encoding in a search query
message for search criteria `IBM` and `Java` with userID of
`tim`:
[0087] query=IBM+Java&userID=tim
[0088] The example of FIG. 3 includes a software module that
creates 322, in dependence upon the personalized search index 500,
the search criteria 328, and the user identification 329, a
response 330 to the search query message. Creating a response to a
search query message typically is carried out by parsing search
criteria and user identification from the search query message into
a database query. A database query may be expressed in a database
query language such as, for example, SQL. The example search query
message set forth above, having search criteria `IBM` and `Java`
with userID of `tim,` parsed into SQL may be represented as:
[0089] SELECT ALL FROM personalizedIndex
[0090] WHERE keyword IN (`IBM`,`Java`)
[0091] AND userID=`tim`;
[0092] This SQL query retrieves from a personalized search index
named `personalizedIndex` records having keywords `IBM` or `Java`
and userID of `tim.` If the example index of FIG. 5 is taken as
`personalizedIndex,` for example, this example SQL query would
select records 558 and 568. Both records 558 and 568 identify the
URI "www.ibm.com," which is then combined with a title and
description (not shown) and incorporated into a response 330 to the
search query message.
[0093] The example of FIG. 3 includes a software module that
transmits 326 the response 330 to the user 310. Transmitting 326 a
search response 330 to a user 310 is typically carried out by
transmitting a response message in a hyperlinking protocol such as
HTTP, WAP, HDTP, and the like. Such a response message typically
includes the search results expressed in a markup language, such
as, for example, HTML or WML, for display on a browser.
[0094] For further explanation, FIG. 4 depicts an exemplary
software architecture in which methods, systems, and products may
be implemented according to embodiments of the present invention
for personalized searching for information in a distributed data
processing system. More particularly, FIG. 4 illustrates an
architecture useful for implementing searching according to
personalized navigation history. Personalized navigation history is
taken in this disclosure as a personalized search index bearing
time stamps, thereby indicating the order in which locations (URLs,
URIs, and so on) in the index were traversed or navigated by a
user. Even more particularly, the architecture of FIG. 4 includes a
module that receives 402 in a in a search portal 334 from a user
310 a navigation request message 404 comprising a navigation
direction 406. In typical embodiments, supported navigation
directions include `Back` and `Forward.`
[0095] The architecture of FIG. 4 includes a module that creates
408, in dependence upon a personalized search index 500, the
navigation direction 406, and a last navigation time stamp 307, a
response 410 to the navigation request message 404. The last
navigation time stamp 307 records the time stamp from the last
personalized search index record used to create a previous response
to a previous navigation request message. An initial value or
starting point for the last navigation time stamp 307 may be set by
data entry from a user. In the exemplary browser of FIG. 13, for
example, invoking the menu item `Nav Start Point` 769 prompts the
user to enter a time stamp value which the user then URL encodes
and transmits in an HTML message to the search portal for storage
in a last navigation time stamp 307 in a user's user account 610,
such as, for example:
[0096] userid=JohnSmith&lastNavTimeStamp=3/4/03+1010:10.9
[0097] This example described a modified browser. Alternatively,
the search portal may support HTML forms in web pages through which
a user may enter an initial value or starting point for a last
navigation time stamp directly from an unmodified browser. No doubt
other ways of indicating starting points for history navigation
will occur to those of skill in the art, and all such ways are well
within the scope of the present invention.
[0098] Creating 408 a response to a navigation request message is
carried out by retrieving (802), from user account data (610), the
last navigation time stamp (307) and retrieving (804) a navigation
location (576) from the personalized search index (500) in
dependence upon the last navigation time stamp (307) from the user
account data and the navigation direction (406). Retrieving a
navigation location from the personalized search index is carried
out by retrieving from the personalized search index a navigation
location from the first index record having a time stamp later than
the last navigation time stamp if the navigation direction is
`Forward`. If the navigation direction is `Back,` retrieving a
navigation location from the personalized search index is carried
out by retrieving from the personalized search index a navigation
location from the first index record having a time stamp earlier
than the last navigation time stamp.
[0099] Creating 408 a response to a navigation request message also
includes retrieving (804) the document (806) identified by the
navigation location, incorporating the retrieved document into the
response, and transmitting (412) the response (410), including the
document, to the user. The architecture of FIG. 4 includes a module
that updates 416 the last navigation time stamp 307 by storing in
it the time stamp from the personalized search index record whose
location field value was used to retrieve the document that was
incorporated into the response. If the navigation direction was
`Forward,` therefore, the new value of the last navigation stamp
307 in the user account 307 is the next later time stamp value from
the personalized search index records. If the navigation direction
was `Back,` the new value of the last navigation stamp 307 in the
user account 307 is set to the next earlier time stamp value from
the personalized search index records.
[0100] For further explanation, FIG. 6 sets forth a flow chart
illustrating an exemplary method of personalized searching for
information in a distributed data processing system that includes
providing 312 in a search portal 334 a personal search term list
300. The method of FIG. 6 also includes receiving 316 from a user
310 a navigation identification message 300 comprising a user
identification 304 for the search portal and a navigation location
314 and inserting 320 index records 318 in a personalized search
index 500 in dependence upon the user identification 304, the
navigation location 314, and the personal search term list 300. In
typical embodiments, the personalized search index comprises index
records comprising time stamps, and inserting index records in the
personalized search index 500 includes inserting time stamps
encoding an indication of the date and time of the receiving of
navigation identification messages.
[0101] The method of FIG. 6 also includes establishing 502 a user
account 610 for the user on the search portal, the user account
comprising user identification 305 for users of the portal, and a
last navigation time stamp 307. A user account 610 is typically
implemented as a database table or other data structure retained in
computer memory. Recall from the discussion of the architecture of
FIG. 4 that responses 410 to navigation request messages 404 are
created 408 in dependence upon time stamps in index records 318 in
personalized search indexes 500. The last navigation time stamp
(307 on FIG. 6) records the time stamp from the last index record
318 in a personalized search index 500 from which was created a
response 410 to a navigation request message 404.
[0102] The userID 305 in the user account, as mentioned above, is
unique to a user within the search portal. Each user may have
multiple user names, logon ids, or other user identifications used
in multiple domains, wirelessly coupled laptops, PDAs, mobile
phones, home PCs, workstations on LANs at work, and so on.
Establishing a single userID for the search portal allows entering
that userID into each data communications application in each
domain and therefore making all navigation within the web available
to the search portal regardless from which domain the navigation
originates. User accounts optionally include passwords, retinal
scans, digitally-encoded fingerprints, security tokens, or other
security data as will occur to those of skill in the art. The user
identification advantageously is sufficient to uniquely identify
the user, and user identification can be implemented as
confidential PIN numbers or other relatively secure formats.
Passwords and other security data therefore are said to be
optional, depending on the level of security deemed to be needed by
an operator of any particular search portal according to
embodiments of the present invention.
[0103] The method of FIG. 6 includes authenticating 324 the
navigation identification message 300. Some indexing systems
according to embodiments of the present invention may operate
without authentication. Such systems accept navigation
identification messages from any user. Because users can transmit
navigation identification messages from any client, however, users
may inadvertently transmit navigation identification messages with
the wrong user identification. In systems without authentication,
such navigation identification messages are accepted for indexing,
although the resulting index records may be inserted with the wrong
user identification. As an aid to accuracy and order, determining
that a navigation identification message is from the user it
purports to be from and that it will effect correct indexing,
therefore, many indexing systems according to embodiments of the
present invention do authenticate 324 navigation identification
messages by determining whether user identification 304 in a
navigation identification messages exists in a user account record
610. In systems that use additional security data, such as
passwords, authentication includes comparing a password (not shown)
from a navigation identification message with a password from a
user account 610 for the user identified by the userID 304 in the
navigation identification message.
[0104] As an aid to clarity in presentation of search results, the
method of FIG. 6 includes assigning 504 priority to index records
318 in the personalized search index. In some indexing systems
according to embodiments of the present invention, assigning
priority comprises counting the number of times a navigation
location 315 is received in navigation identification messages 300.
Consider the exemplary personalized search index 500 of FIG. 5,
whose data structure contains a field for storing a priority value,
shown as column 574 on FIG. 5. Indexing systems that assign
priority by counting the number of times a navigation location 315
is received in navigation identification messages 300 may do so by
incrementing a priority value 574 in every record bearing a
particular navigation location (represented as URIs 576 in the
example of FIG. 5) every time a navigation identification message
300 is received with that navigation location. In the example of
the web, this procedure has the effect of incrementing the priority
value of index records for a particular web document, resource, or
web site, every time a user visits the web site or requests the
document or resource. The more often a user accesses a particular
web document, resource, or site, the higher its priority value
becomes.
[0105] In other indexing systems according to embodiments of the
present invention, assigning priority comprises counting the number
of times a keyword from the personal search term list occurs in a
document. In other indexing systems according to embodiments of the
present invention, assigning priority comprises determining the
location of search keywords in a navigated document or web site,
assigning higher priority for keywords that occur early in the
document or web site. In these methods of assigning priority, the
priority value is derived from the characteristics of the documents
requested or sites visited rather than the behavior of a user.
Other methods of assigning priority will occur to those of skill in
the art, and all such methods are well within the scope of the
present invention.
[0106] In the method of FIG. 6, the navigation identification
message 300 also includes a search keyword 315 and providing a
personal search term list further comprises storing 323 the search
keyword 315 in the personal search term list 300. Consider again
the example of an HTML anchor element effecting a hyperlink to a
document described as a `Java Portal Report`:
[0107] <a href="http://www.ibm.com/index.html">Click Here For
Java Portal Report</a>
[0108] In this example, a browser or other data communications
application is configured, to transmit a navigation identification
message that includes not only the URI and a userID as described
above, but also the text string from the body of the element,
"Click Here For Java Portal Report." The fact that a user invokes
the hyperlink is taken as an expression of interest in the subject
represented by the words in the body of the hyperlink, and the
words in the body of the hyperlink therefore are transmitted to a
search portal for inclusion in the user's personal search term
list. The following is an example of a navigation identification
message represented as URI encoded data for transmission to a
search portal in an HTTP POST or GET message, including user
identification, navigation location, and search keywords from the
hyperlink:
[0109]
userid=John+Smith&location=http://www.ibm.com/index.html
[0110] &keywords=Click+Here+For+Java+Portal+Report
[0111] In typical embodiments, a personal search term list 300 is
implemented as a database table having two columns, one column for
userIDs and one for keywords. Storing 323 search keywords 315 in
such a personal search term list 300 is carried out by inserting
new records bearing the search terms and a userID. In such a
personal search term list, assuming an indexing engine that inserts
all keywords from navigation identification messages, the
navigation identification message above may result in the insertion
of six new records in a personal search term list:
1 UserID Keyword JohnSmith Click JohnSmith Here JohnSmith For
JohnSmith Java JohnSmith Portal JohnSmith Report
[0112] FIG. 7 sets forth a flow chart illustrating further methods
of providing 312 a personal search term list. One method
illustrated in FIG. 7 comprises receiving 606 in a search portal
from a user a search query message comprising search criteria 604
and user identification 304 and storing 608 the search criteria in
the personal search term list. Here again is an example of URI
encoding in a search query message for search criteria `IBM` and
`Java` with userID of `tim`:
[0113] query=IBM+Java&userID=tim
[0114] In this example, storing 608 the search criteria in the
personal search term list inserts these new records in the personal
search term list:
2 UserID Keyword tim IBM tim Java
[0115] The illustrated example includes authenticating 612 the
search query message. Because this kind of search query message
affects the contents of a personal search term list which in turn
affects the contents of a personalized search index which in turn
affects the search experience of a user, it is an advantage to
reduce the risk that any particular search query message will
affect the contents of a personal search term list for the wrong
user. Many indexing systems according to embodiments of the present
invention therefore authenticate search query messages by checking
the userID from a search query message against the userID in user
account records. In addition to userIDs, some systems use other
security data also such as, for example, passwords, Kerberos
tokens, digital signatures, biometric data representing retinal
scans or fingerprints, and so on as will occur to those of skill in
the art.
[0116] A further method for providing a personal search term list,
also shown on FIG. 7, includes receiving from the user and adding
614 to the personal search term 300 list a keyword selected by the
user 310 from within a document 134. FIG. 13 depicts an exemplary
GUI on a client machine running a data communication application,
more particularly, in the example of FIG. 13, a browser. The
browser of FIG. 13 is an example of a data communications
application in a client machine that is capable of providing
selected keywords to be received in a search portal and added to a
personal search term list for a user. The example browser of FIG.
13 is one that has been programmed, or modified with a plug-in, to
accept and transmit keywords selected by a user. The browser of
FIG. 13, as depicted, has been operated to point to a web site
named "SomeSearchEngine.com," as shown in the title bar of the
browser display 714. The browser of FIG. 13 includes a GUI toolbar
718 with a Back button, a Forward button, and buttons for
refreshing the display, searching, printing, and stopping web page
retrievals. The browser of FIG. 13 also includes a horizontal menu
716 containing the menu items File, Edit, View, Bookmark (sometimes
called `Favorites`), SearchOptions, Tools, and Help.
[0117] The menu entry called SearchOptions 726 is programmed to
display a menu 702 of search options operable in support of
personalized indexing and searching according to embodiments of the
present invention. The search options settable through menu 702
include user identification 750, a search portal location 752, a
priority type 754, a language preference 756, other preferences
758, and other miscellaneous search options 760. Selecting the menu
entry for user identification 750 enables a user to input through a
data entry form and store in computer memory with the browser's
other operating options and parameters a user identification for a
search portal, a user identification that may be the same as or
different from the one the user uses in the local domain or on the
client machine where the browser is running and may be the same
user identification for a search portal used by the particular user
from this browser and from other browsers on other client machines.
Similarly, selecting the menu entry for portal location 752 enables
a user to input through a data entry form and store in computer
memory with the browser's other operating options and parameters a
network address for a search portal to which navigation
identification messages are to be sent. The network address may be
implemented as, for example, a domain name for the search portal, a
URI for the search portal, a dotted decimal internet protocol
address for the search portal, or in other ways as will occur to
those of skill in the art.
[0118] The browser of FIG. 13 displays three exemplary entries 722
from a search result message generated in response to the query,
"mine geology," displayed in a query entry field 732. Each entry in
the search results includes a title 726 for the document described
by the entry, one or two lines of descriptive text 728, and a URI
identifying the document described by the entry.
[0119] The browser of FIG. 13 is configured to transmit for receipt
in a search portal keywords selected by a user from within a
document by use of text selection and GUI controls such as mouse
motions and keyboard manipulations. In particular, a
right-mouse-button-click anywhere on the display portion 724 of the
browser screen presents pull-down menu 762 comprising some of the
usual menu items for browser control, Create Shortcut, Add to
Favorites, View Source, and so on, but also presenting a new menu
item 764 labeled `Transmit Selected Text.` Highlighting text in the
display area 724, right-clicking to gain menu 762, and invoking
Transmit Selected Text 764 with, for example, a mouse-click, causes
the browser to open a TCP connection to a search portal (in this
example, the search portal identified through the `Portal Location`
item 752 on menu 702, concatenate the selected text into URI
encoded data, and transmit the selected text to the search portal
in an HTTP message, where the search portal receives and adds
keywords from the selected text to a personal search term list for
a user.
[0120] For further explanation, consider an example of a user whose
userID for the search portal is `JohnSmith.` JohnSmith selects the
text in the description line 728 on the browser screen of FIG. 13
with a mouse-click-and-drag, right-clicks on the display area 724,
and then selects `Transmit Selected Text` 764 from menu 762. The
browser then transmits to the search portal in an HTTP message the
following URI encoded data:
[0121]
userid=JohnSmith&keywords=geochemistry+geomorphology+and+planetary+-
sciences
[0122] The search portal receives, the URI encoded keywords,
extracts them from the HTTP message, and adds them as entries with
the userID to a personal search term list for a user, resulting in
the following new entries in JohnSmith's personal search term
list:
3 UserID Keyword JohnSmith geochemistry JohnSmith geomorphology
JohnSmith and JohnSmith planetary JohnSmith sciences
[0123] Readers of skill in the art will notice that not much search
power is added by including `and` in a personal search term list.
Many indexing systems according to embodiments of the present
invention exclude certain frequently occurring terms both from
personal search term lists and from personalized search indexes,
such as, for example, `the,` `a,` `an,` and so the like. For
clarity of explanation, however, and not as a limitation of the
invention, the examples in this disclosure simply include all
identified keywords in indexes and in personal search term
lists.
[0124] A further method for providing a personal search term list,
also shown on FIG. 7, includes making the personal search term list
available 300 to the user for editing 614. Making a personal search
term list available for editing may be carried out by any means of
editing data in tables as will occur to those of skill in the art,
including, for example, presenting the contents of a personal
search term list through a CGI script or servlet in a <FORM>
element in an HTML document for editing directly through the screen
of a user's browser, where the user can then directly insert new
keywords, delete keywords no long of interest, or edit existing
keywords in the user's personal search term list.
[0125] For further explanation, FIG. 8 sets forth a flow chart
illustrating an exemplary method of inserting (320 on FIG. 6) index
records in a personalized search index that includes retrieving 772
a document from a navigation location and indexing 774, in the
personalized search index, a navigation location, keywords from the
personal search term list that occur in the retrieved document, and
a time stamp indicating when the navigation identification message
was received. Indexing 774 a navigation location, keywords, and a
time stamp advantageously includes reading 794 from a system clock
the date and time when a navigation identification message is
received. Indexing 774 a navigation location, keywords, and a time
stamp advantageously is carried out when a navigation
identification message is received from a user (316 on FIG. 6). In
the example of the world wide web as a distributed processing
system, a navigation location 314 in a navigation identification
message 300 is typically implemented as a URI identifying a web
document such as an HTML document, a web page, or a CGI script or
servlet that will dynamically assemble and deliver a web page or
document. The exemplary method of FIG. 8 then includes retrieving a
web document identified by the location URI in a navigation
identification message and, to the extent that the web document
includes keywords that are also in the personal search term list
300 for the user identified by the userID 304 in the navigation
identification message, inserting into a personalized search index
500 new records for each such keywords. The new records have
structure, for example, like that shown in FIG. 5, including the
keywords 570, the userID 572, the URI where is found the document
containing each keyword, and optionally a priority rating 574. If
an index record already exists for a particular combination of
keyword, userID, and URI, then the method optionally includes
taking other action, such as, for example, incrementing a priority
value.
[0126] FIG. 8 illustrates a further method for inserting 320 index
records 318 in a personalized search index 500. In this example,
the navigation identification message 300 contains a search keyword
315 and inserting 320 index records 318 in a personalized search
index 500 further comprises indexing 776 the search keyword 315
with the navigation location 314 and a time stamp in the
personalized search index. The time stamp records the date and time
when the corresponding navigation identification message 300 is
received. The time stamp has, for example, the format shown at
reference 578 in FIG. 5. In this example, in the method of FIG. 8,
indexing the keyword with the navigation location and a time stamp
includes reading 777 the date and time for the time stamp from a
system clock 779.
[0127] Consider again the example of an HTML anchor element
effecting a hyperlink to a document described as a `Java Portal
Report`:
[0128] <a href="http://www.ibm.com/index.html">Click Here For
Java Portal Report</a>
[0129] In this example, a browser or other data communications
application is configured, to transmit a navigation identification
message that includes the URI, a userID, and the text string from
the body of the hyperlink: "Click Here For Java Portal Report." The
fact that a user invokes the hyperlink is taken as an expression of
interest in the subject represented by the words in the body of the
hyperlink, and the words in the body of the hyperlink therefore are
transmitted to a search portal for inclusion in the user's
personalized search index. The following is an example of a
navigation identification message represented as URI encoded data
for transmission to a search portal in an HTTP POST or GET message,
including user identification, navigation location, and search
keywords from the hyperlink:
[0130]
userid=John+Smith&location=http://www.ibm.com/index.html
[0131] &keywords=Click+Here+For+Java+Portal+Report
[0132] In typical embodiments, a personalized search index 500 is
implemented as a database table having columns such as those
illustrated in FIG. 5 for keywords 570, userIDs 572, URIs 576, time
stamps 578, and other columns may include priority values, titles
of documents, descriptive text, and so on as will occur to those of
skill in the art. According to the illustrated method from FIG. 8,
therefore, indexing 776 the search keyword 315 with the navigation
location 314 and a time stamp in the personalized search index may
be carried out, for example, by extracting the keywords from their
URI encoding in an HTTP message and adding them in new records,
along with userID, URI, time stamp derived from system clock time,
and so on, to a personalized search index, one new record for each
new keyword.
Personalized Navigation
[0133] FIG. 9 sets forth a flow chart illustrating an exemplary
method of operating a history navigation engine 335 advantageously
in dependence upon a personalized search index. More particularly,
FIG. 9 illustrates an exemplary method of searching for information
in a distributed data processing system according to personalized
navigation history. As mentioned earlier, personalized navigation
history is taken in this disclosure as a personalized search index
bearing time stamps, thereby indicating the order in which
locations (URLs, URIs, and so on) in the index were traversed or
navigated by a user.
[0134] The method of FIG. 9 includes receiving 402 in a search
portal from a user 310 a navigation request message 404 comprising
a navigation direction 406. In the example of the web as a
distributed data processing system, a navigation request message is
typically implemented as an HTTP request message, such as an HTTP
`GET` message. In the example of the web, receiving 402 a
navigation request message 404 in a search portal comprises
operating the search portal as a web server and receiving an HTTP
request message from a user 310 through a browser communicating
across the web. Exemplary values of navigation direction 406 are
`Forward` and `Back.` `Forward` encodes a request for a document
identified by a location 576 (such as a URL or URI) in an index
record in a personalized search index 500 having a time stamp value
later than the current value of a last navigation time stamp 307 in
a corresponding user account record 610. `Back` encodes a request
for a document identified by a location 576 (such as a URL or URI)
in an index record in a personalized search index 500 having a time
stamp value earlier than the current value of a last navigation
time stamp 307 in a corresponding user account record 610.
[0135] The browser of FIG. 13, for example, is modified, either in
its source code or by way of a plug-in, to support navigation
direction against a personalized search index or a subset of a
personalized search index in a search portal according to
embodiments of the present invention by sending navigation request
messages 404 in response to invocations of its Back button 761 and
its Forward button 763. Alternatively, a search portal may support
navigation controls implemented as hyperlinks, labeled, for
example, `Back` and `Forward,` in a web page through which a user
may navigate against a personalized search index or a subset of
one. No doubt other ways of directing navigation against a
personalized search index will occur to those of skill in the art,
and all such ways are well within the scope of the present
invention.
[0136] In this example, the capability of sending navigation
request messages is modal. That is, the mode in which navigation
request messages are transmitted in response to operation of the
Back and Forward buttons is invoked by selecting the menu item `Nav
Personal` 768. Selecting `Nav Browser` 766 returns the browser's
Back and Forward buttons to normal operations against the browser's
own local memory. When the browser is in the mode for transmitting
navigation request messages and its Forward button 763 is invoked,
the browser URI encodes navigation direction as `Forward` and
transmits it in a navigation request message as, for example:
[0137] userid=JohnSmith&navDir=Forward
[0138] Similarly, when in that mode and the Back button 761 is
invoked, the browser URI encodes navigation direction as `Back` and
transmits it in a navigation request message as, for example:
[0139] userid=JohnSmith&navDir-Back
[0140] The method of FIG. 9 includes creating 408, in dependence
upon the personalized search index 500, the navigation direction
406, and a last navigation time stamp 307, a response 410 to the
navigation request message 404. In the example of the web as a
distributed data processing system, a response 410 to a navigation
request message 404 is typically implemented as an HTTP response
message. Creating 408 a response 410 to the navigation request
message 404 is carried out in dependence upon the personalized
search index 500, the navigation direction 406, and a last
navigation time stamp 307 in this sense: If the navigation
direction is `Forward,` the method comprises searching through the
index records in the personalized search index for the first index
record having a time stamp value later than the value of the last
navigation time stamp 307 in the corresponding user account 610.
`Corresponding user account` in this context is the user account
610 bearing the same userID 305 as the userID 329 in the navigation
request message 404. If the navigation direction is `Back,` the
method comprises searching through the index records in the
personalized search index for the first index record having a time
stamp value earlier than the value of the last navigation time
stamp 307 in the corresponding user account 610. In the method of
FIG. 9, therefore, creating 408 a response includes retrieving 802,
from user account data 610, a last navigation time stamp 307.
[0141] In the method of FIG. 9, creating (408) a response includes
retrieving (804) a navigation location (576) from the personalized
search index (500) in dependence upon the last navigation time
stamp (307) from the user account data and the navigation direction
(406). Having found an index record according to the navigation
direction 406 and the last navigation time stamp 307, the method
includes retrieving from that record the location of a document
from which the keyword in that record was indexed, retrieving 804
the document itself, and including the document itself in the
response message 410. That is, in the method of FIG. 9, creating
408 a response includes retrieving 804 the document 806 identified
by the navigation location. In the example of the web, the location
of the document is typically a URL or URI, and the document itself
typically is an HTML document. In retrieving 804 the document, the
search portal effectively switches hats for a moment and operates
as a browser, transmitting an HTTP `GET` message across an internet
102 to a web server 128 and waiting for a returning HTTP response
message containing the document. The search portal in this example
then switches back to web server operation, in effect, and
transmits 412 the response to the user 310. That is, the search
portal in this example incorporates the document 806 into the
response 410 and transmits to the user the entire response
including the document.
[0142] The method of FIG. 9 also includes updating 416 the last
navigation time stamp 307. Updating 416 the last navigation time
stamp is carried out by storing in the last navigation time stamp
307 field in the user account record 610 the value of the time
stamp on the index record in the personalized search index 500 from
which was retrieved the location 576 of the document 410 returned
to the user 310 in the response 410.
[0143] For further explanation, consider an example of the
exemplary personalized search index in FIG. 10 and a navigation
request message received in a search portal from a user having
userID `mike,` where the navigation direction in the navigation
request message is `Forward,` and the value of the last navigation
time stamp in mike's user account is "3/2/03 2105:16.3." In this
example, the search portal retrieves a document location (taken as
a URI) from the first index record having a time stamp later than
"3/2/03 2105:16.3." That is, in this example, the search portal
retrieves a URI from record 660. The URI is shown for purposes of
explanation merely as a domain name `www.new.com.` As a practical
matter, readers of skill in the art will realize that the URI may
include a particular filename or pathname as well as an Internet
service identification, such as, for example:
"http://www.new.com/index.h- tml." The search portal then retrieves
an HTML document identified by the URI, concatenates the document
into an HTTP response message, and transmits the HTTP response
message to the requesting user, `mike.`
[0144] The method of claim 8 wherein the navigation request message
404 further comprises a navigation interval 414 and retrieving 804
a navigation location from the personalized search index further
comprises: retrieving a navigation location from the personalized
search index in dependence upon the last navigation time stamp from
the user account data, the navigation direction, and the navigation
interval. The navigation interval 414 may be implemented, for
example, as an integer representation of a number of time stamp
values to skip in retrieving a search index record from which a
document location is to be taken. In the browser of FIG. 13, for
example, a navigation interval is set in the browser by invoking
menu item `Nav Interval` 770 which prompts the user to enter a
navigation interval. The browser then inserts the navigation
interval so entered into each navigation request message sent by
the browser.
[0145] For further explanation of the use of navigation intervals
414, consider an example of the exemplary personalized search index
in FIG. 10 and a navigation request message received in a search
portal from a user having userID `mike,` where the navigation
direction in the navigation request message is `Back,` the value of
the last navigation time stamp in mike's user account is "3/2/03
2105:16.3," and the value of the navigation interval in the
navigation request message is `3.` In this example, the search
portal retrieves a URI from the third index record having a time
stamp earlier than "3/2/03 2105:16.3." That is, in this example,
the search portal retrieves a URI from record 652. The search
portal then retrieves an HTML document identified by the URI (taken
in this example as "http://www.ibm.com/index.html."), concatenates
the document into an HTTP response message, and transmits the HTTP
response message to the requesting user, `mike.`
Subset Playback
[0146] FIG. 11 sets forth a flow chart illustrating a further
method of searching for information in a distributed data
processing system according to personalized navigation history. The
method of FIG. 11 includes creating 930 a subset of the
personalized search index and making the subset available to users
for remote playback 932. More particularly, in the example of FIG.
11, creating a subset of a personalized search index includes
identifying start and end points 906 and establishing a subset
identification for the new subset. In typical embodiments, start
and end points are time stamp values identifying a range of time
stamp values on index records to be extracted from a personalized
search index to form a subset. Identifying start and end points may
be carried out by prompting a user to enter them or select them
from a display.
[0147] The exemplary browser of FIG. 13, for example, supports a
pull-down menu 762 comprising an entry labeled `New Subset` 790.
The software function invoked by the menu item `New Subset` 790 is
added to the browser at the source code level or through a plug-in.
The software function invoked by the menu item `New Subset` 790,
when invoked with a mouse-click, for example, may prompt a user for
a start point for a new subset, and end point for a new subset, and
a display name for a new subset, concatenate the start and end
points and the display name into URI encoded data, and transmit the
URI encoded data to a search portal along with the user's userID to
establish a new subset for the user. Such URI encoding of a start
and end points for a new subset may be implemented, for example,
according to the following example:
[0148] userid=JohnSmith&startPoint=06/1/03+0700:15.1
[0149]
&endPoint=3/3/03+0910:07.8&displayName=My+New+Subset
[0150] The URI encoded data may be transmitted to a URI identifying
a CGI script or a servlet in a search portal designated for
creating new subsets. Alternatively, rather than altering a
browser, an HTML form from the search portal itself, logged onto as
a web page, may prompt a user to enter start and end points and a
display name for a new subset. Other ways of establishing start and
end points and display names for subsets will occur to those of
skill in the art, and all such ways are well within the scope of
the present invention.
[0151] In the example of FIG. 11, creating 930 a subset of a
personalized search index includes establishing 908 a subset
identification for a new subset. A display name 915, described
above, may be taken as a subset identification, and establishing a
subset identification does typically include prompting for, or
otherwise establishing a display name for a new subset, so as to
have some readable text identifying the subset in a way that is
meaningful to human beings for use in menu displays and the like.
Establishing a subset identification also typically includes,
however, creating a record in a subset identification table to
represent the new subset, encoding a unique identifier for the new
subset, and storing the identifier in the subset identification
record as shown, for example, at references 901 and 910 in FIG. 11.
More particularly, subset identification table 901 comprises subset
identification records, each of which represents a subset of a
personalized search index, and each of which comprises, for a
particular subset, a userID 909, a subsetID 910, a start point 912,
an end point 914, and a display name 915.
[0152] In the example of FIG. 11, creating 930 a subset of a
personalized search index includes copying 916 index records from a
personalized search index and inserting them into a subset index
table 900. The structure of the subset index table 900, in this
example, is the same as the structure of the personalized search
index from which its contents are copied, that is, the same
structure comprising keywords 570, userIDs 572, document locations
576, and time stamps 578 as illustrated in FIGS. 5 and 10 and
discussed in detail above in this disclosure. The extraction of
records from the personalized search index for insertion into a
subset index table is characterized as `copying` 916 to denote that
the original records, according to typical embodiments of the
present invention, are not removed or deleted from the personalized
search index, so that operations against the personalized search
index may continue normally.
[0153] Copying 916 personalized search index records into a subset
index table 900 has the effect of `freezing` them, rendering their
overall status static. Records residing in a personalized search
index have a dynamic quality in that the values of their time
stamps vary. More particularly, it is typical for indexing engines,
when inserting index records into a personalized search index in
response to navigation identification messages, to find that a
record for a user for a location already exists and therefore
simply to update the time stamp on the record rather than creating
a new index record. In this way, the current time stamp on the
record reflects the last time the location identified in the record
was visited by the user, but the position of the record in an index
or sort according to time stamp changes every time the location
identified in the record is visited by the user identified in the
record.
[0154] Time stamps in subset index tables 900, according to
embodiments of the present invention, typically are not updated as
a user navigates through a distributed data processing system. When
a subset of records is copied from a personalized search index into
a subset index table, therefore, their relations among one another,
in terms of a sort on time stamp, are frozen. User navigation no
long affects them, so that they can be accessed with `Back` and
`Forward` navigation instructions in the same sequence at any time
and as often as desired. Again using the web as an example, users
wishing to do so, therefore, may identify an interesting or useful
series of web sites or web documents accessible to browsers, note
the start time, traverse the sites in a desired sequence on a
browser that sends navigation identification messages to a search
portal according to embodiments of the present invention, and
create a subset using the noted start time and end time as start
and end points for the subset, and thereby record the series of web
sites or web documents in a fixed sequence that can be accessed in
that sequence repeatedly at any time in the future by the user or,
with the user's permission, depending on the security arrangements
of a particular search portal, by other users also.
[0155] In the example of FIG. 11, making 932 a subset available to
users for remote playback includes prompting for a subset
identification 918. Again the prompt may take the form of an HTML
form on a web page from a search portal that supports subsetting.
Or the prompt may be had from a browser's menu item like the one
labeled `Nav subset` 792 on FIG. 13. Invoking `Nav subset` 792
presents a list (not shown) of display names 915 of subsets
previously created by the user, each display name associated with a
subsetID 910. When the user selects one of the display names, the
method of FIG. 11 continues by setting 920 a current subsetID field
902 in the user account 610 with the value of the associated
subsetID 910.
[0156] FIG. 12 sets forth a flow chart illustrating an exemplary
method of operating a history navigation engine advantageously
independence upon subsets of a personalized search index. The
example of FIG. 12 particularly illustrates the effect of setting a
non-null value in a current subsetID field 902 in a user account
record 610. In the example of FIG. 12, when a navigation request
message 404 is received 402 from a user, all as described in detail
above in this specification, the method includes determining
whether the user account record 610 for the user identified in the
navigation request message 329 possesses a null current subsetID
902. If the current subsetID 902 is null, then processing of the
navigation request message continues against the personalized
search index as shown and described in connection with the method
illustrated in FIG. 9.
[0157] If the current subsetID 902 is non-null, however, meaning
that the user has selected a subset and caused a subsetID to be set
in the current subsetID 902, then processing in the example of FIG.
11 continues with retrieving a last navigation time stamp 307 from
the user account 610, and, if the navigation direction 406 is
`Forward,` retrieving 804 and transmitting 412 in a response 410
the document 806 identified in the first index record in the subset
index table 900 having a time stamp value later than the value of
the last navigation time stamp 307. The last navigation time stamp
is then updated 416 with the time stamp value from the first index
record in the subset index table 900 having a time stamp value
later than the value of the last navigation time stamp 307. If the
navigation direction 406 is `Back,` processing in this example
includes retrieving 804 and transmitting 412 in a response 410 the
document 806 identified in the first index record in the subset
index table 900 having a time stamp value earlier than the value of
the last navigation time stamp 307, and the last navigation time
stamp is updated 416 with the time stamp value from the first index
record in the subset index table 900 having a time stamp value
earlier than the value of the last navigation time stamp 307. In
this way, when a user has selected a subset, navigation directions
effect navigation among documents identified by index records in
the subset index table rather than a personalized search index.
[0158] To return to navigation against a personalized search index
rather than against a subset, the user returns to null the value of
the current subsetID 902 in the user account 610. The user may null
the current subsetID 902 by invoking, for example, a browser menu
item such as the one labeled `Nav Personal` 768, which among other
things, formulates and transmits to the search portal an HTTP
message bearing a URI encoded instruction to null the current
subsetID 902 field 902 in the user account 610, such as, for
example:
[0159] userid=JohnSmith¤tSubsetID=null
[0160] It will be understood from the foregoing description that
modifications and changes may be made in various embodiments of the
present invention without departing from its true spirit. The
descriptions in this specification are for purposes of illustration
only and are not to be construed in a limiting sense. The scope of
the present invention is limited only by the language of the
following claims.
* * * * *
References