U.S. patent application number 10/224471 was filed with the patent office on 2003-03-06 for apparatus for retrieving and presenting digital data.
This patent application is currently assigned to Communications Res. Lab., Ind Admin. Inst. Invention is credited to Yamazaki, Tatsuya.
Application Number | 20030046269 10/224471 |
Document ID | / |
Family ID | 19085297 |
Filed Date | 2003-03-06 |
United States Patent
Application |
20030046269 |
Kind Code |
A1 |
Yamazaki, Tatsuya |
March 6, 2003 |
Apparatus for retrieving and presenting digital data
Abstract
An apparatus for retrieving and presenting digital data,
includes a network that includes data archives containing digital
data, a terminal that can be connected to the network, a retrieval
device that retrieves digital data, using a database of digital
data retrieval information including information added to each
group of digital data that can be provided over the network, and a
communication quality determination device that determines a
quality of communication between the terminal and a data archive
containing digital data extracted by the retrieval device based on
search conditions specified by a user via the terminal. Digital
data sorted into an order in accordance with a priority specified
by the user is downloaded to a user terminal.
Inventors: |
Yamazaki, Tatsuya;
(Koganei-shi, JP) |
Correspondence
Address: |
OBLON SPIVAK MCCLELLAND MAIER & NEUSTADT PC
FOURTH FLOOR
1755 JEFFERSON DAVIS HIGHWAY
ARLINGTON
VA
22202
US
|
Assignee: |
Communications Res. Lab., Ind
Admin. Inst
Tokyo
JP
|
Family ID: |
19085297 |
Appl. No.: |
10/224471 |
Filed: |
August 21, 2002 |
Current U.S.
Class: |
1/1 ;
707/999.001; 707/E17.108 |
Current CPC
Class: |
Y10S 707/99931 20130101;
Y10S 707/99933 20130101; Y10S 707/99934 20130101; G06F 16/951
20190101 |
Class at
Publication: |
707/1 |
International
Class: |
G06F 007/00 |
Foreign Application Data
Date |
Code |
Application Number |
Aug 28, 2001 |
JP |
2001-257674 |
Claims
What is claimed is:
1. An apparatus for retrieving and presenting digital data,
comprising: a network that includes a data archive containing a
plurality of digital data; a terminal that can be connected to the
network; retrieval means that retrieves digital data, using a
database of digital data retrieval information comprising
predetermined item information added to each digital data item that
can be presented over the network; communication quality
determination means that determines a quality of communication
between the terminal and a data archive containing digital data
extracted by the retrieval means based on search conditions
specified by a user via the terminal; and information presentation
means that presents digital data sorted into an order in accordance
with a priority specified by the user, based on item information
and communication quality relating to each digital data group
extracted by the retrieval means.
2. The apparatus according to claim 1, wherein information on
classes of digital data that can be handled by the terminal is
stored in the information presentation means, whereby digital data
extracted by retrieval of the retrieval means that cannot be
handled by the terminal are excluded from information that is
presented.
Description
BACKGROUND OF THE INVENTION
[0001] 1. Field of the Invention:
[0002] The present invention relates to an apparatus for retrieving
and presenting digital data in response to a search for desired
digital data by a user using a terminal connected to a network that
has digital archives of multimedia data including text, voice,
still images and video.
[0003] 2. Description of the Prior Art:
[0004] A search engine provided on a network, such as the Internet,
is usually employed to retrieve digital data provided on the
network, using terminals connected to the network. While search
engines use various search and retrieval techniques, basically a
search engine searches for information that exactly matches, or
partially matches, keywords that a user inputs via the terminal.
The search engine extracts the uniform resource locations (URLs) of
content items that match the search criteria and presents the
results to the user, organized into a certain order.
[0005] The above type of retrieved information presentation
apparatus therefore only retrieves information based on matching of
keywords input by the user, and does not take into account the
volume of the extracted content, the quality of the network between
the terminal and a digital archive including the content, and
whether the terminal performance can handle the presentation or
playback of the content concerned. It is therefore possible that
the quality of the retrieved information obtained by the user may
be low. Thus, users are not always satisfied with such
apparatuses.
[0006] An object of the present invention is to provide an
apparatus for retrieving and presenting digital data that takes
communication quality into consideration and presents the digital
data retrieval results promptly, in response to a user request.
SUMMARY OF THE INVENTION
[0007] To attain the above object, the present invention provides
an apparatus for retrieving and presenting digital data,
comprising:
[0008] a network that includes a data archive containing a
plurality of digital data;
[0009] a terminal that can be connected to the network;
[0010] retrieval means that retrieves digital data, using a
database of digital data retrieval information comprising
predetermined item information added to each digital data item that
can be presented over the network;
[0011] communication quality determination means that determines a
quality of communication between the terminal and a data archive
containing digital data extracted by the retrieval means based on
search conditions specified by a user via the terminal; and
[0012] information presentation means that presents digital data
sorted into an order in accordance with a priority specified by the
user, based on item information and communication quality relating
to a group of digital data extracted by the retrieval means.
[0013] The above apparatus can also include one in which
information on classes of digital data that can be handled by the
terminal are stored in the information presentation means, whereby
digital data extracted by retrieval of the retrieval means that
cannot be handled by the terminal are excluded from the presented
information.
[0014] Providing the apparatus according to the present invention
with the means of determining the communication quality between a
digital data archive and the terminal enables the apparatus to
promptly present digital data search results in response to a user
request.
[0015] Further features of the invention, its nature and various
advantages will be more apparent from the accompanying drawings and
following detailed description of the invention.
BRIEF DESCRIPTION OF THE DRAWINGS
[0016] FIG. 1 shows the general configuration of an apparatus for
retrieving and presenting digital data according to the present
invention.
[0017] FIG. 2 shows information attached to the respective
media.
[0018] FIG. 3 is a user interface window image.
[0019] FIG. 4 shows the quality of service (QoS) scenario
derivation process.
DESCRIPTION OF THE PREFERRED EMBODIMENT
[0020] FIG. 1 shows the configuration of an apparatus for
retrieving and presenting digital data according to the present
invention. A network 1, such as the Internet, includes a plurality
of data archives, for example first data archive 21, second data
archive 22, . . . and Nth data archive 2N. The first to N-th data
archives 21 to 2N store digital data. For example, first data
archive 21 stores media files m.sub.11, m.sub.12, m.sub.13, . . . ,
second data archive 22 stores media files m.sub.21, m.sub.22,
m.sub.23, . . . , N-th data archive 2N stores media files m.sub.N1,
m.sub.N2, m.sub.N3, . . . , and so on.
[0021] A user terminal 3 that is a terminal device that can be
connected to the network 1 has search functions in the form of an
application. Based on search conditions specified by a terminal
user, digital data are extracted from the first to N-th data
archives 21 to 2N and the information is presented in the order
requested by the user. A content search section 3a, QoS measurement
section 3b and QoS scenario derivation section 3c provided by the
application on the terminal 3 will now be described.
[0022] The content search section 3a functions as a retrieval means
that retrieves digital data, using a database of digital data
retrieval information comprising predetermined item information
added to each digital data item that can be presented over the
network, and extracting data that correspond to the search criteria
from the digital data media files stored in the data archives 21 to
2N. The media files m.sub.jk (j, k are natural numbers, and denote
the k-th digital data in the j-th archive 2j) are tagged with
keywords, volume, type and format as attached information.
[0023] "Keywords" are natural-language words that express the
features of each media file m.sub.jk, "volume" is the size of each
file m.sub.jk (in bits), "type" is the type of media, such as
video, audio or text, and "format" is the method of formatting each
media file when it is encoded. Thus, by building a database in
which these attributes are tagged to the media files as item
information, data searches can be focused using various search
criteria, and the attribute information can also be used to
rearrange the order of the search results.
[0024] The item information that can be used to tag data in a
database for use in searches by the content search section 3a is
not limited to the above four types, but may be arbitrarily set.
Also, a database used for the searching of data by the content
search section 3a does not have to be provided for each user
terminal 3, but may instead be placed on the network 1, along with
the content search function itself, with search results being sent
to a terminal 3 in response to a request from the terminal 3
concerned.
[0025] The QoS measurement section 3b functions as a communication
quality determination means that determines a quality of
communication between a data archive containing digital data
extracted by the retrieval means based on search conditions
specified by a user via the terminal, and the terminal. For this,
the QoS measurement section 3b measures the speed of communication
to rank the QoS based on network quality and terminal performance.
Specifically, the QoS measurement section 3b measures the network
throughput, in bits per second, from the user terminal 3 on which
the application is running to the j-th archive 2j that stores media
file m.sub.jk in which the search scores a hit. Measured
throughputs are denoted by Th.sub.j (measured throughput Th.sub.1
from terminal 3 to the first data archive 21, measured throughput
Th.sub.2 from terminal 3 to the second data archive 22, . . . , and
measured throughput Th.sub.N from terminal 3 to the N-th archive
2N).
[0026] The QoS scenario derivation section 3c functions as an
information presentation means that presents digital data sorted
into an order in accordance with a priority specified by the user,
based on item information and communication quality relating to
each digital data group extracted by the retrieval means. The QoS
scenario derivation section 3c uses the communication quality
measured by the QoS measurement section 3b and the ranking based on
the user request input via the terminal to determine the final
order in which the plurality of hits of the media file are
presented to the user. The QoS scenario derivation section 3c can
present the information in a media retrieval order that reflects
the user's preference. This final order of media files is called
"the QoS scenario." Thus, in this embodiment, the QoS scenario
derivation section 3c presents to the user the derived QoS
scenario, that is, the order in which media hits are retrieved.
[0027] FIG. 3 shows an example of a user interface window for
specifying various conditions for deriving a QoS scenario. The user
inputs keywords that express the required information; the keywords
are used for searches by the content search section 3a. For
presentation, the QoS scenario derivation section 3c ranks the
information using an importance weighting parameter that specifies
whether keyword or QoS is given a relatively higher weighting. If,
for example, a weighting W.sub.k is specified for a keyword and a
weighting W.sub.q is specified for the QoS (where 0.ltoreq.W.sub.k,
W.sub.q.ltoreq.100, W.sub.k+W.sub.q=100), the media hits are ranked
based on the percentage values of W.sub.k and W.sub.q. The
apparatus can be arranged so that even if the user, in inputting
W.sub.k and W.sub.q, inputs a numerical value that exceeds these
constraints, the input is automatically normalized, or so that when
one percentage is specified, the other percentage is determined
automatically.
[0028] "Type" in the user interface window of FIG. 3 is used to
specify the media type that has first priority, such as text, for
example. The QoS scenario derivation section 3c raises the priority
level of this media type in the ranking. Specifying the media type
is optional. Whether or not the required type is specified is a
decision that can be left to the user.
[0029] Checking the "format filter" checkbox will cause the
apparatus to filter out media files m that are in a format that
cannot be decoded by the terminal 3. This corresponds to the
above-described function provided in the QoS scenario derivation
section 3c whereby the classes of digital data that can be handled
by the terminal are stored in the information presentation means,
and digital data files that are extracted by the retrieval means
but cannot be handled by the terminal are excluded from the
presented information. Information on the format types that can be
decoded by the terminal 3 can be stored in the QoS scenario
derivation section 3c at the time of application installation, or
can be stored at some subsequent time by the user.
[0030] An example of the QoS scenario derivation process will now
be described with reference to FIG. 4. The process starts when the
user uses the interface window to input a search request. When the
user request is received, keyword matching is used to search a
plurality of archive media files (first to N-th data archives 21 to
2N). Assuming there are n media hits (where n is a natural number),
the retrieved media files are each given a score that goes from n
down to 1, based on the similarity between the words input by the
user and the reference keywords in the apparatus, generating a
media retrieval score n.sub.k. A higher n.sub.k score (that is,
closer to n) indicates a degree of keyword matching. There is no
particular limitation on the keyword matching method used. For
example, a thesaurus can be used to determine the degree of
similarity to the keywords, or used together with fuzzy logic
techniques to enable keyword matching that includes degrees of
ambiguity.
[0031] Throughputs to digital archives containing media that
generate search hits are measured, and for each media.sub.jk, the
volume of m.sub.jk is divided by the throughput Th.sub.j to the
archive concerned. This value is an indication as to the time it
will take to download each of the media m.sub.jk from the archive
to the terminal 3, and is used to generate scores n.sub.q in order
from n to 1, starting from the low end. Thus, a larger n.sub.q
score (one that is closer to n) signifies easier retrieval.
[0032] For each of n media files
W.sub.k.times.n.sub.k(m.sub.jk)+w.sub.q.t- imes.n.sub.q(m.sub.k) is
calculated: the larger this value, the higher the scored order the
mediafile is given, going in descending order from 1st to n-th.
This order takes into account the weighted order shown in FIG. 4.
The n.sub.k(m.sub.jk) and n.sub.q(m.sub.k) in the above calculation
signify the m.sub.jk scores in n.sub.k and n.sub.q,
respectively.
[0033] Next, the data is sorted by media type, giving precedence to
the type of media specified by the user, which is moved up to a
higher level than media that has not been thus specified.
Specifying the media type is optional, so the data is not thus
sorted unless the user specifically specifies the "Type"
option.
[0034] Finally, if the "Format filter" checkbox has been checked,
media that cannot be decoded by the terminal is filtered out,
resulting in the cancellation of I media files from the n files of
media with search hits (I.ltoreq.n). This is the format-based
filtering shown in FIG. 4. When QoS scenarios for media assigned an
order from 1 to I are determined, media collection proceeds in
accordance with that order. The QoS scenarios thus determined are
ideally suited for building a digital museum that can present
exhibits in response to a user request by gathering information
distributed on the network.
[0035] A user who makes such a request can first be shown
multimedia data or the like that can be quickly downloaded, and the
remaining multimedia data can then be collected while the user is
regarding the initial portion, thus reducing the user response
time, which is the time it takes for the requested data to be
downloaded to the user's terminal. The apparatus for retrieving and
presenting digital data according to the invention can also exclude
data that cannot be handled by the user terminal, reducing time
that would otherwise be wasted.
* * * * *