U.S. patent application number 11/286268 was filed with the patent office on 2006-09-28 for method and apparatus for a ranking engine.
This patent application is currently assigned to TRUVEO, INC.. Invention is credited to Adam L. Beguelin, Peter F. Kocks, Timothy D. Tuttle.
Application Number | 20060218141 11/286268 |
Document ID | / |
Family ID | 36407892 |
Filed Date | 2006-09-28 |
United States Patent
Application |
20060218141 |
Kind Code |
A1 |
Tuttle; Timothy D. ; et
al. |
September 28, 2006 |
Method and apparatus for a ranking engine
Abstract
A computer-implemented method is provided for ranking files from
an Internet search. In one embodiment, the method comprises
assigning a score to each file based on at least one of the
following factors: recency, editorial popularity, clickthru
popularity, favorites metadata, or favorites collaborative
filtering. The files may be organized based on the assigned scores
to provide users with more accurate search results.
Inventors: |
Tuttle; Timothy D.; (San
Francisco, CA) ; Beguelin; Adam L.; (San Carlos,
CA) ; Kocks; Peter F.; (San Francisco, CA) |
Correspondence
Address: |
WORKMAN NYDEGGER;(F/K/A WORKMAN NYDEGGER & SEELEY)
60 EAST SOUTH TEMPLE
1000 EAGLE GATE TOWER
SALT LAKE CITY
UT
84111
US
|
Assignee: |
TRUVEO, INC.
Burlingame
CA
|
Family ID: |
36407892 |
Appl. No.: |
11/286268 |
Filed: |
November 22, 2005 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
60630552 |
Nov 22, 2004 |
|
|
|
Current U.S.
Class: |
1/1 ;
707/999.005; 707/E17.028; 707/E17.108 |
Current CPC
Class: |
G06F 16/9535 20190101;
Y10S 707/99933 20130101; G06F 16/338 20190101; Y10S 707/914
20130101; G06F 16/284 20190101; G06F 16/783 20190101; G06F 16/438
20190101; G06F 16/738 20190101; Y10S 707/99931 20130101; G06F
16/735 20190101; G06F 16/951 20190101; G06F 16/24578 20190101; Y10S
707/99937 20130101 |
Class at
Publication: |
707/005 |
International
Class: |
G06F 17/30 20060101
G06F017/30 |
Claims
1. A computer-implemented method for a ranking engine, the method
comprising: assigning a score to each file or record based on at
least the following factors: recency, editorial popularity, and
clickthru popularity; and organizing the files based on the
assigned scores.
2. The method of claim 1 wherein the score is RT and is determined
using the following formula: R T = W r .times. R r Term .times.
.times. 1 + W e .times. R e Term .times. .times. 2 + W c .times. R
c Term .times. .times. 3 ##EQU8## where: 0<R.sub.i<1 and:
1=W.sub.r+W.sub.e+W.sub.c 0<R.sub.T<1
3. The method of claim 2 wherein recency is weighted based on the
following formula for R.sub.r: R r .times. { 1 - 1 t e .times. ( d
c - d F ) , For .times. .times. ( d c - d F ) < t e 0 , For
.times. .times. ( d c - d F ) > t e ##EQU9## t.sub.e=expiration
time (perhaps .about.30 days) d.sub.c=current date d.sub.F=date
found.
4. The method of claim 2 wherein editorial popularity is weighted
between 1 and 0 and is based on at least one of the following:
Neilsen ratings, known brand names, website popularity (e.g. Alexa
ranking), or the judgment of a professional or corporation with
expertise in online media
5. The method of claim 2 wherein clickthru popularity is weighted
based on the following formula for R.sub.c: R.sub.c=W.sub.cpm
R.sub.cpm+W.sub.cph R.sub.cph+W.sub.cpd R.sub.cpd where: R cpm =
clicks .times. .times. per .times. .times. minutes .times. .times.
ranking = CPM Max .times. .times. ( cpm ) over .times. .times. all
.times. .times. items , ( 0 < R cpm < 1 ) ##EQU10## R cph =
clicks .times. .times. per .times. .times. hour .times. .times.
ranking = CPH Max .times. .times. ( cph ) over .times. .times. all
.times. .times. items , ( 0 < R cph < 1 ) ##EQU10.2## R cpd =
clicks .times. .times. per .times. .times. day .times. .times.
ranking = CPD Max .times. .times. ( cpd ) over .times. .times. all
.times. .times. items , ( 0 < R cpd < 1 ) ##EQU10.3## and
##EQU10.4## 1 = W cpm + W cph + W cpd ##EQU10.5##
6. The method of claim 1 wherein the file is a video file.
7. A computer-implemented method for organizing a collection of
files from an Internet search, the method comprising: assigning a
score to each file based on at least the following factors:
recency, editorial popularity, clickthru popularity, favorites
metadata,.and favorites collaborative filtering; and organizing the
files based on the assigned scores.
8. The method of claim 7 wherein the file is a media file.
9. The method of claim 7 wherein the file is a video file.
10. The method of claim 7 wherein the score is R.sub.T and is
determined using the following formula: R T = W r .times. R r Term
.times. .times. 1 + W e .times. R e Term .times. .times. 2 + W c
.times. R c Term .times. .times. 3 + W md .times. R md Term .times.
.times. 4 + W cF .times. R cF Term .times. .times. 5 ##EQU11##
where: 0<R.sub.i<1 and:
1=W.sub.r+W.sub.e+W.sub.c+W.sub.md+W.sub.cF 0<R.sub.T<1
11. The method of claim 10 wherein recency is weighted based on the
following formula for R.sub.r: R r .times. { 1 - 1 t e .times. ( d
c - d F ) , For .times. .times. ( d c - d F ) < t e 0 , For
.times. .times. ( d c - d F ) > t e ##EQU12## t.sub.e=expiration
time (perhaps .about.30 days) d.sub.c=current date d.sub.F=date
found.
12. The method of claim 10 wherein editorial popularity is weighted
between 1 and 0 and is based on at least one of the following:
Neilsen ratings, known brand names, website popularity (e.g. Alexa
ranking), or the judgment of a professional or corporation with
expertise in online media.
13. The method of claim 10 wherein Clickthru popularity is weighted
based on the following formula for R.sub.c: R.sub.c=W.sub.cpm
R.sub.cpm+W.sub.cph R.sub.cph+W.sub.cpd R.sub.cpd where: R cpm =
clicks .times. .times. per .times. .times. minutes .times. .times.
ranking = CPM Max .times. .times. ( cpm ) over .times. .times. all
.times. .times. items , ( 0 < R cpm < 1 ) ##EQU13## R cph =
clicks .times. .times. per .times. .times. hour .times. .times.
ranking = CPH Max .times. .times. ( cph ) over .times. .times. all
.times. .times. items , ( 0 < R cph < 1 ) ##EQU13.2## R cpd =
clicks .times. .times. per .times. .times. day .times. .times.
ranking = CPD Max .times. .times. ( cpd ) over .times. .times. all
.times. .times. items , ( 0 < R cpd < 1 ) ##EQU13.3## and
##EQU13.4## 1 = W cpm + W cph + W cpd ##EQU13.5##
14. The method of claim 7 wherein weighting of favorites metadata
is R.sub.md=0 if no matches are found or 1 if a keyword field in
the metadata of the file matches any favorite titles in a user's
favorite titles file, any favorite people in a user's favorite
people file, or any keyword in a user's any favorite keywords
file.
15. The method of claim 7 wherein weighting of collaborative
filtering favorites metadata is R.sub.cf R.sub.cf, l=W.sub.sim
(S.sub.max, l)+(1-W.sub.sim) P.sub.l, where: W sim = similarity
.times. .times. weighting .times. .times. factor = C max .times.
.times. sim .function. ( 1 - 1 1 + n i ) , ##EQU14## where:
0.ltoreq.C.sub.max sim.ltoreq.1
16. The method of claim 7 wherein R.sub.cf is a weighted sum of the
maximum user similarity for item l and the popularity of item l
among KNN such that 0.ltoreq.R.sub.cf.ltoreq.1.
17. A computer system comprising: a ranking engine having
programming code for displaying results of a search query based on
scores, wherein the scores for files found in the search are based
on at least the following factors: recency, editorial popularity,
and clickthru popularity.
18. The system of claim 17 wherein the files are media files.
19. The system of claim 17 wherein the files are video files.
20. The system of claim 17 wherein each of the scores is R.sub.T
and is determined using the following formula: R T = W r .times. R
r Term .times. .times. 1 + W e .times. R e Term .times. .times. 2 +
W c .times. R c Term .times. .times. 3 ##EQU15## where:
0<R.sub.i<1 and: 1=W.sub.r+W.sub.e+W.sub.c
0<R.sub.T<1
21. The system of claim 20 wherein recency is weighted based on the
following formula for R.sub.r: R r .times. { 1 - 1 t e .times. ( d
c - d F ) , For .times. .times. ( d c - d F ) < t e 0 , For
.times. .times. ( d c - d F ) > t e ##EQU16## t.sub.e=expiration
time (perhaps .about.30 days) d.sub.c=current date d.sub.F=date
found.
22. The system of claim 20 wherein editorial popularity is weighted
between 1 and 0 and is based on at least one of the following:
Neilsen ratings, known brand names, website popularity (e.g. Alexa
ranking), or the judgment of a professional or corporation with
expertise in online media
23. The system of claim 20 wherein clickthru popularity is weighted
based on the -following formula for R.sub.c: R.sub.c=W.sub.cpm
R.sub.cpm+W.sub.cph R.sub.cph+W.sub.cpd R.sub.cpd where: R cpm =
clicks .times. .times. per .times. .times. minutes .times. .times.
ranking = CPM Max .times. .times. ( cpm ) over .times. .times. all
.times. .times. items , ( 0 < R cpm < 1 ) ##EQU17## R cph =
clicks .times. .times. per .times. .times. hour .times. .times.
ranking = CPH Max .times. .times. ( cph ) over .times. .times. all
.times. .times. items , ( 0 < R cph < 1 ) ##EQU17.2## R cpd =
clicks .times. .times. per .times. .times. day .times. .times.
ranking = CPD Max .times. .times. ( cpd ) over .times. .times. all
.times. .times. items , ( 0 < R cpd < 1 ) ##EQU17.3## and
##EQU17.4## 1 = W cpm + W cph + W cpd ##EQU17.5##
24. A computer system comprising: a ranking engine having
programming code for displaying results of a search query based on
scores, wherein the scores for files found in the search are based
on at least the following factors: recency, editorial popularity,
clickthru popularity, favorites metadata, and favorites
collaborative filtering.
25. The system of claim 24 wherein the files are media files.
26. The system of claim 24 wherein the files are video files.
27. The system of claim 24 wherein each of the scores is RT and is
determined using the following formula: R T = W r .times. R r Term
.times. .times. 1 + W e .times. R e Term .times. .times. 2 + W c
.times. R c Term .times. .times. 3 + W md .times. R md Term .times.
.times. 4 + W cF .times. R cF Term .times. .times. 5 ##EQU18##
where: 0<R.sub.i<1 and:
1=W.sub.r+W.sub.e+W.sub.c+W.sub.md+W.sub.cF 0<R.sub.T<1
28. The system of claim 27 wherein recency is weighted based on the
following formula for R.sub.r: R r .times. { 1 - 1 t e .times. ( d
c - d F ) , For .times. .times. ( d c - d F ) < t e 0 , For
.times. .times. ( d c - d F ) > t e ##EQU19## t.sub.e=expiration
time (perhaps .about.30 days) d.sub.c=current date d.sub.F=date
found.
29. The system of claim 27 wherein editorial popularity is weighted
between 1 and 0 and is based on at least one of the following:
Neilsen ratings, known brand names, website popularity (e.g. Alexa
ranking), or the judgment of a professional or corporation with
expertise in online media
30. The system of claim 27 wherein clickthru popularity is weighted
based on the following formula for R.sub.c: R.sub.c=W.sub.cpm
R.sub.cpm+W.sub.cph R.sub.cph+W.sub.cpd R.sub.cpd where: R cpm =
clicks .times. .times. per .times. .times. minutes .times. .times.
ranking = CPM Max .times. .times. ( cpm ) over .times. .times. all
.times. .times. items , ( 0 < R cpm < 1 ) ##EQU20## R cph =
clicks .times. .times. per .times. .times. hour .times. .times.
ranking = CPH Max .times. .times. ( cph ) over .times. .times. all
.times. .times. items , ( 0 < R cph < 1 ) ##EQU20.2## R cpd =
clicks .times. .times. per .times. .times. day .times. .times.
ranking = CPD Max .times. .times. ( cpd ) over .times. .times. all
.times. .times. items , ( 0 < R cpd < 1 ) ##EQU20.3## and
##EQU20.4## 1 = W cpm + W cph + W cpd ##EQU20.5##
31. The system of claim 27 wherein weighting of favorites metadata
is R.sub.md=0 if no matches are found or 1 if a keyword field in
the metadata of the file matches any favorite titles in a user's
favorite titles file, any favorite people in a user's favorite
people file, or any keyword in a user's any favorite keywords
file.
32. The system of claim 27 wherein weighting of collaborative
filtering favorites metadata is R.sub.cf R.sub.cf,l=W.sub.sim
(S.sub.max,l)+(1-W.sub.sim) P.sub.l, where: W sim = similarity
.times. .times. weighting .times. .times. factor = C max .times.
.times. sim .function. ( 1 - 1 1 + n i ) , ##EQU21## where:
0.ltoreq.C.sub.max sim.ltoreq.1
33. The system of claim 27 wherein R.sub.cf is a weighted sum of
the maximum user similarity for item I and the popularity of item I
among KNN such that 0.ltoreq.R.sub.cf.ltoreq.1.
34. A computer-implemented method for organizing a collection of
files from an Internet search, the method comprising: assigning a
score to each file based on favorites collaborative filtering
W.sub.cFR.sub.cF and at least one of the following factors: recency
W.sub.rR.sub.r, editorial popularity W.sub.eR.sub.e, clickthru
popularity W.sub.cR.sub.c, and favorites metadata W.sub.mdR.sub.md;
and organizing the files based on the assigned scores.
35. The method of claim 34 wherein the file is a media file.
36. The method of claim 34 wherein the file is a video file.
37. The method of claim 34 wherein the score is RT and is
determined using the following formula: where: 0<R.sub.i<1
and: 1=W.sub.r +W.sub.e+W.sub.c+W.sub.md+W.sub.cF
0<R.sub.T<1
38. The method of claim 37 wherein recency is weighted based on the
following formula for R.sub.r: R r .times. { 1 - 1 t e .times. ( d
c - d F ) , For .times. .times. ( d c - d F ) < t e 0 , For
.times. .times. ( d c - d F ) > t e ##EQU22## t.sub.e=expiration
time (perhaps .about.30 days) d.sub.c=current date d.sub.F=date
found.
39. The method of claim 37.wherein editorial popularity is weighted
between 1 and 0 and is based on at least one of the following:
Neilsen ratings, known brand names, website popularity (e.g. Alexa
ranking), or the judgment of a professional or corporation with
expertise in online media.
40. The method of claim 37 wherein Clickthru popularity is weighted
based on the following formula for R.sub.c: R.sub.c=W.sub.cpm
R.sub.cpm+W.sub.cph R.sub.cph+W.sub.cpd R.sub.cpd where: R cpm =
clicks .times. .times. per .times. .times. minutes .times. .times.
ranking = CPM Max .function. ( cpm ) over .times. .times. all
.times. .times. items , ( 0 < R cpm < 1 ) ##EQU23## R cph =
clicks .times. .times. per .times. .times. hour .times. .times.
ranking = CPH Max .function. ( cph ) over .times. .times. all
.times. .times. items , ( 0 < R cph < 1 ) ##EQU23.2## R cpd =
clicks .times. .times. per .times. .times. day .times. .times.
ranking = CPD Max .function. ( cpd ) over .times. .times. all
.times. .times. items , ( 0 < R cpd < 1 ) ##EQU23.3## and
##EQU23.4## 1 = W cpm + W cph + W cpd ##EQU23.5##
41. The method of claim 34 wherein weighting of favorites metadata
is R.sub.md=0 if no matches are found or 1 if a keyword field in
the metadata of the file matches any favorite titles in a user's
favorite titles file, any favorite people in a user's favorite
people file, or any keyword in a user's any favorite keywords
file.
42. The method of claim 34 wherein weighting of collaborative
filtering favorites metadata is R.sub.cf R.sub.cf, l=W.sub.sim
(S.sub.max, l)+(1-W.sub.sim) P.sub.l, where: W sim = similarity
.times. .times. weighting .times. .times. factor = C max .times.
.times. sim .function. ( 1 - 1 1 + n i ) , ##EQU24## where:
0.ltoreq.C.sub.max sim.ltoreq.1
43. The method of claim 34 wherein R.sub.cf is a weighted sum of
the maximum user similarity for item 1 and the popularity of item l
among KNN such that 0.ltoreq.R.sub.cf.ltoreq.1.
44. A computer system comprising: a ranking engine having
programming code for displaying results of a search query based on
scores, wherein the scores for files found in the search are based
on favorites collaborative filtering W.sub.cFR.sub.cF and at least
one of the following factors: recency W.sub.rR.sub.r, editorial
popularity W.sub.eR.sub.e, clickthru popularity W.sub.cR.sub.c, and
favorites metadata W.sub.mdR.sub.md.
45. The system of claim 44 wherein the files are media files.
46. The system of claim 44 wherein the files are video files.
47. The system of claim 44 wherein each of the scores is RT and is
determined using the following formula: R T = W r .times. R r Term
.times. .times. 1 + W e .times. R e Term .times. .times. 2 + W c
.times. R c Term .times. .times. 3 + W md .times. R md Term .times.
.times. 4 + W cF .times. R cF Term .times. .times. 5 ##EQU25##
where: 0<R.sub.i<1 and:
1=W.sub.r+W.sub.e+W.sub.c+W.sub.md+W.sub.cF 0<R.sub.T<1
48. The system of claim 47 wherein recency is weighted based on the
following formula for R.sub.r: R r .times. { 1 - 1 t e .times. ( d
c - d F ) , For .times. .times. ( d c - d F ) < t e 0 , For
.times. .times. ( d c - d F ) > t e ##EQU26## t.sub.e=expiration
time (perhaps .about.30 days) d.sub.c=current date d.sub.F=date
found.
49. The system of claim 47 wherein editorial popularity is weighted
between 1 and 0 and is based on at least one of the following:
Neilsen ratings, known brand names, website popularity (e.g. Alexa
ranking), or the judgment of a professional or corporation with
expertise in online media
50. The system of claim 47 wherein clickthru popularity is weighted
based on the following formula for R.sub.c: R.sub.c=W.sub.cpm
R.sub.cpm+W.sub.cph R.sub.cph+W.sub.cpd R.sub.cpd where: R cpm =
clicks .times. .times. per .times. .times. minutes .times. .times.
ranking = CPM Max .function. ( cpm ) over .times. .times. all
.times. .times. items , ( 0 < R cpm < 1 ) ##EQU27## R cph =
clicks .times. .times. per .times. .times. hour .times. .times.
ranking = CPH Max .function. ( cph ) over .times. .times. all
.times. .times. items , ( 0 < R cph < 1 ) ##EQU27.2## R cpd =
clicks .times. .times. per .times. .times. day .times. .times.
ranking = CPD Max .function. ( cpd ) over .times. .times. all
.times. .times. items , ( 0 < R cpd < 1 ) ##EQU27.3## and
##EQU27.4## 1 = W cpm + W cph + W cpd ##EQU27.5##
51. The system of claim 44 wherein weighting of favorites metadata
is R.sub.md=0 if no matches are found or 1 if a keyword field in
the metadata of the file matches any favorite titles in a user's
favorite titles file, any favorite people in a user's favorite
people file, or any keyword in a user's any favorite keywords
file.
52. The system of claim 44 wherein weighting of collaborative
filtering favorites metadata is R.sub.cf R.sub.cf, l=W.sub.sim
(S.sub.max, l)+(1-W.sub.sim) P.sub.l, where: W sim = similarity
.times. .times. weighting .times. .times. factor = C max .times.
.times. sim .function. ( 1 - 1 1 + n i ) , ##EQU28## where:
0.ltoreq.C.sub.max sim.ltoreq.1
53. The system of claim 44 wherein R.sub.cf is a weighted sum of
the maximum user similarity for item l and the popularity of item l
among KNN such that 0.ltoreq.R.sub.cf.ltoreq.1.
Description
BACKGROUND OF THE INVENTION
[0001] The present application claims the benefit of priority of
U.S. Provisional Application Ser. No. 60/630,552 (Attorney Docket
Number 41702-1002) filed on Nov. 22, 2004 and fully incorporated
herein by reference for all purposes.
[0002] 1. Technical Field
[0003] The technical field relates to a scheme for ranking results,
and more specifically, to a rating scheme to rank video search
results by a number of factors.
[0004] 2. Background Art
[0005] Standard web crawlers were originally designed for web pages
where the bulk of useful information about the page was contained
in an HTML text file. In web pages today, it is increasingly common
for the useful information about the page to be contained in a
variety of different files, which are all assembled in the browser
to create the complete application. Because of this, standard web
crawlers are unable to find much of the multimedia and video
content available on modern web pages.
[0006] Even for the video content that is found by standard web
crawlers, the result of the search often provides video content
that may be out-of-date, poor quality, or not relevant to a search
query from a user. Traditional search engines lack the ability to
efficiently and more accurately organize these search results.
There is a need for improved techniques for organizing the results
from such searches to provide higher accuracy and greater ease of
use for the user.
SUMMARY OF THE INVENTION
[0007] The present invention provides solutions for at least some
of the drawbacks discussed above. Specifically, some embodiments of
the present invention provide a Ranking Engine that is a rating
scheme used in the Truveo Search Engine to rank video search
results by factors such as, but not limited to, popularity,
timeliness and/or user preferences. It enables the Truveo Search
Engine to provide highly targeted search results to users. It is
designed to operate effectively in the absence of any user input,
however, it uses any provided user input to improve the accuracy of
the search results. In one aspect, the present invention provides
memory-based reasoning algorithms ensure highly accurate search
results with minimal user input. Extensive metadata enables
advanced parametric search when desired. At least some of these and
other objectives described herein will be met by embodiments of the
present invention.
[0008] In one embodiment of the present invention, a
computer-implemented method is provided for a ranking engine. The
method comprises assigning a score to each file or record based on
at least the following factors: recency, editorial popularity, and
clickthru popularity. The files are organized based on the assigned
scores.
[0009] In another embodiment of the present invention, a
computer-implemented method is provided for a ranking engine. The
method comprises assigning a score to each file or record based on
at least the following factors: recency, editorial popularity,
clickthru popularity, favorites metadata, and favorites
collaborative filtering. The files are organized based on the
assigned scores.
[0010] In yet another embodiment of the present invention, a
computer system is provided that comprises of a ranking engine
having programming code for displaying results of a search query
based on scores, wherein the scores for files found in the search
are based on at least the following factors: recency, editorial
popularity, and clickthru popularity.
[0011] In a still further embodiment of the present invention, a
computer system is provided that comprises of a ranking engine
having programming code for displaying results of a search query
based on scores, wherein the scores for files found in the search
are based on at least the following factors: recency, editorial
popularity, popularity, favorites metadata, and favorites
collaborative filtering.
[0012] The files may be media files, video files, video streams, or
the like. The editorial popularity may be weighted between 1 and 0
and is based on at least one of the following: Neilsen ratings,
known brand names, website popularity (e.g. Alexa ranking), or the
judgment of a professional or corporation with expertise in online
media. In one embodiment, the weighting of favorites metadata is
Rmd=0 if no matches are found or 1 if a keyword field in the
metadata of the file matches any favorite titles in a user's
favorite titles file, any favorite people in a user's favorite
people file, or any keyword in a user's any favorite keywords
file.
[0013] In yet another embodiment of the present invention, a
computer-implemented method is provided for organizing a collection
of files from an Internet search. The method comprises assigning a
score to each file based on favorites collaborative filtering
W.sub.cFR.sub.cF and at least one of the following factors: recency
W.sub.rR.sub.r, editorial popularity W.sub.eR.sub.e, clickthru
popularity W.sub.cR.sub.c, and favorites metadata W.sub.mdR.sub.md.
The files are organized based on the assigned scores.
[0014] In yet another embodiment of the present invention, a
computer system is provided that comprises of a ranking engine
having programming code for displaying results of a search query
based on scores, wherein the scores for files found in the search
are based on favorites collaborative filtering W.sub.cFR.sub.cF and
at least one of the following factors: recency W.sub.rR.sub.r,
editorial popularity W.sub.eR.sub.e, clickthru popularity
W.sub.cR.sub.c, and favorites metadata W.sub.mdR.sub.md.
[0015] For any of the embodiments herein, the files may be media
files, video files, video streams, or the like. Optionally, the
editorial popularity may be weighted between 1 and 0 and is based
on at least one of the following: Neilsen ratings, known brand
names, website popularity (e.g. Alexa ranking), or the judgment of
a professional or corporation with expertise in online media. In
one embodiment, the weighting of favorites metadata is Rmd=0 if no
matches are found or 1 if a keyword field in the metadata of the
file matches any favorite titles in a user's favorite titles file,
any favorite people in a user's favorite people file, or any
keyword in a user's any favorite keywords file.
[0016] A further understanding of the nature and advantages of the
invention will become apparent by reference to the remaining
portions of the specification and drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
[0017] FIG. 1 shows a schematic of one embodiment of the present
invention.
[0018] FIG. 2 is a graph showing variables plotted for recency
ranking according to the present invention.
[0019] FIG. 3 is a graph showing the relationship of similarity and
popularity weighting according to the present invention.
[0020] FIG. 4 shows one embodiment of a display showing results
from a search query.
[0021] FIG. 5 shows one embodiment of a user interface according to
the present invention.
DESCRIPTION OF THE SPECIFIC EMBODIMENTS
[0022] It is to be understood that both the foregoing general
description and the following detailed description are exemplary
and explanatory only and are not restrictive of the invention, as
claimed. It may be noted that, as used in the specification and the
appended claims, the singular forms "a", "an" and "the" include
plural referents unless the context clearly dictates otherwise.
Thus, for example, reference to "a crawler" may include may include
multiple crawlers, and the like. References cited herein are hereby
incorporated by reference in their entirety, except to the extent
that they conflict with teachings explicitly set forth in this
specification. [PK1]
[0023] Referring now to FIG. 1, a schematic is shown of the Truveo
Search Engine which is configured for use with the present ranking
scheme. As seen in FIG. 1, the search engine may include a
recommendation engine 10. The engine 10 may use reasoning
algorithms to provide highly accurate search results with minimal
user input. In one embodiment, the recommendation engine may use a
ranking scheme as set forth below. Truveo Ranking Scheme: R T = W r
.times. R r Term .times. .times. 1 + W e .times. R e Term .times.
.times. 2 + W c .times. R c Term .times. .times. 3 + W md .times. R
md Term .times. .times. 4 + W cF .times. R cF Term .times. .times.
5 ##EQU1## where: 0<R.sub.i<1 and:
1=W.sub.r+W.sub.e+W.sub.c+W.sub.md+W.sub.cF 0<R.sub.T<1 Term
1: Recency Ranking: R r .times. { 1 - 1 t e .times. ( d c - d F ) ,
For .times. .times. ( d c - d F ) < t e 0 , For .times. .times.
( d c - d F ) > t e ##EQU2##
[0024] where:
[0025] t.sub.e=expiration time (perhaps .about.30 days)
[0026] d.sub.c=current date
[0027] d.sub.F=date found
[0028] This yields the relationship as shown in FIG. 2.
Term 2: Editorial Popularity Ranking:
[0029] Each database entry (e.g., item) is assigned a value for
`EDITORIAL_RANK`, based on how popular the content is expected to
be. This could be based on expected viewership for known brand
names, previous Neilsen ratings, etc. The most popular content
should approach R.sub.e=1. Unknown or unpopular content should
approach R.sub.e=0. Optionally, the editorial popularity rank may
also have a time decay component to give weight or more weight to
more recent popularity information.
Term 3: Clickthru Popularity Ranking: R.sub.c=W.sub.cpm
R.sub.cpm+W.sub.cph R.sub.cph+W.sub.cpd R.sub.cpd
[0030] where: R cpm = clicks .times. .times. per .times. .times.
minutes .times. .times. ranking = CPM Max .function. ( cpm ) over
.times. .times. all .times. .times. items , ( 0 < R cpm < 1 )
##EQU3## R cph = clicks .times. .times. per .times. .times. hour
.times. .times. .times. ranking = CPH Max .times. ( cph ) over
.times. .times. all .times. .times. items , ( 0 < R cph < 1 )
##EQU3.2## R cpd = clicks .times. .times. per .times. .times.
.times. day .times. .times. .times. ranking = CPD Max .function. (
cpd ) over .times. .times. all .times. .times. items , ( 0 < R
cpd < 1 ) ##EQU3.3## and ##EQU3.4## 1 = W cpm + W cph + W cpd
##EQU3.5##
[0031] To implement the clickthru popularity rating, the following
fields need to be added to the video data table: [0032]
TOTAL_CLICKS=the running tally of clicks that this item has seen
since DATE_FOUND [0033] CPM=clicks per minute [0034]
CPM_COUNT_BUFFER=running tally of clicks on this item since
CPM_LAST_CALC [0035] CPM_LAST_CALC=the time when CPM was last
calculated and CPM_COUNT_BUFFER was flushed
[0036] Similarly:
[0037] CPH, CPH_COUNT_BUFFER, CPH_LAST_CALC for clicks-per-hour,
and
[0038] CPD, CPD_COUNT_BUFFER, CPD_LAST_CALC for clicks-per-day.
[0039] These fields can be calculated and update as follows:
[0040] For every user with cookies enabled, each clicked item is
stored anonymously in a cookie. Upon a subsequent request to the
Truveo search engine (during that same session), the clickthru data
in the cookie is processed as follows:
[0041] For every item clicked, increment TOTAL_CLICKS,
CPM_COUNT_BUFFER, CPH_COUNT_BUFFER, and CPD_COUNT_BUFFER by 1.
[0042] For CPM, if CURRENT_TIME-CPM_LAST_CALL>1 minute;
[0043] CPM=CPM_COUNT_BUFFER/(CURRRENT_TIME-CPM_LAST_CALC)
[0044] reset CPM_COUNT_BUFFER to 0.
[0045] set CPM_LAST_CALC to CURRENT_TIME
[0046] Similarly for CPD and CPH
[0047] Once this is complete, the user's browser cookie may be
flushed to eliminate all cached clickthrus.
[0048] Term 4: Favorites Metadata Ranking:
[0049] Note that if the user has not registered for an account,
this Ranking, R.sub.md, is zero
[0050] If the user does have a valid account, R.sub.md will be
determined as follows:
[0051] User FAVORITES METADATA is stored in 3 database tables:
FAVORITE_TITLES, FAVORITE_PEOPLE, FAVORITE_KEYWORDS.
[0052] For a given video data item:
[0053] If any entry in FAVORITE_TITLES matches any part of the
TITLE field-or the KEYWORDS Field, R.sub.md=1.
[0054] --OR--
[0055] If any entry in the FAVORITE PEOPLE table matches any part
of any of the fields: ACTOR, DIRECTOR, KEYWORDS, PRODUCER, WRITER,
LONG_DESCRIPTION, SHORT_DESCRIPTION, R.sub.md=1
[0056] --OR--
[0057] If any entry in the FAVORITE_KEYWORDS table matches any part
of any of the fields: ACTOR, CATEGORY, DIRECTOR, GENRE,
HOST_SITE_NAME, HOST_SITE_URL, KEYWORDS, LONG_DESCRIPTION,
SHORT_DESCRIPTION, PRODUCER, TITLE, WRITER, R.sub.md=1.
[0058] Otherwise, R.sub.md=0
[0059] Therefore: R md = { 0 , if .times. .times. .times. no
.times. .times. .times. metadata .times. .times. .times. match 1 ,
if .times. .times. metadata .times. .times. match ##EQU4##
[0060] Note:. Be sure to Filter matches on trivial metadata entries
like single characters, articles or whitespace characters.
[0061] A user's favorites may be determined by, but limited to,
providing a mechanism for the user to indicate their favorite
videos, recording the video items they select to view (e.g. through
the use of cookies), or by recording the video items they choose to
forward via e-mail to other people. The FAVORITE_TITLE,
FAVORITE_PEOPLE, and FAVORITE_KEYWORDS tables are populated for the
user by extracting the appropriate meta data from the video record
of the indicated favorite video.
[0062] Optionally, embodiments of the present application may also
include the use of a unique cookie to identify an anonymous user as
a substitute for a user account.
Term 5: Favorites Collaborative Filtering Ranking:
[0063] A listing of the Favorite Items (video data records) for
each user is stored in the database table FAVORITE_ITEMS.
[0064] Note that, if the user has not registered for an account,
this ranking, R.sub.cf, is zero.
[0065] If the user does have a valid account, R.sub.cf is
determined as follows:
[0066] First, calculate the distance between user i and all other
users, j: D i , j = distance .times. .times. between .times.
.times. .times. user .times. .times. i + j = n i - n i , j n i = 1
- n i , j n i ##EQU5##
[0067] where n.sub.i is the number of Favorite items user i has
stored, and n.sub.i,j is the number of user i's Favorites that
match Favorites of user j
[0068] Note that if all of user i's Favorites match a Favorite of
user j, then D.sub.i,j=0. If none match, D.sub.i,j=1.
[0069] Similarly, a measure of the similarity between user i and j
can be calculated as follows: S.sub.i,j=similarity between users i
and j=(1-D.sub.ij)=
[0070] Note: S.sub.i,j=1 when the users are completely similar, and
0 when there are no similar Favorites between users.
[0071] We can now select the K-Nearest Neighbors to user i based on
the similarity ranking. For example, assuming user i has three
Favorite items:
[0072] For: User i
[0073] Favorites: ITEMID=103 ITEMID=107 ITEMID=112 n.sub.i=3
[0074] K-Nearest Neighbors can be selected as follows:
TABLE-US-00001 User ID (j) n.sub.i,j D.sub.i,j S.sub.i,j Favorite
Items ID 1 1 0.66 0.33 101, 102, 103, 110 2 2 0.33 0.66 103, 104,
105, 106, 107 3 0 1 0 101 4 3 0 1 103, 104, 107, 112 5 2 0.33 0.66
106, 107, 109, 110, 111, 112 6 1 0.66 0.33 103, 104
[0075] Reranking the users by decreasing similarity: TABLE-US-00002
Favorite Items Not Already Stored by User ID S.sub.i,j User i 4 1
104 2 0.66 104, 105, 106 K-Nearest Neighbors, where K = 4 {open
oversize brace} 5 0.66 106, 109, 110, 111 1 0.33 101, 102, 110 6
0.33 104 3 0 101
[0076] From this ordered list, the K-Nearest Neighbors are the
first K items.
[0077] From the K-Nearest Neighbors, we can also determine a
popularity rating for each new Favorite item. This can be
calculated from the fraction of the K neighbors that have item l in
their Favorites list.
[0078] Specifically: TABLE-US-00003 Similarity to User ID User i
New Favorite Items 4 1 104 2 0.66 104, 105, 106 5 0.66 106, 109,
110, 111 1 0.33 101, 102, 110 P i = popularity .times. .times. of
.times. .times. item .times. .times. l among .times. .times. K
.times. - .times. Nearest Neighbors .times. .times. to .times.
.times. user .times. .times. i = number .times. .times. of .times.
.times. occurrences .times. .times. of .times. .times. item .times.
.times. l K ##EQU6##
[0079] Therefore, TABLE-US-00004 Users with Item ID This Item
P.sub.1 S.sub.max l 104 4, 2, 1 0.75 1 106 2, 5 0.5 0.66 110 5, 1
0.5 0.66 105 2 0.25 0.66 109 5 0.25 0.66 111 5 0.25 0.66 101 1 0.25
0.33 102 1 0.25 0.33 Were: S.sub.max,l = Maximum similarity across
all users with item l in their Favorites list Note: Popularity = 1
when all KNN contain item l, and P.sub.l = 0 when no KNN contain
item l.
[0080] Were: S.sub.max,l=Maximum similarity across all users with
item l in their Favorites list
[0081] Note: Popularity=1 when all KNN contain item l, and
P.sub.1=0 when no KNN contain item l.
[0082] Now, we can determine a ranking for every new item in the
K-Nearest Neighbors list:
[0083] For a given item l: R.sub.cf,l=W.sub.sim
(S.sub.max,l)+(1-W.sub.sim) P.sub.l,
[0084] where: W sim = similarity .times. .times. weighting .times.
.times. factor = C max .times. .times. sim .function. ( 1 - 1 1 + n
i ) , ##EQU7##
[0085] where: 0.ltoreq.C.sub.max sim.ltoreq.1
[0086] In other words, R.sub.cf is a weighted sum of the maximum
user similarity for item l and the popularity of item l among KNN
such that 0.ltoreq.R.sub.cf.ltoreq.1.
[0087] The weighting factor is calculated as a function of n.sub.i
since the relative importance of user similarity, as compared to
popularity, increases with the number of specified Favorite items.
In other words, if a user has only specified one Favorite item,
n.sub.i=1, then the similarity will be either 0 or 1, and therefore
it does not have much meaning. Therefore, when n.sub.i is small,
similarity should be weighed less than popularity.
[0088] C.sub.max sim should be set to the value that the similarity
weighting factor should approach as n.sub.i becomes large. A good
range is probably 0.3.ltoreq.C.sub.max sim.ltoreq.0.8.
[0089] More specifically, the relationship of the similarity and
popularity weighting coefficients can be plotted as shown in FIG.
3.
[0090] Now, for each new item in KNN, we can calculate the Rank
R.sub.c:f TABLE-US-00005 Item ID P.sub.l S.sub.max l R.sub.cf,l 104
.75 1 0.86 106 .5 0.66 0.57 110 .5 0.66 0.57 105 .25 0.66 0.43 109
.25 0.66 0.43 111 .25 0.66 0.43 101 .25 0.33 0.29 102 .25 0.33 0.29
Assume C.sub.max sim = 0.6. For n.sub.i = 3: W.sub.sim = 0.45
[0091] Note:
[0092] R.sub.cf is always between 0 and 1
[0093] If the maximum similarity to user i for item l is 1, and
item I is a Favorite of all KNN users, R.sub.cf=1
[0094] The popularity will never be below 1/KNN, but the similarity
can be zero. As a result, R.sub.cf will never be O unless C.sub.max
sim=1 and n.sub.i.infin..
[0095] Optionally, embodiments of the present invention may also
include a factor for crawl quality in the ranking of search
results. By way of nonlimiting example, Application Crawler results
are ranked higher than RSS feed results and RSS feed results higher
than results from a generic web crawler.
[0096] Referring now to FIG. 4, one embodiment of a user interface
for presenting the search results is shown. As seen in FIG. 4, the
results may display description of the video content, length of
video, time the video was posted, title, website origin, video
type, and/or video quality.
[0097] Referring now to FIG. 5, another embodiment of a user
interface is shown. This intuitive Media Center user interface may
used to bring web video to a television and other non-PC video
devices. In one embodiment, the present invention provides
TiVo-style recommendations as well as keyword queries. As seen in
FIG. 1, the television interface (or Media Center interface) shown
in FIG. 5 may access the results from the ranking engine and
application crawler. Again, video quality, bit rate, description,
and other information may be displayed. Videos may also be
categorized based on categories such as, but not limited to, news,
sports, movies, and other subjects.
[0098] While the invention has been described and illustrated with
reference to certain particular embodiments thereof, those skilled
in the art will appreciate that various adaptations, changes,
modifications, substitutions, deletions, or additions of procedures
and protocols may be made without departing from the spirit and
scope of the invention. For example, with any of the above
embodiments, the recommendation may use a ranking scheme having
only a subset of the ranking terms set forth in the formula. By way
of example and not limitation, some embodiments may not include
Term 5, the Favorites Collaborative Filtering Ranking. In other
embodiments, variations may be made to the present embodiment such
as but not limited to computing the ranking terms in a different
order or the like. It should be understood that the present ranking
scheme is not limited to video files and may be used to rank or
organize other types of files. It should be understood that the
term "files" as in "video files" may include the delivery of the
content of the file in the form of a stream from a server (i.e. a
media server).
[0099] The publications discussed or cited herein are provided
solely for their disclosure prior to the filing date of the present
application. Nothing herein is to be construed as an admission that
the present invention is not entitled to antedate such publication
by virtue of prior invention. Further, the dates of publication
provided may be different from the actual publication dates which
may need to be independently confirmed. U.S. Provisional
Application Ser. No. 60/630,552 (Attorney Docket Number 41702-1002)
filed Nov. 22, 2004 and U.S. Provisional Application Ser. No.
60/630,423 (Attorney Docket Number 41702-1001) filed Nov. 22, 2004,
are fully incorporated herein by reference for all purposes. All
publications mentioned herein are incorporated herein by reference
to disclose and describe the structures and/or methods in
connection with which the publications are cited.
[0100] Expected variations or differences in the results are
contemplated in accordance with the objects and practices of the
present invention. It is intended, therefore, that the invention be
defined by the scope of the claims which follow and that such
claims be interpreted as broadly as is reasonable.
* * * * *