U.S. patent application number 13/552579 was filed with the patent office on 2014-01-23 for determining user interest through detected physical indicia.
The applicant listed for this patent is David Deephanphongs. Invention is credited to David Deephanphongs.
Application Number | 20140026156 13/552579 |
Document ID | / |
Family ID | 48917690 |
Filed Date | 2014-01-23 |
United States Patent
Application |
20140026156 |
Kind Code |
A1 |
Deephanphongs; David |
January 23, 2014 |
Determining User Interest Through Detected Physical Indicia
Abstract
In accordance with some implementations, a method for
determining viewer interest is disclosed. The method is performed
on a client system having one or more processors, a camera, and
memory storing programs for execution. The electronic device
captures analyzes captured visual data to detect physical indicia
of interest associated with a user of the client system. The
electronic device then determines a level of interest of the user
with respect to media content being displayed in the proximity of
the users based on the detected physical indicia of interest. The
electronic device then sends the determined level of interest to a
server system; the server system including an interest profile for
the user of the client system. The electronic device then receives,
from the server system, recommendations for additional media
content for the user based, at least in part on, the determined
level of interest.
Inventors: |
Deephanphongs; David; (San
Francisco, CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Deephanphongs; David |
San Francisco |
CA |
US |
|
|
Family ID: |
48917690 |
Appl. No.: |
13/552579 |
Filed: |
July 18, 2012 |
Current U.S.
Class: |
725/12 |
Current CPC
Class: |
H04N 21/4223 20130101;
H04N 21/44218 20130101; H04N 21/6582 20130101; H04N 21/251
20130101; H04N 21/25866 20130101; G06K 9/00228 20130101; G06Q
30/0201 20130101; G06K 9/00362 20130101 |
Class at
Publication: |
725/12 |
International
Class: |
H04N 21/258 20110101
H04N021/258 |
Claims
1. A method for determining viewer interest, comprising: on a
client system having one or more processors, a camera, and memory
storing one or more programs for execution by the one or more
processors: capturing visual data of a user of the client system
with the camera; analyzing the captured visual data to detect
physical indicia of interest associated with a user of the client
system; based on the detected physical indicia of interest,
determining a level of interest of the user with respect to media
content being displayed in proximity to the user; generating an
interest score based on the determined level of interest, wherein
the interest score represents the level of interest the user has in
the media content being displayed in proximity to the user; sending
the interest score to a server system; the server system including
an interest profile for the user of the client system; and
receiving, from the server system, recommendations for additional
media content for the user based, at least in part, on the interest
score.
2. The method of claim 1, wherein detecting physical indicia of
interest includes: determining a first gaze point for a first eye
relative to a display; determining a second gaze point for a second
eye relative to a display; and measuring the distance between the
first gaze point and the second gaze point.
3. The method of claim 2, wherein detecting physical indicia of
interest further includes: determining a focus area of the user
based on the position of the first gaze point, the second gaze
point, and the distance between them.
4. The method of claim 1, wherein detecting physical indicia of
interest includes: determining an orientation of the user's
head;
5. The method of claim 1, further including: receiving, from the
server system, a list of events associated with the media being
displayed in proximity to the user of the client system.
6. The method of claim 5, wherein detecting physical indicia of
interest further includes: detecting a user's physical response to
the list of events received from the user.
7. The method of claim 5, wherein the list of events includes audio
events and visual events.
8. The method of claim 3, wherein detecting physical indicia of
interest further includes: receiving a stream of media content for
display in proximity to the user of a client system; analyzing the
stream of media content to determine a plurality of objects
currently being displayed, each object in the plurality of objects
having an associated location; determining, at a first time, an
first object intersecting with the focus area, determining, at a
second time, a second object intersecting with the focus area; and
determining whether the focus area intersects the same object at
both the first time and the second time.
9. An electronic device for determining viewer interest,
comprising: one or more processors; a camera, memory storing one or
more programs to be executed by the one or more processors; the one
or more programs comprising instructions for: capturing visual data
of a user of the client system with a camera; analyzing the
captured visual data to detect physical indicia of interest
associated with a user of the client system; based on the detected
physical indicia of interest, determining a level of interest of
the user with respect to media content being displayed in proximity
to the user; generating an interest score based on the determined
level of interest, wherein the interest score represents the level
of interest the user has in the media content being displayed in
proximity to the user; sending the interest score to a server
system; the server system including an interest profile for the
user of the client system; and receiving, from the server system,
recommendations for additional media content for the user based, at
least in part, on the interest score.
10. The electronic device of claim 9, wherein the instructions for
detecting physical indicia of interest further include instructions
for: determining a first gaze point for a first eye relative to a
display; determining a second gaze point for a second eye relative
to a display; and measuring the distance between the first gaze
point and the second gaze point.
11. The electronic device of claim 10, wherein the instructions for
detecting physical indicia of interest further include instructions
for: determining a focus area of the user based on the position of
the first gaze point, the second gaze point, and the distance
between them.
12. The electronic device of claim 9, further including
instructions for: receiving, from the server system, a list of
events associated with the media being displayed in proximity to
the user of the client system.
13. The electronic device of claim 12, wherein the instructions for
detecting physical indicia of interest further include instructions
for: detecting a user's physical response to the list of events
received from the user.
14. The electronic device of claim 11, wherein the instructions for
detecting physical indicia of interest further include instructions
for: receiving a stream of media content for display in proximity
to the user of a client system; analyzing the stream of media
content to determine a plurality of objects currently being
displayed, each object in the plurality of objects having an
associated location; determining, at a first time, an first object
intersecting with the focus area, determining, at a second time, a
second object intersecting with the focus area; and determining
whether the focus area intersects the same object at both the first
time and the second time.
15. A non-transitory computer readable storage medium storing one
or more programs configured for execution by an electronic device
with a camera, the one or more programs comprising instructions
for: capturing visual data of a user of the client system;
analyzing the captured visual data to detect physical indicia of
interest associated with a user of the client system; based on the
detected physical indicia of interest, determining a level of
interest of the user with respect to media content being displayed
in proximity to the user; generating an interest score based on the
determined level of interest, wherein the interest score represents
the level of interest the user has in the media content being
displayed in proximity to the user; sending the interest score to a
server system; the server system including an interest profile for
the user of the client system; and receiving, from the server
system, recommendations for additional media content for the user
based, at least in part, on the interest score.
16. The computer readable storage medium of claim 15, wherein the
instructions for detecting physical indicia of interest further
include instructions for: determining a first gaze point for a
first eye relative to a display; determining a second gaze point
for a second eye relative to a display; and measuring the distance
between the first gaze point and the second gaze point.
17. The computer readable storage medium of claim 16, wherein the
instructions for detecting physical indicia of interest further
include instructions for: determining a focus area of the user
based on the position of the first gaze point, the second gaze
point, and the distance between them.
18. The computer readable storage medium of claim 15 further
including instructions for: receiving, from the server system, a
list of events associated with the media being displayed in
proximity to the user of the client system.
19. The computer readable storage medium of claim 18, wherein the
instructions for detecting physical indicia of interest further
include instructions for: detecting a user's physical response to
the list of events received from the user.
20. The computer readable storage medium of claim 17, wherein the
instructions for detecting physical indicia of interest further
include instructions for: receiving a stream of media content for
display in proximity to the user of a client system; analyzing the
stream of media content to determine a plurality of objects
currently being displayed, each object in the plurality of objects
having an associated set of coordinates; determining, at a first
time, a first object intersecting with the focus area, determining,
at a second time, a second object intersecting with the focus area;
and determining whether the first object intersecting with the
focus area at the first time is the same as the second object
intersecting with the focus at the second time.
Description
TECHNICAL FIELD
[0001] The disclosed implementations relate to the field of
displaying media content generally and in particular to using
determining a user's interest in displayed media.
BACKGROUND
[0002] There are currently many avenues for users to consume media
content. In addition to traditional, non-interactive avenues such
traditional television, radio, or projection screens in movie
theatres, new electronic devices provide additional avenues to
consume media content, such as streaming content over the Internet
via computers, smart phones, or tablets. Some of these additional
avenues are interactive and allow users to interact with the
distributors of media content. This increased interaction allows
distributors or producers of media content to provide more
personalized services to the consumers of the media content.
[0003] One option for producers or distributors of media content to
provide personalized services is through a recommendation engine.
Such engines select new media content to recommend to the user
based on information known about a user. Increasing the amount of
information that a recommendation engine has concerning a specific
user increases the accuracy of recommendation engine to correctly
recommend media content that the user will find interesting. As a
result, gathering information concerning what media content a user
finds interesting and what media content a user does not find
interesting is important to providing a good user experience.
[0004] The new avenues for viewing media content allow additional
interaction that allows media content distributors to more
efficiently gather information relating to a user's interest.
Generally, the user indicates interest in a piece of media content
by selecting a level of interest or otherwise rating the media
content. Many recommendation systems are integrated directly into
media content display platforms and allow users to indicate whether
or not they found a particular piece of media content
interesting.
SUMMARY
[0005] In accordance with some implementations, a method for
determining viewer interest is disclosed. The method is performed
on a client system having one or more processors, a camera, and
memory storing one or more programs for execution by the one or
more processors. The client system captures visual data of a user
of the client system with the camera. The client system analyzes
the captured visual data to detect physical indicia of interest
associated with a user of the client system. The client system then
determines a level of interest of the user with respect to media
content being displayed in the proximity of the users based on the
detected physical indicia of interest. The client system then sends
the determined level of interest to a server system which maintains
an interest profile for the user of the client system. The client
system then receives, from the server system, recommendations for
additional media content for the user based, at least in part on,
the determined level of interest.
[0006] In accordance with some implementations, a client system for
determining viewer interest is disclosed. The client system has one
or more processors, a camera, and memory storing one or more
programs to be executed by the one or more processors. The one or
more programs include instructions for capturing visual data of a
user of the client system with the camera. In some implementations,
the client system includes instructions for analyzing the captured
visual data to detect physical indicia of interest associated with
a user of the client system. The client system in some
implementations may also include instructions for determining a
level of interest of the user with respect to media content being
displayed in the proximity of the users based on the detected
physical indicia of interest. In some implementations, the client
system also includes instructions for sending the determined level
of interest to a server system; the server system including an
interest profile for the user of the client system. In some
implementations, the client system further includes instructions
for receiving, from the server system, recommendations for
additional media content for the user based, at least in part on,
the determined level of interest.
[0007] In accordance with some implementations, a non-transitory
computer readable storage medium storing one or more programs
configured for execution by a client system with an associated
camera is disclosed. The one or more programs also include
instructions for capturing visual data of a user of the client
system. The one or more programs further include instructions for
analyzing the captured visual data to detect physical indicia of
interest associated with a user of the client system. The one or
more programs also include instructions for determining a level of
interest of the user with respect to media content being displayed
in the proximity of the users based on the detected physical
indicia of interest. The one or more programs may also include
instructions for sending the determined level of interest to a
server system; the server system including an interest profile for
the user of the client system. The one or more programs further
include instructions for receiving, from the server system,
recommendations for additional media content for the user based, at
least in part on, the determined level of interest.
BRIEF DESCRIPTION OF THE DRAWINGS
[0008] FIG. 1 is a block diagram illustrating a client/server
environment including a client system with a display in accordance
with some implementations.
[0009] FIG. 2A is a block diagram illustrating a client system in
accordance with some implementations.
[0010] FIG. 2B is a block diagram of an event list received from a
server system in accordance with some implementations.
[0011] FIG. 3 is a block diagram illustrating a server system in
accordance with some implementations.
[0012] FIG. 4 is a flow diagram illustrating the process of using
detected physical indicia of a user to determine the interest a
user has in media being displayed on a display associated with a
client system in accordance with some implementations.
[0013] FIG. 5A depicts an example of determining user interest
through physical indicia in accordance with some
implementations.
[0014] FIG. 5B depicts an example of determining user interest
through physical indicia in accordance with some
implementations.
[0015] FIG. 5C depicts an example of determining user interest
through physical indicia in accordance with some
implementations.
[0016] FIG. 6A depicts an example of determining user interest
through tracking displayed objects and determining the user focus
area in accordance with some implementations.
[0017] FIG. 6B depicts an example of determining user interest
through tracking displayed objects and determining the user focus
area in accordance with some implementations.
[0018] FIG. 6C depicts an example of determining user interest
through tracking displayed objects and determining the user focus
area in accordance with some implementations.
[0019] FIG. 7 is a flow diagram illustrating the process of
detecting user interest based on physical indicia in accordance
with some implementations.
[0020] FIG. 8 is a flow diagram illustrating the process of
detecting user interest based on physical indicia in accordance
with some implementations.
[0021] FIG. 9 is a flow diagram illustrating the process of
detecting user interest based on physical indicia in accordance
with some implementations.
[0022] Like reference numerals refer to corresponding parts
throughout the drawings.
DESCRIPTION OF IMPLEMENTATIONS
[0023] In some implementations, a user of a client system views
media content via the client system on either a display integrated
into the client system or associated with the client system.
Providers of the media content find great value in determining the
user's attentiveness to the displayed media content as knowing the
user's interest in media content can help media providers tailor
future content or recommendations more closely to the user's
interests. Accordingly, in some implementations, a user's interest
in displayed media is determined by analyzing visual data of the
user (such as visual data from photographs or video) for physical
indicia of user interest. An advantage of such an implementation is
that the user does not have to actively indicate their interest to
the system.
[0024] In some implementations the client system includes the
ability to detect physical indicia associated with a user. For
example, the client system has access to an associated camera or a
microphone. The client system then uses the camera to capture and
store visual information about the user. The client system then
analyzes the captured visual information for any physical indicia
of interest in media content.
[0025] In some embodiments, determining physical indicia of
interest includes determining the position of the eyes of the user
using gaze tracking techniques. For example, the client system uses
the position and orientation of each eye to determine where the
user is looking relative to the display. By determining where the
user is looking the client system is able to determine whether the
user is focusing on the display. If the user is determined to be
focusing on the display associated the client system determines on
what portion of the screen the user is focusing. In some
implementations, the client system then uses this information to
determine a level of interest for the user associated with the
media currently being displayed.
[0026] In some implementations, the physical indicia of interest
determined from the visual information includes the position of a
user's head. By analyzing the position of the user's head, the
client system is able to estimate where the user is looking and
consequently, determine whether the user is looking at the display.
The client system then estimates user interest in the currently
displayed media. In other implementations, the determined physical
indicia of interest include the user's body lean. In other
implementations the determined physical indicia of interest is a
user's reaction to a visual or audio event which occurs in the
media being displayed. For example, a user who physically reacts to
a surprising visual or startling loud sound in a movie (e.g. by
jumping or screaming) is likely more interested in the movie they
are watching than a user who does not react to a loud sound in a
movie.
[0027] In some implementations, an audio event includes information
about a song currently playing. The information includes the beats
per minute for a song (or the frequency or periodicity). The client
system 102 then analyzes captured visual information to determine
whether the user is moving with a periodicity (or frequency or
beats per minute) that matches the periodicity of the detected
song. A user moving (dancing for example) with the same frequency
of a song indicates positive user engagement with the presented
audio event. For example, if a song is playing alone or as part of
the soundtrack of a movie users who are very engaged with the
currently presented media are more likely to move in time (dance)
with the music.
[0028] In some implementations, the client system sends the
determined interest level to a server system for further
processing, storage, and use (in a recommendation system, for
example). In some implementations, the client system removes
personally identifiable information before sending the interest
information to the server system. In some implementations the user
is able to log onto a service that tracks interest information over
time and keeps an interest profile for the user.
[0029] In some implementations, the server system uses the
determined interest received from the client system to increase the
accuracy of recommendation systems. For example, the determined
interest can be used to select specific genres, performers, or
topics that the user finds interesting. In some implementations
these recommendations can be presented to the user for selection.
In some implementations, the client system automatically begins
displaying the most highly recommended media without user
interaction. In some implementations the user must select the
specific media to be displayed.
[0030] FIG. 1 is a block diagram illustrating a client-server
environment 100, in accordance with some implementations. The
client-server environment 100 includes a client system 102 which is
part of a client environment 108 and a server system 120. In some
implementations, the client system 102-1 includes a display 106-1
and a camera 104-1. In some implementations, the user environment
108-2 includes a camera 104-2 and a display 106-2 associated with
the client system 102-2 but not integrated into the client system
102-2. The server system 120 includes a recommendation engine 122
and a media information database 130. The communication network
interface 112 may connect to any of a variety of networks,
including local area networks (LAN), wide area networks (WAN),
wireless networks, wired networks, the Internet, or a combination
of such networks.
[0031] In accordance with some implementations, the client
environment 108-1 includes a client system 102. In some
implementations, the client system 102-1 includes an incorporated
camera 106-1 and an incorporated display 104-1. The incorporated
camera 106-1 is a camera which is included in the client system
102-1 and is able to record visual information. The incorporated
display 104-1 is also included in the client system 102-1 and
displays media in the vicinity of the user.
[0032] In other implementations the client environment 108-2
includes a client system 102-2, a display 104-2, which is
associated with the client system 102-2 but is not integrated into
the client system 102-2, and a camera 106-2, which is associated
with the client system 102-2 but is not integrated into the client
system 102-2. The camera 106-2 is able to capture visual data of a
user in the vicinity of the media being displayed on the display
104-2 associated with client system 102-2. The associated display
104-2 is configured to display media in the vicinity of the user of
the client system 102-2.
[0033] In accordance with some implementations, the client system
102 receives a list of events 114 from the server system 120. The
list of events 114 received from the server system includes a list
of visual or auditory events which occur during a specific piece of
media. In some implementations each event in the list of events
include a reference time that indicates the time at which the event
occurs, a duration time for the event, and, in the case of visual
events, an approximate location on the display on which the event
occurs. For example, a list of a events for a movie may include the
following list of events: at 11 minutes and 37 seconds a loud
scream occurs and lasts for 3 seconds, at 38 minutes and 27 seconds
a large explosion takes place on the left half of the screen and
lasts for 15 seconds, and at 61 minutes and 10 seconds a kungfu
fight occurs between two characters and lasts for 2 minutes and 17
seconds.
[0034] In accordance with some implementations, the client system
102 sends the determined interest 112 to the server system 120. The
determined interest represents the client systems 102 estimation,
based on physical indicia, of the level of interest of a user has
in the media currently or most recently displayed in the vicinity
of the user. This determined interest information may be recorded
in any format suitable for gauging interest. For example, the
determined interest may be represented by a numerical value between
0 and 1, where 0 represents no determined interest and 1 represents
full or maximum interest. Alternatively, interest may be
represented by choosing one of several distinct states. For
example, interest may be represented by assigning one of three
possible interest values (high interest, medium interest, or low
interest) to a user and reporting this value back to the server
system 120. In some implementations any variation or combination of
these interest scoring systems may be used.
[0035] In accordance with some implementations, the server system
120 includes a recommendation engine 122 and a media information
database 130. The recommendation engine 122 is configured to
collect information concerning the interests of specific users. In
some implementations, this information is collected from a
plurality of sources. For example, user information can be
collected by aggregating user search history data, user web
navigation data, user media purchases, detected user physical
indicia of interest, user self-reported interest in specific media,
and any other source of user interest information. Based on the
collected user interest data the recommendation engine determines
specific media to recommend to the user. In some implementations,
the media determined by the recommendation engine 122 automatically
begins displaying on the display 104 associated with the client
system 102 without waiting for user selection. In other
implementations, the selected media does not begin displaying until
selected by a user.
[0036] In accordance with some implementations, the media
information database 130 includes specific details about specific
pieces of media. For example, the media information database 130
includes the genre information, cast information, director
information, event information, and other information related to
specific media. The server system 120 uses this information to
facilitate evaluation of potential recommendations by the
recommendation engine 122. The server system 120 also uses the
media information database 130 to generate a list of events 114 for
a specific piece of media content being displayed on a display 104
associated with a client system 102.
[0037] FIG. 2A is a block diagram illustrating a client system 102,
in accordance with some implementations. The client system 102
typically includes one or more processing units (CPU's) 202, one or
more network interfaces 210, memory 212, an associated camera 106,
and one or more communication buses 214 for interconnecting these
components. The client system 102 includes a user interface 204.
The user interface 204 includes an associated display device 104
and optionally includes an input means such as a keyboard, mouse, a
touch sensitive display, or other input buttons 208. Optionally,
the display device 104 includes an audio device or other
information delivery device. Furthermore, some client systems use a
microphone and voice recognition to supplement or replace the
keyboard.
[0038] Memory 212 includes high-speed random access memory, such as
DRAM, SRAM, DDR RAM or other random access solid state memory
devices; and may include non-volatile memory, such as one or more
magnetic disk storage devices, optical disk storage devices, flash
memory devices, or other non-volatile solid state storage devices.
Memory 212 may optionally include one or more storage devices
remotely located from the CPU(s) 202. Memory 212, or alternately
the non-volatile memory device(s) within memory 212, includes a
non-transitory computer readable storage medium. In some
implementations, memory 212 or the computer readable storage medium
of memory 212 stores the following programs, modules and data
structures, or a subset thereof: [0039] an operating system 216
that includes procedures for handling various basic system services
and for performing hardware dependent tasks; [0040] a network
communication module 218 that is used for connecting the client
system 102 to other computers via the one or more communication
network interfaces 210 (wired or wireless) and one or more
communication networks, such as the Internet, other wide area
networks, local area networks, metropolitan area networks, and so
on; [0041] a display module 220 for enabling display of media on a
display 104 associated with the client system 102; [0042] one or
more client system 102 applications module(s) 222 for enabling the
client system 102 to perform the functions offered by the client
system 102, including but not limited to: [0043] an image capture
module 224 for using the associated camera 106 to capture visual
data of a user in the vicinity of the client system 102; [0044] an
image analysis module 230 for analyzing the visual data captured by
the camera 106 to detect physical indicia of interest of a user in
the proximity of the displayed media content, including but not
limited to the position of the user's eyes, the position of the
user's head, the position of the user's body, and any movements
made by the user; [0045] an event tracking module 232 for receiving
a list of events from the server system (FIG. 1, 120) and comparing
the detected physical indicia of interest against the list of
events received from the server system (FIG. 1, 120) to more
accurately gauge the interest of the user by comparing the physical
reactions of a user specific events which occur during the media;
[0046] an object tracking module 234 for determining the position
of specific objects on the display 104 associated with the client
system 102, determining the gaze position of the user by analyzing
the head and eye positions of the user, determining whether, at a
first time, the gaze position of the user intersects with a
determined object, determining whether, at a second time, the gaze
position of the user intersects a determined object, and
determining whether the gaze position of the user intersects with
the same object at both the first and second times; and [0047] an
interest determination module 236 for determining the interest a
user in the vicinity of the client system 102 in media currently
being displayed on the display 104 associated with the client
system 102 by gathering visual information to determine physical
indicia of interest and comparing the determined physical indicia
of interest to a list of events received from the server system
(FIG. 1, 120) or objects displayed on the display 104 associated
with client system 102; and [0048] a data module 240 for storing
data related to the client system 102, including but not limited
to: [0049] visual display data 242 including data to be displayed
on the display 104 associated with the client system 102, including
data necessary for media to be displayed, data necessary to display
a user interface to allow the user to effectively control the
client system 102, and any other data needed to effectively use the
associated display 104; [0050] user data 244 including information
concerning users of the client system 102 such as a user profile,
user preferences and interests, and other information relevant to
effectively providing services to the user; [0051] event data 246
including data received from the server system (FIG. 1, 102) that
lists audio or visual events in media which is currently displayed
or will be displayed in the future on the display 104 associated
with the client system 102; and [0052] media data 248 including
data associated with the media that is currently displayed or will
be soon be displayed on the display 104 associated with the client
system 102.
[0053] FIG. 2B is a block diagram of an event list 246 received
from a server system (FIG. 1, 12) in accordance with some
implementations. Each event list includes one or more events 250.
Each event represents a specific audio or visual event that occurs
during the display of a specific piece of media content.
[0054] In some implementations, an event 250 includes additional
information concerning the event. In some implementations each
event includes one or more of: an event ID 252, a time 254, a
duration 256, an on screen location 258, and additional description
260. The time 254 included in each event 250 describes at what
point relative to the beginning of the piece of media the event
occurs. The time data 254 allows the client system (FIG. 1, 102) to
correlate specific user indicia of interest to specific events 250.
In some implementations each event 250 includes a duration that
describes how long the event lasts from its start time 254. For
example, a scream or surprising visual would only last a few
seconds at most, while a car chase or martial arts fight scene
might have a duration of a few minutes or more.
[0055] In some implementations the event data 246 further includes
an on screen location 258 for visual events (such information is
not necessary for audio events). The on screen location data
includes coordinates indicating where on a display (FIG. 1, 104)
the visual event 250 is being displayed. The client system (FIG. 1,
102) uses this information to determine whether the user is
focusing on the displayed event 250. In some implementations the
event data 246 further includes description information 260 that
describes the event 250. In some implementations this information
consists of a list of categories or descriptors which describe the
event. For example, a car chase event might include categories such
as car chase, BMW, high speed driving, vehicle stunts, and urban
driving.
[0056] In some implementations the description information 260
includes a brief textual description of the event 250. For example
the description may be "Police officers chase a suspect at high
speeds through downtown Paris." In some implementations the client
system (FIG. 1, 102) uses this description information, together
with gathered physical indicia information, to analyze the interest
of a user more specifically. For example, the client system (FIG.
1, 102) is able to determine if a specific type or category of
event is of particular interest to a user. This interest
information may then be transmitted to a server system (FIG. 1,
120).
[0057] FIG. 3 is a block diagram illustrating a server system 120,
in accordance with some implementations. The server system 120
typically includes one or more processing units (CPU's) 302, one or
more network interfaces 304, memory 306, and one or more
communication buses 308 for interconnecting these components.
[0058] Memory 306 includes high-speed random access memory, such as
DRAM, SRAM, DDR RAM or other random access solid state memory
devices; and may include non-volatile memory, such as one or more
magnetic disk storage devices, optical disk storage devices, flash
memory devices, or other non-volatile solid state storage devices.
Memory 306 may optionally include one or more storage devices
remotely located from the CPU(s) 302. Memory 306, or alternately
the non-volatile memory device(s) within memory 306, includes a
non-transitory computer readable storage medium. In some
implementations, memory 306 or the computer readable storage medium
of memory 306 stores the following programs, modules and data
structures, or a subset thereof: [0059] an operating system 310
that includes procedures for handling various basic system services
and for performing hardware dependent tasks; [0060] a network
communication module 312 that is used for connecting the server
system 120 to other computers via the one or more communication
network interfaces 304 (wired or wireless) and one or more
communication networks, such as the Internet, other wide area
networks, local area networks, metropolitan area networks, and so
on; [0061] one or more server application module(s) 314 for
enabling the server system 120 to perform the functions offered by
the server system 120, including but not limited to: [0062] a
recommendation engine 122 for using collected user information 324
and media information database 130 to determine media of interest
to a user of the client system (FIG. 2, 102) and to send a
determined recommendation to the user of the client system (FIG. 2,
102); [0063] a media determination module 316 for determining the
media being displayed at a client system (FIG. 1, 102), wherein the
media being displayed at a client system (FIG. 1, 102) is
determined by receiving the identification of the media from the
client system (FIG. 1, 102), analyzing the data being displayed at
the display (FIG. 1, 104) associated with the client system (FIG.
1, 102), or, in the case where the media displayed at the client
system (FIG. 1, 102) is being provided by the server system 120,
determining the media being transmitted to the client system (FIG.
1, 102); [0064] an event selection module 318 for determining a
list of events to send to the client system (FIG. 1, 102) based on
the media determined to be displayed on the display (FIG. 1, 104)
associated with the client electronic display (FIG. 1, 102) and the
information stored in the media information database 130; and
[0065] a data reception module 320 for receiving data from the
client system (FIG. 1, 102) including interest information 326
determined by analyzing physical indicia from the user of the
client system (FIG. 1, 102); and [0066] one or more server data
module(s) 322 for storing data related to the server system 120,
including but not limited to: [0067] media information database 130
including specific details about particular pieces of media,
including, for example, the genre information, cast information,
director information, event information, and other information
related to specific media; [0068] user data 324 including
information concerning users of the client system (FIG. 1, 102)
such as a user profile, user preferences and interests, and other
information relevant to effectively providing services to the user;
[0069] interest data 324 including data received from the client
system (FIG. 1, 102) that indicates the level of interest a user
has for one or more pieces of media; and [0070] media display data
328 including data for, when the server system 120 provides media
data to the client system (FIG. 1, 102), displaying media content
on a display.
[0071] FIG. 4 is a flow diagram illustrating the process of using
detected physical indicia of a user to determine the interest a
user has in media being displayed on a display (FIG. 1, 104)
associated with a client system 102, in accordance with some
implementations. In some implementations, the server system 120
initially sends an event list 412 to the client system 102. The
event data list 246 includes information concerning visual or
auditory events which occur during a specific piece of media. In
some implementations each event in the list of events includes A) a
reference time that indicates the time at which the event occurs,
B) a duration time for the event, and, in the case of visual
events, C) an approximate location on the display on which the
event occurs. For example, a list of a events for a movie may
include the following list of events: at 11 minutes and 37 seconds
a loud scream occurs and lasts for 3 seconds, at 38 minutes and 27
seconds a large explosion takes place on the left half of the
screen and lasts for 15 seconds, and at 61 minutes and 10 seconds a
kungfu fight occurs between two characters and lasts for 2 minutes
and 17 seconds.
[0072] In accordance with some implementations, the client system
102 receives the list of events 412 and displays media on the
display (FIG. 1, 104) associated with the client system 102. The
client system 102 receives visual information data 406 from an
associated camera 104. In some implementations the client
environment device 102 analyzes the visual information data 406
received from the camera 104 to determine whether there are any
physical indicia of interest in the visual information data 406 of
the user of the client system 102.
[0073] In some implementations the client system 102 also receives
audio data 408 from a microphone associated with the client system
102. This audio data can then be analyzed to determine whether
there are any audio indicia of interest from a user. For example,
if the list of events 412 received from the server 120 includes an
event which is likely to produce an auditory reaction, such as a
startling or surprising character suddenly jumping onto to the
screen at a tense moment. A user who is very interested in the
media currently being displayed is more likely to react audibly to
startling or surprisingly scary events in the media being
displayed.
[0074] In some implementations the client system 102 analyzes the
data received from the camera 104 and the microphone 404 to
determine physical indicia of interest. For example, by analyzing
the visual data received from the camera 104 to determine the
position of the user's eyes and, from that information, determining
the sight lines of each eye and then determine where, relative to
the display, the user's gaze is focused. Based on the determined
user's gaze point the client system 102 is able to estimate a
user's interest in the media currently being displayed. The client
system 102 is also able to estimate interest by analyzing the
position of the user's head to determine generally where the user
is looking, the body lean of the user, and the user's reactions to
the media currently being displayed.
[0075] In some implementations, the client system 102 uses the list
of events 412 received from the server system 120 to help determine
a user's level of interest. The client system 102 correlates the
list of events 412 with the visual data 406 to improve the ability
of the client system 102 to accurately determine the user's
interest in the media currently being displayed. For example, if
the list of events 412 describes a large explosion at a particular
point in the media, the client system 102 can specifically see
whether the user has a physical reaction to the noted explosion. A
user who physically reacts to specific events will be determined to
be more interested in the currently displayed media than a user who
does not physically react to specific events.
[0076] In accordance with some implementations, the client system
transmits the determined user interest data 410 to the server
system 120. The user interest data 410 includes a score or ranking
representing the degree to which the user is interested in a
particular piece of media. The user interest data 410 includes data
identifying the media to which the interest score or ranking
applies.
[0077] In accordance with some implementations the server system
120 receives the user interest data 410 and stores it for further
use. In some implementations, the server system 120 uses this user
interest data 410 as data for the recommendation engine (FIG. 1,
122) to more accurately predict additional media that would be of
interest to a user. The user interest data 410 received from the
client system 102 is obtained without having to require interaction
from the user. In addition, physical indicia may indicate user
interest in media to which a user is not aware or which a user
would not volunteer to a recommendation engine if the information
were not automatically collected. In some implementations, the
received user interest data 410 is combined with other information
the server system has collected about the user to make a more
accurate determination regarding future recommendations. In some
implementations the user is able to log into a service which has a
user profile for the user already constructed. The user profile
includes a more extensive record of the users previously indicated
interests and other information relevant to making
recommendations.
[0078] FIG. 5A depicts an example of determining user interest
through physical indicia, in accordance with some implementations.
In this example, the client system (FIG. 1, 102) analyzes capture
visual data to determine position and rotation of a user's eyes.
Based on the determined position and rotation of a user's eyes, the
client system (FIG. 1, 102) determine the sight line of the eye and
where that sight line intersects with a display 522 that is
currently displaying media. The client system (FIG. 1, 102) maps
each eye independently. In accordance with some implementations the
client system (FIG. 1, 102) determines where the left eye's sight
line intersects the display 522 and records the left eye gaze point
(A) 504. The client system (FIG. 1, 102) determines the right eye's
sight line intersects the display 522 and records the right eye
gaze point (B) 506.
[0079] In accordance with some implementation the client system
(FIG. 1, 102) measures the distance between the left eye gaze point
(A) 504 and the right eye gaze point (B) 506. The client system
(FIG. 1, 102) uses the measured distance 502 between the left and
right gaze points to determine where the user's focus is located.
In some implementations the client system (FIG. 1, 102) determines
that the user is not focused on the displayed associated with the
client system (FIG. 1, 102). For example, when the measured
distance 502 between the left gaze point (504) and the right gaze
point (506) is greater than a predetermined value and therefore the
client system (FIG. 1, 102) is able to determine the user's focus
is behind the display 522. Determining that the user's focus is
behind the display 522 indicates that the user does not have high
interest in the currently displayed media. In some implementations,
the client system (FIG. 1, 102) determines that the user's left
gaze point (504) and the right gaze point (506) do not intersect
with the display (FIG. 1, 104) associated with the client system
(FIG. 1, 102) and thus determines that the user is not focusing on
the display (FIG. 1, 104).
[0080] FIG. 5B depicts an example of determining user interest
through physical indicia, in accordance with some implementations.
In this example, the client system (FIG. 1, 102) determines
viewer's the left gaze point (A) 514 and the right gaze point (B)
512. In accordance with some implementations the distance between
the right and left gaze points is less than a predetermined
distance. When the determined distance 510 is less than a
predetermined distance the client system (FIG. 1, 102) is able to
determine that the user is focusing on the display 524 and to
determine a focus area 508 on the display 524. The focus area 508
represents the area on the display 524 that the user is focusing
on. In some implementations when the distance 510 between the left
gaze point 514 and the right gaze point 512 is less than a
predetermined value the client system (FIG. 1, 102) determines that
the user's interest in the currently displayed media is relatively
high.
[0081] FIG. 5C depicts an example of determining user interest
through physical indicia, in accordance with some implementations.
In this example, the client system (FIG. 1, 102) determines the
left gaze point (A) 520 and the right gaze point (B) 518. In some
implementations the left gaze point (A) 520 is on the right side of
the right gaze point (B) 518. In this case, the client system (FIG.
1, 102) can determine that user's focus is on something in front of
the screen, regardless of the distance between 516 the left gaze
point 520 and the right gaze point 518. Based on this
determination, the client system (FIG. 1, 102) determines that the
user has relatively low interest in the currently displayed
media.
[0082] In some implementations more than one user is in the
vicinity of the client system (FIG. 1, 102) which is displaying
media content on its associated display. In some implementations,
the client system (FIG. 1, 102) will have associated profiles with
each user and will measure their interest individually. This is
accomplished by identifying each user, via facial recognition for
example, and then tracking each individual's physical indicia of
interest. In other implementations, the client system (FIG. 1, 102)
does not have associated profiles associated with all the users. In
this circumstance the client system (FIG. 1, 102) will identify the
primary user of the client system (FIG. 1, 102) and determine the
primary user's interest. The primary user may be identified by
facial recognition, proximity to the client system (FIG. 1, 102),
or proximity to a remote control associated with the client system
(FIG. 1, 102).
[0083] In some implementations, the client system (FIG. 1, 102)
does not have individual profiles for each user and cannot or has
not identified a primary user. In these circumstances the client
system (FIG. 1, 102) tracks the interest level for all available
users and then compares the levels of interest. In accordance with
a determination that all available users have comparable levels of
interest, the interest levels are averaged together. In accordance
with a determination that all the available users have sufficiently
different levels of interest, such that no real consensus is
reached, the various different levels of interest are all discarded
and no level of interest is sent to the server system (FIG. 1,
120).
[0084] FIG. 6A depicts an example of determining user interest
through tracking displayed objects and determining the user focus
area at a first point in time, in accordance with some
implementations. In this example, the client system (FIG. 1, 102)
determines a list of objects that are currently displayed on the
display 610-1 (objects A 604-1, B 606-1, and C 610-1). The client
system (FIG. 1, 102) tracks the position of each object on the
display 608-1 and determines the focus area 602-1 of a user at
multiple different times. By tracking the movement of objects on
the display 608-1 through time and also tracking the user's focus
area through time, the client system (FIG. 1, 102) can determine
whether the user's focus area is following a specific object. In
some implementations determining that the user's focus area 602-1
is following a specific object through different times, indicates
that the user's interest in the media is high.
[0085] In accordance with some implementations, the client system
(FIG. 1, 102) determines the focus area 602-1 of the user. The
client system (FIG. 1, 102) then determines whether the focus area
602-1 intersects with any of the objects currently displayed on the
display 608-1. In this example, the client system (FIG. 1, 102)
intersects with object A 604-1. The client system (FIG. 1, 102)
stores this information for future use.
[0086] FIG. 6B depicts an example of determining user interest
through tracking displayed objects and determining the user focus
area at a second point in time, in accordance with some
implementations. In this example, the objects are the same as those
depicted in FIG. 1, but have moved between the first time and the
second time. The client system (FIG. 1, 102) determines the
positions of the objects on the display 608-2 and the user's focus
area 602-2 at a second time. As can be seen, relative to the
display at time one in FIG. 6A, object A 604-2 and object B 606-2
have moved position on the display and object C 610-1 has left the
display 608-2 has left the display entirely. Further, object D
612-2 has entered the display 608-2. The client system (FIG. 1,
102) determines the position of the user focus area 602-2. In this
example the user focus area has moved relative to its position at
the first time as seen in FIG. 6A.
[0087] In accordance with some implementations, the client system
(FIG. 1, 102) determines the position of the user focus area 602-2
and whether it intersects with any objects currently displayed. In
this example the user's focus area 602-2 intersects with object A.
In some implementations the client system (FIG. 1, 102) compares
the focus area intersect data from the first time with the focus
area intersect data from the second time to determine whether the
user's focus area 602-2 has followed a specific object from the
first time to the second time. In this example, the user's focus
area 602-2 intersects with object A at both the first and the
second time. In some implementations, the client system (FIG. 1,
102) determines that the user's interest in the displayed media is
relatively high based on determining that the user's focus area has
followed a specific object from the first time to the second
time.
[0088] FIG. 6C depicts an example of determining user interest
through tracking displayed objects and determining the user focus
area at a third point in time, in accordance with some
implementations. In this example, the objects are the same as those
depicted in FIG. 1 but the objects have moved between the first
time and the third time. The client system (FIG. 1, 102) determines
the position of objects on the display 608-3 and the position of
the user focus area 602-3. In this example the objects A 604-3 and
B 606-3 have moved from the original positions from the first time
as depicted in FIG. 6A. Object C 610-1 has left the display 608-3
and object D 612-2 has entered the display 608-3. In contrast to
the example depicted in FIG. 6B, the user's focus area 602-3 has
not moved relative to its position at the first time depicted in
FIG. 6A. Thus, the user's focus point has not moved from the first
time to the second time. In some implementations, the client system
(FIG. 1, 120) determines that the user interest in the displayed
media is relatively low based on the fact that the user's focus
area has not changed despite movement of the displayed objects.
[0089] FIG. 7 is a flow diagram illustrating the process of
detecting user interest based on physical indicia, in accordance
with some implementations. Each of the operations shown in FIG. 7
may correspond to instructions stored in a computer memory or
computer readable storage medium. Optional operations are indicated
by dashed lines (e.g., boxes with dashed-line borders). In some
implementations, the method described in FIG. 7 is performed by the
client system (FIG. 1, 102).
[0090] In accordance with some implementations, the client system
(FIG. 1, 102) receives, from the server system (FIG. 1, 120), a
list of events (FIG. 2, 246) associated with the media being
displayed in the proximity of the user of the client system (702).
In some implementations, a camera (FIG. 1, 106) captures visual
data of a user of a client system (FIG. 1, 102) and transmits the
visual data to the client system (FIG. 1, 102). In some
implementations, the client system (FIG. 1, 102) analyzes the
captured visual data to detect physical indicia of interest
associated with a user of the client system (706). In some
implementations, analyzing the capture visual data includes
determining an orientation of the user's head (708). In some
implementations analyzing the captured visual data includes
detecting a user's physical response to the list of events received
from the user (710).
[0091] In accordance with some implementations, the client system
(FIG. 1, 102) analyzing the captured visual data includes
determining a first gaze point for a first eye relative to a
display (712). The client system (FIG. 1, 102) further determines a
second gaze point for a second eye relative to a display (714). The
client system (FIG. 1, 102) further measures the distance between
the first gaze point and the second gaze point (716). The client
system (FIG. 1, 102) further determines a focus area of the user
based on the position of the firsts gaze point, the second gaze
point, and the distance between them (718).
[0092] FIG. 8 is a flow diagram illustrating the process of
detecting user interest based on physical indicia, in accordance
with some implementations. Each of the operations shown in FIG. 8
may correspond to instructions stored in a computer memory or
computer readable storage medium. Optional operations are indicated
by dashed lines (e.g., boxes with dashed-line borders). In some
implementations, the method described in FIG. 8 is performed by the
client system (FIG. 1, 102).
[0093] In accordance with some implementations, the client system
(FIG. 1, 102) analyzing the captured visual data includes receiving
a stream of media content for display in proximity to the user of a
client system (804). The client system (FIG. 1, 102) further
analyzes the stream of media content to determine a plurality of
objects currently being displayed, each object in the plurality of
objects having an associated (806). The client system (FIG. 1, 102)
further determines, at a first time, a first object intersecting
with the user's focus area (808). The client system (FIG. 1, 102)
further determines, at a second time, a second object intersecting
with the user's focus area (810). The client system (FIG. 1, 102)
further includes determining whether the focus area intersects the
same object at both the first time and the second time (812).
[0094] For example, the client system (FIG. 1, 102) identifies
three objects on a screen, a main character, a vehicle, and a
chandelier. The client system (FIG. 1, 102) tracks the location of
each object while the media is being displayed. The client system
(FIG. 1, 102) also tracks the visual focus area of the user. So, if
the client system (FIG. 1, 102) determines that, at a first time,
the user's focus area intersects with the main character object,
the client system (FIG. 1, 102) and at a second time, the user's
focus area still intersects with the main character object despite
the object having moved, the client system (FIG. 1, 102) determines
that the user's interest level in this media is relatively high.
Conversely if the user's focus area remains unchanged despite the
displayed objects changing position, this indicates that the user's
interest level is relatively low.
[0095] FIG. 9 is a flow diagram illustrating the process of
detecting user interest based on physical indicia, in accordance
with some implementations. Each of the operations shown in FIG. 9
may correspond to instructions stored in a computer memory or
computer readable storage medium. Optional operations are indicated
by dashed lines (e.g., boxes with dashed-line borders). In some
implementations, the method described in FIG. 9 is performed by the
client system (FIG. 1, 102).
[0096] In accordance with some implementations, the client system
(FIG. 1, 102) determines a level of interest of the user with
respect to media being displayed in the proximity of a user based
on the detected physical indicia of interest (902). The client
system (FIG. 1, 102) sends the determined level of interest to a
server system (FIG. 1, 120) including an interest profile for the
user of the client system (904). The client system (FIG. 1, 102)
receives, from the server system (FIG. 1, 120), recommendations for
additional media content for the user based, at least in part on,
the determined level of interest (906).
[0097] The foregoing description, for purpose of explanation, has
been described with reference to specific implementations. However,
the illustrative discussions above are not intended to be
exhaustive or to limit the invention to the precise forms
disclosed. Many modifications and variations are possible in view
of the above teachings. The implementations were chosen and
described in order to best explain the principles of the invention
and its practical applications, to thereby enable others skilled in
the art to best utilize the invention and various implementations
with various modifications as are suited to the particular use
contemplated.
[0098] It will also be understood that, although the terms first,
second, etc. may be used herein to describe various elements, these
elements should not be limited by these terms. These terms are only
used to distinguish one element from another. For example, a first
contact could be termed a second contact, and, similarly, a second
contact could be termed a first contact, without departing from the
scope of the present implementations. The first contact and the
second contact are both contacts, but they are not the same
contact.
[0099] The terminology used in the description of the
implementations herein is for the purpose of describing particular
implementations only and is not intended to be limiting. As used in
the description of the implementations and the appended claims, the
singular forms "a," "an," and "the" are intended to include the
plural forms as well, unless the context clearly indicates
otherwise. It will also be understood that the term "and/or" as
used herein refers to and encompasses any and all possible
combinations of one or more of the associated listed items. It will
be further understood that the terms "comprises" and/or
"comprising," when used in this specification, specify the presence
of stated features, integers, steps, operations, elements, and/or
components, but do not preclude the presence or addition of one or
more other features, integers, steps, operations, elements,
components, and/or groups thereof.
[0100] As used herein, the term "if" may be construed to mean
"when" or "upon" or "in response to determining" or "in response to
detecting," depending on the context. Similarly, the phrase "if it
is determined" or "if (a stated condition or event) is detected"
may be construed to mean "upon determining" or "in response to
determining" or "upon detecting (the stated condition or event)" or
"in response to detecting (the stated condition or event),"
depending on the context.
* * * * *