U.S. patent application number 12/195259 was filed with the patent office on 2009-02-26 for system and method for providing real time targeted rating to enable content placement for video audiences.
This patent application is currently assigned to Ads-Vantage, Ltd.. Invention is credited to Reuven Cohen, Raviv Knoller, Anna Litvak-Hinenzon, Alex Paker.
Application Number | 20090055862 12/195259 |
Document ID | / |
Family ID | 40378757 |
Filed Date | 2009-02-26 |
United States Patent
Application |
20090055862 |
Kind Code |
A1 |
Knoller; Raviv ; et
al. |
February 26, 2009 |
SYSTEM AND METHOD FOR PROVIDING REAL TIME TARGETED RATING TO ENABLE
CONTENT PLACEMENT FOR VIDEO AUDIENCES
Abstract
A system and method is provided for providing real time targeted
rating to enable content placement for video audiences. The method
includes determining if at least one set top box, located within a
network having at least one set top box, is on or off, wherein
being on is defined as a set top box having a zapping event occur
within a predefined time period; determining what one or more
viewer profiles are currently consuming content provided by a set
top box within the network, wherein currently consuming refers to
consuming within the predefined period; and determining targeted
rating per a viewer profile that had been identified as currently
consuming content via at least one of the set top boxes within the
network.
Inventors: |
Knoller; Raviv; (Shoham,
IL) ; Paker; Alex; (Modiin, IL) ;
Litvak-Hinenzon; Anna; (Hod-Ha Sharon, IL) ; Cohen;
Reuven; (Rehovot, IL) |
Correspondence
Address: |
SHEEHAN PHINNEY BASS & GREEN, PA;c/o PETER NIEVES
1000 ELM STREET
MANCHESTER
NH
03105-3701
US
|
Assignee: |
Ads-Vantage, Ltd.
Shoham
IL
|
Family ID: |
40378757 |
Appl. No.: |
12/195259 |
Filed: |
August 20, 2008 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
60956728 |
Aug 20, 2007 |
|
|
|
Current U.S.
Class: |
725/34 |
Current CPC
Class: |
H04N 21/23424 20130101;
H04N 21/2665 20130101; G06Q 30/04 20130101; H04N 21/6405 20130101;
H04N 21/44222 20130101; G06Q 30/0264 20130101; G06Q 30/02 20130101;
G06Q 30/0601 20130101; H04N 21/812 20130101; H04N 21/252 20130101;
H04N 7/17318 20130101 |
Class at
Publication: |
725/34 |
International
Class: |
H04N 7/10 20060101
H04N007/10 |
Claims
1. A method of providing real time targeted rating to enable
content placement for video audiences, comprising the steps of:
determining if at least one set top box, located within a network
having at least one set top box, is on or off, wherein being on is
defined as a set top box having a zapping event occur within a
predefined time period; determining what one or more viewer
profiles are currently consuming content provided by a set top box
within the network, wherein currently consuming refers to consuming
within the predefined period; and determining targeted rating per a
viewer profile that had been identified as currently consuming
content via at least one of the set top boxes within the
network.
2. The method of claim 1, wherein the step of determining if at
least one set top box is on or off further comprises the steps of:
receiving a broadcast schedule for a set top box, wherein the
broadcast schedule contains a timetable of content to be provided
by the set top box; determining when content currently being
provided by the set top box is complete; and determining if a
zapping event has occurred between the completion of content
currently being provided by the set top box and the ending of a
predefined period.
3. The method of claim 1, wherein the step of determining if at
least one set top box is on or off further comprises the step of
determining if a zapping event has occurred before a predefined
time period has elapsed.
4. The method of claim 2, wherein the step of determining when
content currently being provided by the set top box is complete is
performed by processing the received broadcast schedule.
5. The method of claim 1, wherein the step of determining what one
or more viewer profiles are currently consuming content provided by
a set top box within the network, further comprises the step of
determining if a supervised or an unsupervised learning process was
performed to derive a list of one or more viewer profiles that are
associated with which set top boxes.
6. The method of claim 5, wherein if a supervised learning process
is performed, the following steps are performed to determine what
one or more viewer profiles are currently consuming content
provided by a set top box within the network, wherein currently
consuming refers to consuming within a predefined period: receiving
data providing an association of consumer profiles and set top
boxes to households within a network; recording zapping events
created by consumers, also referred to as zapping patterns of the
consumers; associating the zapping patterns of the consumers with
households.
7. The method of claim 6, wherein the data is provided by
performing the steps of providing questionnaires to the consumers
and receiving at least some of the questionnaires filled out by
consumers.
8. The method of 6, wherein the step of associating the zapping
patterns of the consumers with households further comprises the
steps of: converting zapping logs into different data models that
can be used to provide set top box signatures; providing the set
top box signatures; using the set top box signatures with a list of
set top boxes and profiles to provide an association rule; and
applying the association rule to the set top box signatures to
determine a list of profiles of the consumer profiles associated
with a specific set top box of the set top boxes.
9. The method of claim 5, wherein if an unsupervised learning
process is performed, the following steps are performed to
determine what one or more viewer profiles are currently consuming
content provided by a set top box within the network, wherein
currently consuming refers to consuming within a predefined period:
receiving a zapping log and a broadcast schedule, wherein the
zapping log includes records of set top box zapping signatures for
at least a portion of the set top boxes of the network; deriving
set top box signatures from the zapping log and broadcast schedule;
clustering viewer profiles into groups of viewer profiles using the
set top box signatures; and associating at least one set top box
within the network with at least one viewer profile.
10. The method of claim 1, wherein the step of determining what one
or more viewer profiles are currently consuming content provided by
a set top box within the network, further comprising the steps of:
obtaining a first input set containing data of at least one set top
box signature, wherein the data of the at least one set top box
signature further comprises a processed zapping log containing
information summarizing viewing habits of at least one set top box
within the network; obtaining a second input set, wherein the
second input set contains data showing which of one or more viewer
profiles are associated with which one or more set top boxes within
the network; and processing the first input set and the second
input set by performing at least one operation on the at least one
set top box signature and the association of one or more viewer
profile to one or more set top box, wherein the operation performs
the steps of: using set top boxes that are consistent to both the
first input set and the second input set, wherein these set top
boxes are referred to as consistent set top boxes; and performing
calculations to identify which of the profiles associated with each
of the consistent set top boxes consumed each content that is
included in the set top box signatures of the first set that are
associated with the consistent set top boxes.
11. The method of claim 5, wherein if a supervised learning process
was performed to derive a list of one or more viewer profiles that
are associated with which set top boxes, a select set top box of
the set top boxes is chosen and one or more viewer profiles of the
list of one or more viewer profiles associated with the select set
top box is applied to a newly obtained set top box signature for
the select set top box.
12. The method of claim 5, wherein if an unsupervised learning
process was performed to derive a list of one or more viewer
profiles that are associated with which set top boxes a select set
top box of the set top boxes is chosen and one or more viewer
profiles of the list of one or more viewer profiles associated with
the select set top box is applied to a newly obtained set top box
signature for the select set top box.
13. A system for providing real time targeted rating to enable
content placement for video audiences, wherein the system comprises
a head end having a computer and means for communicating therein,
wherein the computer has a management application stored therein,
and wherein the management application further comprises: logic
configured to determine if at least one set top box, located within
a network having at least one set top box, is on or off, wherein
being on is defined as a set top box having a zapping event occur
within a predefined time period; logic configured to determine what
one or more viewer profiles are currently consuming content
provided by a set top box within the network, wherein currently
consuming refers to consuming within the predefined period; and
logic configured to determine targeted rating per a viewer profile
that had been identified as currently consuming content via at
least one of the set top boxes within the network.
14. The system of claim 13, wherein the logic configured to
determine if at least one set top box is on or off further
comprises: logic configured to receive a broadcast schedule for a
set top box, wherein the broadcast schedule contains a timetable of
content to be provided by the set top box; logic configured to
determine when content currently being provided by the set top box
is complete; and logic configured to determine if a zapping event
has occurred between the completion of content currently being
provided by the set top box and the ending of a predefined
period.
15. The system of claim 13, wherein determining if at least one set
top box is on or off further comprises determining if a zapping
event has occurred before a predefined time period has elapsed.
16. The system of claim 14, wherein determining when content
currently being provided by the set top box is complete is
performed by processing the received broadcast schedule.
17. The system of claim 13, the logic configured to determine what
one or more viewer profiles are currently consuming content
provided by a set top box within the network, further performs the
step of determining if a supervised or an unsupervised learning
process was performed to derive a list of one or more viewer
profiles that are associated with which set top boxes.
18. The system of claim 17, wherein if a supervised learning
process is performed, the following steps are performed to
determine what one or more viewer profiles are currently consuming
content provided by a set top box within the network, wherein
currently consuming refers to consuming within a predefined period:
receiving data providing an association of consumer profiles and
set top boxes to households within a network; recording zapping
events created by consumers, also referred to as zapping patterns
of the consumers; associating the zapping patterns of the consumers
with households.
19. The system of claim 18, wherein the data is provided by
performing the steps of providing questionnaires to the consumers
and receiving at least some of the questionnaires filled out by
consumers.
20. The system of claim 18, wherein associating the zapping
patterns of the consumers with households is performed by: logic
configured to convert zapping logs into different data models that
can be used to provide set top box signatures; logic configured to
providing the set top box signatures; logic configured to use the
set top box signatures with a list of set top boxes and profiles to
provide an association rule; and logic configured to apply the
association rule to the set top box signatures to determine a list
of profiles of the consumer profiles associated with a specific set
top box of the set top boxes.
21. The system of claim 17, wherein if an unsupervised learning
process is performed, the following steps are performed to
determine what one or more viewer profiles are currently consuming
content provided by a set top box within the network, wherein
currently consuming refers to consuming within a predefined period:
receiving a zapping log and a broadcast schedule, wherein the
zapping log includes records of set top box zapping signatures for
at least a portion of the set top boxes of the network; deriving
set top box signatures from the zapping log and broadcast schedule;
clustering viewer profiles into groups of viewer profiles using the
set top box signatures; and associating at least one set top box
within the network with at least one viewer profile.
22. The system of claim 17, wherein the logic configured to
determine what one or more viewer profiles are currently consuming
content provided by a set top box within the network, further
comprises: logic configured to obtain a first input set containing
data of at least one set top box signature, wherein the data of the
at least one set top box signature further comprises a processed
zapping log containing information summarizing viewing habits of at
least one set top box within the network; logic configured to
obtain a second input set, wherein the second input set contains
data showing which of one or more viewer profiles are associated
with which one or more set top boxes within the network; and logic
configured to process the first input set and the second input set
by performing at least one operation on the at least one set top
box signature and the association of one or more viewer profile to
one or more set top box, wherein the operation performs the steps
of: using set top boxes that are consistent to both the first input
set and the second input set, wherein these set top boxes are
referred to as consistent set top boxes; and performing
calculations to identify which of the profiles associated with each
of the consistent set top boxes consumed each content that is
included in the set top box signatures of the first set that are
associated with the consistent set top boxes.
23. The system of claim 17, wherein if a supervised learning
process was performed to derive a list of one or more viewer
profiles that are associated with which set top boxes, a select set
top box of the set top boxes is chosen and one or more viewer
profiles of the list of one or more viewer profiles associated with
the select set top box is determined after an association rule
derived from the supervised learning is applied to a newly obtained
set top box signature for the select set top box.
24. The system of claim 17, wherein if an unsupervised learning
process was performed to derive a list of one or more viewer
profiles that are associated with which set top boxes a select set
top box of the set top boxes is chosen and one or more viewer
profiles of the list of one or more viewer profiles associated with
the select set top box is determined after an association rule
derived from the unsupervised learning is applied to a newly
obtained set top box signature for the select set top box.
25. A system for providing real time targeted rating to enable
content placement for video audiences, wherein the system comprises
a head end having a computer and means for communicating therein,
wherein the computer has a first management application stored
therein, and the system has a second computer having a second
management application, wherein the second management application
further comprises: logic configured to determine if at least one
set top box, located within a network having at least one set top
box, is on or off, wherein being on is defined as a set top box
having a zapping event occur within a predefined time period; logic
configured to determine what one or more viewer profiles are
currently consuming content provided by a set top box within the
network, wherein currently consuming refers to consuming within the
predefined period; and logic configured to determine targeted
rating per a viewer profile that had been identified as currently
consuming content via at least one of the set top boxes within the
network.
26. The system of claim 25, wherein determining what one or more
viewer profiles are currently consuming content provided by a set
top box within the network, further comprises determining if a
supervised or an unsupervised learning process was performed to
derive a list of what one or more viewer profiles are associated
with which set top boxes.
27. The system of claim 26, further comprising logic located within
at least one specific set top box of set top boxes within the
network, wherein the logic is configured to apply an association
rule and a list of at least one viewer profile associated with the
specific set top box to a newly obtained set top box signature of
the specific set top box, wherein the association rule is a set of
parameters derived from performing a learning process, applied to a
combination of set top box signatures of set top boxes within the
network, with a list of set top boxes within the network, and a
list of predefined viewer profiles.
28. The system of claim 25, wherein the logic configured to
determine what one or more viewer profiles are currently consuming
content provided by a set top box within the network, is located
within a set top box, and wherein currently is a predefined period.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to copending U.S.
Provisional Application entitled, "SYSTEM AND METHOD FOR PROVIDING
PERSONAL ADVERTISEMENTS FOR AN ACCESS NETWORK," having Ser. No.
60/956,728, filed Aug. 20, 2007, which is entirely incorporated
herein by reference.
FIELD OF THE INVENTION
[0002] The present invention relates to advertising, and more
particularly is related to providing personal advertisement to
video services.
BACKGROUND OF THE INVENTION
[0003] Owners of products and services, also referred to herein as
advertisers, spend significant funds advertising on television. In
addition, advertisers seek to maximize return from their investment
in advertising on television by using different techniques. As an
example, owners may pay to have an advertisement run at a specific
time on a specific channel. Such an advertisement may not only be
for products and services, but for any content, such as, but not
limited to, video on demand, gaming, and any other content or
service. In addition, owners may pay a premium price to have their
advertisement run during the showing of popular television
programming.
[0004] Unfortunately, advertisers do not have control over who may
be watching television at a time that an advertisement is run. As a
result, funds associated with television advertising are not
maximized. Instead, after receiving ratings associated with an
aired television show, advertisers pay based upon a previously
desired audience and an agreed upon percentage. Funds would be
better allocated if a larger number of a specific desired audience
could be selected for viewing of targeted advertisements.
[0005] Different techniques have been used in an attempt to
maximize television advertising investments. Examples of known
techniques include attempting to obtain demographic and
psychographic profiles, and using information about rating.
Unfortunately, information about rating, demographic and
psychographic profiles, and targeted rating is obtained using
surveys and/or people meters, which are based on small sample
audiences and are inaccurate in the collection process.
Advertisers, network management, and cable/satellite decision
makers would like to use more accurate information for placement
and pricing of television advertisements.
[0006] Currently, the process of creating television viewer
profiles has not made use of the actual actions of the television
viewers while watching television. Utilizing information associated
with viewer actions while watching television would be very useful
in the creating of television viewer profiles. In addition, it
would be beneficial to be determine viewer profiles that consumed
content.
[0007] Thus, a heretofore unaddressed need exists in the industry
to address the aforementioned deficiencies and inadequacies.
SUMMARY OF THE INVENTION
[0008] Embodiments of the present invention provide a system and
method for providing real time targeted rating to enable content
placement for video audiences. Briefly described, in architecture,
one embodiment of the system, among others, can be implemented as
follows. The system contains a head end having a computer and means
for communicating therein, wherein the computer has a management
application stored therein, and wherein the management application
further comprises: logic configured to determining if at least one
set top box, located within a network having at least one set top
box, is on or off, wherein being on is defined as a set top box
having a zapping event occur within a predefined time period;
determining what one or more viewer profiles are currently
consuming content provided by a set top box within the network,
wherein currently consuming refers to consuming within the
predefined period; and determining targeted rating per a viewer
profile that had been identified as currently consuming content via
at least one of the set top boxes within the network.
[0009] The present invention can also be viewed as providing
methods for providing real time targeted rating to enable content
placement for video audiences associating content to at least one
viewer profile in video audiences. In this regard, one embodiment
of such a method, among others, can be broadly summarized by the
following steps: determining if at least one set top box, located
within a network having at least one set top box, is on or off,
wherein being on is defined as a set top box having a zapping event
occur within a predefined time period; determining what one or more
viewer profiles are currently consuming content provided by a set
top box within the network, wherein currently consuming refers to
consuming within the predefined period; and determining targeted
rating per a viewer profile that had been identified as currently
consuming content via at least one of the set top boxes within the
network.
[0010] Other systems, methods, features, and advantages of the
present invention will be or become apparent to one with skill in
the art upon examination of the following drawings and detailed
description. It is intended that all such additional systems,
methods, features, and advantages be included within this
description, be within the scope of the present invention, and be
protected by the accompanying claims.
BRIEF DESCRIPTION OF THE DRAWINGS
[0011] Many aspects of the invention can be better understood with
reference to the following drawings. The components in the drawings
are not necessarily to scale, emphasis instead being placed upon
clearly illustrating the principles of the present invention.
Moreover, in the drawings, like reference numerals designate
corresponding parts throughout the several views.
[0012] FIG. 1 is a schematic diagram illustrating an example of an
IPTV network in which the present system may be provided.
[0013] FIG. 2 is a flow chart further illustrating the process of
personalizing advertisements, in accordance with one exemplary
embodiment of the invention.
[0014] FIG. 3 is a flow chart further illustrating the process of
identifying and associating consumer profiles to set top boxes
within a supervised learning scenario.
[0015] FIG. 4 is a schematic diagram illustrating an example of a
cable network in which the present system may be provided.
[0016] FIG. 5 is a schematic diagram illustrating an example of a
satellite network in which the present system may be provided.
[0017] FIG. 6 is a schematic diagram illustrating an example of a
terrestrial network in which the present system may be
provided.
[0018] FIG. 7 is a flow chart further illustrating the steps of the
supervised learning process.
[0019] FIG. 8 is a flow chart further illustrating the process of
identifying and associating consumer profiles to set top boxes
within an unsupervised learning scenario.
[0020] FIG. 9 is a block diagram further illustrating functionality
of the management application as blocks of logic.
[0021] FIG. 10 is a detailed logical flow diagram illustrating a
sequence of events performed during unsupervised learning.
[0022] FIG. 11 is a flow chart further illustrating a process for
determining targeted rating.
[0023] FIG. 12 is a flow chart illustrating a process for obtaining
a content to viewer profile assignment.
[0024] FIG. 13 is a flowchart illustrating functions performed by
the present system and method during execution of the real time
targeted rating process.
[0025] FIG. 14 is a flow chart further illustrating the process of
determining if a set top box is on or off.
DETAILED DESCRIPTION
[0026] The present system is capable of learning the viewing habits
of video viewers by collecting zapping events and other events
performed by the viewer. Such videos may be viewed via a
television, hand held device, computer, or any device capable of
displaying video. The events may be collected at a set top box,
computer, or other device. Alternatively, the events may be
collected at a different location, such as, but not limited to, at
an access multiplexer located in a head end, or in a device located
separate from the head end. The system learns the viewing habits
and zapping habits of different population profiles by identifying
the viewing profile of a household.
[0027] The system uses supervised or unsupervised learning
functionality for identifying different population profiles, and
provides a representation of the probability (or another form of
representation) of each population profile to watch any given
program and to present a zapping pattern. The probabilities can be
utilized as a tool for advertisers searching for the demographic
profile of the audience of a television program, or, using
inference functionality described herein, to identify the home
audience at each household, and the specific viewers of a
television program. Thereafter, the system is capable of supplying
personalized content, such as, but not limited to, advertisements,
video selections, and other content, to the viewers. It should be
noted that the following description provides an example in which
the content is an advertisement, however, the invention is not
intended to be limited to advertisements, but instead, any content
that may be personalized.
[0028] The present system collects the operations performed by
viewers at service decoders, such as, but not limited to, set top
boxes (the term set top box is used hereafter). The system then
employs unsupervised or supervised learning functionality, as
described herein, to interpret the operations at each set top box
as the sum of operations of all viewers associated with this set
top box. The system learns to identify different viewer profiles in
the population and associates with each set top box and profile a
probabilistic model of the viewing and zapping habits of
viewers.
[0029] It should be noted that the present system and method may be
provided within different infrastructures. As an example, the
following description provides examples of using the present system
and method in an Internet protocol television (IPTV)
infrastructure, in a cable infrastructure, and in a satellite
infrastructure. While these infrastructures are described herein,
the present system and method is not intended to be limited to
these infrastructures.
[0030] While the following describes the present system and method
in detail it is beneficial to provide certain definitions.
[0031] Set top box (STB) or service decoder: A set top box or
service decoder is a device responsible for converting digital (or
analog) content received into viewable content that may be fed into
a television set or other monitor. The set top box or service
decoder may be located at a household or another location.
[0032] Platform: A network of service decoders (e.g., set top
boxes) of a specific television service provider.
[0033] Passive audience identification: Identification of the
viewer's profiles without any specific actions performed by the
viewer.
[0034] Zapping event: A zapping event is an event where there is
switching from a current service to another service, where the
switching is performed by, for example, but not limited to, use of
a remote control, pushing buttons on the set top box, or any action
that causes switching, including, but not limited to, voice
commands, or even consumer motions without pressing buttons. In
addition, a zapping event may be other means for communicating with
a set top box, such as, but not limited to, pressing an electronic
program guide, pressing a volume button, and other actions
involving the set top box.
[0035] Zapping pattern: A zapping pattern is the behavior of a
viewing individual in terms of zapping, such as, but not limited
to, programs watched, frequency of zapping events, and variance of
zapping frequency.
[0036] Set top box (STB) zapping signature: Records of zapping
events of a particular set top box.
[0037] Set top box (STB) signature: Data model providing
characteristics of a set top box including: an association between
a set top box and content available to the set top box, where the
content is either provided or not provided via the set top box
during a time period; and/or, at least one zapping pattern
associated with the set top box. It should be noted that herein
when referring to set top box signatures, one or more set top box
signature is included. In addition, content availability refers to
content that the set top box has access to and can provide.
[0038] Zapping log: Records of the set top box zapping signatures
for an entire set top box network (Platform) or for part of the
network.
[0039] Channel: A stream of programs broadcasted consecutively from
a content source.
[0040] Program: Content that was broadcasted on a specific channel
at a specific date and time, whether on demand or generally
broadcasted.
[0041] Program rating: Percent of viewers that watched the
program.
[0042] Targeted program rating: Percent of viewers of specific
profile that watched the program.
[0043] Channel rating: Percent of viewers that watched the channel
during the specified time period.
[0044] Targeted channel rating: Percent of viewers of specific
Profile that watched the channel during the specified time
period.
[0045] Profile: The classification of an individual into one of
several population groups that is targeted. Such profiles may be,
for example, but not limited to, psychographic (for example,
behavioral) or demographic profiles. Examples of such groups
include, but are not limited to, gender, age, income, marital
status, and possibly also by interests in different fields.
[0046] Learning functionality: Functionality used to reduce a large
set of observed data and its classification into groups to a set of
parameters, allowing to reconstruct the classification of the
majority of the original data and to classify similar, unlearned,
data, or, to produce a new type of classification. Different
relevant learning methods may be utilized to provide the learning
functionality such as, but not limited to, artificial neural
networks, decision trees, k-Nearest Neighbor, Quadratic classifier,
support vector machine, direct probability estimate using Bayesian
inference, Bayesian networks, Gaussian estimators, least squares
optimization methods, and other optimization methods.
[0047] Supervised learning: Supervised learning is learning in
which the classification of the observed data is inferred from a
sample of the data supplied by an outside source. The learning
functionality searches for a parameter set allowing reconstruction
of the classification from the input that later can be used for
classification of new unlearned data.
[0048] Unsupervised learning: Unsupervised learning is learning in
which no classification of observed data is given (i.e., no sample
is provided), and the functionality attempts to classify the data
into different classes under some constraints. The functionality
may use a method, such as, but not limited to, vector quantization,
and various learning methods and various optimization methods, to
find a reduction of the data into representative classes.
[0049] FIG. 1 is a schematic diagram illustrating an example of an
IPTV network 10 in which the present system may be provided.
Specifically, FIG. 1 is specific to video on demand or personalized
advertisements for an IPTV infrastructure. As shown by FIG. 1, an
IPTV head end 20 is provided, portions of which communicate with at
least one customer premises 100A-100D. As is known by those having
ordinary skill in the art, a head end is the physical location in
an area where a video signal is received by a provider, stored,
processed, and transmitted to local customers of the provider. One
having ordinary skill in the art would also appreciate that more
than one head end may be provided within a network. In addition, a
network may have more than one type of head end, such as, but not
limited to, a cable head end, a satellite head end, an IPTV head
end, and a terrestrial head end.
[0050] The head end 20 contains at least a video service splicer
30, an advertisements video server 40, a management application 50,
and an access network multiplexer 60. One having ordinary skill in
the art would appreciate that the head end 20 may have portions in
addition to those mentioned herein. In addition, while the present
description refers to a management application, it should be noted
that the management application is stored on a computer.
[0051] The video service splicer 30 receives video and audio
services from a satellite dish 70. It should, however, be noted
that video and audio services may be received by devices other than
a satellite dish 70, such as, but not limited to, a cable network
or any device capable of providing video to the head end 20.
[0052] The video service splicer 30 is capable of splicing personal
advertisements into a video service stream, as instructed by the
management application 50 and as is further described in detail
hereinbelow. The video service splicer 30 also receives
advertisements from the advertisements video server 40. In
addition, actions of the video service splicer 30 are controlled by
the management application 50. It should be noted that, for the
example of an IPTV network, the video packets received by the video
service splicer 30 may carry an Internet protocol (IP) address and
a User Datagram Protocol (UDP) port number. It should also be noted
that the video service splicer 30 may instead receive video and
audio services from a cable fiber.
[0053] The access network multiplexer 60 is responsible for routing
video services to transmission units 120A-120D that are video
services decoders, as explained hereinbelow. The transmission units
120 are each located within a customer premises 100A-100D. The
access multiplexer 60 is connected to both the management
application 50 and the video service splicer 30. Specifically, the
access network multiplexer 60 may perform, for example, IP and UDP
port manipulation. It should be noted that the access network
multiplexer 60 may be, for example, but not limited to, an optic
multiplexer or a digital subscriber line access multiplexer
(DSLAM). From a multicast point of view, as described hereinbelow,
connection between the access network multiplexer 60 and a set top
box 110 may be a shared media connection, or any other type of
connection, and there may or may not be a multicast hierarchy
between the access network multiplexer 60 and the set top box
110.
[0054] The management application 50 communicates with the video
service splicer 30, the advertisements video server 40, and the
access network multiplexer 60. In addition, the management
application 50 provides the functionality required to learn
unsupervised profiles in television audiences, as is described in
detail hereinbelow. It should be noted that in accordance with an
alternative embodiment of the invention, the management application
50 may instead be located within a set top box 110 located within
the customer premises 100A-100D.
[0055] Each customer premises 100A-100D at least contains a set top
box 110A-110D and a transmission unit 120A-120D. While for
exemplary purposes four customer premises 100A-100D are
illustrated, one having ordinary skill in the art would appreciate
that additional or fewer customer premises 100A-100D may be
provided. The transmission unit 120 is capable of receiving
advertisement streams and video streams and forwarding the streams
to an appropriate set top box 110. For exemplary purposes, the
customer premises 100A-100D is illustrated as also containing a
computer 130A-130D, although a computer 130 is not intricate to the
invention. It should be noted that while a single set top box is
shown as being located within a customer premises 100, more than
one set top box 110 may be located within the customer premises
100. In addition, in accordance with an alternative embodiment of
the invention, the set top box may be a computer or any device that
can decode a service. For the present example of an IPTV network,
the set top box 110 receives a video service with certain TCP/IP
parameters, such as, but not limited to, IP address and UDP port.
It should be noted, however, that in a cable network or a satellite
network, the set top box 110 may or may not receive TCP/IP
parameters.
[0056] The present system enables editing of online personal video
so as to provide personalized television advertisements directed
toward a viewer presently watching the television. As is described
in detail below, the present invention is capable of categorizing a
viewer into an advertising profile, an example of which is, but in
not limited to, a demographic profile. Within a single customer
premises, different television viewers may have different profiles.
The different television viewers may view the same television
during the day. Each different viewer may be associated with a
different advertising profile, such as, but not limited to a
demographic profile, thus preferably receiving different
advertising messages. As an example, a family structure may be
described as having an adult male of age 45, an adult female of age
42, a male teenager of age 17, a female teenager of age 14, and a
male child of age 7. It should be noted that while the present
description refers to a demographic profile, other types of
profiles may be provided for.
[0057] During the time that a television viewer consumes service
transmissions the management application 50 identifies the profile
of the viewer. After identifying the profile, the application 50
performs personalized advertisements editing for that particular
profile. When there is a different viewer with a different
advertising profile that is using the same video decoder, the
management application 50 identifies the profile that the viewer
belongs to and performs online personalization editing for the
advertisements, as described below.
[0058] In accordance with the present invention, for both
supervised and unsupervised learning, the television consumers,
also referred to herein as viewers, are not individually
identifying themselves to the system. As a result, the system is
required to identify consumer profiles and to associate the
profiles with a specific set top box. This process is described in
detail hereinbelow. Prior to describing this process, a general
process of IPTV advertisement insertion in a broadcast environment
is described in detail.
[0059] A typical advertisement projection works as follows. During
content consumption the access network multiplexer 60 receives a
video signal and sends the video signal to the customer premises
100A-100D using an IP protocol. During an advertisement break the
video transmissions continue to be transmitted in multicast, thus
there is no personalization of advertisements. To instead
personalize advertisements, the following is performed.
[0060] FIG. 2 is a flow chart 200 further illustrating the process
of personalizing advertisements, in accordance with one exemplary
embodiment of the invention. Any process descriptions or blocks in
flow charts should be understood as representing modules, segments,
or portions of code that include one or more executable
instructions for implementing specific logical functions or steps
in the process, and alternative implementations are included within
the scope of the embodiment of the present invention in which
functions may be executed out of order from that shown or
discussed, including substantially concurrently or in reverse
order, depending on the functionality involved, as would be
understood by those reasonably skilled in the art of the present
invention.
[0061] As shown by block 202, content is transmitted from the head
end 20, via the access network multiplexer 60, to the set top box
110. An example of a protocol that may be used for the transmission
is the Internet group management protocol (IGMP), which is used by
IP hosts to manage their dynamic multicast group membership. Of
course, other protocols may be used.
[0062] In accordance with the present example, a subset, or
complete set, of the customers that are connected to the access
network multiplexer 60 are viewing the same video and/or audio
service (i.e., content). The management application 50 also
continuously identifies the consumers (block 204). It should be
noted that the management application 50 can utilize either online
processing or offline processing to determine a relationship
between viewed content (e.g., videos) and viewer profiles.
Regarding offline processing to identify consumers, associate the
consumers with content, and produce reports, in accordance with a
predefined schedule, or when prompted to do so, the management
application 50 reviews zapping patterns, processes the patterns,
and associates each program viewed from a set top box 110 with a
viewer profile. Alternatively, for online processing, during an
advertising break, the management application 50 reviews only
recent zapping events to determine which viewer is presently
viewing content. Further description of consumer identification is
provided with regard to FIG. 3, FIG. 8, and FIG. 10. It should be
noted that the information received by the management application
50 may be received from a source other than a set top box.
[0063] Returning to the flowchart 200 of FIG. 2, the management
application 50 decides which advertisements of the advertisement
set each consumer should receive (block 206). It should be noted
that the process of selecting advertisements is described in detail
herein.
[0064] As shown by block 208, the video splicer 30 then splices the
advertisements according to the decision of block 206. Since one
having ordinary skill in the art would know how a video splicer
splices advertisements, further description of the splicing process
is not provided herein. As shown by block 210, when the
advertisement break is over, the access multiplexer 60 continues to
transmit the multicast transmission as it did prior to the
advertisement break.
[0065] It should be noted that if during an advertisement break the
consumer changes the consumed video service, the management
application 50 supplies the new service in the same manner.
Specifically, if the service transmits content, the management
application 50 continues to transmit the content with the multicast
protocol. In addition, if there is an advertisement break, the
management application 50 may splice different advertisements.
[0066] As previously mentioned, the present system provides a
consumer specific advertising environment. This environment is
provided in part by the providing of online multilayer multicast
groups between the access network multiplexer 60 and the set top
boxes 110A-110D. The access network multiplexer 60 transmits
broadcast transmissions with multicast protocol to a subset A of
the set that is connected to the access network multiplexer 60. In
the subset A there are different subsets B of consumers watching
the same channel at a given moment that are connected to the access
network multiplexer 60. Within a single subset B, consumers are
associated by their profile for advertising. When there is an
advertisement break, the access network multiplexer 60 is
transmitting an additional layer of multicast, where each different
subset Bi is receiving different advertisements according to the
advertisement profile associated with subset Bi. Finally, when the
advertisement break is over, subset A consumers continue to watch
the same service.
[0067] While the abovementioned provides an example of an IPTV
network 10, a different infrastructure in which the present system
and method may be provided includes a cable network 400. FIG. 4 is
a schematic diagram illustrating an example of a cable network 10
in which the present system may be provided. While there are
similarities between the IPTV network of FIG. 1 and the cable
network 400 of FIG. 4, there are also differences, which are
described herein.
[0068] Referring the FIG. 4, a cable head end 410 of the cable
network 400 is very similar to the IPTV head end 20 of the IPTV
network 10. It should be noted, however, that instead of an access
network multiplexer 60, the cable network 400 contains an RF
interface 410, which may be, for example, but not limited to, a
quadrature amplitude modulation (QAM) modulator and/or a radio
frequency (RF) combiner. The cable network 400 provides for
individual coaxial cables to provide communication capability from
the cable head end 410 to individual set top boxes 430A-430H, where
each set top box is located within a customer premises (CP)
440A-440H, such as, but not limited to, a home.
[0069] Another example of a network in which the present system and
method may be provided is a satellite network. FIG. 5 is a
schematic diagram illustrating an example of a satellite network
500 in which the present system may be provided. The satellite
network 500 contains a satellite head end 510 that is similar to
the IPTV head end 20, except that the satellite head end 510
contains an RF modulation interface 520. The RF modulation
interface 520 is capable of formatting and amplifying received data
for transmission to a satellite 550.
[0070] The satellite 550 is capable of reflecting received data to
satellite dishes 560A-560N capable of receiving data signals from
the satellite 550. Each satellite dish 560A-560N is associated with
a customer premises 570A-570N, such as, for example, a home. In
addition, each customer premises 570A-570N has at least one set top
box 580A-580N located therein.
[0071] Still a further example of a network in which the present
system and method may be provided is a terrestrial network. FIG. 6
is a schematic diagram illustrating an example of a terrestrial
network 600 in which the present system may be provided. The
terrestrial network 600 contains a terrestrial head end 610 that is
similar to the IPTV head end 20, except that the terrestrial head
end 610 contains an RF modulation interface 620. The RF modulation
interface 620 is capable of formatting and amplifying received data
for transmission to a radio tower 650.
[0072] The radio tower 650 is capable of reflecting received data
to antennas 660A-660N capable of receiving data signals from the
radio tower 650. Each antenna 660A-660N is associated with a
customer premises 670A-670N, such as, for example, a home. In
addition, each customer premises 670A-670N has at least one set top
box 680A-680N located therein.
[0073] In accordance with the present invention, the management
application 50 identifies the consumer profiles that are using
video/audio decoders (i.e., set top boxes) in the network 10. For
exemplary purposes the example of a single household having two
television sets is provided. Each television is connected to a
different set top box. A first television A is located in the
living room and a second television B resides in a room for
children.
[0074] In accordance with the present example, there are three
consumer demographic profiles in the household, namely:
[0075] 1. Profile 1: Male adult of age 37
[0076] 2. Profile 2: Female adult of age 34
[0077] 3. Profile 3: Male child of age 8 and male child of age
10
The consumer profiles are associated with the television sets as
follows:
[0078] Television A--profiles 1, 2, and 3 (all the household
residents are consuming content via television A).
[0079] Television B--profile 3 (only the children are using
television B)
[0080] The process of identifying and associating consumer profiles
to set top boxes may be separated in accordance with whether a
supervised learning process is used or an unsupervised learning
process. These two scenarios are described separately hereinbelow,
although it will be noted that certain steps in the processes are
similar.
[0081] In accordance with the present example, for both the
supervised and unsupervised scenarios, service providers have no
knowledge of the profiles existing in the household, the location
of the television sets in the household, and/or associations
between the television sets and the profiles. Instead, the
management application 50 identifies and associates the consumer
profiles with the set top boxes.
Supervised Learning
[0082] Reference is now made to the flowchart 300 of FIG. 3. The
flowchart 300 of FIG. 3 further illustrates the process of
identifying and associating consumer profiles to set top boxes
100A-100D within a supervised learning scenario. As shown by block
302, to acquire a sample, the service provider may send a
questionnaire to the consumers. Alternatively, the service provider
may use any other method of obtaining data, such as, but not
limited to, having a telephone conversation. The questionnaire may
refer to the household demographic details, video decoders (i.e.,
set top boxes), and association between the usage of each person in
the household and the video decoders in the household. As shown by
block 304, consumers fill out the questionnaire and return the same
to the service provider. With the return of the consumer
questionnaire, it is known which individual profiles and set top
boxes are associated with a household.
[0083] As shown by block 306, set top boxes 110 in the network 10
record all of the zapping events that the consumers are creating.
In accordance with the present description, and as is known by
those having ordinary skill in the art, zapping refers to the
switching from the current service to another service via use of,
for example, but not limited to, a remote control or pushing
buttons on the video decoder. It should be noted that this use of
remote controls is provided for exemplary purposes. Instead,
zapping may be associated with switching initiated by voice
commands, or even consumer motions without pressing buttons.
[0084] As shown by block 308, the set top boxes 110 send the
zapping events to the management application 50. The management
application 50 then associates behavior of consumers and their
zapping pattern with the households that either did not return the
questionnaire or that never received a questionnaire (block
310).
[0085] The association process is a learning process, also referred
to as a business process, which is the process of passive platform
audience learning and identification, and targeted platform rating
calculation and analysis. The learning process is divided into
multiple steps, including data collection, modeling, learning,
identification, analysis, and post processing. FIG. 7 is a flow
chart 700 further illustrating the steps of the supervised learning
process.
[0086] Data Collection
[0087] Referring to FIG. 7 and the step of data collection, in
order to perform audience learning, audience identification, and
targeted rating calculation, certain external data is collected and
converted into an internal format (block 702). This external data
includes the zapping log, the broadcast schedule, set top box
information, and sample information. The zapping log includes the
actions that were performed by the set top box user using a remote
control, directly using set top box control buttons, or performing
a different action that caused changing from a current service to
another service, or from a current state of the set top box to
another state of the set top box (e.g., switching on or off). The
broadcast schedule (or AsRun) includes, for example, a timetable
for the platform channels/programs during the zapping gathering
period. It should be noted that the broadcast schedule may also
include a schedule of video on demand programs, or a schedule of
any interactive service. The broadcast schedule should be
reconciled with the zapping log in terms of times and channels
identifications. The set top box information includes the relevant
information, for every set top box for which zapping was collected,
(e.g., unique set top box identifier and address). The set top box
information should also be reconciled with the zapping log in terms
of set top box identifications.
[0088] Modeling
[0089] Modeling is the process of converting the zapping log into
different data models that could be used by different learning and
identification algorithms, thereby providing a set top box
signature (block 704). In accordance with the present system and
method, at least the following data models are recognized. A first
data model that is recognized is a set top box viewing signature.
Regarding the set top box viewing signature, for each set top box,
the list of "watched" programs could be created based on the
zapping log and reconciled broadcast schedule. For each watched
program, an aggregated watching percentage is given. As an example,
STB1 watched program number 56, 30%, means that STB1 watched 30% of
the program, on overall (including leaving the program and getting
back to it), during the whole time of broadcast of program number
56. A second data model that is recognized is a set top box time
signature. The set top box time signature is, for each set top box,
the list of percentages of viewing every channel during the
specific time aggregated for weekdays. As an example, set top box 1
(STB1) watched CNN on Sundays between 12:00 and 13:00, 25%, means
that during the learning period, the average time that this
particular set top box watched CNN between 12:00 and 13:00 on
Sundays was fifteen minutes.
[0090] A third data model that is recognized is a set top box
zapping frequency signature. Specifically, every profile does
zapping with different frequencies. Calculating zapping frequencies
of every set top box during the predefined time periods provides a
Zapping Frequency Signature.
[0091] Unfortunately, the zapping log is not noise free. Most of
the viewers use the remote control in the same fashion, but there
is a small minority of users that would use the remote control
differently. This affects the general zapping frequency, surfing
periods (when the viewer changes the channels with high frequency
in order to find something interesting), etc. In order to handle
these irregular behaviors, a set of data filters should be applied
to the zapping log prior to modeling.
[0092] Learning
[0093] For supervised learning, learning is a process in which the
set top box signatures (viewing, time, and/or zapping frequency),
created at the data modeling stage, are used with a list of set top
boxes and profiles to provide an Association Rule (block 706). The
Association Rule provides knowledge of how to associate a list of
profiles within a network to a set top box within the network. The
Association Rule is determined due to not having received filled
out questionnaires from all parties and wanting to determine
unknown relationships between profiles and set top boxes.
[0094] It should be noted that during supervised learning, it is
not determined which profiles are associated with which set top
boxes. Instead, as mentioned above, an Association Rule is
determined to provide knowledge of how to associate a list of
profiles to each set top box.
[0095] As mentioned above, during supervised learning there is an
association of set top box signatures (e.g., viewing) for each set
top box in the data model to a predefined list of profiles, based
on a sample, for further use in the identification functionality. A
sample is a partial list of set top boxes for which both the
zapping log and the list of profiles associated with each set top
box are provided. The sample may be provided by an operator of the
set top box collection. Predefined profiles can be, for example,
but not limited to, demographic profiles that define gender, age,
marital status, income level, or psychographic (behavioral)
profiles.
[0096] The Association Rule can be applied to any set top box in
the same network, as is performed during identification. An example
of a process that may be used to derive the Association Rule
follows. The management application 50 contains knowledge of the
current consumed service for a specific decoder, the profiles
(demographic, or behavioral) associated with a specific decoder and
household, and previously consumed content for a specific decoder.
In accordance with the present invention, the management
application 50 uses inference functionality to determine the
current viewer/listener profile. The inference functionality
defines the current profile(s) that is/are consuming the
service.
[0097] An example of inference functionality follows, where the
learning functionality uses Bayes rule. At this point, the
management application 50 contains knowledge of the current
consumed service for a specific decoder (set top box). In addition,
the management application 50 knows the demographic profiles
associated with a specific decoder and household. Further, the
management application 50 knows previously consumed content for a
specific decoder, specifically, the short-term history. The
management application 50 may then use the inference functionality
to determine the current viewer/listener profile.
[0098] An example for the inference functionality using Bayes rule
is provided hereinafter. In the learning algorithm, data collection
determines the distribution of the consumed content as a function
of the classification of the viewers/listeners at the household. In
addition, using the data in conjunction with the Bayes rule, the
probability that the household contains a viewer/listener belonging
to each demographic profile is estimated. Data utilized to perform
this process includes probabilities of each consumed service for
households containing each of the demographic profiles, as well as
probabilities of each consumed service for households not
containing each of the demographic profiles.
[0099] Bayes rule reads as shown by equation one below.
P(C|F1 . . . Fn)=P(F1 . . . Fn|C)*P(C)/(P(F1 . . . Fn|C)*P(C)+P(F1
. . . Fn|.about.C)*P(.about.C)) (Eq. 1)
In equation one, P(F1 . . . Fn|C) is the probability that a
household containing a certain profile (C) consumes the list of
services F1 . . . Fn and does not consume any other service. In
addition, P(F1 . . . Fn|.about.C) is the probability that a
household not containing a certain profile (C) consumes the list of
services F1 . . . Fn and does not consume any other service.
Further, P(C) is the probability that a household contains profile
C, regardless of the services consumed and P(.about.C) is the
probability that a household does not contain profile C, regardless
of the services consumed.
[0100] P(F1 . . . Fn|C) and P(F1 . . . Fn|.about.C) may be
approximated as the products P(F1|C)* . . . *P(Fn|C) and
P(F1|.about.C)* . . . *P(Fn|.about.C) respectively, which may be
calculated directly from the statistics gathered for the sample
population. Better approximations may be obtained by considering
correlations between services and between profiles in a household.
From the above calculation, the result is the probability, P(C|F1 .
. . Fn) that a household contains profile C, given the list of the
household consumed services. The collection of all values P(C|F1 .
. . Fn), calculated for the whole of sample set top boxes
represents the Association Rule used for the identification step,
applied to each set top box in the network, which was not part of
the sample set top boxes. In addition, from this calculation, the
result is the probability that a certain individual viewer from a
specific profile used the set top box.
[0101] In accordance with an alternative embodiment of the
invention, a sample may be provided, and post processing may be
provided to associate content with profiles. Specifically, a sample
may include at least one profile, a set top box associated with the
profile, and zapping information associated with the set top box.
Post processing may then be performed on the sample to determine
which content (e.g., advertisement) is most appropriate for
providing to the consumer associated with the profile. As a result,
in accordance with this alternative embodiment of the invention,
the learning process is not required.
[0102] Identification
[0103] Identification is a process of recognition of a list of
profiles as being associated with a certain set top box (STB),
based on the learning results. Every set top box in the network
should be assigned with at least one profile (demographic, or
behavioral). It is conceivable to assume that in front of a set top
box, mostly there is more than one active profile and there are
cases where the same profile should be associated a few times to
the same set top box. Thus, for each set top box there should be
assigned one or more profiles. For example, a young couple (male
& female) between the ages of 20-30 that are living together
would produce 2 profiles, specifically, one for the female and the
other for the male. As another example, if a specific household has
two boys of the ages seven and fourteen, the boys may both be
assigned to an appropriate set top box as the same profile, "Male
6-18."
[0104] To determine the list of profiles associated with a set top
box, the Association Rule is mathematically applied to the list of
set top box signatures (block 708).
[0105] Analysis
[0106] Analysis is the process of breaking down and studying the
results of learning and identification in order to estimate
possible identification errors, provide a set of different factors
and amendments for post processing, association of definition of
profiles by signatures to a third party definition, and any other
functionality resulting from studying the learning and
identification results.
[0107] The identification error analysis may be performed via
mathematical modeling means and/or via simulation (empirical)
means. For example, estimation of expected identification errors
may be achieved via applying the learned results to a part of the
sample and simulating the identification results.
[0108] Post Processing
[0109] Post Processing is the process of calculating the data
required for presentation to potential customers, such as, targeted
rating. Post processing also includes reporting and analyzing based
on results of identification. The aforementioned list of results is
obtained via post processing functionality described hereafter.
Such functionality may be provided by, for example, algorithms.
Post processing may be utilized to calculate the following data,
although post processing calculation is not intended to be limited
to calculating only this data; rather, by post processing any
calculation done with the use of the results obtained from the
learner and/or identifier is referred to as a post processed
calculation/algorithm.
Targeted Rating
[0110] Targeted rating may include a percentage of viewers of a
specific profile that consumed content, a percentage of viewers of
a specific profile that consumed content from a channel during a
specified time period, or a percentage of viewers of a specific
profile that consumed content provided within the network during a
specified time period. It should be noted that the term "consumed"
is used herein instead of the term "watched" since content consumed
by a viewer profile not only includes content that is watched by a
viewer profile, but also content that is not watched, but that is
provided to a set top box associated with a viewer profile, such
as, but not limited to, audio content.
[0111] Herein, content may be, for example, but not limited to, a
program. It should also be noted, that for exemplary purposes, the
following provides the example of consuming content comprising
watching content, however, one having ordinary skill in the art
will appreciate that consuming of content need not be limited to
watching content, but instead may include other functions such as,
but not limited to, listening to content received from a
channel.
[0112] More specifically, targeted rating functionality calculates
the targeted rating of a content per profile (e.g., using
optimization algorithms, see examples herein below) of the learned
and identified data, or of any independent data (e.g., obtained
from the sample) as long as the data contains information about the
set top box signatures (e.g., viewing signatures) and the
profile(s) associated to each set top box in the input. As an
example, the targeted rating functionality may be used on data
resulting from the supervised learning functionality, unsupervised
learning functionality, or independent data. It should be noted
that herein set top box signatures includes one or more set top box
signature.
[0113] Targeted rating may include targeted program rating,
targeted channel rating, and targeted time interval rating.
Targeted program rating is a percentage of viewers of a specific
profile that watched a program. In addition, targeted channel
rating is a percentage of viewers of a specific profile that
watched a channel during a specified time period. Further, targeted
time interval rating is a percentage of viewers of a specific
profile that watched content broadcasted within the network during
a specified time period.
[0114] Targeted rating determination may be provided in general or
regionally. Specifically, a regional targeted rating is a targeted
rating for one region, where a region may be limited to, for
example, a specific geographical location. Alternatively, general
targeted rating is a targeted rating for an entire network, or a
part of a network, which is region independent (for example, it may
include one or several combined regions).
[0115] FIG. 11 is a flow chart 950 illustrating the process of
determining a targeted rating. As shown by block 952, data
representing relationships between viewer profiles and set top
boxes is received, or obtained. Specifically, data showing which
profiles are associated with which set top boxes is received. The
data may either be obtained after performing learning and
identification processes, as described herein, or received from an
external source.
[0116] As shown by block 954, set top box signatures are also
received, or obtained, for use in determining targeted rating. Such
set top box signatures may be, for example, but not limited to,
viewing signatures, time signatures, high-resolution time
signatures, or zapping frequency signatures. It should be noted
that other set top box signatures may also be provided for by the
present system and method.
[0117] The type of set top box signature used in targeted rating
determination dictates which kind of targeted rating will result.
As an example, when viewing set top box signatures are used,
targeted program rating results. In addition, when time set top box
signatures are used, targeted time interval rating, or targeted
channel per a time interval rating, results.
[0118] As shown by block 956, a first input set is derived showing
the probability that each profile is associated with each set top
box. It should be noted that the first input set is derived by
performing the learning and identification processes, or is
received from an external source. A second input set is derived
containing data of set top box signatures (block 958). It should be
noted that the second input set is derived by performing the
modeling functionality on the collected/received zapping log. As an
example, for a viewing signature, the zapping log may contain
information showing whether a certain set top box consumed certain
content (for example, a program), or not. For purposes of deriving
the desired output set, namely, the set of targeted ratings, it is
assumed that the data of the set top box signatures can be
approximated by certain operations involving data associating
profiles to set top boxes and targeted rating.
[0119] As is shown by block 960, certain operations are applied on
the set of data associating profiles to set top boxes and the set
of data containing set top box signatures (the input sets),
resulting in a targeted rating (the output set). Different forms of
data sets and different operations may be used to provide the
targeted rating. As an example, matrices may be used to derive the
targeted rating, where it is assumed that multiplying a matrix A
(matrix A shows the probability that each profile is associated
with each set top box) by a matrix B (matrix B is the targeted
rating) would result in a matrix C (matrix C is the set top box
signature data). Of course, other examples of operations may be
used. Two examples of operations that may be used to determine
targeted rating are provided below.
[0120] If the network covers more than one region and information
on the regions in which the different set-top boxes in the network
reside is available, a regional targeted rating (RTR) may be
calculated using similar methods to those described below. In
addition, regional targeted rating of high-resolution time steps,
where a time step may be for example, but not limited to, per each
thirty seconds, may be calculated for each specific channel and
profile.
[0121] Input to the regional targeted rating functionality includes
the region in which each of the set top boxes is stationed, the set
top box signatures for set top boxes within that region, such as,
but not limited to, viewing signatures, time signatures, zapping
frequency signatures, and high-resolution time signatures, and
lists of profiles associated with each of the set top boxes within
the region, from any source. It should be noted that a region may
have one or more set top boxes therein. In addition, a set top box
may be located within more than one region.
[0122] The output of the regional targeted rating functionality is
the percentage of viewers of each predefined profile, within a
specific region, that watched each of the contents, for example,
programs, in the case of when viewing signatures are the input, or
of each channel at a certain time interval, in the case of when
time signatures are the input.
[0123] Two examples of methods that may be used to calculate
targeted rating are provided herein below. It should be noted that
the present invention is not intended to be limited to the
following examples, but instead that the following examples are
merely provided for exemplary purposes and are not intended to
limit the present invention.
EXAMPLE 1
[0124] An example of a method to calculate targeted rating, given a
list of set top boxes with viewing signatures and profile(s)
associated to each set top box, can be given via the use of a
linear regression optimization algorithm. In calculating the
targeted rating, it is assumed that multiplying the set of
parameters representing the association of profile(s) to set top
boxes (let us call it A) by the aggregation of targeted rating
values of each of the profiles per each program watched by at least
a portion of the set top boxes of the network for which the zapping
log contains records of set top box zapping signatures (the yet
unknown and desired output, let us call it B) corresponds to the
parameters representing the aggregation of the set top box viewing
signatures (part of the input, let us call it C).
[0125] For purposes of this example, it is assumed that the sets of
parameters A, B, and C are utilized to provide matrices A, B, and
C. A minimization algorithm on the squared norm of the matrix
(AB-C) may then be performed (a random initial guess is provided to
the algorithm for the values of B). In other words, given A and C,
the output of applying this algorithm is the set of probabilities,
B, representing the probability of each profile to watch each of
the programs broadcasted to the collection of set top boxes. An
example table for such an output is presented below after example 2
is described.
EXAMPLE 2
[0126] As a second example of a method to calculate targeted
rating, the matrices A, B and C are as in example one, where A is a
matrix containing list(s) of demographic, or psychographic,
profiles that is (are) associated to each set top box (of the whole
network, a part of the network, a specific region within the
network, or statistically representing any of those), which is
obtained from any source, either via local identification, via
receiving an external sample, or via another means.
[0127] The matrix C is a matrix that contains, per each of the set
top boxes, a list of set top box signatures per a channel, or a
program. Examples of forms of set top box signatures include, but
are not limited to, viewing signatures, time signatures,
high-resolution time signatures or any other form of set top box
signatures that associates knowledge of some viewing habits in a
certain period per each set top box. The unknown set of
probabilities per each of the pre-defined profiles, represented by
the matrix B, may then be obtained by the use of solving equation
two (Eq. 2):
B.apprxeq.A.sup.+C (Eq. 2)
In equation two, A.sup.+ is the pseudo-inverse of the matrix A,
which is unique in mathematical terms, thereby insuring that the
targeted rating matrix B computed in equation two is well-defined.
An example of a pseudo-inverse is the Moore-Penrose pseudo-inverse.
Calculating A.sup.+ and multiplying it by the matrix C gives a good
approximation to the matrix B, of the targeted ratings.
[0128] The algorithm of equation two is extremely accurate and
allows for the performance of targeted rating calculations on very
large amounts of data (more than an order of millions of entries)
in an extremely short computing time. Specifically, when performing
linear regression, for example, in accordance with one exemplary
embodiment of the invention, there is a requirement that for each
targeted rating element a separate optimization process is
performed, thereby requiring a long computation period. A targeted
rating element may be, for example, but not limited to, a program,
a time interval, or a channel.
[0129] Alternatively, in accordance with another exemplary
embodiment of the invention, if a pseudo-inverse is utilitized,
performing a matrix multiplication, instead of multiple
optimization processes, is very fast and is performed for all the
targeted rating elements at once, even if there are tens of
thousands of targeted rating elements.
An Example of Data and Targeted Rating Output Follows
[0130] If the pre-defined profiles are:
[0131] 1. Female of age 30-55 with high income.
[0132] 2. Male of age 18-40 with average income.
[0133] 3. Male child of age 6-16 with low income.
[0134] 4. Female child of age 6-16 with average income.
[0135] And the list of programs (as specified in the viewing
signatures) is:
[0136] 1. Saturday night live.
[0137] 2. Lost.
[0138] 3. 24.
Then the targeted rating (TR) output would be the following
table:
TABLE-US-00001 Profile Rating (in % of Program ID ID each profile)
1 1 0.5% 2 1% 3 0.01% 4 0.04% 2 1 3% 2 1.54% 3 0.01% 4 0 3 1 2.31%
2 2.11% 3 0 4 0
Content to Profile Assignment
[0139] In addition to a targeted rating of a content (for example,
program) per profile, a content to viewer profile assignment (C2P)
may be determined so as to provide an identification of what
content is being consumed by what viewer profile. For exemplary
purposes, it should be noted that content may be, for example, but
not limited to, a program. Specifically, a content to profile
assignment is beneficial to calculate for those set top boxes
within the network to which more than one viewer profile has been
associated so as to enable determination of which viewer profile of
the list of viewer profiles associated with the set top box
actually consumed a specific content.
[0140] The present description provides examples of how to
determine content to profile assignment for illustration purposes
only and is not intended to limit the invention to these examples.
Specifically, as previously shown above, the learning and
identification processes result in an association of at least one
viewer profile to a set top box for which a set top box signature
is provided. In addition, determining a targeted rating results in
a percentage of viewer profiles that consumed content, wherein the
content may be, for example, a program. Having the learning and
identification process result and the targeted rating result, it is
beneficial to determine what content is being consumed by what
viewer profile. Similarly an assignment of any content in a
specific time slot to a specific viewer profile in the household
that consumed this content may be made.
[0141] As previously mentioned, obtaining a content to profile
assignment involves determining for each content that was consumed
by a certain set top box, which is the specific viewer profile, or
viewer profiles, of the profiles associated to this set top box,
that consumed the content. Alternatively, if more than one viewer
profile has a probability of consuming the content, a list of
viewer profiles associated to this set top box that consumed the
content with certain probabilities may be calculated. This
calculation can be done, for example, via use of algorithms
applying algebraic manipulations to the sets of parameters
representing the aggregation of viewing (or other) set top box
signatures (denoted by C, as above), the parameters representing
the association of viewer profile(s) to set top boxes (denoted by
A, as above), and parameters representing targeted rating values
(denoted by B, as above).
[0142] Once the association of profile(s) lists to set top boxes is
obtained (the input set A), either by performing a
supervised/unsupervised learning and identification process, or
obtained from an external source, it is possible to utilize
statistical, algebraic, or other methods on input set A, together
with the set top box signatures of the set top boxes (input set C),
and the set of targeted ratings B, to infer the specific viewer
profile that watched each specific content via any given set top
box. The targeted ratings may be obtained either by one of the
methods described above, or by other methods, or received from an
external source.
[0143] For exemplary purposes, FIG. 12 is a flow chart 1000
illustrating a process for obtaining a content to viewer profile
assignment. As shown by block 1002, at least one set top box
signature is received, wherein each received set top box signature
provides an association of content consumed via an associated set
top box. Such set top box signatures may be, for example, but not
limited to, viewing signatures, time signatures, or high-resolution
time signatures. It should be noted that other set top box
signatures may also be provided for by the present system and
method.
[0144] Data representing relationships between viewer profiles and
set top boxes is received (block 1004), wherein the data may either
be obtained after performing learning and identification processes,
as described herein, or received from an external source. Such data
includes an association between at least one viewer profile and at
least one set top box. Preferably, the data is provided as a list
of viewer profiles that are associated with a specific set top
box.
[0145] Including the functionality of block 1002 and block 1004,
the result is an association of content consumed via an associated
specific set top box and a list of at least one viewer profile
associated with the specific set top box. These results may be
obtained for one or more set top boxes within the network, wherein
the content to profile assignment may be determined for each such
set top box.
[0146] As shown by block 1006, operations are performed on the set
top box signatures and the association of viewer profiles to set
top boxes to obtain content to profile assignment. It should be
noted that many different examples of operations may be provided.
The following provides two examples of operations that may be used
to obtain content to profile assignment.
EXAMPLE 1
[0147] Using the targeted rating of viewer profiles, or other data
describing the viewing habits of each viewer profile associated
with the network of set top boxes, or associated with a part of the
network of set top boxes; and further having the association of
viewer profile lists to the set top boxes, as obtained from
supervised or unsupervised learning and identification methods, or
by an algorithm, or obtained from an external source; then the
probabilities for any viewer profile to watch given content are
deduced using statistical analysis or any algebraic, or other
method. Assuming, for illustration purposes, that in example 1
content is a program, let us denote by P.sub.j(f) the probability
that a specific program, denoted by j, was consumed via a certain
set top box, denoted as STB.sub.i, within the network, by a certain
viewer profile, f, identified to be in the list of profiles using
this specific STB.sub.i. Then, for example, P.sub.j(f) may be
calculated as:
P j ( f ) = TR j ( f ) f ' TR j ( f ' ) , ##EQU00001##
where TR.sub.j(f) denotes the targeted rating of the specific
program j (where program j is a program for which we are
determining viewer profile(s) that watched program j) for profile
f, and f' range over the profile list, of profile(s) that had been
associated with this STB.sub.i, via which program j had been
consumed.
[0148] The association of the list of profiles to this specific
STB.sub.i may be obtained by learning and identification processes,
or any other method, or received from an external source (or,
alternatively, assuming all profiles are associated with this
STB.sub.i with some probability if no other information is given).
In the case that association of the profile f to the STB.sub.i, via
which program j had been consumed, is given with a certain
probability, it is possible to get the probability that the profile
f watched the program j in the STB.sub.i in a more accurate way,
for example by multiplying each targeted rating by the appropriate
probability.
[0149] Let us note that the accuracy of P.sub.j(f) gets higher if
the watching correlations between the different profiles, f',
associated with the STB.sub.i that watched program j, is as low as
possible. It should be noted that zero correlations means that only
one profile, f, out of the list of profiles, f', which are
associated with the STB.sub.i, would usually watch the program
j.
[0150] Applying a maximization on all probabilities P.sub.j(f),
obtained for each of the viewer profiles, f, that are associated
with the STB.sub.i, would then result in obtaining the content to
viewer profile assignment, where the profile having the highest
probability as determined, is the viewer profile that watched the
program. It should be noted that if more than one profile has the
same high probability as determined, then both viewer profiles
watched the program.
EXAMPLE 2
[0151] While example 1 usually provides accurate results, it might
take a long computation time, in case it needs to be computed for
each content, for example for each program, and each set top box
that consumed this content (for example, watched the program),
separately. Moreover, example 1 depends upon the input set B of
targeted ratings.
[0152] An alternative example, as shown by example 2, would just
apply algebraic manipulations on the sets A and C, described above,
where set A is either obtained from the processes of learning and
identification, or is received from an external source. It should
be noted that in accordance with the second example, there is no
requirement for calculating or receiving the targeted rating (set B
above).
[0153] Assuming for illustration purposes of this example that the
sets A and C are matrices and that a content is a program, the
following method of C2P may be considered. For each program j, that
had been watched via STB.sub.i, P.sub.j(f), the probability that a
profile f (of a list of profiles associated with STB.sub.i) is the
one who watched program j, is obtained via algebraic manipulations
on the matrices A and C and statistical inference:
[0154] P.sub.j(f) is calculated as the number of set top boxes that
were associated with profile f via which program j had been
watched, divided by the number of set top boxes via which program j
had been watched. Then, the quantity P.sub.ij(f) is obtained as the
probability that STB.sub.i contains profile f, and via it program j
had been consumed. Then, as in example 1, a maximization on all
probabilities P.sub.ij(f) may be applied for each of the viewer
profiles, f, that are associated with the STB.sub.i, thereby
resulting in obtaining the content to viewer profile assignment,
where the profile having the highest probability as determined, is
the viewer profile that watched the program. Again, it should be
noted that if more than one profile has the same high probability
as determined, then both viewer profiles watched the program.
[0155] It should be noted that other methods may be used to
associate content to viewer profiles and such methods are intended
to be included within the present description.
Total Viewership
[0156] Further, a total viewership may be calculated (using, e.g.,
a program--time slot map and applying to it a calculation algorithm
which utilizes data obtained in the previous steps described here),
which is the calculation of total aggregated viewing activities for
each of the pre-defined profiles (these may be demographic or
behavioral), during a twenty-four hours period for each week
day.
[0157] For example, having the association of profile(s) with each
set top box, represented as a set of probabilities (either obtained
as an output from the learning and identification steps or given
from an outside source), and given the set top box signatures
(e.g., as an output from the data modeling stage), given in
addition the broadcasting time table (showing for a pre-defined
period of time at which time and date and for which duration each
program was broadcasted), the following calculation is
performed.
[0158] The data is aggregated and modulated in such a form that for
each day of the week (24 hours) it is calculated how many of each
of the pre-defined profiles watched any content during each of the
pre-defined time intervals. For example, if the period decided upon
is three months and there were 12 Sundays during this period, the
24 hour period is divided to intervals of 15 minutes and for each
such interval it is calculated (using the set top box signatures
and the data mentioned above) how many times each of the
pre-defined profiles watched any content during each of the 15
minute intervals aggregated for all 12 Sundays on a 24 hours span.
Then this information is presented in a graph showing the viewing
peaks during a 24 hour Sunday divided to 15-minute slots per each
profile. This is done for each day of the week (aggregated to the
number of time this weekday appeared during the three months
period).
[0159] In addition to the abovementioned, a targeted rating
distribution may be determined, which involves, for every channel,
for every profile, calculating the rating of the channel for every
brief period of time (e.g., thirty seconds), for every minimally
defined region. Further, a viewership flow may be determined, which
includes, for every channel, calculating the number (or percentage)
of viewers of every profile that join and leave the channel during
every short period of time (e.g., thirty seconds), for every
minimally defined region. Still further, creative reports may be
determined such as, for example, during an advertisement break, for
each second, calculating the rating and viewership flow. All the
aforementioned are merely examples of the post processing
possibilities.
[0160] In the supervised case, with the knowledge gained by the
functionality of block 310, for any households that did not fill
out the questionnaire, the management application 50 uses
identification functionality to associate the rest of the set top
boxes 110 with the profiles that are using the set top boxes 110
(block 312). An example of the functionality, which is used as a
basis for such an identification functionality, is provided herein
below. It should be noted that different relevant learning methods
may be used to perform the identification functionality. Examples
of such learning methods may include the use of any one of the
following, or other learning methods: Bayesian learning, various
statistical methods, artificial neural networks; decision trees;
k-nearest neighbor; quadratic classifier; support vector machine;
various optimization methods, and direct calculation of
probabilities. Of course, other learning methods may be used and
are intended to be included within the present description.
Viewership Flow
[0161] Using the identified profiles data and high-resolution time
signatures, a viewership flow may be calculated. It should be noted
that a high-resolution time signature is a representation of which
channel each set top box watched during each time step of a
specific time interval, such as, but not limited to, thirty
seconds. In addition, a viewership flow is the number of viewers of
each profile that left or joined watching a specific channel during
each time interval (e.g., 30 seconds), during a day or any
pre-defined time interval. Viewership flow may be calculated using,
for example, but not limited to, a high-resolution regional
targeted rating, in addition to the data of signatures and lists of
profiles associated with each set top box.
[0162] Calculation of viewership flow is performed in a few steps.
It should be noted that the following is an example of steps that
may be used to calculate viewership flow, however, the following
example is not the only way to calculate viewership flow and this
example is not intended to be limiting. As a first step, the
high-resolution regional targeted rating is calculated. Calculation
of the high-resolution regional targeted rating provides, per each
channel and per each viewer profile, the percentage of viewers of
this viewer profile that watched this channel per each time
interval (for example, 30 seconds) during each day of a specified
period. Such targeted rating may be calculated, for example, but
not limited to, using a method similar to the method described in
the targeted rating section of the present description, where the
word program is replaced by channel per time interval.
[0163] To calculate viewership flow, the differences between the
targeted ratings of same viewer profiles, per different time
intervals, may be calculated to record the change in number of
viewers of each profile between successive time intervals.
Moreover, using for example, but not limited to, the method
described above as content to profile assignment, the number of
viewers that left or joined the viewers of each channel at each
time interval may be calculated. To summarize: the viewership flow
application may contain various descriptions of changes in viewers
per channel per time interval. For Examples of the abovementioned
include, but are not limited to, targeted rating and the changes in
targeted rating per time interval, and number of viewers of each
profile who left or joined the viewers of the channel at each time
interval.
Unsupervised Learning
[0164] Reference is now made to the flowchart 800 of FIG. 8. The
flowchart 800 of FIG. 8 further illustrates the process of
identifying and associating consumer profiles to set top boxes
100A-100D within an unsupervised learning scenario. It should be
noted, that unlike with supervised learning, with unsupervised
learning no sample relating viewer profiles to set top boxes is
provided. Moreover, the type of viewer profiles might be unknown at
the stage of the learning. As a result, the viewer profiles must be
determined. It should be noted that different types of viewer
profiles may exist, including, but not limited to, demographic and
psychographic types of viewer profiles. For example, for the
psychographic type of viewer profile, the profile may contain
multiple categories, such as, but not limited to, watching habits,
purchasing behavior, social class, lifestyle, opinions, and
values.
[0165] To determine viewer profiles one of many methods may be
used, such as, but not limited to, using clustering algorithms to
find common denominators within a population in association with
viewing habits of the population. An example of a method that may
be used for profile learning and determination is provided
below.
[0166] As shown by block 802, set top boxes 110 in the network 10
record all zapping events created by the consumers. The set top
boxes 110 send the zapping events to the management application 50
(block 804). It should be noted that the zapping events include an
identification of the set top box from which the zapping events
were derived. The management application 50 then associates
behavior of consumers and their zapping patterns (block 806).
[0167] FIG. 9 is a block diagram further illustrating functionality
of the management application 50 as blocks of logic. As shown by
FIG. 9, the management application 50 contains modeling logic 902,
learning logic 904, identification logic 906, analyzer logic 908,
profiles determination logic 910, post processor logic 912, and
reporting logic 914. The logic of the management application 50 is
further described in detail with regard to the logical flow diagram
of FIG. 10.
[0168] FIG. 10 is a detailed logical flow diagram illustrating a
sequence of events performed during unsupervised learning. The
zapping log and the broadcast schedule (arrows 1) are the inputs to
modeling functionality of the management application 50, the output
of which is a collection of set top box signatures (arrow 2),
wherein the collection of set top box signatures includes a
signature for each set top box in the network. The set top box
signatures may be one of multiple classes of signatures, wherein
the classes of signatures include viewing signatures, time
signatures, and zapping frequency signatures. Each set top box in
the network may have multiple signatures, wherein the signatures
for a single set top box are selected from the classes of
signatures. In fact, for example, a single set top box may even
have one or more of each class of signature. Each such set top box
also has a unique identification (ID). Viewing signatures are
vectors of all the programs watched during a specified period by
each of the set top boxes in the network.
[0169] The set top box signatures are the input used by learning
functionality (arrow 3) of the management application 50. The
learning functionality clusters profiles into groups of profiles
that are yet unresolved. It should be noted that an unresolved
profile is a profile for which a type is not yet known.
Specifically, the learning functionally, which is further described
in detail below under the section entitled "learning", is capable
of using the set top box signatures and determining relationships
between profiles to derive clusters of profiles, where a type of a
profile is not yet known. As an example, an optimization algorithm
may be used to cluster the profiles into groups of unresolved
profiles, an example of which is illustrated below. The learning
step may be performed a few times, to determine the number of
existing profile groups available for identification from viewing
signature data. This may be done by, for example, but not limited
to, throwing out, after each iteration, the profile groups that
have similarity to each other, which is greater than a pre-defined
threshold.
[0170] As previously mentioned, the output of the learning
functionality of the management application 50 is clusters of yet
unresolved profiles (arrow 4). The clusters of the yet unresolved
profiles, together with a profile description (arrows 5), are the
input to the profiles determination functionality of the management
application 50.
[0171] The profiles description is a classification, or definition,
of profiles of viewers by groups that associates between, for
example, viewing habits and purchasing habits of individuals. The
profiles description is provided by an external source, such as,
but not limited to, a single source researcher. It should be noted
that the profile description input is some external definition of
profiles that is fed to the system.
[0172] The profiles determination functionality performs a match
between the profiles found by the learning functionality
(unresolved profiles) and the profiles description from the
external source, which determines whether to match the profiles to
demographic clustering or to a specific psychographic clustering,
for example, by consuming habits. The profile determination with
respect to a given profile description may be done, for example, by
performing a standard best match procedure on each of the profiles
in both groups (unresolved and pre-defined) and by finding the best
possible match to each profile from the unresolved group from the
defined profiles. It should be noted that sometimes one unresolved
profile might fit to two described profiles and vise versa--two or
more unresolved profiles can match one profile from the described
profiles group.
[0173] The output of the profiles determination functionality are
the resolved profiles (arrow 6), which are the input, together with
the set top box signatures, to an identification functionality
(arrows 7).
[0174] In accordance with an alternative embodiment of the
invention, the learning and the profiles determination
functionalities may be performed simultaneously by combining these
two functionalities (learning and profile determination) of the
management application 50 into one. In accordance with this
embodiment, the profiles description and the set top box signatures
are both fed as inputs to the learning and profiles determination
functionalities (arrows 3 and 5). In this case, the learning and
profiles determination functionalities are performed together. The
output of the learning and profiles determination functionalities
is resolved profiles (arrow 6). In the case of combining these two
functionalities, directing the learning process toward the input
profiles description may be done by, for example, but not limited
to, feeding the described profiles as an initial guess to the
optimization process and using the number of the defined profiles
as the number of profiles to found.
[0175] The resolved profiles are sometimes used together with the
set top box signatures as an input to the identification
functionality of the management application 50 (arrows 7), to
associate each set top box in the network with at least one
profile, during which, for example, a quantization process may be
performed and each set top box in the network may be associated
with at least one profile.
[0176] A quantization process is a process during which, rather
than having a continuous range of probabilities of having each of
the profiles associated with some set top box, some profiles would
be decided as not associated to that set top box (due to having a
too small probability of being associated), while other profiles
would be decided as being associated (with some higher probability,
or 1). A quantization process may be performed by, for example,
calculating a statistical constant related to the association of
profiles to set top boxes (see detailed explanation below) and
performing rounding steps. A quantization procedure may be
performed at various steps of the learning and identification
process.
[0177] The identification of lists of profiles associated with each
set top box in the network may be performed by, for example, but
not limited to, combining the association rule between unresolved
profiles to set top boxes and the association rule between resolved
and unresolved profiles to create an association rule associating
lists of resolved profiles to set top boxes. For example, the
association rules may be matrices of parameters and the application
of the association rules may be performed, by using matrix
multiplication.
[0178] The output of the identification functionality (arrow 8) is
the identification of which profile(s) uses each of the set top
boxes in the network. In other words, the output is an
identification of at least one profile associated with each set top
box in the network.
[0179] The profiles description, set top box signatures, and
profiles associated with each set top box (arrows 9) are fed to
analyzer functionality of the management application 50, the output
of which is an estimation of identification quality and error
estimation (arrow 11). Specifically, the analyzer is a
self-assessment tool of the management application. The analysis in
the case of unsupervised learning is performed with respect to the
profiles definition input. The output of the analyzer may be, for
example, the quality of the ability of the system to classify the
profiles into groups according to the given profile definition,
ranking the quality of the input data in view of desired output
versus the actual output, and error estimation regarding the
accuracy of the identification process.
[0180] The estimated errors may be, for example, the expected
deviation from the actual situation, and false positive and false
negative identification rates. Moreover, correlations between the
different profiles groups may be calculated, thereby providing
information regarding identification possibilities of certain
profiles with respect to their correlations with other profiles.
This may be done, for example, by performing comparison of results
with known statistics, or by comparing results obtained for all of
the network with results obtained from a well representing subgroup
of the network.
[0181] The identified profiles associated with a set top box are
fed as an input, together with the set top box signatures (either
the same ones used for the learning and identification
functionalities, or others, such as time signatures or
high-resolution time signatures) and additional set top box data,
if required, to post processor functionality of the management
application 50 (arrows 12). The post processing functionality
computes various data, such as: regional targeted rating (RTR),
content to profile assignment (C2P), total viewership and
viewership flow. A description of these functionalities was
presented above. Note that the computation of the functionalities
of the post processor may remain the same for data (associating
lists of profiles to set top boxes) obtained via supervised
learning, unsupervised learning, or an external source.
[0182] Reporting functionality of the management application 50
uses the computed data to produce business and other reports (arrow
13). As with the supervised scenario, the association process, also
referred to as the learning and identification process, is divided
into multiple steps. The steps in the association process include
data collection, modeling, learning, profiles determination,
identification, analysis, and post processing. Of the multiple
steps, usually the data collection, modeling, analysis and post
processing remain the same for both the supervised and unsupervised
processes. The main difference in the supervised and unsupervised
processes is in the learning step, which may also include a profile
determination step, and which may inflict some differences in the
identification steps. Note that the steps of learning, profile
determination, and identification are sometimes called here for
short, "unsupervised learning". The unsupervised learning process
is further defined herein below.
[0183] Learning
[0184] For unsupervised learning, each set top box signature is
learned to be associated with a certain list of unresolved profiles
defined solely using the set top box signatures. Examples of such
set top box signatures include, but are not limited to, viewing
signatures, time signatures, high-resolution time signatures, and
zapping frequency signatures. It should be noted that the main
difference from the supervised learning process is that no sample
is provided in this case. An unsupervised learning algorithm
receives the set top box signatures only as an input, resulting in
a classification of profiles into, for example, a certain type of
psychographic (for example, behavioral) or demographic profile
groups. After the first step (unless the steps of learning and
profile resolving are combined) the resulting learned profiles are
usually yet unresolved, meaning that their nature is yet to be
resolved.
[0185] Examples of unsupervised learning algorithms include, but
are not limited to, least squares algorithms and algorithms that
provide minimization via steepest decent. Other outputs from the
learning algorithms include an association of profiles to set top
boxes and obtaining a targeted rating of the defined profiles at
the same time, thereby providing a probability that a profile is
associated with a set top box.
[0186] The following is provided as an example of an unsupervised
learning algorithm. An input to the unsupervised learning process
is the collection of set top box signatures, which is the output of
the data modeling process. Assume as an example that these are
viewing signatures (although these might be time signatures, etc.),
where we denote their parametrical representation by a matrix C.
For example, each row of the matrix C may refer to one set top box,
and each column of the matrix C may refer to, for example, but not
limited to, one program, where the entries of matrix C may be, for
example, the portions of the programs that each set top box
watched, or, for example, the probabilities with which each of the
set top boxes represented in matrix C watched each of the programs
represented in matrix C. Let us denote by a matrix A the collection
of probabilities, representing viewer profiles association to the
set top boxes, where the entries of the matrix A are the
probabilities of each of the viewer profiles to be associated with
each of the set top boxes. Note that the viewer profiles might be
yet unresolved viewer profiles at this stage. Let us denote by the
matrix B, targeted rating probabilities. Both A and B are unknown
in the case of unsupervised learning. To obtain the desired outputs
A and B, we use, for example, but not limited to, the following
method. We minimize the squared norm of the difference (AB-C) (see
equation three), to obtain the approximation of the matrix C as the
product AB. For this, we are using, for example, but not limited
to, a convex optimization algorithm (or, for example, some other
nonlinear minimization algorithm) under various constrains, such
as, but not limited to, that each quantity in A is greater than
zero and smaller than one, and each quantity in B is greater than
zero and smaller than, for example, 0.5. The following description
further describes this process.
[0187] Following this example, to determine a possible algorithm
for achieving the minimization of the squared norm of the matrix
(AB-C), (see equation three), considered above, it is assumed that
the population consists of viewers that can be divided into several
groups of different profiles, where each viewer may belong to one
or more group of viewers profiles. Each such group of profiles is
associated, for example, with a behavior pattern in terms of
watching habits, where the pattern consists of, for example, but
not limited to, the viewing signatures and the targeted rating per
content and per each profile, where the targeted rating for the
profile is the probability of a viewer of this profile watching
each program, or some other definition of content.
[0188] Since usually the number of all possible profile groups is
low compared to the number of programs and set top boxes in the
network, one is actually looking for a low rank approximation of
the matrix C, the term low rank (of matrices A and B) refers in
this case to the fact that the number of different profile groups
is smaller than the dimensions of C, representing for example the
number of programs and the number of set top boxes in the network,
where due to this low rank the matrices A and B may be obtained
using this approximation. One approach to obtaining a low rank
approximation of the matrix C is to search for the matrices A and B
that minimize the squared norm of the matrix (AB-C). This can be
done using, for example, a convex optimization method on the
quantity of equation three, which reads:
n = AB - C 2 = i , j ( k A ik B kj - C ij ) 2 = Trace ( ( AB - C )
T ( AB - C ) ) ( Eq . 3 ) ##EQU00002##
where n denotes the squared norm of (AB-C), and trace is a known
operation on a matrix providing the sum of the diagonal. In order
to minimize this efficiently, one may use the derivatives of
equation three, described in equations four and five, each of which
read as follows:
.differential. n .differential. A ab = 2 i , j ( A ai B ij - C aj )
B bj .differential. n .differential. A = 2 ( AB - C ) B T ( Eq . 4
) ##EQU00003##
and correspondingly,
.differential. n .differential. B = 2 A T ( AB - C ) ( Eq . 5 )
##EQU00004##
[0189] The second derivatives may also be calculated in order to
perform this minimization and they are given by the combination of
equations six, seven, and eight below:
.differential. 2 n .differential. A ab .differential. A cd = 2
.delta. ac ( BB T ) bd ( Eq . 6 ) .differential. 2 n .differential.
A ab .differential. B cd = 2 .delta. bd ( A T A ) ac ( Eq . 7 )
.differential. 2 n .differential. A ab .differential. B cd = 2 A ac
B bd + 2 .delta. bc ( AB - C ) ad ( Eq . 8 ) ##EQU00005##
Using any standard convex optimization technique and the
derivatives above with the (convex) constraints 0.ltoreq.A.sub.ij,
B.sub.ij.ltoreq.1, a solution of the optimization problem may be
found, where the joint dimension of the matrices A and B is chosen
as the desired, or expected, number of profiles.
[0190] The matrix A is to be understood as the set of probabilities
of association of each of the profiles per each of the set top
boxes and the matrix B is the targeted rating matrix. Since the
matrix A is expected to contain binary quantities (either a profile
exists in a household or not), and since the optimal solution is
defined up to a multiplicative constant for each profile, it is
desirable to find a good quantization criterion for A.
[0191] Instead of the above-described example, for the unsupervised
learning algorithm, one may consider the slightly more complex
example described below. Moreover, these alternative ways may be
used to address specific different cases and the present invention
is not limited to these examples. An example of an alternative way
is, instead of minimizing the squared norm of the matrix (AB-C),
minimizing the squared norm of (B-(A+)C), denoted herein by m:
m=.parallel.B-(A.sup.+)C.parallel..sup.2 (Eq. 9)
In addition, it is also possible to minimize the squared norm of
(A-C(B.sup.+)), denoted by v:
v=.parallel.A-C(B.sup.+).parallel..sup.2, (Eq. 10)
where A.sup.+ denotes the pseudo-inverse of the matrix A, and
B.sup.+ denotes the pseudo-inverse of the matrix B. For example,
the Moore-Penrose pseudo-inverse may be used. This enables a
reduction of the dimensionality of the problem as the dimensions of
the later matrices are usually much smaller than of the matrix
(AB-C). Further, this approach creates a sharper distinction
between the probabilities in A (desired to be binary) and of B
(usually small probabilities representing targeted rating) in the
minimization process. The pseudo-inverse of a matrix is unique in
mathematical terms, hence minimizing equations nine or ten is well
defined. In the case of minimizing, for example, the quantity m,
one would need to use the derivatives
.differential. m .differential. A and .differential. m
.differential. B , ##EQU00006##
which involves calculating derivatives of the form
.differential. A + .differential. A ab , ##EQU00007##
where:
.differential. A ij + .differential. A ab = ( A + A + T ) ib
.delta. ja - A ia + A bj + - ( A + A + T ) ib ( A + T A T ) aj ( Eq
. 11 ) ##EQU00008##
The result of applying the derivative in equation eleven to obtain
the derivatives
.differential. m .differential. A , and .differential. m
.differential. B , ##EQU00009##
so as the second derivatives, of the quantity m, results in
slightly longer expressions than the derivatives presented above,
in equations 4-8, but similar in nature.
[0192] Moreover, instead of using convex minimization routines, we
may use various nonlinear minimizations with slightly altered
constrains to minimize the squared norms of the differences
above.
[0193] An initial guess, for example, but not limited to, a random
guess, is given to the algorithm for any of the probabilistic
quantities in A and B. Additional constrains may be given to the
algorithm to increase its accuracy. Of course, other optimization
(or learning) algorithms may be used. The output is a set of
probabilities, A, associating groups of profiles to the set top
boxes, which later may be quantized and/or resolved (using, when
needed a profile resolving procedure and quantization), and a set
of probabilities, B, providing the targeted rating for each (for
example) program and each profile (also to be used in the profile
resolving scheme when needed). It should be noted that the targeted
rating may be re-calculated during the post-processing to increase
the accuracy.
[0194] It should be noted that the abovementioned examples,
equations, and functionalities are based upon the general premise
that matrix C can be approximated by matrix A multiplied by matrix
B. Of course, further examples for achieving such approximation may
be provided and such examples are intended to be included within
the present invention.
[0195] Quantization
[0196] The quantization step is typically, but not necessarily, to
be used after the learning and profile determination stage, in the
identification functionality, or a few times during the steps of
learning, profile determination, and identification.
[0197] One approach to finding the quantizing constants (a set of
constants that each of the probabilities relating each of the found
profiles to set top boxes should be divided by to determine whether
a certain profile should indeed be associated with a certain set
top box or not) is to assume that A is approximately a binary
matrix with a constant multiplicative factor per column, s.sub.i
(1.ltoreq.i.ltoreq.number of profile groups), or in other words,
assume that each of the i profile groups has its own quantization
constant. Since the entries are supposed to be binary quantities,
one expects the following from calculating the mean and variance
using the binomial distribution, as shown by equations 12 and
13.
.SIGMA..sub.aA.sub.ai=s.sub.iNp (Eq. 12)
.SIGMA..sub.aA.sub.ai.sup.2/N-(.SIGMA..sub.aA.sub.ai).sup.2/N.sup.2=s.su-
b.i.sup.2pq (Eq. 13)
where N is the number of set top boxes in the network, p is the
probability that a profile is associated to a set top box, and
q=1-p. Solving equation twelve and equation thirteen for s.sub.i,
dividing A.sub.ai/s.sub.i and rounding to a pre-defined threshold,
leads to an association rule, associating each of the profiles
(resolved or yet unresolved) to each of the set top boxes.
[0198] Profile Determination
[0199] Profile determination, or resolving, is a process that
defines the nature of identified profiles. During profile
resolving, profiles definition, for example from a single source
research results, such as, but not limited to, viewing habits and
behavior, may be used as inputs. In addition, the profile list and
targeted rating of defined profiles may be used as inputs. The
inputs are provided to a resolving algorithm resulting in profile
descriptions that describe each profile in the list.
[0200] The single source research addresses a focus group that
answers a questionnaire. There are two groups of questions in this
questionnaire, namely, a first group and a second group. The first
group refers to identity of a person, examples including behavior
(i.e., purchasing behavior, rest and relaxation preferences, etc)
and demographic profile of the answering person. The second group
refers to media consumption, for example, about the time a person
would watch television each day of the week and his preferred
shows.
[0201] The single source research associates the media consumption
habits with other habits, such as, but not limited to, purchasing
habits and preferred vacation habits. The output of the single
source research is a set of profiles and their habits, while each
profile is associated with its media consumption habits. The
resolving algorithm finds the best correlation between two sets of
data, namely, for example, the media consumption habits of the
focus group; and, for example, the targeted rating of the defined
profiles (the output of the unsupervised learning algorithm).
Therefore, the resolving algorithm has the capability of defining
the traits of the learned profile in the unsupervised
algorithm.
[0202] In accordance with the present invention, after the learning
and identification are performed, the management application 50
knows online, or offline, the current psychographic or demographic
profiles that are consuming content for at least a portion of the
set top boxes of the network for which the zapping log contains
records of set top box zapping signatures. The information
regarding the current demographic/psychographic profiles that are
consuming content for set top boxes within the network for which
sufficient input was received, may be the basis for personalized
advertisements deployment in accordance with the present
invention.
Real Time Targeted Rating (RTTR)
[0203] The present system and method provides the capability of
determining whether a set top box within a network is on or off. In
addition, if the set top box is on, the present system and method
provides the capability of identifying in real time, or near real
time, which viewer profile is currently consuming content provided
by the set top box, what is the targeted rating of the viewer
profile, or profiles, currently consuming content provided by the
set top box, and a targeted rating of all viewer profiles that
consumed content of the set top boxes within the network, which are
part of the real time targeted rating system, for a predefined time
interval. This real time process is referred to herein as the real
time targeted rating. As previously mentioned, the content may be,
for example, but not limited to, video, audio, data, or any
combination of these. As will be described in additional detail
herein, the real time targeted rating functionality uses the
functionalities and methodologies mentioned above with regard to
supervised learning, unsupervised learning, identification, content
to profile assignment, and targeted rating. The real time targeted
rating process is described in detail hereafter.
[0204] Functionality performed in real time targeted rating may be
performed by a separate or the same management application of the
present system and method, located in a head end or in a different
location, or a management application located in a different
location, as described hereinabove. In addition, the functionality
may be performed by a separate computer and/or server (not shown).
The embodiments are intended to be covered by the present
description. It should further be noted that certain functions of
the real time targeted rating process may instead be performed by
the set top box itself.
[0205] In accordance with the present invention, queries can be
made by users of the present system and method for execution of the
real time targeted rating functionality for each set top box within
the network that is covered by the present system and method. These
queries may be made, for example, but not limited to, through a
remote web client. For example, multiple web based clients may
subscribe to the system described herein for retrieving
pre-configured reports, reports which are created automatically by
the system periodically every pre-defined time interval (for
example, 5 minutes), or per query. Such queries may be made, for
example, regarding each of the pre-defined viewer profiles to find
out in real time, or near real time, whether a set top box in
question is on or off, to identify what viewer profile, or viewer
profiles are currently consuming (or consumed for the last
predefined time interval) content via the set of box in question,
to determine a targeted rating for a specific viewer profile that
consumed content from this set top box, and to determine targeted
ratings of all viewer profiles that consumed content of the set top
box, or set top boxes, within a predefined time interval. An
example of determining targeted ratings of all viewer profiles that
consumed content of the set top box, or set top boxes, within a
predefined time interval includes calculating targeted ratings for
viewer profiles that were determined to be consuming content
provided by the set top box, or several set top boxes, within the
last five minutes. Such a determination uses, per each set top box,
at least one set top box signature summarizing activities of the
set top box for the last five minutes. Further description is
provided herein.
[0206] The real time targeted rating system may include a part, or
all, of the following capabilities: data collection, modeling,
learning, identification, content to profile assignment, targeted
rating (and/or regional targeted rating), and a reporting
capability, which can be utilized, for example, via a web
interface, or other interface, to produce, for example, business
reports, system reports, or any other reports involving the
produced data. These reports may be generated automatically,
periodically (for example, every 5 minutes), or per a query, or
both. A query may be initiated, for example, by a user of the
system, by a web client, or by any other interface interacting with
the system and having the capability of making a query. Such a
query may be, for example, automatic, or manual, or provided by
another method.
[0207] The real time targeted rating system may be beneficial for
content placement, for example, at a certain time and a certain
channel, or a certain time and a certain set top box; where
content, may refer, for example, to an advertisement. Other
examples of content may be a program, or an audio content, or any
other example of content that may be consumed via a set top
box.
[0208] FIG. 13 is a flowchart 1100 illustrating functions performed
by the present system and method during execution of the real time
targeted rating process. As shown by block 1110, a determination is
made regarding whether a set top box is on or off. The process of
determining whether a set top box is on or off is described in
detail with regard to the flowchart of FIG. 14, as provided
hereafter.
[0209] Returning to FIG. 13, as shown by block 1120 if the set top
box is on, the real time targeted rating functionality determines
what viewer profile or profiles are currently consuming content
provided by the set top box. Determining which viewer profile or
profiles are currently consuming content provided by the set top
box is described in detail hereafter.
[0210] After determining which viewer profiles are currently
consuming content, a targeted rating or targeted ratings may be
determined (block 1130). The targeted rating may either be a
targeted rating for a viewer profile currently consuming content
provided by the set top box or the targeted ratings may be targeted
ratings of all viewer profiles that consumed content of the set top
box, or of several set top boxes, within a predefined time
interval. It should be noted that the real time targeted rating
process may be repeated after the passing of a predefined time
period, per user query, or both. By repeating this process, after
the passing of the predefined time period, the real time, or near
real time, determination of which viewer profiles are consuming
content from the set top box, may change, and is maintained current
(in the sense that after each predefined time period, a new
determination of what viewer profile(s) are consuming content of
the set top box is achieved, maintaining always the most current
identification). The time interval is small enough to be considered
`now` and big enough to allow for the accumulation of enough
data.
[0211] Generating reports (for example, busyness reports, and/or
system reports), as shown by block 1140, either automated,
periodic, per a user query, or both, may also be provided.
[0212] On/Off Set Top Box Determination With Real Time Targeted
Rating
[0213] In accordance with the present invention, the present system
and method provides the capability of determining whether content
provided by a set top box within the network is being consumed by a
viewer profile. Determining whether content is currently being
consumed allows the present system and method to determine if a set
top box is currently on or off.
[0214] One method that is used by the present system and method to
determine if content is currently being consumed is to continuously
update a set top box zapping signature with the occurrence of each
new zapping event associated with a set top box. By continuously
updating the set top box zapping signature of the set top box, the
set top box zapping signature remains current and may be considered
for determining if a set top box is currently on or off. It should
be noted herein that, in accordance with the present invention, a
set top box is considered to be off not only if no power is being
received by the set top box, but also if content provided by the
set top box is not being consumed by a viewer profile within a
predefined period (for example, if no zapping event occurred during
a predefined time period, with or without association to a
schedule).
[0215] FIG. 14 is a flow chart 1200 further illustrating the
process of determining if a set top box is on or off, in accordance
with an exemplary embodiment of the invention. As shown by block
1210 a broadcast schedule for a set top box within a network is
received. As previously mentioned a broadcast schedule includes,
for example, a timetable for the platform channels/programs during
the zapping gathering period. It should be noted that the broadcast
schedule may also include a schedule of video on demand programs, a
schedule of audio programs, or a schedule of any interactive
services.
[0216] As shown by block 1220, a determination is made regarding
when content provided by the set top box is complete. As an
example, a determination may be made regarding when a video program
or audio program is complete. After completion of content provided
by the set top box a predefined time period is allowed to pass
(block 1230). As shown by block 1240 a determination is then made
regarding whether a zapping event has occurred prior to the
expiration of the predefined time period after the completion of
content provided by the set top box. If the predefined time period
expired and no new zapping event occurred, the set top box is
considered to be off. Alternatively, if a zapping event occurred
within the predefined time period, the set top box is considered to
be on. Alternatively, if for example a schedule is unavailable, the
determination whether a set top box is on or off may be achieved by
checking if a zapping event occurred during an elapsed predefined
time period.
[0217] Determining whether the set top box is on is important for
multiple reasons. One such reason is that content provided by or to
a set top box when the set top box is considered to be off should
not be considered when determining whether a viewer profile, and
what viewer profile, is currently consuming content from the set
top box. Such determination provides for a more accurate
determination of current content consumed by a viewer profile. This
determination is important when determining if and what
advertisement, or other content, to send to a set top box for
consumption by a viewer profile. Specifically, if no one is
consuming content provided by a set top box, resulting in the set
top box being considered to be off, there is no benefit in
forwarding advertisements to the set top box. In fact, determining
whether a set top box is on or off is important for other
calculations performed by the present system and method. If a set
top box is considered to be off, perhaps due to a lack of zapping
events occurring, content being provided by the set top box, or
that is available to the set top box, should not be considered for
calculation purposes, such as in determining which viewer profile
is associated with which set top box. Specifically, for example,
when a set top box is off, the input set C, which has been used in
many calculations described hereinabove, may have a value of zero,
in a place representing content transmitted during a time when the
set top box was considered by the application to be off.
Alternatively, depending on data representation, the set C may
contain no entries corresponding to time intervals, or contents
available to the set top box, during which the set top box was
determined by the system to be off.
[0218] It should be noted that the updating of the set top box
zapping signature may instead be updated in accordance with a
predefined schedule so as to alleviate the need for acquiring and
processing a schedule, for updating the set top box zapping
signature with each new zapping event. As an example, if the real
time targeted rating process is being performed every five minutes,
it would be beneficial to have the set top box zapping signature
updated at least every five minutes. Of course, the timing in which
the set top box signature is updated may have many different
values.
[0219] It should also be noted that, in accordance with alternative
embodiments of the invention, as mentioned above, a broadcast
schedule may not be necessary for determination of whether a set
top box is on or off. Specifically, a time gap between zapping
events may be considered to determine if a set top box is on or
off. As an example, if a predefined time period passes without any
zapping events occurring, a set top box may be considered to be
off. Other methods of determining whether a set top box is on or
off may also be used, and such methods are intended to be included
within the present description.
[0220] As previously mentioned, with determination that a set top
box is on, the real time targeted rating functionality determines
what viewer profile or profiles are currently consuming content
provided by the set top box. The real time targeted rating
functionality applied depends upon whether supervised or
unsupervised learning was performed by the present system and
method for determining what viewer profiles are usually associated
with which set top boxes. Herein, the term usually is used to
distinguish between currently (i.e. in real time, or nearly real
time), and during a `relatively long` period of time during which
data was collected. The data regarding the `usual` (rather than
current) association of viewer profiles to a certain set top box,
may also be periodically updated, for example every three months
(or any other time interval, which is longer than the time interval
defined as current).
[0221] The following describes the real time targeted rating
process used for determining what viewer profile or profiles are
currently consuming content provided by the set top box, in
accordance with the present system and method, when a determination
has been made that a set top box is on. As mentioned above, after
determining that a set top box is on, the real time targeted rating
process then depends upon whether supervised or unsupervised
learning was performed by the present system and method for
determining what viewer profiles are associated with which set top
boxes in real time, or a nearly real time.
[0222] The following first illustrates steps taken in real time
targeted rating when supervised learning was performed to determine
what viewer profiles were associated with which set top boxes.
Thereafter, illustration is provided of the steps taken in real
time targeted rating when unsupervised learning was performed to
determine what viewer profiles were associated with which set top
boxes. It should be noted that the following provides examples of
processes that may be performed during the real time targeted
rating process and the invention is not intended to be limited to
the same.
[0223] Real Time Targeted Rating with Supervised Learning
[0224] Referring to the supervised learning scenario, the
Association Rule derived after performing supervised learning, as
previously described, is gathered. As previously mentioned, the
Association Rule provides knowledge of how to associate a list of
profiles within a network to a set top box within the network. A
list of one or more of the viewer profiles that are determined to
be associated with the set top box, as determined after performing
the identification process, are gathered. The identification
process is not repeated here since it has been described in detail
hereinabove.
[0225] To determine which of the list of one or more viewer
profiles that were determined to be associated with the set top box
are currently consuming content provided by the set top box, the
previously obtained association rule, together with the list of one
or more of the viewer profiles that were determined to be
associated with the set top box, through the identification
functionality previously described, are applied to a newly obtained
set top box signature for the set top box in question.
Specifically, the set top box signature used is one that is
current, or one that has been updated at least within a predefined
period.
[0226] Alternatively, instead of using the association rule, but
still using the list of one or more viewer profiles determined to
be associated with the set top box, the present system and method
may provide real time, or near real time, determination of the one
or more viewer profiles that are currently associated with a set
top box by applying the content to profile procedure, as previously
described, to currently consumed content, as identified by a set
top box signature, and to the list of the one or more viewer
profiles determined to be associated with the set top box.
[0227] To apply the content to profile procedure the process used
by the content to profile functionality is performed. Specifically,
for example, a set A and a set C are provided, where the set A is a
list of one or more viewer profiles associated with a set top box
within a network, and set C is a summary of which set top boxes
within the network consumed content. To determine which profiles
consumed content within the last predefined time period, via use of
the content to profile functionality, we start with a summary of
which set top boxes within the network provided content, which was
consumed, within the predefined period. This summary of which set
top boxes within the network provided content, which was consumed,
within the predefined period can be obtained by reviewing, and/or
processing, the set top box signatures of each set top box in the
network.
[0228] Having the list of set top boxes that provided content
within the last predefined time period, a determination is made as
to which profiles are associated with the set top boxes that
provided content within the predefined period. As an example,
profiles f1 and f3 may be associated with set top box 1 (STB1), and
profiles f1 and f2 may be associated with the set top box 2 (STB2).
This example may be represented as STB1 has (f1, f3), and STB2 has
(f1, f2).
[0229] A determination is then made as to the probability that,
within the predefined time period, a specific profile consumed
content provided by a set top box that provided content within the
predefined time period. An example of a method that may be used to
determine the probability follows. If there are ten set top boxes
in a network that provided content within the predefined period,
and five of these set top boxes are associated with profile f1,
while four of these set top boxes are associated with the profile
f2, the probability that a profile f1 consumed content within the
predefined period, wherein the content was provided by a set top
box that provided content within the predefined period, can be
represented as P(f1)= 5/10. In addition, the probability that a
profile f2 consumed content within the predefined period, wherein
the content was provided by a set top box that provided content
within the predefined period, can be represented as P(f2)=
4/10.
[0230] The probability that one or more viewer profiles associated
with a specific set top box consumed content within the predefined
period, from the specific set top box, is then considered by
selecting probabilities having values closest to one (for example,
P(f3)=0.93), where the probability is for a profile known to be
associated with the specific set top box, and the specific set top
box provided content, which was consumed, within the predefined
time period. Profiles associated with the probability having a
value closest to one (those with the maximal probability per the
selected set top box) are selected as the profiles that consumed
content from the set top box within the predefined period. It
should be noted that this example may be made more accurate if, to
the calculation, the probabilities of association of each of the
profiles to a specific set top box, and/or the probabilities of the
presence of each of the viewer profiles within the network, are
added.
[0231] It should be noted that the above is merely an example, and
any other method of calculating content to profile assignment, as
described herein above, or in any other form, may be used.
[0232] Real Time Targeted Rating with Unsupervised Learning
[0233] For the unsupervised learning scenario, completion of the
unsupervised learning process and the identification process
results in a list of one or more viewer profiles associated with a
set top box in question. The list of one or more viewer profiles
that are determined to be associated with the set top box, are
gathered.
[0234] To determine which of the list of one or more viewer
profiles that were determined to be associated with the set top box
are currently consuming content provided by the set top box, the
list of one or more of the viewer profiles determined to be
associated with the set top box is applied, and possibly together
with the obtained and gathered association rule (as described in
the unsupervised learning portion herein above), to a newly
obtained set top box signature for the set top box in question.
Specifically, the set top box signature used is one that is
current, or one that has been updated at least within a predefined
period. For example, such application may include performing all
steps described in the unsupervised learning process, but with the
input of only the at most few resolved viewer profiles that were
previously determined to be usually associated with the specific
set top box.
[0235] Alternatively, instead of applying the list of one or more
of the viewer profiles determined to be associated with the set top
box in question, and possibly the previously obtained association
rule, to a newly obtained set top box signature for the set top box
in question, the present system and method may provide real time,
or near real time, determination of the one or more viewer profiles
that are currently associated with a set top box by applying the
content to profile procedure, as described above with regarding to
the supervised process.
[0236] Further Illustrations and Examples of RTTR Method
[0237] With the supervised and unsupervised scenarios described
above, it should be noted that the present system and method is
also capable of determining what viewer profiles, possibly from a
pre-defined list, are currently consuming content provided by the
set top box, even if there is no previous data or knowledge
regarding viewer profiles associated with set top boxes, or
previous learning or association rule data. In such a situation,
the set A is missing, where the set A is a list of one or more
viewer profiles associated with a set top box within a network. The
set C can then be obtained for a predefined period, such as, but
not limited to, the last five minutes, where the set C is a summary
of set top box signature(s); meaning a summary of viewing habits,
containing for example a summary of which set top box(es) within
the network consumed content, per different identifications of
contents (for example, which set top box(es) consumed which
programs). With there being a sample, the supervised process
mentioned above may be performed, resulting in a viewer profile or
profiles that are currently consuming content provided by the set
top box. Alternatively, if there is no sample, the unsupervised
learning process described above may be performed, resulting in a
viewer profile or profiles that are currently consuming content
provided by the set top box. As has been previously mentioned,
herein, the term currently consuming is intended to be the same as
consuming within a predefined period.
[0238] It should be noted that, with regard to set top box
signatures, the method of real time targeted rating includes the
steps of data collection (for example, zapping log and schedule),
and modeling, periodically, every pre-defined time interval (for
example, every five minutes), to obtain the set top box signatures.
Alternatively, to obtain a set top box signature, the collection
and the modeling may be performed per each event occurring at the
set top box, for example, any interaction of a viewer profile with
the set of box, such as pressing the info button.
[0239] By the present system and method performing the real time
targeted rating functionality, the system and method contains the
following data, or is ready to obtain the same upon request: either
an association of at least one viewer profile to at least a one set
top box within the network, that had been consuming content using
this set top box during the last short pre-defined time interval
(for example five minutes), or report that the set top box at
question being shut off during this time interval; then, the
targeted rating of each of the pre-defined in the system viewer
profiles for the last time interval (for example, five minutes) may
be provided.
[0240] It should be noted that while examples of time intervals for
updating set top box signatures and other content are exemplified
as being five minutes, the time interval is not limited to five
minutes, but instead may be any other time interval.
[0241] The following provides an example of a way to calculate and
operate the real time targeted rating functionality.
[0242] As one example, the real time targeted rating application
may receive as an input the results of learning (supervised or
unsupervised) from the management application, performed for any
period of data collection, in the form of `learned sets`, which are
sets of parameters providing an association rule between set top
boxes within the network and at least one viewer profile (for
example demographic or psychographic). In addition, the real time
targeted rating application may receive as an input any set top box
information, such as the identification of viewer profiles that had
been associated to this set top box, via learning and
identification procedures, performed, for example, by a management
application at any location, or at the real time targeted rating
application server, or received from an another external source.
Other set top box information, may include, for example, the region
in which the set top box is located, and/or other status
information regarding the set top box.
[0243] Any additional inputs, obtained for the set top boxes within
the network, or for viewer profiles, at an earlier time, such as
the description of viewing habits of profiles or set top boxes
within the network, or any other relevant information may be used
as well.
[0244] The output of the real time targeted rating functionality is
the identity of the viewer profile(s) (out of a pre-defined list,
for example, a list containing a few demographic profile types or a
list containing a few psychographic types or any mixture of those,
that had been associated to a specific set top box) that is
currently watching each of the set top boxes within the network,
that are part of the real time targeted rating system, that data
was received for, and that are known not be switched off; and, a
targeted rating, or a regional targeted rating for each of the
identified profiles, per content provided at the pre-defined time
interval (for example, 5 minutes), per which the identification of
current viewer profiles was performed. The later identification is
referred to as real time, or nearly real time, or online
identification and it may take place in real time, or nearly real
time, with a pre-defined time delay needed to receive and process
the data, or gather sufficient amount of data. One way to obtain
these outputs may be, for example, using the identification
functionality (either for supervised, or unsupervised learning),
described above. The use of the identification functionality in
this example, may be performed via applying the currently obtained
(for example, for the last 5 minutes interval) set top box
signatures (for example, viewing signatures) to the `learned sets`,
which provide an association rule of list(s) of profile(s) with set
top box(es) within the network. This can be done, for example, by
using mathematical, or other operations, such as multiplication
(for example, multiplication of a vector and a matrix). The later
may be done either using the whole `learned matrix` obtained for a
`relatively long` pre-defined previous time period (for example, a
month), or using just the part of the `learned matrix`, which is
narrowed, for each specific set top box, only to the list
containing at least one viewer profile, which is associated to each
specific set top box, for which at least one set top box signature
was obtained. Due to the fact that the identification is done on
the basis of viewing behavior that occurred in a very short time
period (for example, 5 minutes), the identified viewer profile, or
profiles, would usually be those consuming content at the specific
set top box, in real time, or nearly real time.
[0245] The `learned sets`, together with the list(s) of profile(s)
associated with the set top boxes within the network, may be stored
within the real time targeted rating server, or downloaded to the
set top boxes themselves (where each set top box would contain only
the part of the learned data associated with it). In addition, the
set top box signatures may be inferred at the set top box level,
or, if the `learned sets` are stored at the server level, set top
box zapping signature(s) may be uploaded (per a time interval, or
per a zapping event) to the real time targeted rating server, and
the set top box signature(s) may be updated if during the
pre-defined time interval (for example, 5 minutes) a new zapping
event occurred, which was not yet included in the set top box
zapping signature. In such a case, the identification may be
applied once, or again and again after each zapping event within
the predefined short time period (for example, 5 minutes); where
the newly obtained signature (the set top box signature obtained,
for example for the last 5 minutes period) will usually contain the
information regarding the latest occurring zapping event(s).
[0246] In the case that during the short predefined time interval
(for example, 5 minutes) the set top box signature is updated with
each occurring zapping event and the identification process is
applied repeatedly with each such zapping event, the identification
of the current viewer profile consuming content via a specific set
top box within the network is expected to be of high accuracy, as
the identification accuracy would increase with each such
iteration.
[0247] It should be noted that if no previous data including
supervised/unsupervised `learned sets` is available to the real
time targeted rating system, the system may perform
supervised/unsupervised learning and identification per each
obtained set top box signature.
[0248] To summarize, the real time targeted rating system is
capable of receiving previously processed data (such as previous
results of learning and identification); continuous real time, or
nearly real time, data collection (set top box zapping signatures,
and possibly a schedule) for any pre-defined time interval prior to
the desired identification, for example, 5 minutes, and in some
cases per each zapping event (such as turning the set top box
on/off), and processing/modeling capabilities of the continuously
collected data. After each such short predefined time interval (for
example, 5 minutes), the real time targeted rating system outputs a
snapshot of the set top boxes within the network, where for each
such set top box, the one or more viewer profile, currently
consuming content via the set top box are identified and the
targeted rating of the identified viewer profiles may be
calculated. Reports (busyness and/or system) may be automatically
periodically generated, or may be generated per user query. Queries
may be submitted by users of the real time targeted rating system
via a Web interface, or any other interface with the real time
targeted rating server.
[0249] All collected and calculated data may be stored, for
example, within the real time targeted rating server, or other
location, and may be made available for use for future
identification(s)/calculation(s), for any required time period.
[0250] The real time targeted rating server operates so that at
each given moment a query might be posed to it regarding what are
the current viewer profile(s) using set top boxes within the
network, which are part of the real time targeted rating system,
and which are inferred by the real time targeted rating system to
be currently on. As a result to such a query, an output report
regarding the identification of a certain viewer profile using
these set top boxes by the last identification, or of a few viewer
profiles with the probabilities of each of them using these set top
boxes, with respect to the last identification performed, is
prepared. In addition, a targeted rating, or a regional targeted
rating, per each of these viewer profiles may be calculated. If no
queries are made periodic automated output reports may be generated
by the real time targeted rating system.
[0251] The online, real time, or nearly real time identification
and the targeted rating calculation may be performed, for example,
at the real time targeted rating server, located, for example, at
the head end, where the real time targeted rating server receives
continuously inputs both from the management application and from
the set top boxes, for example those connected to the head end, and
automatically performing per each pre-defined time step, and/or per
each zapping event occurring at any of the set top boxes, the steps
of labeling each of the set top boxes within the real time targeted
rating system as being switched on or off, and for those on, who
are the viewer profile(s) using it in the current time interval,
with or without assigned probabilities, and the (regional) targeted
rating associated with each of the identified profiles, and the
last time interval for which identification took place.
[0252] Alternatively, for example, the `learned sets` may be sent
by the real time targeted rating server to each of the set top
boxes and stored there, the collection of the last occurring
zapping events may be performed at the set top box level and the
identification of each viewer profile using each of the set top
boxes may be performed at the level of each set top box, where the
result is sent back to the real time targeted rating server and the
(possibly regional) targeted rating is calculated at the real time
targeted rating server. In this example, in addition, the list of
profiles associated with set top boxes within the network may be
sent to be stored at the set top boxes themselves. Then, the
identification of which viewer profile is currently consuming
content via the set top box at question, for set top boxes within
the network, may be performed out of the short profile list, only
out of those fewer viewer profiles, associated to the set top box
at question, or from the whole list of viewer profiles, if such a
short list is not provided.
[0253] The identification of the viewer profiles may be performed
in a more accurate way, where the time interval, referring to
`current identification`, may be narrowed, so that as few profiles
as possible are identified as current viewer profile(s) associated
with a set top box within the network, that is part of the real
time targeted rating system.
[0254] Any of the described above methods, or combination of them,
or other methods, may be used to address different specific
situations.
[0255] It should be emphasized that the above-described embodiments
of the present invention are merely possible examples of
implementations, merely set forth for a clear understanding of the
principles of the invention. Many variations and modifications may
be made to the above-described embodiments of the invention without
departing substantially from the spirit and principles of the
invention. All such modifications and variations are intended to be
included herein within the scope of this disclosure and the present
invention and protected by the following claims.
* * * * *