U.S. patent application number 12/194236 was filed with the patent office on 2009-02-26 for system and method for providing targeted rating of profiles in video audiences.
This patent application is currently assigned to Ads-Vantage, Ltd.. Invention is credited to Reuven Cohen, Raviv Knoller, Anna Litvak-Hinenzon, Alex Paker.
Application Number | 20090055860 12/194236 |
Document ID | / |
Family ID | 40378757 |
Filed Date | 2009-02-26 |
United States Patent
Application |
20090055860 |
Kind Code |
A1 |
Knoller; Raviv ; et
al. |
February 26, 2009 |
SYSTEM AND METHOD FOR PROVIDING TARGETED RATING OF PROFILES IN
VIDEO AUDIENCES
Abstract
A system and method for providing targeted rating of profiles in
video audiences is provided. The method includes the steps of
deriving a first input set, wherein the first input set contains
data showing which viewer profiles are associated with which set
top boxes within the network, wherein the data may also include an
association between a single viewer profile and a single set top
box within the network; deriving a second input set containing data
of at least one set top box signature, wherein the data of the at
least one set top box signature further comprises a processed
zapping log containing information summarizing viewing habits of at
least one set top box within the network; and processing the first
input set and the second input set assuming that the second input
set can be derived by operations, wherein the operations involve
data associating the viewer profiles to set top boxes within the
network and to the targeted rating of profiles.
Inventors: |
Knoller; Raviv; (Shoham,
IL) ; Paker; Alex; (Modiin, IL) ;
Litvak-Hinenzon; Anna; (Hod-HaSharon, IL) ; Cohen;
Reuven; (Rehovot, IL) |
Correspondence
Address: |
SHEEHAN PHINNEY BASS & GREEN, PA;c/o PETER NIEVES
1000 ELM STREET
MANCHESTER
NH
03105-3701
US
|
Assignee: |
Ads-Vantage, Ltd.
Shoham
IL
|
Family ID: |
40378757 |
Appl. No.: |
12/194236 |
Filed: |
August 19, 2008 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
60956728 |
Aug 20, 2007 |
|
|
|
Current U.S.
Class: |
725/34 ; 725/131;
725/32 |
Current CPC
Class: |
G06Q 30/02 20130101;
H04N 21/23424 20130101; G06Q 30/04 20130101; G06Q 30/0601 20130101;
G06Q 30/0264 20130101; H04N 21/812 20130101; H04N 21/44222
20130101; H04N 7/17318 20130101; H04N 21/252 20130101; H04N 21/2665
20130101; H04N 21/6405 20130101 |
Class at
Publication: |
725/34 ; 725/32;
725/131 |
International
Class: |
H04N 7/025 20060101
H04N007/025; H04N 7/173 20060101 H04N007/173 |
Claims
1. A method of providing targeted rating of profiles in video
audiences of a network, comprising the steps of: obtaining a first
input set, wherein the first input set contains data showing which
of one or more viewer profiles are associated with which one or
more set top boxes within the network; obtaining a second input set
containing data of at least one set top box signature, wherein the
data of the at least one set top box signature further comprises a
processed zapping log containing information summarizing viewing
habits of at least one set top box within the network; and
processing the first input set and the second input set assuming
that the second input set can be derived by operations, wherein the
operations involve data associating the viewer profiles to set top
boxes within the network and to the targeted rating of
profiles.
2. The method of claim 1, wherein the set top box signatures are
selected from the group consisting of viewing signatures, time
signatures, high-resolution time signatures, and zapping frequency
signatures.
3. The method of claim 1, wherein the operations are mathematical
operations.
4. The method of claim 1, wherein the first input set is received
from an external source.
5. The method of claim 1, wherein the first input set is derived by
performing a learning step and an identification step, the learning
step comprising using set top box signatures with a list of set top
boxes and viewer profiles to provide knowledge of how to associate
at least one viewer profile to a set top box of the list of set top
boxes within the network, and the identification step comprising
recognizing a list of viewer profiles as being associated with a
set top box, of the list of set top boxes, based on results of the
learning step.
6. The method of claim 5, wherein the learning step further
comprises the steps of: receiving a zapping log and a broadcast
schedule, wherein the zapping log includes records of set top box
zapping signatures for at least a portion of the set top boxes of
the network; deriving set top box signatures from the zapping log
and broadcast schedule; clustering viewer profiles into groups of
viewer profiles using the set top box signatures; and associating
at least one set top box within the network with at least one
viewer profile.
7. The method of claim 6, wherein an optimization algorithm is used
to perform the step of clustering.
8. The method of claim 5, wherein the learning step further
comprises the steps of: receiving a data sample, wherein the data
sample provides an association of viewer profiles to a sample of
the set top boxes within the network, wherein a sample of the set
top boxes includes one or more of the set top boxes within the
network; receiving a zapping log and a broadcast schedule, wherein
the zapping log includes records of set top box zapping signatures
for at least a portion of the set top boxes of the network;
deriving set top box signatures from the zapping log and broadcast
schedule; using the set top box signatures with the sample of the
set top boxes to derive an association rule of viewer profiles to
set top boxes within the network; and applying the association rule
to the set top box signatures to determine a subset of viewer
profiles of the viewer profiles associated with a specific set top
box of the set top boxes within the network.
9. The method of claim 1, wherein the operations involving data
associating the viewer profiles to set top boxes and targeted
rating comprises providing a relationship between a set A, a set B,
and a set C, where the set C is derived by associating set A to set
B, where A represents a set of parameters representing the
association of at least one viewer profile to at least one set top
box, B represents an aggregation of targeted rating values of each
of the viewer profiles per each content watched by at least a
portion of the set top boxes of the network for which the zapping
log contains records of set top box zapping signatures, and C is an
aggregation of the set top box signatures.
10. The method of claim 1, wherein the operations involving data
associating the viewer profiles to set top boxes and targeted
rating comprises providing a relationship between a matrix A, a
matrix B, and a matrix C, where matrix A multiplied by matrix B
approximates matrix C, where A represents a set of parameters
representing the association of at least one viewer profile to set
top boxes, B represents an aggregation of targeted rating values of
each of the viewer profiles per each content watched by at least a
portion of the set top boxes of the network for which the zapping
log contains records of set top box zapping signatures, and C is an
aggregation of the set top box signatures.
11. The method of claim 1, wherein the set top box signatures
comprise at least one signature for each set top box in the
network.
12. The method of claim 1, wherein if the network covers more than
one region and information on the regions in which different set
top boxes in the network reside is available, the method further
comprises the step of calculating a regional targeted rating.
13. The method of claim 1, wherein the data of the at least one set
top box signature further comprises a processed broadcast schedule
containing content that the set top box is capable of receiving
14. A system for providing targeted rating of profiles in video
audiences of a network, wherein the system comprises a head end
having a computer and means for communicating therein, wherein the
computer has a management application stored therein, and wherein
the management application further comprises: logic configured to
obtain a first input set, wherein the first input set contains data
showing which viewer profiles are associated with which set top
boxes within the network, wherein the data may also include an
association between a single viewer profile and a single set top
box within the network; logic configured to obtain a second input
set containing data of at least one set top box signature, wherein
the data of the at least one set top box signature further
comprises a processed zapping log containing information
summarizing viewing habits of at least one set top box within the
network; and logic configured to process the first input set and
the second input set assuming that the second input set can be
derived by operations, wherein the operations involve data
associating the viewer profiles to set top boxes within the network
and to the targeted rating of the profiles.
15. The system of claim 14, wherein the set top box signatures are
selected from the group consisting of viewing signatures, time
signatures, high-resolution time signatures, and zapping frequency
signatures.
16. The system of claim 14, wherein the operations are mathematical
operations.
17. The system of claim 14, wherein the first input set is received
from an external source.
18. The system of claim 14, wherein the first input set is derived
by performing a learning step and an identification step, the
learning step comprising using set top box signatures with a list
of set top boxes and viewer profiles to provide knowledge of how to
associate at least one viewer profile to a set top box of the list
of set top boxes within the network, and the identification step
comprising recognizing a list of viewer profiles as being
associated with a set top box, of the list of set top boxes, based
on results of the learning step.
19. The system of claim 18, wherein the learning step further
comprises the steps of: receiving a zapping log and a broadcast
schedule, wherein the zapping log includes records of set top box
zapping signatures for at least a portion of the set top boxes of
the network; deriving set top box signatures from the zapping log
and broadcast schedule; clustering viewer profiles into groups of
viewer profiles using the set top box signatures; and associating
at least one set top box within the network with at least one
viewer profile.
20. The system of claim 19, wherein an optimization algorithm is
used to perform the step of clustering.
21. The system of claim 18, wherein the learning step further
comprises the steps of: receiving a data sample, wherein the data
sample provides an association of viewer profiles to a sample of
the set top boxes within the network, wherein a sample of the set
top boxes includes one or more of the set top boxes within the
network; receiving a zapping log and a broadcast schedule, wherein
the zapping log includes records of set top box zapping signatures
for at least a portion of the set top boxes of the network;
deriving set top box signatures from the zapping log and broadcast
schedule; using the set top box signatures with the sample of the
set top boxes to derive an association rule of viewer profiles to
set top boxes within the network; and applying the association rule
to the set top box signatures to determine a subset of viewer
profiles of the viewer profiles associated with a specific set top
box of the set top boxes within the network.
22. The system of claim 14, wherein the operations involving data
associating the viewer profiles to set top boxes and targeted
rating comprises providing a relationship between a set A, a set B,
and a set C, where the set C is derived by associating set A to set
B, where A represents a set of parameters representing the
association of at least one viewer profile to set top boxes, B
represents an aggregation of targeted rating values of each of the
viewer profiles per each content watched by at least a portion of
the set top boxes of the network for which the zapping log contains
records of set top box zapping signatures, and C is an aggregation
of the set top box signatures.
23. The system of claim 14, wherein the set top box signatures
comprise at least one signature for each set top box in the
network.
24. The system of claim 14, wherein if the network covers more than
one region and information on the regions in which different set
top boxes in the network reside is available, the method further
comprises the step of calculating a regional targeted rating.
25. The system of claim 14, wherein the data of the at least one
set top box signature further comprises a processed broadcast
schedule containing content that the set top box is capable of
receiving.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to copending U.S.
Provisional Application entitled, "SYSTEM AND METHOD FOR PROVIDING
PERSONAL ADVERTISEMENTS FOR AN ACCESS NETWORK," having Ser. No.
60/956,728, filed Aug. 20, 2007, which is entirely incorporated
herein by reference.
FIELD OF THE INVENTION
[0002] The present invention relates to advertising, and more
particularly is related to providing personal advertisement to
video services.
BACKGROUND OF THE INVENTION
[0003] Owners of products and services, also referred to herein as
advertisers, spend significant funds advertising on television. In
addition, advertisers seek to maximize return from their investment
in advertising on television by using different techniques. As an
example, owners may pay to have an advertisement run at a specific
time on a specific channel. Such an advertisement may not only be
for products and services, but for any content, such as, but not
limited to, video on demand, gaming, and any other content or
service. In addition, owners may pay a premium price to have their
advertisement run during the showing of popular television
programming.
[0004] Unfortunately, advertisers do not have control over who may
be watching television at a time that an advertisement is run. As a
result, finds associated with television advertising are not
maximized. Instead, after receiving ratings associated with an
aired television show, advertisers pay based upon a previously
desired audience and an agreed upon percentage. Funds would be
better allocated if a larger number of a specific desired audience
could be selected for viewing of targeted advertisements.
[0005] Different techniques have been used in an attempt to
maximize television advertising investments. Examples of known
techniques include attempting to obtain demographic and
psychographic profiles, and using information about rating.
Unfortunately, information about rating, demographic and
psychographic profiles, and targeted rating is obtained using
surveys and/or people meters, which are based on small sample
audiences and are inaccurate in the collection process.
Advertisers, network management, and cable/satellite decision
makers would like to use more accurate information for placement
and pricing of television advertisements.
[0006] Currently, the process of creating television viewer
profiles has not made use of the actual actions of the television
viewers while watching television. Utilizing information associated
with viewer actions while watching television would be very useful
in the creating of television viewer profiles. In addition, it
would be beneficial to be able to determine a percentage of viewers
of a specific profile that viewed content.
[0007] Thus, a heretofore unaddressed need exists in the industry
to address the aforementioned deficiencies and inadequacies.
SUMMARY OF THE INVENTION
[0008] Embodiments of the present invention provide a system and
method for providing targeted rating of profiles in video audiences
of a network. Briefly described, in architecture, one embodiment of
the system, among others, can be implemented as follows. The system
contains a head end having a computer and means for communicating
therein, wherein the computer has a management application stored
therein, and wherein the management application further comprises:
logic configured to derive a first input set, wherein the first
input set contains data showing which viewer profiles are
associated with which set top boxes within the network, wherein the
data may also include an association between a single viewer
profile and a single set top box within the network; logic
configured to derive a second input set containing data of at least
one set top box signature, wherein the data of the at least one set
top box signature further comprises a processed zapping log
containing information summarizing viewing habits of at least one
set top box within the network; and logic configured to process the
first input set and the second input set assuming that the second
input set can be derived by operations, wherein the operations
involve data associating the viewer profiles to set top boxes
within the network and to the targeted rating of the profiles.
[0009] The present invention can also be viewed as providing
methods for providing targeted rating of profiles in video
audiences of a network. In this regard, one embodiment of such a
method, among others, can be broadly summarized by the following
steps: deriving a first input set, wherein the first input set
contains data showing which viewer profiles are associated with
which set top boxes within the network, wherein the data may also
include an association between a single viewer profile and a single
set top box within the network; deriving a second input set
containing data of at least one set top box signature, wherein the
data of the at least one set top box signature further comprises a
processed zapping log containing information summarizing viewing
habits of at least one set top box within the network; and
processing the first input set and the second input set assuming
that the second input set can be derived by operations, wherein the
operations involve data associating the viewer profiles to set top
boxes within the network and to the targeted rating of
profiles.
[0010] Other systems, methods, features, and advantages of the
present invention will be or become apparent to one with skill in
the art upon examination of the following drawings and detailed
description. It is intended that all such additional systems,
methods, features, and advantages be included within this
description, be within the scope of the present invention, and be
protected by the accompanying claims.
BRIEF DESCRIPTION OF THE DRAWINGS
[0011] Many aspects of the invention can be better understood with
reference to the following drawings. The components in the drawings
are not necessarily to scale, emphasis instead being placed upon
clearly illustrating the principles of the present invention.
Moreover, in the drawings, like reference numerals designate
corresponding parts throughout the several views.
[0012] FIG. 1 is a schematic diagram illustrating an example of an
IPTV network in which the present system may be provided.
[0013] FIG. 2 is a flow chart further illustrating the process of
personalizing advertisements, in accordance with one exemplary
embodiment of the invention.
[0014] FIG. 3 is a flow chart further illustrating the process of
identifying and associating consumer profiles to set top boxes
within a supervised learning scenario.
[0015] FIG. 4 is a schematic diagram illustrating an example of a
cable network in which the present system may be provided.
[0016] FIG. 5 is a schematic diagram illustrating an example of a
satellite network in which the present system may be provided.
[0017] FIG. 6 is a schematic diagram illustrating an example of a
terrestrial network in which the present system may be
provided.
[0018] FIG. 7 is a flow chart further illustrating the steps of the
supervised learning process.
[0019] FIG. 8 is a flow chart further illustrating the process of
identifying and associating consumer profiles to set top boxes
within an unsupervised learning scenario.
[0020] FIG. 9 is a block diagram further illustrating functionality
of the management application as blocks of logic.
[0021] FIG. 10 is a detailed logical flow diagram illustrating a
sequence of events performed during unsupervised learning.
[0022] FIG. 11 is a flow chart further illustrating a process for
determining targeted rating.
DETAILED DESCRIPTION
[0023] The present system is capable of learning the viewing habits
of video viewers by collecting zapping events and other events
performed by the viewer. Such videos may be viewed via a
television, hand held device, computer, or any device capable of
displaying video. The events may be collected at a set top box,
computer, or other device. Alternatively, the events may be
collected at a different location, such as, but not limited to, at
an access multiplexer located in a head end, or in a device located
separate from the head end. The system learns the viewing habits
and zapping habits of different population profiles by identifying
the viewing profile of a household.
[0024] The system uses supervised or unsupervised learning
functionality for identifying different population profiles, and
provides a representation of the probability (or another form of
representation) of each population profile to watch any given
program and to present a zapping pattern. The probabilities can be
utilized as a tool for advertisers searching for the demographic
profile of the audience of a television program, or, using
inference functionality described herein, to identify the home
audience at each household, and the specific viewers of a
television program. Thereafter, the system is capable of supplying
personalized content, such as, but not limited to, advertisements,
video selections, and other content, to the viewers. It should be
noted that the following description provides an example in which
the content is an advertisement, however, the invention is not
intended to be limited to advertisements, but instead, any content
that may be personalized.
[0025] The present system collects the operations performed by
viewers at service decoders, such as, but not limited to, set top
boxes (the term set top box is used hereafter). The system then
employs unsupervised or supervised learning functionality, as
described herein, to interpret the operations at each set top box
as the sum of operations of all viewers associated with this set
top box. The system learns to identify different viewer profiles in
the population and associates with each set top box and profile a
probabilistic model of the viewing and zapping habits of
viewers.
[0026] It should be noted that the present system and method may be
provided within different infrastructures. As an example, the
following description provides examples of using the present system
and method in an Internet protocol television (IPTV)
infrastructure, in a cable infrastructure, and in a satellite
infrastructure. While these infrastructures are described herein,
the present system and method is not intended to be limited to
these infrastructures.
[0027] While the following describes the present system and method
in detail it is beneficial to provide certain definitions.
[0028] Set top box (STB) or service decoder: A set top box or
service decoder is a device responsible for converting digital (or
analog) content received into viewable content that may be fed into
a television set or other monitor. The set top box or service
decoder may be located at a household or another location.
[0029] Platform: A network of service decoders (e.g., set top
boxes) of a specific television service provider.
[0030] Passive audience identification: Identification of the
viewer's profiles without any specific actions performed by the
viewer.
[0031] Zapping event: A zapping event is an event where there is
switching from a current service to another service, where the
switching is performed by, for example, but not limited to, use of
a remote control, pushing buttons on the set top box, or any action
that causes switching, including, but not limited to, voice
commands, or even consumer motions without pressing buttons. In
addition, a zapping event may be other means for communicating with
a set top box, such as, but not limited to, pressing an electronic
program guide, pressing a volume button, and other actions
involving the set top box.
[0032] Zapping pattern: A zapping pattern is the behavior of a
viewing individual in terms of zapping, such as, but not limited
to, programs watched, frequency of zapping events, and variance of
zapping frequency.
[0033] Set top box (STB) zapping signature: Records of zapping
events of a particular set top box.
[0034] Set top box (STB) signature: Data model providing
characteristics of a set top box including: an association between
a set top box and content available to the set top box, where the
content is either provided or not provided via the set top box
during a time period; and/or, at least one zapping pattern
associated with the set top box. It should be noted that herein
when referring to set top box signatures, one or more set top box
signature is included. In addition, content availability refers to
content that the set top box has access to and can provide.
[0035] Zapping log: Records of the set top box zapping signatures
for an entire set top box network (Platform) or for part of the
network.
[0036] Channel: A stream of programs broadcasted consecutively from
a content source.
[0037] Program: Content that was broadcasted on a specific channel
at a specific date and time, whether on demand or generally
broadcasted.
[0038] Program rating: Percent of viewers that watched the
program.
[0039] Targeted program rating: Percent of viewers of specific
profile that watched the program.
[0040] Channel rating: Percent of viewers that watched the channel
during the specified time period.
[0041] Targeted channel rating: Percent of viewers of specific
Profile that watched the channel during the specified time
period.
[0042] Profile: The classification of an individual into one of
several population groups that is targeted. Such profiles may be,
for example, but not limited to, psychographic (for example,
behavioral) or demographic profiles. Examples of such groups
include, but are not limited to, gender, age, income, marital
status, and possibly also by interests in different fields.
[0043] Learning functionality: Functionality used to reduce a large
set of observed data and its classification into groups to a set of
parameters, allowing to reconstruct the classification of the
majority of the original data and to classify similar, unlearned,
data, or, to produce a new type of classification. Different
relevant learning methods may be utilized to provide the learning
functionality such as, but not limited to, artificial neural
networks, decision trees, k-Nearest Neighbor, Quadratic classifier,
support vector machine, direct probability estimate using Bayesian
inference, Bayesian networks, Gaussian estimators, least squares
optimization methods, and other optimization methods.
[0044] Supervised learning: Supervised learning is learning in
which the classification of the observed data is inferred from a
sample of the data supplied by an outside source. The learning
functionality searches for a parameter set allowing reconstruction
of the classification from the input that later can be used for
classification of new unlearned data.
[0045] Unsupervised learning: Unsupervised learning is learning in
which no classification of observed data is given (i.e., no sample
is provided), and the functionality attempts to classify the data
into different classes under some constraints. The functionality
may use a method, such as, but not limited to, vector quantization,
and various learning methods and various optimization methods, to
find a reduction of the data into representative classes.
[0046] FIG. 1 is a schematic diagram illustrating an example of an
IPTV network 10 in which the present system may be provided.
Specifically, FIG. 1 is specific to video on demand or personalized
advertisements for an IPTV infrastructure. As shown by FIG. 1, an
IPTV head end 20 is provided, portions of which communicate with at
least one customer premises 100A-100D. As is known by those having
ordinary skill in the art, a head end is the physical location in
an area where a video signal is received by a provider, stored,
processed, and transmitted to local customers of the provider. One
having ordinary skill in the art would also appreciate that more
than one head end may be provided within a network. In addition, a
network may have more than one type of head end, such as, but not
limited to, a cable head end, a satellite head end, an IPTV head
end, and a terrestrial head end.
[0047] The head end 20 contains at least a video service splicer
30, an advertisements video server 40, a management application 50,
and an access network multiplexer 60. One having ordinary skill in
the art would appreciate that the head end 20 may have portions in
addition to those mentioned herein. In addition, while the present
description refers to a management application, it should be noted
that the management application is stored on a computer.
[0048] The video service splicer 30 receives video and audio
services from a satellite dish 70. It should, however, be noted
that video and audio services may be received by devices other than
a satellite dish 70, such as, but not limited to, a cable network
or any device capable of providing video to the head end 20.
[0049] The video service splicer 30 is capable of splicing personal
advertisements into a video service stream, as instructed by the
management application 50 and as is further described in detail
hereinbelow. The video service splicer 30 also receives
advertisements from the advertisements video server 40. In
addition, actions of the video service splicer 30 are controlled by
the management application 50. It should be noted that, for the
example of an IPTV network, the video packets received by the video
service splicer 30 may carry an Internet protocol (IP) address and
a User Datagram Protocol (UDP) port number. It should also be noted
that the video service splicer 30 may instead receive video and
audio services from a cable fiber.
[0050] The access network multiplexer 60 is responsible for routing
video services to transmission units 120A-120D that are video
services decoders, as explained hereinbelow. The transmission units
120 are each located within a customer premises 100A-100D. The
access multiplexer 60 is connected to both the management
application 50 and the video service splicer 30. Specifically, the
access network multiplexer 60 may perform, for example, IP and UDP
port manipulation. It should be noted that the access network
multiplexer 60 may be, for example, but not limited to, an optic
multiplexer or a digital subscriber line access multiplexer
(DSLAM). From a multicast point of view, as described hereinbelow,
connection between the access network multiplexer 60 and a set top
box 110 may be a shared media connection, or any other type of
connection, and there may or may not be a multicast hierarchy
between the access network multiplexer 60 and the set top box
110.
[0051] The management application 50 communicates with the video
service splicer 30, the advertisements video server 40, and the
access network multiplexer 60. In addition, the management
application 50 provides the functionality required to learn
unsupervised profiles in television audiences, as is described in
detail hereinbelow. It should be noted that in accordance with an
alternative embodiment of the invention, the management application
50 may instead be located within a set top box 110 located within
the customer premises 100A-100D.
[0052] Each customer premises 100A-100D at least contains a set top
box 100A-100D and a transmission unit 120A-120D. While for
exemplary purposes four customer premises 100A-100D are
illustrated, one having ordinary skill in the art would appreciate
that additional or fewer customer premises 100A-100D may be
provided. The transmission unit 120 is capable of receiving
advertisement streams and video streams and forwarding the streams
to an appropriate set top box 110. For exemplary purposes, the
customer premises 100A-100D is illustrated as also containing a
computer 130A-130D, although a computer 130 is not intricate to the
invention. It should be noted that while a single set top box is
shown as being located within a customer premises 100, more than
one set top box 110 may be located within the customer premises
100. In addition, in accordance with an alternative embodiment of
the invention, the set top box may be a computer or any device that
can decode a service. For the present example of an IPTV network,
the set top box 110 receives a video service with certain TCP/IP
parameters, such as, but not limited to, IP address and UDP port.
It should be noted, however, that in a cable network or a satellite
network, the set top box 110 may or may not receive TCP/IP
parameters.
[0053] The present system enables editing of online personal video
so as to provide personalized television advertisements directed
toward a viewer presently watching the television. As is described
in detail below, the present invention is capable of categorizing a
viewer into an advertising profile, an example of which is, but in
not limited to, a demographic profile. Within a single customer
premises, different television viewers may have different profiles.
The different television viewers may view the same television
during the day. Each different viewer may be associated with a
different advertising profile, such as, but not limited to a
demographic profile, thus preferably receiving different
advertising messages. As an example, a family structure may be
described as having an adult male of age 45, an adult female of age
42, a male teenager of age 17, a female teenager of age 14, and a
male child of age 7. It should be noted that while the present
description refers to a demographic profile, other types of
profiles may be provided for.
[0054] During the time that a television viewer consumes service
transmissions the management application 50 identifies the profile
of the viewer. After identifying the profile, the application 50
performs personalized advertisements editing for that particular
profile. When there is a different viewer with a different
advertising profile that is using the same video decoder, the
management application 50 identifies the profile that the viewer
belongs to and performs online personalization editing for the
advertisements, as described below.
[0055] In accordance with the present invention, for both
supervised and unsupervised learning, the television consumers,
also referred to herein as viewers, are not individually
identifying themselves to the system. As a result, the system is
required to identify consumer profiles and to associate the
profiles with a specific set top box. This process is described in
detail hereinbelow. Prior to describing this process, a general
process of IPTV advertisement insertion in a broadcast environment
is described in detail.
[0056] A typical advertisement projection works as follows. During
content consumption the access network multiplexer 60 receives a
video signal and sends the video signal to the customer premises
100A-100D using an IP protocol. During an advertisement break the
video transmissions continue to be transmitted in multicast, thus
there is no personalization of advertisements. To instead
personalize advertisements, the following is performed.
[0057] FIG. 2 is a flow chart 200 further illustrating the process
of personalizing advertisements, in accordance with one exemplary
embodiment of the invention. Any process descriptions or blocks in
flow charts should be understood as representing modules, segments,
or portions of code that include one or more executable
instructions for implementing specific logical functions or steps
in the process, and alternative implementations are included within
the scope of the embodiment of the present invention in which
functions may be executed out of order from that shown or
discussed, including substantially concurrently or in reverse
order, depending on the functionality involved, as would be
understood by those reasonably skilled in the art of the present
invention.
[0058] As shown by block 202, content is transmitted from the head
end 20, via the access network multiplexer 60, to the set top box
110. An example of a protocol that may be used for the transmission
is the Internet group management protocol (IGMP), which is used by
IP hosts to manage their dynamic multicast group membership. Of
course, other protocols may be used.
[0059] In accordance with the present example, a subset, or
complete set, of the customers that are connected to the access
network multiplexer 60 are viewing the same video and/or audio
service (i.e., content). The management application 50 also
continuously identifies the consumers (block 204). It should be
noted that the management application 50 can utilize either online
processing or offline processing to determine a relationship
between viewed content (e.g., videos) and viewer profiles.
Regarding offline processing to identify consumers, associate the
consumers with content, and produce reports, in accordance with a
predefined schedule, or when prompted to do so, the management
application 50 reviews zapping patterns, processes the patterns,
and associates each program viewed from a set top box 110 with a
viewer profile. Alternatively, for online processing, during an
advertising break, the management application 50 reviews only
recent zapping events to determine which viewer is presently
viewing content. Further description of consumer identification is
provided with regard to FIG. 3, FIG. 8, and FIG. 10. It should be
noted that the information received by the management application
50 may be received from a source other than a set top box.
[0060] Returning to the flowchart 200 of FIG. 2, the management
application 50 decides which advertisements of the advertisement
set each consumer should receive (block 206). It should be noted
that the process of selecting advertisements is described in detail
herein.
[0061] As shown by block 208, the video splicer 30 then splices the
advertisements according to the decision of block 206. Since one
having ordinary skill in the art would know how a video splicer
splices advertisements, further description of the splicing process
is not provided herein. As shown by block 210, when the
advertisement break is over, the access multiplexer 60 continues to
transmit the multicast transmission as it did prior to the
advertisement break.
[0062] It should be noted that if during an advertisement break the
consumer changes the consumed video service, the management
application 50 supplies the new service in the same manner.
Specifically, if the service transmits content, the management
application 50 continues to transmit the content with the multicast
protocol. In addition, if there is an advertisement break, the
management application 50 may splice different advertisements.
[0063] As previously mentioned, the present system provides a
consumer specific advertising environment. This environment is
provided in part by the providing of online multilayer multicast
groups between the access network multiplexer 60 and the set top
boxes 110A-110D. The access network multiplexer 60 transmits
broadcast transmissions with multicast protocol to a subset A of
the set that is connected to the access network multiplexer 60. In
the subset A there are different subsets B of consumers watching
the same channel at a given moment that are connected to the access
network multiplexer 60. Within a single subset B, consumers are
associated by their profile for advertising. When there is an
advertisement break, the access network multiplexer 60 is
transmitting an additional layer of multicast, where each different
subset Bi is receiving different advertisements according to the
advertisement profile associated with subset Bi. Finally, when the
advertisement break is over, subset A consumers continue to watch
the same service.
[0064] While the abovementioned provides an example of an IPTV
network 10, a different infrastructure in which the present system
and method may be provided includes a cable network 400. FIG. 4 is
a schematic diagram illustrating an example of a cable network 10
in which the present system may be provided. While there are
similarities between the IPTV network of FIG. 1 and the cable
network 400 of FIG. 4, there are also differences, which are
described herein.
[0065] Referring the FIG. 4, a cable head end 410 of the cable
network 400 is very similar to the IPTV head end 20 of the IPTV
network 10. It should be noted, however, that instead of an access
network multiplexer 60, the cable network 400 contains an RF
interface 410, which may be, for example, but not limited to, a
quadrature amplitude modulation (QAM) modulator and/or a radio
frequency (RF) combiner. The cable network 400 provides for
individual coaxial cables to provide communication capability from
the cable head end 410 to individual set top boxes 430A-430H, where
each set top box is located within a customer premises (CP)
440A-440H, such as, but not limited to, a home.
[0066] Another example of a network in which the present system and
method may be provided is a satellite network. FIG. 5 is a
schematic diagram illustrating an example of a satellite network
500 in which the present system may be provided. The satellite
network 500 contains a satellite head end 510 that is similar to
the IPTV head end 20, except that the satellite head end 510
contains an RF modulation interface 520. The RF modulation
interface 520 is capable of formatting and amplifying received data
for transmission to a satellite 550.
[0067] The satellite 550 is capable of reflecting received data to
satellite dishes 560A-560N capable of receiving data signals from
the satellite 550. Each satellite dish 560A-560N is associated with
a customer premises 570A-570N, such as, for example, a home. In
addition, each customer premises 570A-570N has at least one set top
box 580A-580N located therein.
[0068] Still a further example of a network in which the present
system and method may be provided is a terrestrial network. FIG. 6
is a schematic diagram illustrating an example of a terrestrial
network 600 in which the present system may be provided. The
terrestrial network 600 contains a terrestrial head end 610 that is
similar to the IPTV head end 20, except that the terrestrial head
end 610 contains an RF modulation interface 620. The RF modulation
interface 620 is capable of formatting and amplifying received data
for transmission to a radio tower 650.
[0069] The radio tower 650 is capable of reflecting received data
to antennas 660A-660N capable of receiving data signals from the
radio tower 650. Each antenna 660A-660N is associated with a
customer premises 670A-670N, such as, for example, a home. In
addition, each customer premises 670A-670N has at least one set top
box 680A-680N located therein.
[0070] In accordance with the present invention, the management
application 50 identifies the consumer profiles that are using
video/audio decoders (i.e., set top boxes) in the network 10. For
exemplary purposes the example of a single household having two
television sets is provided. Each television is connected to a
different set top box. A first television A is located in the
living room and a second television B resides in a room for
children.
[0071] In accordance with the present example, there are three
consumer demographic profiles in the household, namely: [0072] 1.
Profile 1: Male adult of age 37 [0073] 2. Profile 2: Female adult
of age 34 [0074] 3. Profile 3: Male child of age 8 and male child
of age 10
[0075] The consumer profiles are associated with the television
sets as follows:
[0076] Television A--profiles 1, 2, and 3 (all the household
residents are consuming content via television A).
[0077] Television B--profile 3 (only the children are using
television B)
[0078] The process of identifying and associating consumer profiles
to set top boxes may be separated in accordance with whether a
supervised learning process is used or an unsupervised learning
process. These two scenarios are described separately hereinbelow,
although it will be noted that certain steps in the processes are
similar.
[0079] In accordance with the present example, for both the
supervised and unsupervised scenarios, service providers have no
knowledge of the profiles existing in the household, the location
of the television sets in the household, and/or associations
between the television sets and the profiles. Instead, the
management application 50 identifies and associates the consumer
profiles with the set top boxes.
Supervised Learning
[0080] Reference is now made to the flowchart 300 of FIG. 3. The
flowchart 300 of FIG. 3 further illustrates the process of
identifying and associating consumer profiles to set top boxes
100A-100D within a supervised learning scenario. As shown by block
302, to acquire a sample, the service provider may send a
questionnaire to the consumers. Alternatively, the service provider
may use any other method of obtaining data, such as, but not
limited to, having a telephone conversation. The questionnaire may
refer to the household demographic details, video decoders (i.e.,
set top boxes), and association between the usage of each person in
the household and the video decoders in the household. As shown by
block 304, consumers fill out the questionnaire and return the same
to the service provider. With the return of the consumer
questionnaire, it is known which individual profiles and set top
boxes are associated with a household.
[0081] As shown by block 306, set top boxes 110 in the network 10
record all of the zapping events that the consumers are creating.
In accordance with the present description, and as is known by
those having ordinary skill in the art, zapping refers to the
switching from the current service to another service via use of,
for example, but not limited to, a remote control or pushing
buttons on the video decoder. It should be noted that this use of
remote controls is provided for exemplary purposes. Instead,
zapping may be associated with switching initiated by voice
commands, or even consumer motions without pressing buttons.
[0082] As shown by block 308, the set top boxes 110 send the
zapping events to the management application 50. The management
application 50 then associates behavior of consumers and their
zapping pattern with the households that either did not return the
questionnaire or that never received a questionnaire (block
310).
[0083] The association process is a learning process, also referred
to as a business process, which is the process of passive platform
audience learning and identification, and targeted platform rating
calculation and analysis. The learning process is divided into
multiple steps, including data collection, modeling, learning,
identification, analysis, and post processing. FIG. 7 is a flow
chart 700 further illustrating the steps of the supervised learning
process.
[0084] Data Collection
[0085] Referring to FIG. 7 and the step of data collection, in
order to perform audience learning, audience identification, and
targeted rating calculation, certain external data is collected and
converted into an internal format (block 702). This external data
includes the zapping log, the broadcast schedule, set top box
information, and sample information. The zapping log includes the
actions that were performed by the set top box user using a remote
control, directly using set top box control buttons, or performing
a different action that caused changing from a current service to
another service, or from a current state of the set top box to
another state of the set top box (e.g., switching on or off). The
broadcast schedule (or AsRun) includes, for example, a timetable
for the platform channels/programs during the zapping gathering
period. It should be noted that the broadcast schedule may also
include a schedule of video on demand programs, or a schedule of
any interactive service. The broadcast schedule should be
reconciled with the zapping log in terms of times and channels
identifications. The set top box information includes the relevant
information, for every set top box for which zapping was collected,
(e.g., unique set top box identifier and address). The set top box
information should also be reconciled with the zapping log in terms
of set top box identifications.
[0086] Modeling
[0087] Modeling is the process of converting the zapping log into
different data models that could be used by different learning and
identification algorithms, thereby providing a set top box
signature (block 704). In accordance with the present system and
method, at least the following data models are recognized. A first
data model that is recognized is a set top box viewing signature.
Regarding the set top box viewing signature, for each set top box,
the list of "watched" programs could be created based on the
zapping log and reconciled broadcast schedule. For each watched
program, an aggregated watching percentage is given. As an example,
STB 1 watched program number 56, 30%, means that STB 1 watched 30%
of the program, on overall (including leaving the program and
getting back to it), during the whole time of broadcast of program
number 56. A second data model that is recognized is a set top box
time signature. The set top box time signature is, for each set top
box, the list of percentages of viewing every channel during the
specific time aggregated for weekdays. As an example, set top box 1
(STB1) watched CNN on Sundays between 12:00 and 13:00, 25%, means
that during the learning period, the average time that this
particular set top box watched CNN between 12:00 and 13:00 on
Sundays was fifteen minutes.
[0088] A third data model that is recognized is a set top box
zapping frequency signature. Specifically, every profile does
zapping with different frequencies. Calculating zapping frequencies
of every set top box during the predefined time periods provides a
Zapping Frequency Signature.
[0089] Unfortunately, the zapping log is not noise free. Most of
the viewers use the remote control in the same fashion, but there
is a small minority of users that would use the remote control
differently. This affects the general zapping frequency, surfing
periods (when the viewer changes the channels with high frequency
in order to find something interesting), etc. In order to handle
these irregular behaviors, a set of data filters should be applied
to the zapping log prior to modeling.
[0090] Learning
[0091] For supervised learning, learning is a process in which the
set top box signatures (viewing, time, and/or zapping frequency),
created at the data modeling stage, are used with a list of set top
boxes and profiles to provide an Association Rule (block 706). The
Association Rule provides knowledge of how to associate a list of
profiles within a network to a set top box within the network. The
Association Rule is determined due to not having received filled
out questionnaires from all parties and wanting to determine
unknown relationships between profiles and set top boxes.
[0092] It should be noted that during supervised learning, it is
not determined which profiles are associated with which set top
boxes. Instead, as mentioned above, an Association Rule is
determined to provide knowledge of how to associate a list of
profiles to each set top box.
[0093] As mentioned above, during supervised learning there is an
association of set top box signatures (e.g., viewing) for each set
top box in the data model to a predefined list of profiles, based
on a sample, for further use in the identification functionality. A
sample is a partial list of set top boxes for which both the
zapping log and the list of profiles associated with each set top
box are provided. The sample may be provided by an operator of the
set top box collection. Predefined profiles can be, for example,
but not limited to, demographic profiles that define gender, age,
marital status, income level, or psychographic (behavioral)
profiles.
[0094] The Association Rule can be applied to any set top box in
the same network, as is performed during identification. An example
of a process that may be used to derive the Association Rule
follows. The management application 50 contains knowledge of the
current consumed service for a specific decoder, the profiles
(demographic, or behavioral) associated with a specific decoder and
household, and previously consumed content for a specific decoder.
In accordance with the present invention, the management
application 50 uses inference functionality to determine the
current viewer/listener profile. The inference functionality
defines the current profile(s) that is/are consuming the
service.
[0095] An example of inference functionality follows, where the
learning functionality uses Bayes rule. At this point, the
management application 50 contains knowledge of the current
consumed service for a specific decoder (set top box). In addition,
the management application 50 knows the demographic profiles
associated with a specific decoder and household. Further, the
management application 50 knows previously consumed content for a
specific decoder, specifically, the short-term history. The
management application 50 may then use the inference functionality
to determine the current viewer/listener profile.
[0096] An example for the inference functionality using Bayes rule
is provided hereinafter. In the learning algorithm, data collection
determines the distribution of the consumed content as a function
of the classification of the viewers/listeners at the household. In
addition, using the data in conjunction with the Bayes rule, the
probability that the household contains a viewer/listener belonging
to each demographic profile is estimated. Data utilized to perform
this process includes probabilities of each consumed service for
households containing each of the demographic profiles, as well as
probabilities of each consumed service for households not
containing each of the demographic profiles.
[0097] Bayes rule reads as shown by equation one below.
P(C|F1 . . . Fn1)=P(F1 . . . Fn|C) *P(C)/(P(F1 . . .
Fn|C)*P(C)+P(F1 . . . Fn|.about.C)*P(.about.C)) (Eq. 1)
In equation one, P (F1 . . . Fn|C) is the probability that a
household containing a certain profile (C) consumes the list of
services F1 . . . Fn and does not consume any other service. In
addition, P (F1 . . . Fn|.about.C) is the probability that a
household not containing a certain profile (C) consumes the list of
services F1 . . . Fn and does not consume any other service.
Further, P(C) is the probability that a household contains profile
C, regardless of the services consumed and P(.about.C) is the
probability that a household does not contain profile C, regardless
of the services consumed.
[0098] P(F1 . . . Fn|C) and P(F1 . . . Fn|.about.C) may be
approximated as the products P(F1.about.C)* . . . *P(Fn|C) and
P(F1|.about.C)* . . . *P(Fn|.about.C) respectively, which may be
calculated directly from the statistics gathered for the sample
population. Better approximations may be obtained by considering
correlations between services and between profiles in a household.
From the above calculation, the result is the probability, P(C|F1 .
. . Fn) that a household contains profile C, given the list of the
household consumed services. The collection of all values P(C|F1 .
. . Fn), calculated for the whole of sample set top boxes
represents the Association Rule used for the identification step,
applied to each set top box in the network, which was not part of
the sample set top boxes. In addition, from this calculation, the
result is the probability that a certain individual viewer from a
specific profile used the set top box.
[0099] In accordance with an alternative embodiment of the
invention, a sample may be provided, and post processing may be
provided to associate content with profiles. Specifically, a sample
may include at least one profile, a set top box associated with the
profile, and zapping information associated with the set top box.
Post processing may then be performed on the sample to determine
which content (e.g., advertisement) is most appropriate for
providing to the consumer associated with the profile. As a result,
in accordance with this alternative embodiment of the invention,
the learning process is not required.
[0100] Identification
[0101] Identification is a process of recognition of a list of
profiles as being associated with a certain set top box, based on
the learning results. Every set top box in the network should be
assigned with at least one profile (demographic, or behavioral). It
is conceivable to assume that in front of a set top box, mostly
there is more than one active profile and there are cases where the
same profile should be associated a few times to the same set top
box. Thus, for each set top box there should be assigned one or
more profiles. For example, a young couple (male & female)
between the ages of 20-30 that are living together would produce 2
profiles, specifically, one for the female and the other for the
male. As another example, if a specific household has two boys of
the ages seven and fourteen, the boys may both be assigned to an
appropriate set top box as the same profile, "Male 6-18."
[0102] To determine the list of profiles associated with a set top
box, the Association Rule is mathematically applied to the list of
set top box signatures (block 708).
[0103] Analysis
[0104] Analysis is the process of breaking down and studying the
results of learning and identification in order to estimate
possible identification errors, provide a set of different factors
and amendments for post processing, association of definition of
profiles by signatures to a third party definition, and any other
functionality resulting from studying the learning and
identification results.
[0105] The identification error analysis may be performed via
mathematical modeling means and/or via simulation (empirical)
means. For example, estimation of expected identification errors
may be achieved via applying the learned results to a part of the
sample and simulating the identification results.
[0106] Post Processing
[0107] Post Processing is the process of calculating the data
required for presentation to potential customers, such as, targeted
rating. Post processing also includes reporting and analyzing based
on results of identification. The aforementioned list of results is
obtained via post processing functionality described hereafter.
Such functionality may be provided by, for example, algorithms.
Post processing may be utilized to calculate the following data,
although post processing calculation is not intended to be limited
to calculating only this data; rather, by post processing any
calculation done with the use of the results obtained from the
learner and/or identifier is referred to as a post processed
calculation/algorithm.
Targeted Rating
[0108] Targeted rating may include a percentage of viewers of a
specific profile that consumed content, a percentage of viewers of
a specific profile that consumed content from a channel during a
specified time period, or a percentage of viewers of a specific
profile that consumed content provided within the network during a
specified time period. It should be noted that the term "consumed"
is used herein instead of the term "watched" since content consumed
by a viewer profile not only includes content that is watched by a
viewer profile, but also content that is not watched, but that is
provided to a set top box associated with a viewer profile, such
as, but not limited to, audio content.
[0109] Herein, content may be, for example, but not limited to, a
program. It should also be noted, that for exemplary purposes, the
following provides the example of consuming content comprising
watching content, however, one having ordinary skill in the art
will appreciate that consuming of content need not be limited to
watching content, but instead may include other functions such as,
but not limited to, listening to content received from a
channel.
[0110] More specifically, targeted rating functionality calculates
the targeted rating of a content per profile (e.g., using
optimization algorithms, see examples herein below) of the learned
and identified data, or of any independent data (e.g., obtained
from the sample) as long as the data contains information about the
set top box signatures (e.g., viewing signatures) and the
profile(s) associated to each set top box in the input. As an
example, the targeted rating functionality may be used on data
resulting from the supervised learning functionality, unsupervised
learning functionality, or independent data. It should be noted
that herein set top box signatures includes one or more set top box
signature.
[0111] Targeted rating may include targeted program rating,
targeted channel rating, and targeted time interval rating.
Targeted program rating is a percentage of viewers of a specific
profile that watched a program. In addition, targeted channel
rating is a percentage of viewers of a specific profile that
watched a channel during a specified time period. Further, targeted
time interval rating is a percentage of viewers of a specific
profile that watched content broadcasted within the network during
a specified time period.
[0112] Targeted rating determination may be provided in general or
regionally. Specifically, a regional targeted rating is a targeted
rating for one region, where a region may be limited to, for
example, a specific geographical location. Alternatively, general
targeted rating is a targeted rating for an entire network, or a
part of a network, which is region independent (for example, it may
include one or several combined regions).
[0113] FIG. 11 is a flow chart 950 illustrating the process of
determining a targeted rating. As shown by block 952, data
representing relationships between viewer profiles and set top
boxes is received, or obtained. Specifically, data showing which
profiles are associated with which set top boxes is received. The
data may either be obtained after performing learning and
identification processes, as described herein, or received from an
external source.
[0114] As shown by block 954, set top box signatures are also
received, or obtained, for use in determining targeted rating. Such
set top box signatures may be, for example, but not limited to,
viewing signatures, time signatures, high-resolution time
signatures, or zapping frequency signatures. It should be noted
that other set top box signatures may also be provided for by the
present system and method.
[0115] The type of set top box signature used in targeted rating
determination dictates which kind of targeted rating will result.
As an example, when viewing set top box signatures are used,
targeted program rating results. In addition, when time set top box
signatures are used, targeted time interval rating, or targeted
channel per a time interval rating, results.
[0116] As shown by block 956, a first input set is derived showing
the probability that each profile is associated with each set top
box. It should be noted that the first input set is derived by
performing the learning and identification processes, or is
received from an external source. A second input set is derived
containing data of set top box signatures (block 958). It should be
noted that the second input set is derived by performing the
modeling functionality on the collected/received zapping log. As an
example, for a viewing signature, the zapping log may contain
information showing whether a certain set top box consumed certain
content (for example, a program), or not. For purposes of deriving
the desired output set, namely, the set of targeted ratings, it is
assumed that the data of the set top box signatures can be
approximated by certain operations involving data associating
profiles to set top boxes and targeted rating.
[0117] As is shown by block 960, certain operations are applied on
the set of data associating profiles to set top boxes and the set
of data containing set top box signatures (the input sets),
resulting in a targeted rating (the output set). Different forms of
data sets and different operations may be used to provide the
targeted rating. As an example, matrices may be used to derive the
targeted rating, where it is assumed that multiplying a matrix A
(matrix A shows the probability that each profile is associated
with each set top box) by a matrix B (matrix B is the targeted
rating) would result in a matrix C (matrix C is the set top box
signature data). Of course, other examples of operations may be
used. Two examples of operations that may be used to determine
targeted rating are provided below.
[0118] If the network covers more than one region and information
on the regions in which the different set-top boxes in the network
reside is available, a regional targeted rating (RTR) may be
calculated using similar methods to those described below. In
addition, regional targeted rating of high-resolution time steps,
where a time step may be for example, but not limited to, per each
thirty seconds, may be calculated for each specific channel and
profile.
[0119] Input to the regional targeted rating functionality includes
the region in which each of the set top boxes is stationed, the set
top box signatures for set top boxes within that region, such as,
but not limited to, viewing signatures, time signatures, zapping
frequency signatures, and high-resolution time signatures, and
lists of profiles associated with each of the set top boxes within
the region, from any source. It should be noted that a region may
have one or more set top boxes therein. In addition, a set top box
may be located within more than one region.
[0120] The output of the regional targeted rating functionality is
the percentage of viewers of each predefined profile, within a
specific region, that watched each of the contents, for example,
programs, in the case of when viewing signatures are the input, or
of each channel at a certain time interval, in the case of when
time signatures are the input.
[0121] Two examples of methods that may be used to calculate
targeted rating are provided herein below. It should be noted that
the present invention is not intended to be limited to the
following examples, but instead that the following examples are
merely provided for exemplary purposes and are not intended to
limit the present invention.
EXAMPLE 1
[0122] An example of a method to calculate targeted rating, given a
list of set top boxes with viewing signatures and profile(s)
associated to each set top box, can be given via the use of a
linear regression optimization algorithm. In calculating the
targeted rating, it is assumed that multiplying the set of
parameters representing the association of profile(s) to set top
boxes (let us call it A) by the aggregation of targeted rating
values of each of the profiles per each program watched by at least
a portion of the set top boxes of the network for which the zapping
log contains records of set top box zapping signatures (the yet
unknown and desired output, let us call it B) corresponds to the
parameters representing the aggregation of the set top box viewing
signatures (part of the input, let us call it C).
[0123] For purposes of this example, it is assumed that the sets of
parameters A, B, and C are utilized to provide matrices A, B, and
C. A minimization algorithm on the squared norm of the matrix
(AB-C) may then be performed (a random initial guess is provided to
the algorithm for the values of B). In other words, given A and C,
the output of applying this algorithm is the set of probabilities,
B, representing the probability of each profile to watch each of
the programs broadcasted to the collection of set top boxes. An
example table for such an output is presented below after example 2
is described.
EXAMPLE 2
[0124] As a second example of a method to calculate targeted
rating, the matrices A, B and C are as in example one, where A is a
matrix containing list(s) of demographic, or psychographic,
profiles that is (are) associated to each set top box (of the whole
network, a part of the network, a specific region within the
network, or statistically representing any of those), which is
obtained from any source, either via local identification, via
receiving an external sample, or via another means.
[0125] The matrix C is a matrix that contains, per each of the set
top boxes, a list of set top box signatures per a channel, or a
program. Examples of forms of set top box signatures include, but
are not limited to, viewing signatures, time signatures,
high-resolution time signatures or any other form of set top box
signatures that associates knowledge of some viewing habits in a
certain period per each set top box. The unknown set of
probabilities per each of the pre-defined profiles, represented by
the matrix B, may then be obtained by the use of solving equation
two (Eq. 2):
B`A.sup.+C (Eq. 2)
In equation two, A.sup.+ is the pseudo-inverse of the matrix A,
which is unique in mathematical terms, thereby insuring that the
targeted rating matrix B computed in equation two is well-defined.
An example of a pseudo-inverse is the Moore-Penrose pseudo-inverse.
Calculating A.sup.+ and multiplying it by the matrix C gives a good
approximation to the matrix B, of the targeted ratings.
[0126] The algorithm of equation two is extremely accurate and
allows for the performance of targeted rating calculations on very
large amounts of data (more than an order of millions of entries)
in an extremely short computing time. Specifically, when performing
linear regression, for example, in accordance with one exemplary
embodiment of the invention, there is a requirement that for each
targeted rating element a separate optimization process is
performed, thereby requiring a long computation period. A targeted
rating element may be, for example, but not limited to, a program,
a time interval, or a channel.
[0127] Alternatively, in accordance with another exemplary
embodiment of the invention, if a pseudo-inverse is utilitized,
performing a matrix multiplication, instead of multiple
optimization processes, is very fast and is performed for all the
targeted rating elements at once, even if there are tens of
thousands of targeted rating elements.
[0128] An example of data and targeted rating output follows.
If the pre-defined profiles are: [0129] 1. Female of age 30-55 with
high income. [0130] 2. Male of age 18-40 with average income.
[0131] 3. Male child of age 6-16 with low income. [0132] 4. Female
child of age 6-16 with average income.
[0133] And the list of programs (as specified in the viewing
signatures) is: [0134] 1. Saturday night live. [0135] 2. Lost.
[0136] 3. 24. Then the targeted rating (TR) output would be the
following table:
TABLE-US-00001 [0136] Rating (in % of each Program ID Profile ID
profile) 1 1 0.5% 2 1% 3 0.01% 4 0.04% 2 1 3% 2 1.54% 3 0.01% 4 0 3
1 2.31% 2 2.11% 3 0 4 0
Content to Profile Assignment
[0137] In addition to a targeted rating of a content (for example,
program) per profile, a content to profile assignment (C2P) may be
determined. Content may be, for example, but not limited to, a
program. The present description provides an example for
illustration purposes only. Similarly an assignment of any content
in a specific time slot to a specific profile in the household that
consumed this content may be made. Obtaining a content to profile
assignment involves determining for each program that was watched
by a certain set top box, which is the specific profile, of the
profiles associated to this set top box, that watched the program.
This can be done, for example, via use of algorithms applying
algebraic manipulations to the sets of parameters representing the
aggregation of viewing (or other) signatures of the set top boxes
(such as C above), the parameters representing the association of
profile(s) to set top boxes (e.g., A above) and parameters
representing targeted rating values (e.g., B above).
Total Viewership
[0138] Further, a total viewership may be calculated (using, e.g.,
a program--time slot map and applying to it a calculation algorithm
which utilizes data obtained in the previous steps described here),
which is the calculation of total aggregated viewing activities for
each of the pre-defined profiles (these may be demographic or
behavioral), during a twenty-four hours period for each week
day.
[0139] For example, having the association of profile(s) with each
set top box, represented as a set of probabilities (either obtained
as an output from the learning and identification steps or given
from an outside source), and given the set top box signatures
(e.g., as an output from the data modeling stage), given in
addition the broadcasting time table (showing for a pre-defined
period of time at which time and date and for which duration each
program was broadcasted), the following calculation is
performed.
[0140] The data is aggregated and modulated in such a form that for
each day of the week (24 hours) it is calculated how many of each
of the pre-defined profiles watched any content during each of the
pre-defined time intervals. For example, if the period decided upon
is three months and there were 12 Sundays during this period, the
24 hour period is divided to intervals of 15 minutes and for each
such interval it is calculated (using the set top box signatures
and the data mentioned above) how many times each of the
pre-defined profiles watched any content during each of the 15
minute intervals aggregated for all 12 Sundays on a 24 hours span.
Then this information is presented in a graph showing the viewing
peaks during a 24 hour Sunday divided to 15-minute slots per each
profile. This is done for each day of the week (aggregated to the
number of time this weekday appeared during the three months
period).
[0141] In addition to the abovementioned, a targeted rating
distribution may be determined, which involves, for every channel,
for every profile, calculating the rating of the channel for every
brief period of time (e.g., thirty seconds), for every minimally
defined region. Further, a viewership flow may be determined, which
includes, for every channel, calculating the number (or percentage)
of viewers of every profile that join and leave the channel during
every short period of time (e.g., thirty seconds), for every
minimally defined region. Still further, creative reports may be
determined such as, for example, during an advertisement break, for
each second, calculating the rating and viewership flow. All the
aforementioned are merely examples of the post processing
possibilities.
[0142] In the supervised case, with the knowledge gained by the
functionality of block 310, for any households that did not fill
out the questionnaire, the management application 50 uses
identification functionality to associate the rest of the set top
boxes 10 with the profiles that are using the set top boxes 10
(block 312). An example of the functionality, which is used as a
basis for such an identification functionality, is provided herein
below. It should be noted that different relevant learning methods
may be used to perform the identification functionality. Examples
of such learning methods may include the use of any one of the
following, or other learning methods: Bayesian learning, various
statistical methods, artificial neural networks; decision trees;
k-nearest neighbor; quadratic classifier; support vector machine;
various optimization methods, and direct calculation of
probabilities. Of course, other learning methods may be used and
are intended to be included within the present description.
Viewership Flow
[0143] Using the identified profiles data and high-resolution time
signatures, a viewership flow may be calculated. It should be noted
that a high-resolution time signature is a representation of which
channel each set top box watched during each time step of a
specific time interval, such as, but not limited to, thirty
seconds. In addition, a viewership flow is the number of viewers of
each profile that left or joined watching a specific channel during
each time interval (e.g., 30 seconds), during a day or any
pre-defined time interval. Viewership flow may be calculated using,
for example, but not limited to, a high-resolution regional
targeted rating, in addition to the data of signatures and lists of
profiles associated with each set top box.
[0144] Calculation of viewership flow is performed in a few steps.
It should be noted that the following is an example of steps that
may be used to calculate viewership flow, however, the following
example is not the only way to calculate viewership flow and this
example is not intended to be limiting. As a first step, the
high-resolution regional targeted rating is calculated. Calculation
of the high-resolution regional targeted rating provides, per each
channel and per each viewer profile, the percentage of viewers of
this viewer profile that watched this channel per each time
interval (for example, 30 seconds) during each day of a specified
period. Such targeted rating may be calculated, for example, but
not limited to, using a method similar to the method described in
the targeted rating section of the present description, where the
word program is replaced by channel per time interval.
[0145] To calculate viewership flow, the differences between the
targeted ratings of same viewer profiles, per different time
intervals, may be calculated to record the change in number of
viewers of each profile between successive time intervals.
Moreover, using for example, but not limited to, the method
described above as content to profile assignment, the number of
viewers that left or joined the viewers of each channel at each
time interval may be calculated. To summarize: the viewership flow
application may contain various descriptions of changes in viewers
per channel per time interval. For Examples of the abovementioned
include, but are not limited to, targeted rating and the changes in
targeted rating per time interval, and number of viewers of each
profile who left or joined the viewers of the channel at each time
interval.
Unsupervised Learning
[0146] Reference is now made to the flowchart 800 of FIG. 8. The
flowchart 800 of FIG. 8 further illustrates the process of
identifying and associating consumer profiles to set top boxes
100A-100D within an unsupervised learning scenario. It should be
noted, that unlike with supervised learning, with unsupervised
learning no sample relating viewer profiles to set top boxes is
provided. Moreover, the type of viewer profiles might be unknown at
the stage of the learning. As a result, the viewer profiles must be
determined. It should be noted that different types of viewer
profiles may exist, including, but not limited to, demographic and
psychographic types of viewer profiles. For example, for the
psychographic type of viewer profile, the profile may contain
multiple categories, such as, but not limited to, watching habits,
purchasing behavior, social class, lifestyle, opinions, and
values.
[0147] To determine viewer profiles one of many methods may be
used, such as, but not limited to, using clustering algorithms to
find common denominators within a population in association with
viewing habits of the population. An example of a method that may
be used for profile learning and determination is provided
below.
[0148] As shown by block 802, set top boxes 110 in the network 10
record all zapping events created by the consumers. The set top
boxes 110 send the zapping events to the management application 50
(block 804). It should be noted that the zapping events include an
identification of the set top box from which the zapping events
were derived. The management application 50 then associates
behavior of consumers and their zapping patterns (block 806).
[0149] FIG. 9 is a block diagram further illustrating functionality
of the management application 50 as blocks of logic. As shown by
FIG. 9, the management application 50 contains modeling logic 902,
learning logic 904, identification logic 906, analyzer logic 908,
profiles determination logic 910, post processor logic 912, and
reporting logic 914. The logic of the management application 50 is
further described in detail with regard to the logical flow diagram
of FIG. 10.
[0150] FIG. 10 is a detailed logical flow diagram illustrating a
sequence of events performed during unsupervised learning. The
zapping log and the broadcast schedule (arrows 1) are the inputs to
modeling functionality of the management application 50, the output
of which is a collection of set top box signatures (arrow 2),
wherein the collection of set top box signatures includes a
signature for each set top box in the network. The set top box
signatures may be one of multiple classes of signatures, wherein
the classes of signatures include viewing signatures, time
signatures, and zapping frequency signatures. Each set top box in
the network may have multiple signatures, wherein the signatures
for a single set top box are selected from the classes of
signatures. In fact, for example, a single set top box may even
have one or more of each class of signature. Each such set top box
also has a unique identification (ID). Viewing signatures are
vectors of all the programs watched during a specified period by
each of the set top boxes in the network.
[0151] The set top box signatures are the input used by learning
functionality (arrow 3) of the management application 50. The
learning functionality clusters profiles into groups of profiles
that are yet unresolved. It should be noted that an unresolved
profile is a profile for which a type is not yet known.
Specifically, the learning functionally, which is further described
in detail below under the section entitled "learning", is capable
of using the set top box signatures and determining relationships
between profiles to derive clusters of profiles, where a type of a
profile is not yet known. As an example, an optimization algorithm
may be used to cluster the profiles into groups of unresolved
profiles, an example of which is illustrated below. The learning
step may be performed a few times, to determine the number of
existing profile groups available for identification from viewing
signature data. This may be done by, for example, but not limited
to, throwing out, after each iteration, the profile groups that
have similarity to each other, which is greater than a pre-defined
threshold.
[0152] As previously mentioned, the output of the learning
functionality of the management application 50 is clusters of yet
unresolved profiles (arrow 4). The clusters of the yet unresolved
profiles, together with a profile description (arrows 5), are the
input to the profiles determination functionality of the management
application 50.
[0153] The profiles description is a classification, or definition,
of profiles of viewers by groups that associates between, for
example, viewing habits and purchasing habits of individuals. The
profiles description is provided by an external source, such as,
but not limited to, a single source researcher. It should be noted
that the profile description input is some external definition of
profiles that is fed to the system.
[0154] The profiles determination functionality performs a match
between the profiles found by the learning functionality
(unresolved profiles) and the profiles description from the
external source, which determines whether to match the profiles to
demographic clustering or to a specific psychographic clustering,
for example, by consuming habits. The profile determination with
respect to a given profile description may be done, for example, by
performing a standard best match procedure on each of the profiles
in both groups (unresolved and pre-defined) and by finding the best
possible match to each profile from the unresolved group from the
defined profiles. It should be noted that sometimes one unresolved
profile might fit to two described profiles and vise versa--two or
more unresolved profiles can match one profile from the described
profiles group.
[0155] The output of the profiles determination functionality are
the resolved profiles (arrow 6), which are the input, together with
the set top box signatures, to an identification functionality
(arrows 7).
[0156] In accordance with an alternative embodiment of the
invention, the learning and the profiles determination
functionalities may be performed simultaneously by combining these
two functionalities (learning and profile determination) of the
management application 50 into one. In accordance with this
embodiment, the profiles description and the set top box signatures
are both fed as inputs to the learning and profiles determination
functionalities (arrows 3 and 5). In this case, the learning and
profiles determination functionalities are performed together. The
output of the learning and profiles determination functionalities
is resolved profiles (arrow 6). In the case of combining these two
functionalities, directing the learning process toward the input
profiles description may be done by, for example, but not limited
to, feeding the described profiles as an initial guess to the
optimization process and using the number of the defined profiles
as the number of profiles to found.
[0157] The resolved profiles are sometimes used together with the
set top box signatures as an input to the identification
functionality of the management application 50 (arrows 7), to
associate each set top box in the network with at least one
profile, during which, for example, a quantization process may be
performed and each set top box in the network may be associated
with at least one profile.
[0158] A quantization process is a process during which, rather
than having a continuous range of probabilities of having each of
the profiles associated with some set top box, some profiles would
be decided as not associated to that set top box (due to having a
too small probability of being associated), while other profiles
would be decided as being associated (with some higher probability,
or 1). A quantization process may be performed by, for example,
calculating a statistical constant related to the association of
profiles to set top boxes (see detailed explanation below) and
performing rounding steps. A quantization procedure may be
performed at various steps of the learning and identification
process.
[0159] The identification of lists of profiles associated with each
set top box in the network may be performed by, for example, but
not limited to, combining the association rule between unresolved
profiles to set top boxes and the association rule between resolved
and unresolved profiles to create an association rule associating
lists of resolved profiles to set top boxes. For example, the
association rules may be matrices of parameters and the application
of the association rules may be performed, by using matrix
multiplication.
[0160] The output of the identification functionality (arrow 8) is
the identification of which profile(s) uses each of the set top
boxes in the network. In other words, the output is an
identification of at least one profile associated with each set top
box in the network.
[0161] The profiles description, set top box signatures, and
profiles associated with each set top box (arrows 9) are fed to
analyzer functionality of the management application 50, the output
of which is an estimation of identification quality and error
estimation (arrow 11). Specifically, the analyzer is a
self-assessment tool of the management application. The analysis in
the case of unsupervised learning is performed with respect to the
profiles definition input. The output of the analyzer may be, for
example, the quality of the ability of the system to classify the
profiles into groups according to the given profile definition,
ranking the quality of the input data in view of desired output
versus the actual output, and error estimation regarding the
accuracy of the identification process.
[0162] The estimated errors may be, for example, the expected
deviation from the actual situation, and false positive and false
negative identification rates. Moreover, correlations between the
different profiles groups may be calculated, thereby providing
information regarding identification possibilities of certain
profiles with respect to their correlations with other profiles.
This may be done, for example, by performing comparison of results
with known statistics, or by comparing results obtained for all of
the network with results obtained from a well representing subgroup
of the network.
[0163] The identified profiles associated with a set top box are
fed as an input, together with the set top box signatures (either
the same ones used for the learning and identification
functionalities, or others, such as time signatures or
high-resolution time signatures) and additional set top box data,
if required, to post processor functionality of the management
application 50 (arrows 12). The post processing functionality
computes various data, such as: regional targeted rating (RTR),
content to profile assignment (C2P), total viewership and
viewership flow. A description of these functionalities was
presented above. Note that the computation of the functionalities
of the post processor may remain the same for data (associating
lists of profiles to set top boxes) obtained via supervised
learning, unsupervised learning, or an external source.
[0164] Reporting functionality of the management application 50
uses the computed data to produce business and other reports (arrow
13). As with the supervised scenario, the association process, also
referred to as the learning and identification process, is divided
into multiple steps. The steps in the association process include
data collection, modeling, learning, profiles determination,
identification, analysis, and post processing. Of the multiple
steps, usually the data collection, modeling, analysis and post
processing remain the same for both the supervised and unsupervised
processes. The main difference in the supervised and unsupervised
processes is in the learning step, which may also include a profile
determination step, and which may inflict some differences in the
identification steps. Note that the steps of learning, profile
determination, and identification are sometimes called here for
short, "unsupervised learning". The unsupervised learning process
is further defined herein below.
[0165] Learning
[0166] For unsupervised learning, each set top box signature is
learned to be associated with a certain list of unresolved profiles
defined solely using the set top box signatures. Examples of such
set top box signatures include, but are not limited to, viewing
signatures, time signatures, high-resolution time signatures, and
zapping frequency signatures. It should be noted that the main
difference from the supervised learning process is that no sample
is provided in this case. An unsupervised learning algorithm
receives the set top box signatures only as an input, resulting in
a classification of profiles into, for example, a certain type of
psychographic (for example, behavioral) or demographic profile
groups. After the first step (unless the steps of learning and
profile resolving are combined) the resulting learned profiles are
usually yet unresolved, meaning that their nature is yet to be
resolved.
[0167] Examples of unsupervised learning algorithms include, but
are not limited to, least squares algorithms and algorithms that
provide minimization via steepest decent. Other outputs from the
learning algorithms include an association of profiles to set top
boxes and obtaining a targeted rating of the defined profiles at
the same time, thereby providing a probability that a profile is
associated with a set top box.
[0168] The following is provided as an example of an unsupervised
learning algorithm. An input to the unsupervised learning process
is the collection of set top box signatures, which is the output of
the data modeling process. Assume as an example that these are
viewing signatures (although these might be time signatures, etc.),
where we denote their parametrical representation by a matrix C.
For example, each row of the matrix C may refer to one set top box,
and each column of the matrix C may refer to, for example, but not
limited to, one program, where the entries of matrix C may be, for
example, the portions of the programs that each set top box
watched, or, for example, the probabilities with which each of the
set top boxes represented in matrix C watched each of the programs
represented in matrix C. Let us denote by a matrix A the collection
of probabilities, representing viewer profiles association to the
set top boxes, where the entries of the matrix A are the
probabilities of each of the viewer profiles to be associated with
each of the set top boxes. Note that the viewer profiles might be
yet unresolved viewer profiles at this stage. Let us denote by the
matrix B, targeted rating values. Both A and B are unknown in the
case of unsupervised learning. To obtain the desired outputs A and
B, we use, for example, but not limited to, the following method.
We minimize the squared norm of the difference (AB-C) (see equation
three), to obtain the approximation of the matrix C as the product
AB. For this, we are using, for example, but not limited to, a
convex optimization algorithm (or, for example, some other
nonlinear minimization algorithm) under various constrains, such
as, but not limited to, that each quantity in A is greater than
zero and smaller than one, and each quantity in B is greater than
zero and smaller than, for example, 0.5. The following description
further describes this process.
[0169] Following this example, to determine a possible algorithm
for achieving the minimization of the squared norm of the matrix
(AB-C), (see equation three), considered above, it is assumed that
the population consists of viewers that can be divided into several
groups of different profiles, where each viewer may belong to one
or more group of viewers profiles. Each such group of profiles is
associated, for example, with a behavior pattern in terms of
watching habits, where the pattern consists of, for example, but
not limited to, the viewing signatures and the targeted rating per
content and per each profile, where the targeted rating for the
profile is the probability of a viewer of this profile watching
each program, or some other definition of content.
[0170] Since usually the number of all possible profile groups is
low compared to the number of programs and set top boxes in the
network, one is actually looking for a low rank approximation of
the matrix C, the term low rank (of matrices A and B) refers in
this case to the fact that the number of different profile groups
is smaller than the dimensions of C, representing for example the
number of programs and the number of set top boxes in the network,
where due to this low rank the matrices A and B may be obtained
using this approximation. One approach to obtaining a low rank
approximation of the matrix C is to search for the matrices A and B
that minimize the squared norm of the matrix (AB-C). This can be
done using, for example, a convex optimization method on the
quantity of equation three, which reads:
n = AB - C 2 = i , j ( k A ik B kj - C ij ) 2 = Trace ( ( AB - C )
T ( AB - C ) ) ( Eq . 3 ) ##EQU00001##
where n denotes the squared norm of (AB-C), and trace is a known
operation on a matrix providing the sum of the diagonal. In order
to minimize this efficiently, one may use the derivatives of
equation three, described in equations four and five, each of which
read as follows:
.differential. n .differential. A ab = 2 i , j ( A ai B ij - C aj )
B bj .differential. n .differential. A = 2 ( AB - C ) B T ( Eq . 4
) ##EQU00002##
and correspondingly,
.differential. n .differential. B = 2 A T ( AB - C ) ( Eq . 5 )
##EQU00003##
[0171] The second derivatives may also be calculated in order to
perform this minimization and they are given by the combination of
equations six, seven, and eight below:
.differential. 2 n .differential. A ab .differential. A c d = 2
.delta. a c ( BB T ) bd ( Eq . 6 ) .differential. 2 n
.differential. B ab .differential. B c d = 2 .delta. bd ( A T A ) a
c ( Eq . 7 ) .differential. 2 n .differential. A ab .differential.
B c d = 2 A a c B bd + 2 .delta. bc ( AB - C ) ad ( Eq . 8 )
##EQU00004##
Using any standard convex optimization technique and the
derivatives above with the (convex) constraints 0.ltoreq.A.sub.ij,
B.sub.ij.ltoreq.1, a solution of the optimization problem may be
found, where the joint dimension of the matrices A and B is chosen
as the desired, or expected, number of profiles.
[0172] The matrix A is to be understood as the set of probabilities
of association of each of the profiles per each of the set top
boxes and the matrix B is the targeted rating matrix. Since the
matrix A is expected to contain binary quantities (either a profile
exists in a household or not), and since the optimal solution is
defined up to a multiplicative constant for each profile, it is
desirable to find a good quantization criterion for A.
[0173] Instead of the above-described example, for the unsupervised
learning algorithm, one may consider the slightly more complex
example described below. Moreover, these alternative ways may be
used to address specific different cases and the present invention
is not limited to these examples. An example of an alternative way
is, instead of minimizing the squared norm of the matrix (A-C),
minimizing the squared norm of (B-(A.sup.+)C), denoted herein by
m:
m=.parallel.B-(A.sup.+)C.parallel..sup.2 (Eq. 9)
In addition, it is also possible to minimize the squared norm of
(A-C(B+)), denoted by v:
v=.parallel.A-C(N.sup.+).parallel..sup.2, (Eq. 10)
where A.sup.+ denotes the pseudo-inverse of the matrix A, and
B.sup.+ denotes the pseudo-inverse of the matrix B. For example,
the Moore-Penrose pseudo-inverse may be used. This enables a
reduction of the dimensionality of the problem as the dimensions of
the later matrices are usually much smaller than of the matrix
(AB-C). Further, this approach creates a sharper distinction
between the probabilities in A (desired to be binary) and of B
(usually small probabilities representing targeted rating) in the
minimization process. The pseudo-inverse of a matrix is unique in
mathematical terms, hence minimizing equations nine or ten is well
defined. In the case of minimizing, for example, the quantity m,
one would need to use the derivatives
.differential. m .differential. A ##EQU00005##
and
.differential. m .differential. B , ##EQU00006##
which involves calculating derivatives of the form
.differential. A + .differential. A ab , ##EQU00007##
where:
.differential. A ij + .differential. A ab = ( A + A + T ) ib
.delta. ja - A ia + A bj + - ( A + A + T ) ib ( A + T A T ) aj ( Eq
. 11 ) ##EQU00008##
The result of applying the derivative in equation eleven to obtain
the derivatives
.differential. m .differential. A , ##EQU00009##
and
.differential. m .differential. B , ##EQU00010##
so as the second derivatives, of the quantity m, results in
slightly longer expressions than the derivatives presented above,
in equations 4-8, but similar in nature.
[0174] Moreover, instead of using convex minimization routines, we
may use various nonlinear minimizations with slightly altered
constrains to minimize the squared norms of the differences
above.
[0175] An initial guess, for example, but not limited to, a random
guess, is given to the algorithm for any of the probabilistic
quantities in A and B. Additional constrains may be given to the
algorithm to increase its accuracy. Of course, other optimization
(or learning) algorithms may be used. The output is a set of
probabilities, A, associating groups of profiles to the set top
boxes, which later may be quantized and/or resolved (using, when
needed a profile resolving procedure and quantization), and a set
of probabilities, B, providing the targeted rating for each (for
example) program and each profile (also to be used in the profile
resolving scheme when needed). It should be noted that the targeted
rating may be re-calculated during the post-processing to increase
the accuracy.
[0176] It should be noted that the abovementioned examples,
equations, and functionalities are based upon the general premise
that matrix C can be approximated by matrix A multiplied by matrix
B. Of course, further examples for achieving such approximation may
be provided and such examples are intended to be included within
the present invention.
[0177] Quantization
[0178] The quantization step is typically, but not necessarily, to
be used after the learning and profile determination stage, in the
identification functionality, or a few times during the steps of
learning, profile determination, and identification.
[0179] One approach to finding the quantizing constants (a set of
constants that each of the probabilities relating each of the found
profiles to set top boxes should be divided by to determine whether
a certain profile should indeed be associated with a certain set
top box or not) is to assume that A is approximately a binary
matrix with a constant multiplicative factor per column, s.sub.i
(1.ltoreq.i.ltoreq.number of profile groups), or in other words,
assume that each of the i profile groups has its own quantization
constant. Since the entries are supposed to be binary quantities,
one expects the following from calculating the mean and variance
using the binomial distribution, as shown by equations 12 and
13.
.sub.aA.sub.ai=s.sub.iNp (Eq. 12)
.sub.aA.sub.ai.sup.2/N-( .sub.aA.sub.ai).sup.2/N.sup.232
s.sub.i.sup.2pq (Eq. 13)
where N is the number of set top boxes in the network, p is the
probability that a profile is associated to a set top box, and
q=1-p. Solving equation twelve and equation thirteen for s.sub.i,
dividing A.sub.ai/s.sub.i and rounding to a pre-defined threshold,
leads to an association rule, associating each of the profiles
(resolved or yet unresolved) to each of the set top boxes.
[0180] Profile Determination
[0181] Profile determination, or resolving, is a process that
defines the nature of identified profiles. During profile
resolving, profiles definition, for example from a single source
research results, such as, but not limited to, viewing habits and
behavior, may be used as inputs. In addition, the profile list and
targeted rating of defined profiles may be used as inputs. The
inputs are provided to a resolving algorithm resulting in profile
descriptions that describe each profile in the list.
[0182] The single source research addresses a focus group that
answers a questionnaire. There are two groups of questions in this
questionnaire, namely, a first group and a second group. The first
group refers to identity of a person, examples including behavior
(i.e., purchasing behavior, rest and relaxation preferences, etc)
and demographic profile of the answering person. The second group
refers to media consumption, for example, about the time a person
would watch television each day of the week and his preferred
shows.
[0183] The single source research associates the media consumption
habits with other habits, such as, but not limited to, purchasing
habits and preferred vacation habits. The output of the single
source research is a set of profiles and their habits, while each
profile is associated with its media consumption habits. The
resolving algorithm finds the best correlation between two sets of
data, namely, for example, the media consumption habits of the
focus group; and, for example, the targeted rating of the defined
profiles (the output of the unsupervised learning algorithm).
Therefore, the resolving algorithm has the capability of defining
the traits of the learned profile in the unsupervised
algorithm.
[0184] In accordance with the present invention, after the learning
and identification are performed, the management application 50
knows online, or offline, the current psychographic or demographic
profiles that are consuming content for at least a portion of the
set top boxes of the network for which the zapping log contains
records of set top box zapping signatures. The information
regarding the current demographic/psychographic profiles that are
consuming content for set top boxes within the network for which
sufficient input was received, may be the basis for personalized
advertisements deployment in accordance with the present
invention.
[0185] It should be emphasized that the above-described embodiments
of the present invention are merely possible examples of
implementations, merely set forth for a clear understanding of the
principles of the invention. Many variations and modifications may
be made to the above-described embodiments of the invention without
departing substantially from the spirit and principles of the
invention. All such modifications and variations are intended to be
included herein within the scope of this disclosure and the present
invention and protected by the following claims.
* * * * *