Identification and evaluation of audience exposure to logos in a broadcast event Patent Grant Cohen-Solal , et al. March 28, 2 [Koninklijke Philips Electronics N.V.]

Identification and evaluation of audience exposure to logos in a broadcast event

Cohen-Solal , et al. March 28, 2

Patent Grant 7020336

U.S. patent number 7,020,336 [Application Number 10/014,190] was granted by the patent office on 2006-03-28 for identification and evaluation of audience exposure to logos in a broadcast event. This patent grant is currently assigned to Koninklijke Philips Electronics N.V.. Invention is credited to Eric Cohen-Solal, Vasanth Philomin.

United States Patent	7,020,336
Cohen-Solal , et al.	March 28, 2006

Identification and evaluation of audience exposure to logos in a broadcast event

Abstract

Method and system of detecting and analyzing the presence of a logo in one or more datastreams. In the method, at least one video datastream of an event is first received. Next, one or more regions of interest (ROIs) for the logo in one or more images comprising the at least one datastream are identified. The one or more ROIs are analyzed to detect if the logo is present in the ROI. If so, the detection of the presence of the logo is used in making either a broadcasting decision or an advertising decision.

Inventors:	Cohen-Solal; Eric (Ossining, NY), Philomin; Vasanth (Hopewell Junction, NY)
Assignee:	Koninklijke Philips Electronics N.V. (Eindhoven, NL)
Family ID:	21764027
Appl. No.:	10/014,190
Filed:	November 13, 2001

Prior Publication Data


	Document Identifier	Publication Date
	US 20030091237 A1	May 15, 2003

Current U.S. Class:	382/204; 348/157; 348/169; 348/700; 348/E5.022; 382/291; 702/92; 702/94
Current CPC Class:	G06K 9/3266 (20130101); H04N 5/222 (20130101)
Current International Class:	G06K 9/46 (20060101)
Field of Search:	;382/196,199,224,225,226,227,284,287,294,180,204 ;702/85,92,93,94 ;348/61,135,157,164,169,170,171,172,578,587,589,700,722

References Cited [Referenced By]

U.S. Patent Documents


5264933	November 1993	Rosser et al.
5524065	June 1996	Yagasaki
5543856	August 1996	Rosser et al.
5862517	January 1999	Honey et al.
5892554	April 1999	DiCicco et al.
5912700	June 1999	Honey et al.
5917553	June 1999	Honey et al.
5953076	September 1999	Astle et al.
5963670	October 1999	Lipson et al.
5969755	October 1999	Courtney
6069655	May 2000	Seeley et al.
6100925	August 2000	Rosser et al.
6100941	August 2000	Dimitrova et al.
6133946	October 2000	Cavallaro et al.
6230204	May 2001	Fleming, III
6266100	July 2001	Gloudemans et al.
6356669	March 2002	Darrell
6446261	September 2002	Rosser
6466275	October 2002	Honey et al.
6486920	November 2002	Arai et al.
6690829	February 2004	Kressel et al.
6718551	April 2004	Swix et al.
6750919	June 2004	Rosser
6757584	June 2004	Thess et al.
6778705	August 2004	Gutta et al.
6792205	September 2004	Frisken et al.
6810086	October 2004	Puri et al.
6850250	February 2005	Hoch

Foreign Patent Documents


0578558	Dec 1994	EP

Other References

McKenna, Stephen et al., Tracking Faces, Proceedings of the Second Int'l Conference on Automatic Face and Gesture Recognition, Oct. 14-16, 1996, Killington VT, pp. 271-276. cited by other .
U.S. Appl. No. 09/794,443, entitled "Classification Of Objects Through Model Ensembles" for Srinivas Gutta and Vasanth Philomin, filed Feb. 27, 2001. cited by other .
Gutta, S., et al. Mixture Of Experts For Classification of Gender, Ethnic Origin and Pose of Human Faces, IEEE Transactions On Neural Networks, vol. 11, No. 4, (Jul. 2000), pp. 948-960. cited by other .
"Autonomous Driving Approaches Downtown" by U. Franke et al., IEEE Intelligent Systems, vol. 13, No. 6, 1998. cited by other .
"Real-Time Object Detection For "Smart" Vehicles" by D.M. Gavrila and V. Philomin, Proceedings of IEEE International Conference On Computer Vision, Kerkyra, Greece 1999. cited by other .
"Texture Classification By Center Symmetric Auto-Correlation, Using Kullback Discrimination Of Distributions" by David Harwood, et al., Pattern Recognition Letters, 16 (1995), pp. 1-10. cited by other .
"Hand Gesture Recognition Using Ensembles Of Radial Basis Function (RBF) Networks And Decision Trees" by Srinivas Gutta et al, International Journal Of Pattern Recognition And Artificial Intelligence, vol. 11, No. 6, pp. 845-872 (1997). cited by other.

Primary Examiner: Mehta; Bhavesh M.
Assistant Examiner: Desire; Gregory
Attorney, Agent or Firm: Liberchuk; Larry

Claims

What is claimed is:

1. A method for insuring the video broadcast of a logo for a period of time, comprising the steps of: receiving at least one video datastream of an event; identifying one or more regions of interest (ROIs) having characteristics associated with a logo of interest in one or more images comprising the at least one datastream; analyzing the one or more ROIs to detect if the logo is present in at least one of the ROIs; responding to the detection of the presence of the logo for selectively broadcasting the associated ROI for at least a minimum period of time; and tracking in real time the total time the logo is present during the period of time the event is broadcast, to permit associated advertisers to independently confirm acceptable broadcast of paid for advertising.

2. The method of claim 1, wherein the at least one video datastream comprises a single broadcast datastream.

3. The method of claim 1, wherein the at least one video datastream comprises two or more separate video datastreams, the two or more datastreams being individually selectable for broadcasting the event via the one's of said datastreams showing the logo until the total time the logo has been broadcasted during the event is at least equivalent to the associated paid for advertising time.

4. The method of claim 1, wherein the step of identifying one or more ROIs for the logo is based on at least one of a color, shape, and texture of the logo.

5. The method of claim 4, wherein identifying one or more ROIs for the logo comprises identifying a number of adjacent pixels having the same color as the logo.

6. The method of claim 4, wherein identifying one or more ROIs for the logo comprises identifying measures of texture in samples of a location within the image that correspond to like measures of texture of the logo.

7. The method of claim 4, wherein identifying one or more ROIs for the logo comprises using template matching that identifies shapes within the image that correspond with the shape of the logo.

8. The method of claim 1, wherein analyzing the one or more ROIs to detect if the logo is present in at least one of the ROIs comprises using radial basis function (RBF) classification modeling.

9. The method of claim 8, wherein the RBF classification modeling includes training using images of the logo having a multiplicity of perspectives and scales.

10. The method of claim 1, wherein analyzing the one or more ROIs to detect if the logo is present in at least one of the ROIs comprises using template matching.

11. A system for detecting and analyzing the presence of a logo, the system comprising a processor having input that receives at least one video datastream of an event, identifies one or more regions of interest (ROIs) for the logo in one or more images comprising the at least one datastream, analyzes the one or more ROIs to detect if the logo is present in at least one of the ROIs, and insures that an ROI having the logo is broadcast during the event for at least a total period of time corresponding to an associated advertiser's prepaid advertising.

12. The system of claim 11, wherein the processor receives one video datastream of the event, the one video datastream comprising a single broadcast datastream.

13. The system of claim 11, wherein the processor receives two or more separate video datastreams, the two or more datastreams being individually selectable for broadcasting the event via the one's of said datastreams showing the logo until the total time the logo has been broadcasted during the event is at least equivalent to the associated paid for advertising time.

14. Software stored on a computer readable medium for detecting and analyzing the presence of a logo, the software receiving as input digital representations of images that comprise at least one video datastream of an event, the software identifying one or more regions of interest (ROIs) for the logo in one or more images comprising the at least one datastream, analyzing the one or more ROIs to detect if the logo is present in at least one of the ROIs, and monitoring the presence of the logo in the image when so detected, wherein the software provides an output regarding detection of the presence of the logo that is usable in insuring the broadcast of the logo during the event for a total accumulated time corresponding to paid for advertising of an advertiser.

15. A system for detecting and analyzing the presence of a logo, the system comprising a processor having input that receives at least one video datastream of an event, analyzes the image to determine if the logo is present in at least one portion of the image and monitors the presence of the logo in the image when so detected, wherein the detection of the presence of the logo is used in insuring the broadcast of the logo during the event for a total accumulated time corresponding to paid for advertising of an advertiser.

Description

FIELD OF THE INVENTION

This invention relates to content based video analysis for identification of logos in a video datastream.

BACKGROUND OF THE INVENTION

Advertising is an essential means by which to introduce, promote and maintain the purchasing public's familiarity with new and/or extant brands. Advertising efforts typically include selecting a unique logo, displaying the logo to the public in a manner that associates positive product attributes to the logo, and maximizing the purchasing public's exposure to that logo. The more successful logos are typically unique in shape, size, or other features, such as color or texture (e.g., McDonald's.TM. golden arches, Nike.TM. swosh symbol, etc).

Advertising during broadcast events, such as sporting events, is one of the most effective ways to expose products and brand logos to a broad and diverse audience. The success of broadcast advertising has resulted in an escalation in cost, and this success is evidenced by its high cost. Advertisers typically have difficulty in evaluating the efficacy of money spent on advertising during broadcast events. Advertisers are similarly interested in conductive comparative surveys of exposure obtained by other advertisements.

Recently acquired habits and capabilities of television viewers, as well as new forms of advertising, complicate the efforts of advertisers and heighten the need for advertisers to independently monitor broadcast advertisements. Although the total number of television viewers continues to grow, viewers are becoming increasingly more adept at bypassing the traditional commercial breaks that typically appear at 15 minute intervals. The methods of bypass include changing the channel at the start of a commercial break or using a VCR or TiVo.TM. recorder to tape a show and then fast-forwarding past the commercial break. To overcome this new-found ability of modem audiences to filter their viewing by bypassing standard commercial break segments, advertisers are placing their logos and/or products within the show itself, either by logo placement or product placement type advertisements.

Logo placement is seen at practically every sporting event, where logos emblazon the walls of baseball stadiums, race car tracks and football stadiums, as well as on basketball court floors. Similar to logo placement, product placement not only thwarts the filtering efforts of the viewers, but has the additional benefit of more closely correlating the advertiser's brand to the attributes of a particular show (e.g., placing a bottle of Coppertone.TM. suntanning lotion on the set of the Baywatch show).

A significant drawback to product and logo placement is that advertisers must relinquish the control over the creation and airing of a standard 30-second commercial, leaving it principally to the show's producer to see that the logo appears within the broadcast in the correct manner, and for the appropriate amount of time. A producer may not always share the advertiser's focus on whether the brand logo is displayed in a manner that is visible to the audience. This relinquishment of control of logo exposure elevates the need for advertisers to independently confirm acceptable broadcast of paid-for advertising.

Verifying audience exposure to virtual logos is an additional challenge, since virtual logos can be inserted into a broadcast at any stage of production. Virtual ads are digital enhancements of event broadcasts, and although they are typically added to blank spaces on stadiums walls, they can also be used to replace another advertisement of similar size. Despite the fact that they do not exist in real life, virtual advertisements appear real to the audience. The advent of virtual advertisements increases the need to independently monitor whether a logo actually makes it into the final event, as broadcast to the audience. Utilization of virtual advertisements is in its nascent stage, and its use expected to grow significantly.

Ad-hoc manual scanning of a broadcast by an individual may be used to attempt to identify the appearance of a logo. Such a task is subject to human error, including distractions, subjective nature of the identification, etc., among other difficulties. The task of tracking virtual advertisements in a broadcast event is even more uncertain due to the possibility of sudden and random placement of a substitute virtual image either by the event producer or by broadcast personnel who may subsequently manipulate the broadcast image. Further compounding the task is the proliferation of television stations and broadcast events, multiplying the voluminous amount of information to be surveyed. People lack the wherewithal for minute analysis of large volumes of data, as necessary to survey an event broadcast, and tabulate the duration of every appearance of a logo.

Aside from the difficulties in relying on human subjectivity in locating the advertisements, the prior art provides no mechanism to tally the logo exposure time in a reproducible and non-subjective manner. Any statistical analysis based human-generated data will suffer from the inherent subjectivity of the evaluators. Moreover, alternating between the multiple cameras that typically cover sporting events will shift the perspective and position of the displayed logo, potentially causing an observer to overlook a portion of the logo exposure time.

There are numerous processes for detecting the presence of objects in a videostream. For example, U.S. Pat. No. 5,969,755 to Courtney, the contents of which are hereby incorporated by reference, describes a particular motion based technique for automatic detection of removal of an object in video received from a surveillance camera. Courtney divides the video image into segments and video-objects, which are identified in the segments by comparing a reference image with the current image, identifying change regions and tracking them between received video frames to provide updated position and velocity estimation of the object. Courtney and like techniques, however, are limited to detection of removal of an object. They do not provide a mechanism for detecting and analyzing logo appearance or placement.

SUMMARY OF THE INVENTION

An objective of the invention is to provide a system and method to positively identify, in a real-time and a reproducible manner, and to track the quality and frequency of appearances of two or three dimensional logos in one or more datastreams, whether viewed frontally or from a varying perspective. It is an objective to identify and track a logo in a datastream in an automatic and objective manner. Further objectives are to use the detection and tracking of the logo for further analysis and/or decision-making, including analysis of exposure time of the logo during an event (which may then be used in marketing and advertising decisions) and broadcast decisions (for example, determining which of a number of cameras to used in broadcasting the event).

Another objective of the invention is to automate the logo recognition process, providing detailed and non-subjective analysis of the frequency, duration and degree of prominence of display of target logos. The invention provides the advantage of performing reproducible analysis based on the broadcast image, eliminating the need for manual viewing of the program and searching for each occurrence of the target logos. Elimination of human operator subjectivity and digitization of the process allows greater certainty of the search results and allows more definitive and reproducible analysis to be performed. Additionally, the invention can assign a value to each appearance of the logo, with the value varying in accordance with the clarity and/or size of the display of the logo, thereby informing the advertiser of the logo view-ability.

Still another objective of the invention is to provide a real-time notification to the broadcast producer of the appearance of one or more target logos among the plurality of cameras typically used to film a broadcast event. The producer may, for example, use the information to select which camera is used to broadcast the event.

In accordance with these objectives, the invention comprises a method of detecting and analyzing the presence of a logo in one or more datastreams. First, at least one video datastream of an event is received. Next, one or more regions of interest (ROIs) for the logo in one or more images are identified, where the one or more images comprise the at least one datastream. The one or more ROIs are analyzed to detect if the logo is present in the ROI. If so, the detection of the presence of the logo is used in making either a broadcasting decision or an advertising decision.

In one embodiment, the at least one video datastream comprises a single broadcast datastream, the time the logo is detected during the event is compiled, and the time of detection is used to make an advertising decision. In another embodiment, the at least one video datastream comprises two or more separate video datastreams, the two or more datastreams selectable for broadcasting the event. In that case, detection of the logo in one or more of the datastreams is used in a broadcast decision, for example, one of the datastreams in which the logo is detected is selected and used to broadcast the event.

The step of identifying one or more ROIs for the logo is, for example, based on the color, shape, and/or texture of the logo. If, for example, color is used, a number of adjacent pixels having the same color as the logo may be used to identify an ROI. If, for example, texture is used, center-symmetric covariance measures in a proximate location within the image that correspond to the borders of the logo may be used in identifying an ROI. If, for example, shape is used, template matching that identifies shapes within the image that correspond with the shape of the logo may be used in identifying an ROI.

The step of analyzing the one or more ROIs to detect if the logo is present in the ROI comprises, for example, using radial basis function (RBF) classification modeling. The RBF classification modeling may include training using images of the logo having a multiplicity of perspectives and scales. Alternatively, for example, the step of analyzing the one or more ROIs to detect if the logo is present in the ROI comprises use of template matching.

The invention also comprises a system for detecting and analyzing the presence of a logo. The system comprises, for example, a processor having input that receives at least one video datastream of an event. The processor includes processing software (or other digitally formatted algorithms) that identifies one or more ROIs for the logo in one or more images comprising the at least one datastream. The processor analyzes the one or more ROIs to detect if the logo is present in the ROI and monitors the presence of a detected logo. Detection of the presence of the logo is used in making either a broadcasting decision or an advertising decision. In one case, for example, the processor receives one video datastream of the event, where the one video datastream comprises a single broadcast datastream. In another exemplary case, the processor receives two or more separate video datastreams, were the two or more datastreams may each be selected for broadcasting the event.

The invention also comprises software for detecting and analyzing the presence of a logo. The software receives as input digital representations of images that comprise at least one video datastream of an event. The software identifies one or more ROIs for the logo in one or more images comprising the at least one datastream. The software then analyzes the one or more ROIs to detect if the logo is present in the ROI. If so, the software monitors the presence of the detected logo and provides an output regarding detection of the presence of the logo that is usable in making either a broadcasting decision or an advertising decision.

The invention also comprises a system for detecting and analyzing the presence of a logo. The system comprises a processor having input that receives at least one video datastream of an event. The processor analyzes the image to determine if the logo is present in at least one portion of the image and monitors the presence of the logo in the image when so detected. Detection of the presence of the logo is used in making one of a broadcasting decision and an advertising decision.

BRIEF DESCRIPTION OF THE DRAWINGS

The above and other objects, features and advantages of the present invention will be become more apparent from the following detailed description when taken in conjunction with the accompanying drawings, in which:

FIG. 1a is schematic of a typical positioning of cameras used to film and generate a number of video datastreams used to broadcast an event that includes a logo as a backdrop;

FIG. 1b is a representation of the image of the logo captured by the cameras of FIG. 1a;

FIG. 2 depicts a system in accordance with one embodiment of the invention;

FIG. 3 depicts details of one of the components of FIG. 2; and

FIG. 4 depicts a system in accordance with a second embodiment of the invention.

DETAILED DESCRIPTION

FIG. 1a is a representation of several cameras 101 103 in a frame of reference that are used to capture images of an event and each provide a number of video datastreams, which may be selected to broadcast the event, such as a sporting event. The different cameras 101 103 are trained on portions of the event (which may overlap) from different perspectives and angles. The camera chosen for broadcasting the event changes throughout the event depending on broadcast decisions and considerations, which in the past were typically related to the event itself. The various positions, angles, etc. of the cameras themselves may be moved or adjusted.

FIG. 1a also shows a representative logo, shown in the form of a large "M", that may be captured with the images of some or all of the cameras 101 103. It is also noted that a camera that is broadcasting the event may also capture the logo within its field of view for a time, and then pan to another portion of the event and exclude the logo from the broadcast.

Where the target logo 100 is in a position so that all three of cameras 101 103 capture the logo in their fields of view, examples of the different perspectives of the target logo 100 that may be included in the images from the various cameras are shown in FIG. 1b. (The logos are focused on in FIG. 1b, but will, of course, be part of a broad image of the event.) Thus, the logo may have various perspectives and scales in the different datastreams, may be partially or entirely shown, or not shown at all, for example, depending on the camera being used at the time to broadcast the event.

An embodiment of the invention is set out in block diagram format in FIG. 2. Central to the system 120 is a digital processor 124, associated memory 126 and attendant software that will be described in more detail below. In operation, that is, after programming in accordance with the description below, processor 124 receives a broadcast datastream via an input interface 128 and processes the datastream to detect a particular logo, among other things, as also further described below. In this embodiment, the determination of which video datastream to broadcast has been made elsewhere upstream from system 120, and the system analyzes the datastream that is broadcast. Thus, in general, the video broadcast includes images from the datastream of one camera covering an event (such as camera 101 in FIG. 1a), followed by images from the datastreams of other cameras covering the event (such as cameras 102 or 103 in FIG. 1a), etc. The video broadcast is, for example, a digital video datastream. If the broadcast is an analog signal, then processor 124 or another component of the system 120 may include an A/D converter.

An external interface 130 allows a user to initiate operation of the system 120 for detection of a logo (among other things) in the video broadcast. It also allows the user to program the system 120, as also described in more detail below. Among other things, attendant to the programming of the system will typically include loading base-line data related to the image that is then used by the system 120 in performing detection of a logo in the broadcast.

FIG. 3 is a block diagram that represents the processing stages performed by processor 124 on the video broadcast. The processor first performs an ROI analysis 124a on the received video broadcast (also referred to as the video datastream). Generally speaking, the ROI analysis 124a rapidly identifies certain sub-regions of the image that are more likely to contain the logo, in order to focus the subsequent logo identification analysis on those portions of the image. As described further below, the ROI analysis 124a may focus on identifying one or more portions of the image that contain one or more general features of the logo, such as shape, color and/or texture of the logo. Any such identified region is an ROI that is considered for further analysis.

The processor 124 then uses the ROIs identified in the images in the datastream in a logo identification analysis 124b. As also described further below, the processing in this analysis determines whether the logo is present in any of the ROIs. Once a logo (or logos) is positively identified in the image datastream, the logo is tracked in the image using tracking processing 124d. (If more than one logo is identified, they may all be tracked simultaneously in the image.) Once a logo is identified in the image, the processing will generally track the logo, and suspend the ROI and logo identification steps. If, however, the logo moves out of the image (out of the field of view of the camera providing the datastream), for example, because the camera is pivoted or the broadcast is switched to another camera covering the event, then the ROI and logo identification processing 124a, 124b are re-initiated.

As also shown in FIG. 3, the processor 124 also uses the tracking data to perform output processing 124c. This includes, for example, compiling the amount of time the logo appears in the datastream, among other analysis. The output is provided to the user interface 130, such as a graphic display.

ROI analysis 124a is performed in at least one of a number of ways by processor 124. The received image may be scanned, for example, to identify ROI regions within the image based on color, shape and/or texture of the logo, among other things. Referring to the "M" logo shown in FIGS. 1a and 1b, if the "M" is a continuous red color, then the ROI analysis 124a may analyze the incoming images for sub-regions having a certain number of adjacent red picture elements. The analysis may require a certain number of adjacent red pixels before it is identified as an ROI (thus eliminating, for example, features in the image that cannot be resolved).

Similarly, the ROI analysis may consider texture of the sub-regions within the image that broadly match the parameters of the logo under consideration. For example, sub-regions of the image are sequentially considered as possible ROIs that potentially contains the logo. For each sub-region, samples of the image are considered and local center-symmetric auto-correlation measures are generated for the sample. These include linear and rank-order versions, along with a related covariance measure and a variance ratio. Apart from the related covariance measure, all such measures generated for the samples are rotation invariant robust measures and are locally gray-scale invariant. Thus, the various possible perspectives of the logo (such as the "M" shown in FIG. 1b) will not affect these measures. The measures are all abstract measures of texture pattern and scale-scale, thus providing highly discriminating information about the level of local texture in the samples. By comparing the sampled texture measures within the ROI with known texture measures of the logo, and determining that the correlation between the sampled measures and the known measures meets a threshold level, a determination is made that the sub-region is an ROI. Such texture analysis of an image is further described in "Texture Classification By Center Symmetric Auto-Correlation, Using Kullback Discrimination Of Distributions" by David Harwood, et al., Pattern Recognition Letters, 16 (1995), pp. 1 10, the contents of which are hereby incorporated by reference herein.

In addition, the ROI analysis may consider the shapes found in sub-regions within the image that broadly match the parameters of the logo under consideration. For example, if the logo has a circular border, then in various perspectives it may appear circular or oval. The image may thus be analyzed to determine sub-regions having ovals of a certain threshold size. To determine matching shapes, the ROI processing may use template matching in analyzing the sub-regions of the image, and/or a gradient analysis, for example. Such a processing technique may be adapted from the hierarchical template matching approach described in D. M. Gavrila and V. Philomin, "Real-time Object Detection for "Smart" Vehicles", Proceedings of the IEEE International Conference on Computer Vision, Kerkyra, Greece (1999), the contents of which are hereby incorporated by reference herein. (The document is available at www.gavrila.net). Analyzing an image gradient is also further described in U.S. patent application Ser. No. 09/794,443, entitled "Classification Of Objects Through Model Ensembles" for Srinivas Gutta and Vasanth Philomin, filed Feb. 27, 2001, which is hereby incorporated by reference herein and referred to as the "'443 application".

Thus, using at least one such technique, processor 124 is programmed to conduct the image processing to identify ROIs for the particular logo under consideration. Alternatively, processor 124 may be programmed to receive a frontal image of the logo, for example, and to generate parameters pertaining to the logo that correspond to various ROI analysis techniques (such as color, texture, shape, etc.), as described above. A series of images of the logo that represent different scales, perspectives and illumination may also be generated. Processor 124 may use the parameters developed to test each different ROI analysis technique, using the series of images of the logo in a background image, for example. The technique that identifies the greatest number of the series of logos as ROIs in the background image is used as the ROI analysis 124a by processor 124 for the particular logo.

Once one or more ROIs in an image are identified, they are each further analyzed to determine whether the logo is actually found therein. As noted above, this processing falls under the rubric of logo ID analysis 124b of FIG. 3. Although one ROI will be referred to for convenience in the ensuing description, the same logo ID processing is applied to all ROIs identified in the image. To conduct the logo ID analysis 124b, processor 124 is programmed with one of a number of various types of classification models, such as a Radial Basis Function (RBF) classifier, which is a particularly reliable classification model. The '443 application describes an RBF classification technique for identification of objects in an image, which may be a logo, and is thus used in the preferred embodiment for programming the processor 124 to identify whether or not a feature in an ROI is the logo. It is noted that the '443 application also treats classification of an object that is moving in the image. Thus, the RBF classification technique may be used for logo ID where the object moves in the ROI for a succession of images (for example, where the camera providing the image is pivoted), as well as an object that is stationary (i.e., has zero motion) in the ROI of the image.

In short, the RBF classifier technique described extracts two or more features from each object in the ROI. Preferably, the x-gradient, y-gradient and combined x-y-gradient are extracted from each detected object. The gradient is of an array of samples of the image intensity given in the video datastream for the moving body. Each of the x-gradient, y-gradient and x-y-gradient images are used by three separate RBF classifiers that give separate classification. As described further below, this ensemble of RBF (ERBF) classification for the object improves the identification.

Each RBF classifier is a network comprised of three layers. A first input layer is comprised of source nodes or sensory units, a second (hidden) layer comprised of basis function (BF) nodes and a third output layer comprised of output nodes. The gradient image of the moving object is fed to the input layer as a one-dimensional vector. Transformation from the input layer to the hidden layer is non-linear. In general, each BF node of the hidden layer, after proper training using images for the class, is a functional representation of one of a common characteristic across the shape space of the object classification (such as the logo). The training may include inputting a large number of images of the logo, from different perspectives, different scales, different illumination, etc. (As described above, training images may be supplied via the user interface 130 and data input 132 shown in FIG. 2.) Alternatively, the processor 124 may include software that receives a front perspective of the logo, generates a mathematical model of the logo, and internally rotates, re-scales, etc. the mathematical model of the logo, thereby generating various views of the logo having different perspectives, scalings, etc. Each BF node of the hidden layer, after proper training using images for the logo class, transforms the input vector into a scalar value reflecting the activation of the BF by the input vector, which quantifies the amount the characteristic represented by the BF is found in the vector for the object (in this case, the logo) in the image under consideration.

The output nodes map the values of the characteristics along the shape space for the moving object to one or more identification classes for an object type and determines corresponding weighting coefficients for the object in the image. The RBF classifier determines that an object is of the class that has the maximum value of weighting coefficients. Preferably, the RBF classifier outputs a value which indicates the probability that the object belongs to the identified class of objects.

Thus, the RBF classifier that receives, for example, the x-gradient vector of an object in the ROI as input will output a probability that it is the logo. Because the RBF programming comprising the logo ID analysis includes training the RBF classifier with various perspectives (including rotation), lighting and scaling of the logo, the probability is provided for a logo in the image that may have various perspectives, lighting and size, such as those shown for the logo "M" in FIG. 1b. The other RBF classifiers that comprise the ensemble of RBF classifiers (that is, the RBF classifiers for the y-gradient and the x-y-gradient) will also provide a classification output and probability for the input vectors for the object. The classes identified by the three RBF classifiers and the related probability are used in a scoring scheme to conclude whether or not the object in the ROI is the logo.

The objects in each ROI are analyzed by the logo ID analysis in this manner to determine if the object is the logo. If an object in the ROI is determined to be the logo, then as shown in FIG. 124d, the logo is then tracked in the received image, shown in FIG. 3 as tracking analysis block 124d. As noted, this processing keeps track of the logo if it moves within the frame of the image because the camera transmitting the image is rotated or otherwise moved, for example. It is noted that, when the camera is panned or otherwise moved, the logo does not move with respect to the background objects in the image and thus is not an object that is itself "in motion" in the video image. (By moving the camera, the position of the logo may change slightly with respect to other features in the image, but this is not a substantial movement that may be tracked using techniques directed an object that is itself actually in motion.)

Such tracking of the position of the logo within the frame of the image is done, for example, by creating a template of the logo from a frame of the datastream once it is identified as the logo in the logo ID analysis processing 124b described above. The template may be based, for example, on extracted x-y gradient extracted for the logo in the RBF processing in the logo ID analysis 124b described above. An x-y gradient may be generated in the same location in subsequent images at regular intervals (for, example, every tenth image frame received in the datastream). The extracted gradient may be compared to determine if there is a match with the features of the x-y gradient in the subsequent image. If so, then it is concluded that the image of the logo has been sustained in the datastream at the same position over the interval. (The interval will generally be of a sufficiently small duration so that a camera could not be moved and then returned to the same position in the interval.)

If not, then an x-y gradient is generated for the sub-region of the image surrounding the last known position of the logo in the image. The extent of the sub-region is defined by how far the image could possibly move within the frame of the image during the interval by an operator panning or otherwise moving the camera. (As noted, the interval is generally sufficiently small so that an operator would be unable to physically move the camera enough to significantly change the position of the logo within the frame of the subsequent image.) If the extracted gradient matches the gradient of an object within the sub-region, then it is again concluded that the image of the logo has been sustained in the datastream (although the camera has been moved), and the position of the logo in the image is updated in processor 124.

In this manner, the logo identified in the datastream is tracked until it is no longer detected in the image. Typically, the logo will vanish from the datastream one of a number of reasons, including, 1) the camera supplying the image is panned or otherwise moved so that the logo lies outside its field of view, 2) the logo is completely or substantially obscured in the field of view of the camera supplying the image, 3) the camera supplying the video datastream of the event is changed, and the logo lies outside the field of view of the new camera, or 4) there is a break in transmission of the event, for example, for a commercial. In that case, the processing conducted by processor 124 returns to the ROI analysis 124a of the datastream, followed by logo ID analysis 124b, as described above. If and when the logo is again identified in the datastream, tracking 124d and output processing 124c is also performed, again as described above. The sequence of processing is repeated as the logo vanishes from the datastream and then subsequently re-appears.

Under certain circumstances, the logo may become obscured within the image. For example, in a baseball game, a player may step between the logo and the camera, thus blocking the logo from the transmitted image. If the logo is wholly or substantially obscured, then the logo effectively vanishes from the datastream, and the processing performed returns to the ROI analysis 124a and logo ID analysis 124b as described directly above. When the logo becomes visible in the image, it is detected in the ROI analysis 124a and logo ID analysis 124b and again tracked.

There may be circumstances where the logo can become partially obscured in the image, for example, if a player blocks part of the logo from the camera. If necessary, the processing that determines an ROI in the ROI analysis 124a is adjusted so that it has a lower threshold of identifying an ROI. For example, if texture is used to identify an ROI in the image as in the "M" example given above, a sub-region having only two (or one) linear feature may be found to be an ROI. Thus, if there are still a minimum number of defining features visible for a blocked logo, an ROI is found by the ROI processing 124b. Likewise the RBF classifier used in the logo ID analysis 124b may be trained with partial images of the logo and/or the threshold probabilities used to determine whether the object is the logo may be adjusted to accommodate partial images. In this manner, the processor 124 is programmed to identify a partially obscured logo in the datastream. Similarly, the match required between the extracted gradient and the gradient generated from a subsequent image in the tracking processing 124c may be lessened so that an logo that is partially obscured during tracking will continue to be tracked.

Referring back to FIG. 3, the tracking processing 124b provides data for output processing 124c conducted by the processor 124. The output processing 124c may be, for example, the total time that the logo is visible in the datastream for the event, the percentage of time the logo is visible, etc. Thus, the tracking processing 124b may provide an ongoing indication of when the logo is detected in the datastream and when it is subsequently not detected in the datastream. The output processing may keep track of the total time the logo is detected in the datastream (the "detected time"), as well as the total time of the event, for example. The detected time gives how long the logo is visible in absolute terms; the detected time may be used with the total event time to give the percentage of time the logo is visible over the course of the event. Data generated by the logo ID processing 124b and the tracking 124d relating to the size, perspective, illumination, etc. of the logo in the image can also be transmitted for output processing 124c. Using that data, the processor 124 can generate not only the amount of time the logo is visible, but also keep track of the quality of the logo's visibility during the event.

It is also noted that other statistical analysis may be performed. For example, the system may be used to compare the time of exposure of a company's logo with the time of exposure of other logos during an event. The processor may be programmed in the manner described above to simultaneously identify and track the company's logo, as well as the others, in the image datastream. The times that each logo is visible may thus be compiled for the event and compared. In addition, the amount charged to a company may be based on the amount of time its logo is visible during the event.

Other variations of the above-described embodiment may be used to provide output determinations other than the analysis of exposure time of the logo. As noted, the above-described embodiment is used to analyze the logo's exposure in the broadcast datastream. Thus, the determination of which camera (which generate separate video datastreams of the event) is selected to broadcast the event at any given point during the event has been made upstream (that is, prior to receipt by digital processor 124 in FIG. 2). In an alternative embodiment, FIG. 4 shows the digital processor 124 receiving a number of data streams from a number of cameras (shown to be the three cameras 101 103 of FIG. 1a) covering an event. Each datastream is simultaneously processed by the processor 124 to detect a logo therein. That is, each datastream received is separately processed by the processor 124 as shown in FIG. 3 and described above. Detection of the logo in some, but not all of the datastreams can be used by an event producer in deciding which camera to use to broadcast the event over a particular interval of time. For example, if a sponsor's logo is visible in the datastream for one of the three cameras, then if the producer has the ability to choose between cameras that can broadcast the event at that point in time, the producer may decide to use that camera that shows the logo.

Although illustrative embodiments of the present invention have been described herein with reference to the accompanying drawings, it is to be understood that the invention is not limited to those precise embodiments. For example, the RBF classifier used in the above-embodiments for the logo ID processing 124b conducted by processor 124 may be replaced with a template matching software technique. A series of templates for the logo having different perspectives, scales, etc. can be input to the processor 124, or internally generated based on an input frontal view of the logo. After ROIs are identified in an image as previously described, gradients may be generated for the objects in the image. The template may be compared with the objects and a percentage of pixels for each object that falls within the template is generated. If the percentage is less than a threshold amount, the logo ID analysis concludes that the object is the logo. Accordingly, it is to be understood that it is intended that the scope of the invention is as defined by the scope of the appended claims.

* * * * *

References

gavrila.net