U.S. patent application number 14/168811 was filed with the patent office on 2014-05-29 for system and method for visual analysis of on-image gestures.
This patent application is currently assigned to CORTICA LTD.. The applicant listed for this patent is CORTICA LTD.. Invention is credited to Karina Odinaev, Igal Raichelgauz, Yehoshua Y. Zeevi.
Application Number | 20140149893 14/168811 |
Document ID | / |
Family ID | 54065620 |
Filed Date | 2014-05-29 |
United States Patent
Application |
20140149893 |
Kind Code |
A1 |
Raichelgauz; Igal ; et
al. |
May 29, 2014 |
SYSTEM AND METHOD FOR VISUAL ANALYSIS OF ON-IMAGE GESTURES
Abstract
A method and system for providing at least a link to a content
item related to a multimedia content element respective of an
on-image gesture. The method comprises receiving, from a user
device, at least on-image gesture and the multimedia content
element; analyzing the at least on-image gesture determine at least
one portion of the multimedia content element that a user is
interested in; generating at least one signature for each of the at
least a portion; determining a content item corresponding to the at
least one identified portion of multimedia content, wherein the
determination is based in part on a type of the at least on-image
gesture; and modifying the received multimedia content element to
include at least a link to an informative resource containing the
content item.
Inventors: |
Raichelgauz; Igal; (Ramat
Gan, IL) ; Odinaev; Karina; (Ramat Gan, IL) ;
Zeevi; Yehoshua Y.; (Haifa, IL) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
CORTICA LTD. |
Ramat Gan |
|
IL |
|
|
Assignee: |
CORTICA LTD.
Ramat Gan
IL
|
Family ID: |
54065620 |
Appl. No.: |
14/168811 |
Filed: |
January 30, 2014 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
13685182 |
Nov 26, 2012 |
|
|
|
14168811 |
|
|
|
|
13624397 |
Sep 21, 2012 |
|
|
|
13685182 |
|
|
|
|
13344400 |
Jan 5, 2012 |
|
|
|
13624397 |
|
|
|
|
12434221 |
May 1, 2009 |
8112376 |
|
|
13344400 |
|
|
|
|
12084150 |
Apr 7, 2009 |
8655801 |
|
|
PCT/IL2006/001235 |
Oct 26, 2006 |
|
|
|
13685182 |
|
|
|
|
12195863 |
Aug 21, 2008 |
8326775 |
|
|
13685182 |
|
|
|
|
12084150 |
Apr 7, 2009 |
8655801 |
|
|
12195863 |
|
|
|
|
61763505 |
Feb 12, 2013 |
|
|
|
Current U.S.
Class: |
715/760 |
Current CPC
Class: |
G06F 16/172 20190101;
H04L 67/327 20130101; G06F 16/951 20190101; H04N 21/25891 20130101;
G06F 16/152 20190101; G06N 3/0454 20130101; H04H 20/103 20130101;
G09B 19/0092 20130101; H04N 21/466 20130101; G06F 16/904 20190101;
G06F 3/0484 20130101; G06F 16/284 20190101; G06F 16/285 20190101;
G06F 16/433 20190101; G06N 20/00 20190101; G06T 19/006 20130101;
G06N 5/022 20130101; H04H 60/66 20130101; H04N 21/23418 20130101;
G06F 16/7847 20190101; H04H 60/46 20130101; H04H 60/71 20130101;
G06N 3/0481 20130101; G06N 5/04 20130101; H04N 21/2668 20130101;
Y10S 707/99943 20130101; G06F 17/16 20130101; G06F 40/134 20200101;
G06N 5/02 20130101; G06Q 30/0246 20130101; G10L 15/32 20130101;
G10L 25/51 20130101; H04H 20/93 20130101; H04H 60/49 20130101; H04L
65/605 20130101; G06F 16/7844 20190101; G06N 7/005 20130101; G06F
16/1748 20190101; G06F 16/783 20190101; G06K 2209/27 20130101; G06Q
30/0261 20130101; G06F 3/0488 20130101; G06F 16/683 20190101; G06F
16/7834 20190101; H04L 67/22 20130101; G06K 9/00758 20130101; H04N
7/17318 20130101; H04N 21/8106 20130101; Y10S 707/99948 20130101;
H04L 65/601 20130101; G06F 16/685 20190101; H04L 67/306 20130101;
G06K 9/00744 20130101; H04N 21/26603 20130101; G06F 16/2228
20190101; G06F 16/435 20190101; H04H 60/56 20130101; G06F 16/4393
20190101; G06F 16/487 20190101; G06K 9/00711 20130101; G06F 3/048
20130101; G10L 15/26 20130101; H04H 60/33 20130101; G06K 9/00281
20130101; H04L 67/10 20130101; G06F 16/51 20190101; G06F 16/739
20190101; G06F 16/9558 20190101; G06F 16/35 20190101; G06F 16/40
20190101; G06N 5/025 20130101; H04H 20/26 20130101; G06F 16/14
20190101; G06F 16/41 20190101; G06K 9/6267 20130101; H04H 60/37
20130101; H04H 60/59 20130101; H04N 21/278 20130101; G06F 16/438
20190101; G06N 3/063 20130101; H04H 60/58 20130101; G06N 3/088
20130101; G06Q 30/0201 20130101; G06F 16/43 20190101; G06F 16/434
20190101; H04H 2201/90 20130101; G06F 16/48 20190101 |
Class at
Publication: |
715/760 |
International
Class: |
G06F 3/0488 20060101
G06F003/0488; G06F 3/0484 20060101 G06F003/0484 |
Foreign Application Data
Date |
Code |
Application Number |
Oct 26, 2005 |
IL |
171577 |
Jan 29, 2006 |
IL |
173409 |
Aug 21, 2007 |
IL |
185414 |
Claims
1. A method for providing at least a link to a content item related
to a multimedia content element respective of an on-image gesture,
comprising: receiving, from a user device, at least the on-image
gesture and the multimedia content element; analyzing the at least
on-image gesture to determine at least one portion of the
multimedia content element in which a user is interested;
generating at least one signature for each of the at least a
portion; determining a content item corresponding to the at least
one identified portion of multimedia content, wherein the
determination is based in part on a type of the at least on-image
gesture; and modifying the received multimedia content element to
include at least a link to an informative resource containing the
content item.
2. The method of claim 1, further comprising: receiving at least
one event related to the received multimedia content element; and
determining the content item corresponding to the at least one
identified portion of multimedia content using the generated
signatures, the at least one event, and the on-image gesture.
3. The method of claim 1, wherein an on-gesture is any one of: a
touch gesture, a scroll-over, a mouse click, wherein the touch
gesture is detected on a user device having a touch screen
display.
4. The method of claim 1, wherein the event is at least one of
viewing the multimedia content element for a specified period of
time and interacting with the multimedia content element for a
specified period of time.
5. The method of claim 2, further comprising: determining a type of
the on-image gesture and a type of the at least one event; and
determining a type of the content item based on at least one of:
the type of the on-image gesture and the type of the at least one
event.
6. The method of claim 1, further comprising: determining the
context of the multimedia content element respective of the
generated signature; and determining the content item based on the
context of the multimedia content element respective of the
generated signature.
7. The method of claim 1, wherein any one of the multimedia content
element and the content item is at least one of: an image,
graphics, a video stream, a video clip, an audio stream, an audio
clip, a video frame, a photograph, images of signals, combinations
thereof, and portions thereof.
8. The method of claim 1, wherein the at least link is added to the
multimedia content element as an overlay object, wherein the
modified multimedia content is embedded in a web page displayed in
a web-browser of the user device.
9. The method of claim 8, wherein the overlay object comprises a
vocabulary of at least one on-gesture determined as corresponding
to the at least identified portion of the received multimedia
content element.
10. A non-transitory computer readable medium having stored thereon
instructions for causing one or more processing units to execute
the method according to claim 1.
11. A system for providing at least a link to a content item
related to a multimedia content element respective of a user
gesture, comprising: an interface to a network for receiving a
uniform resource locator (URL) of a web-page containing a
multimedia content element and at least on-image gesture related to
the multimedia content element; a processor; and a memory coupled
to the processor, the memory contains instructions that when
executed by the processor cause the system to: receive, from a user
device, the at least on-image gesture and the multimedia content
element; analyze the at least on-image gesture to determine at
least one portion of the multimedia content element in which a user
is interested; generate at least one signature for each of the at
least a portion; determine a content item corresponding to the at
least one identified portion of multimedia content, wherein the
determination is based in part on a type of the at least on-image
gesture; and modify the received multimedia content element to
include at least a link to an informative resource containing the
content item.
12. The system of claim 11, wherein the system is further
configured to: receive at least one event related to the received
multimedia content element; and determine the content item
corresponding to the at least one identified portion of multimedia
content using the generated signatures, the at least one event and
on-image gesture.
13. The system of claim 12, wherein the on-gesture is any one of: a
touch gesture, a scroll-over, a mouse click, wherein the touch
gesture is detected on a user device having a touch screen
display.
14. The system of claim 12, wherein the event is at least one of
viewing the multimedia content element for a specified period of
time and interacting with the multimedia content element for a
specified period of time.
15. The system of claim 12, wherein the system is further
configured to: determine a type of the on-image gesture and a type
of the at least one event; and determine a type of the content item
based on at least one of: the type of the on-image gesture and the
type of the at least one event.
16. The system of claim 12, wherein the system is further
configured to: determine the context of the multimedia content
element respective of the generated signature; and determine the
content item based on the context of the multimedia content element
respective of the generated signature.
17. The system of claim 11, wherein any one of the multimedia
content elements and the content item is at least one of: an image,
graphics, a video stream, a video clip, an audio stream, an audio
clip, a video frame, a photograph, images of signals, combinations
thereof, and portions thereof.
18. The system of claim 11, wherein the at least link is added to
the multimedia content element as an overlay object, wherein the
modified multimedia content is embedded in a web page displayed in
a web-browser of the user device.
19. The system of claim 18, wherein the overlay object comprises a
vocabulary of the at least one on-gesture determined as
corresponding to the at least identified portion of the received
multimedia content element.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims the benefit of U.S. Provisional
Application No. 61/763,505 filed on Feb. 12, 2013, the contents of
which are hereby incorporated by reference. This application is a
continuation-in-part (CIP) of U.S. patent application Ser. No.
13/685,182 filed on Nov. 26, 2012, now pending, which is a CIP
of:
[0002] (a) U.S. patent application Ser. No. 13/624,397 filed on
Sep. 21, 2012, now pending;
[0003] (b) U.S. patent application Ser. No. 13/344,400 filed on
Jan. 5, 2012, now pending, which is a continuation (CON) of U.S.
patent spplication Ser. No. 12/434,221, filed May 1, 2009, now U.S.
Pat. No. 8,112,376;
[0004] (c) U.S. patent application Ser. No. 12/084,150 having a
filing date of Apr. 7, 2009, now allowed, which is the National
Stage of International Application No. PCT/IL2006/001235, filed on
Oct. 26, 2006, which claims foreign priority from Israeli
Application No. 171577 filed on Oct. 26, 2005 and Israeli
Application No. 173409 filed on 29 Jan. 2006; and,
[0005] (d) U.S. patent application Ser. No. 12/195,863, filed Aug.
21, 2008, now U.S. Pat. No. 8,326,775, which claims priority under
35 USC 119 from Israeli Application No. 185414, filed on Aug. 21,
2007, and which is also a continuation-in-part (CIP) of the
above-referenced U.S. patent spplication Ser. No. 12/084,150.
[0006] All of the applications referenced above are herein
incorporated by reference for all that they contain.
TECHNICAL FIELD
[0007] The present invention relates generally to the analysis of
multimedia content, and more specifically to a system for providing
content and links to content displayed as part of a web-page.
BACKGROUND
[0008] Web-pages are information resources that are suitable for
the World Wide Web (WWW) and can be accessed through a web browser.
Web-pages typically contain text and multimedia content elements
that are intended for display on a user's display device.
Multimedia content elements are generally displayed using portions
of code written in, for example, hyper-text mark-up language (HTML)
or JavaScript that is inserted into, or otherwise called up by
documents also written in HTML and which are sent to a user node
for display.
[0009] Multimedia content elements displayed in such web-pages are
usually non-interactive, thereby allowing users to view the
multimedia content elements, but not to connect with such
multimedia content. At most, the user is enabled to leave some
feedback regarding the multimedia content within the web-page.
Therefore, if a user wishes to receive information regarding an
item viewed in, for example, a video, further search efforts are
required.
SUMMARY
[0010] Certain embodiments disclosed herein include a method and
system for providing at least a link to a content item related to a
multimedia content element respective of an on-image gesture. The
method comprises receiving, from a user device, at least on-image
gesture and the multimedia content element; analyzing the at least
on-image gesture determine at least one portion of the multimedia
content element that a user is interested in; generating at least
one signature for each of the at least a portion; determining a
content item corresponding to the at least one identified portion
of multimedia content, wherein the determination is based in part
on a type of the at least on-image gesture; and modifying the
received multimedia content element to include at least a link to
an informative resource containing the content item.
BRIEF DESCRIPTION OF THE DRAWINGS
[0011] The subject matter disclosed herein is particularly pointed
out and distinctly claimed in the claims at the conclusion of the
specification. The foregoing and other objects, features, and
advantages of the invention will be apparent from the following
detailed description taken in conjunction with the accompanying
drawings.
[0012] FIG. 1 is a schematic block diagram of a network system
utilized to describe the various embodiments.
[0013] FIG. 2 is a flowchart describing a process of matching an
advertisement to multimedia content displayed on a web-page
according to an embodiment.
[0014] FIG. 3 is a block diagram depicting the basic flow of
information in the signature generator system.
[0015] FIG. 4 is a diagram showing the flow of patches generation,
response vector generation, and signature generation in a
large-scale speech-to-text system.
[0016] FIG. 5 is a flowchart describing a process for adding a link
to multimedia content displayed on a web-page.
[0017] FIG. 6 is a flowchart describing a process for analyzing an
on-image gesture received by a user according to an embodiment.
DETAILED DESCRIPTION
[0018] It is important to note that the embodiments disclosed
herein are only examples of the many advantageous uses of the
innovative teachings herein. In general, statements made in the
specification of the present application do not necessarily limit
any of the various claimed inventions. Moreover, some statements
may apply to some inventive features but not to others. In general,
unless otherwise indicated, singular elements may be in plural and
vice versa with no loss of generality. In the drawings, like
numerals refer to like parts through several views.
[0019] FIG. 1 shows an exemplary and non-limiting schematic diagram
of a network system 100 utilized to describe the disclosed
embodiments. A network 110 is used to communicate between different
parts of the system. The network 110 may be the Internet, the
world-wide-web (WWW), a local area network (LAN), a wide area
network (WAN), a metro area network (MAN), and other networks
capable of enabling communication between the elements of the
system 100.
[0020] Further connected to the network 110 are one or more client
applications, such as web browsers (WB) 120-1 through 120-n
(collectively referred to hereinafter as web browsers 120 or
individually as a web browser 120, merely for simplicity purposes).
A web browser 120 is executed over a computing device including,
for example, a personal computer (PC), a personal digital assistant
(PDA), a mobile phone, a smart phone, a tablet computer, a wearable
computing device, and other kinds of wired and mobile appliances,
equipped with browsing, viewing, listening, filtering, and managing
capabilities, etc., that are enabled as further discussed herein
below. Each of the web-servers 120 may be implemented as an
independent or plug-in application.
[0021] A server 130 is further connected to the network 110 and is
configured to perform, in part, the embodiments disclosed herein. A
request of the server 130 to analyze the multimedia content item
can be sent by a script executed by a web-browser 120 in the
web-page in response to the uploading of one or more multimedia
content items to the web-page. Such a request may include a URL of
the web-page or a copy of the web-page. The system 100 also
includes a signature generator system (SGS) 140. In one embodiment,
the SGS 140 is connected to the server 130. The server 130 is
enabled to receive and serve multimedia content and causes the SGS
140 to generate a signature respective of the multimedia content.
The process for generating the signatures for multimedia content is
explained in more detail herein below with respect to FIGS. 3 and
4.
[0022] It should be noted that each of the server 130 and the SGS
140 typically comprises a processing unit (not shown) such as a
processor, a CPU, and the like, that is coupled to a memory. The
memory contains instructions that can be executed by the processing
unit. The server 130 also includes an interface (not shown) to the
network 110. In one embodiment the server 130 is communicatively
connected or includes an array of Computational Cores configured as
discussed in more detail below.
[0023] A plurality of web servers 150-1 through 150-m are also
connected to the network 110, each of which is configured to
generate and send multimedia content items to the server 130. The
web servers 150-1 through 150-m typically, but not necessarily
exclusively, are resources for information that can be associated
with a multimedia content sent from a web browser 120. For example,
a web server 150-1 may host the Wikipedia website.
[0024] The system 100 may be configured to generate customized
channels of multimedia content. Accordingly, a web browser 120 or a
client channel manager application (not shown), available on either
the server 130, or the web browser 120, or as an independent or
plug-in application, may enable a user to create customized
channels of multimedia content by receiving selections made by a
user as inputs. Such customized channels of multimedia content are
personalized content channels that are generated in response to
selections made by a user of the web browser 120 or the client
channel manager application. The system 100, and in particular the
server 130 in conjunction with the SGS 140, determines which
multimedia content is more suitable to be viewed, played, or
otherwise utilized by the user with respect to a given channel,
based on the signatures of selected multimedia content. These
channels may optionally be shared with other users, used and/or
further developed cooperatively, and/or sold to other users or
providers, and so on. The process for defining, generating, and
customizing the channels of multimedia content are described in
greater detail in the co-pending Ser. No. 13/344,400 application
referenced above.
[0025] According to the embodiments disclosed herein, the server
130 is configured to carry out a process for providing a content
item or a link thereto to an information resource associated with
an input multimedia content element respective of on-image gesture,
event, or combination thereof. The on-image gesture and/or event
are received from a web-browser 120. In response, the server 130
returns a modified web page including the multimedia content
element with the determined content item or linked thereto.
[0026] The on-image gesture or combination of gestures may include,
but are not limited to: one or more touch gestures, one or more
scrolls over the at least a portion of the multimedia content
element, one or more clicks over the at least a portion of the
multimedia content, one or more responses to the at least a portion
of the multimedia content, a combination thereof, a portion
thereof, and so. The touch gestures may be related to computing
devices with a touch screen display and such gestures include, but
are not limited to, tapping on a content element, resizing a
content element, swiping over a content element, changing the
display orientation, and so on. In an embodiment, gestures detected
by the web-browser can be sent in combination with one or more
events. The event or combination of events may include, but are not
limited to, a predetermined period of time in which a user views or
interacts with the multimedia content element.
[0027] The server 130 is further configured to analyze the received
on-image gestures and/or events to determine at least one portion
of the received multimedia content element that is of particular
interest to the user. Then, the server 130 by means of the SGS 140
is configured to generate a signature for each identified portion.
Using the generated signatures and the type of the received
on-image gesture and/or event, a search for content items relevant
to the identified portion is performed. Thereafter, relevant
content items, or links thereto, can be added as an overlay to the
received multimedia content element displayed on a web-page.
[0028] A multimedia content element and content item may include,
for example, an image, a graphic, a video stream, a video clip, an
audio stream, an audio clip, a video frame, a photograph, and an
image of signals (e.g., spectrograms, phasograms, scalograms,
etc.), and/or combinations thereof and portions thereof.
[0029] It should be noted that the server 130 may analyze all or a
sub-set of the multimedia content elements contained in the
web-page. The SGS 140 generates at least one signature for portions
of each multimedia content element provided by the server 130. The
generated signature(s) may be robust to noise and distribution as
discussed below. Then, using the generated signature(s), the server
130 is capable of matching the signature of a web-page accessible
by a link to the multimedia content and providing the matched link.
Such links may be extracted from the data warehouse 160. For
example, if the signature of an image indicates the city of New
York, then a link to the municipal website of the city of New York
may be determined.
[0030] For instance, in order to provide a matching content item
for a sports car it may be desirable to locate a car of a
particular model. However, in most cases, the model of the car
would not be part of the metadata associated with the multimedia
content (image). Moreover, the car shown in an image may be
displayed at an angle that differs from the angle of a specific
photograph of the car that is available for use as a search item.
The signature generated for that image would enable accurate
recognition of the model of the car because the signatures
generated for the multimedia content elements, according to the
disclosed embodiments, allow for recognition and classification of
multimedia elements, such as by content-tracking, video filtering,
multimedia taxonomy generation, video fingerprinting,
speech-to-text, audio classification, element recognition,
video/image search and any other application requiring
content-based signatures generation and matching for large content
volumes such as web and other large-scale databases.
[0031] In one embodiment, the signatures generated for more than
one multimedia content element are clustered. The clustered
signatures are used to search for matching content items and to
select one or more of the matching content items. The one or more
selected matching content items are retrieved from the data
warehouse 160 and uploaded to the web-page on the web browser 120
by means of one of the web servers 150.
[0032] FIG. 2 depicts an exemplary and non-limiting flowchart 200
describing the process of matching an advertisement to a multimedia
content element displayed on a web-page. In S205, the method starts
when a web-page is uploaded to one of the web-browsers (e.g.,
web-browser 120-1). In S210, a request to match at least one
multimedia content element contained in the uploaded web-page to an
appropriate content item is received. The request can be received
from a web server (e.g., a server 150-1), a script running on the
uploaded web-page, or an agent (e.g., an add-on) installed in the
web-browser. S210 can also include extracting the multimedia
content elements and requesting that respective signatures be
generated.
[0033] In S220, a signature of the multimedia content element is
generated. The signature for the multimedia content element
generated by a signature generator is described below. In S230, an
advertisement item is matched to the multimedia content element
respective of its generated signature. In one embodiment, the
matching process includes searching for at least one advertisement
item with a matching signature respective of the signature of the
multimedia content and displaying the at least one advertisement
item within the display area of the web-page. In one embodiment,
the matching of an advertisement to a multimedia content element
can be performed by the computational cores that are part of a
large scale matching discussed in detail below.
[0034] In S240, upon a user's gesture, the matched advertisement
item is uploaded to the web-page and displayed therein. The user's
gesture may be: a scroll on the multimedia content element; a tap
on the multimedia content element, and/or a response to the
multimedia content. This ensures that the user attention is given
to the content item by providing the advertised content only when
the user has become interested in the multimedia content element.
In S250 it is checked whether there are additional requests to
analyze multimedia content elements and, if so, execution continues
with S210; otherwise, execution terminates.
[0035] As a non-limiting example, a user uploads a web-page that
contains an image of a sea shore. The image is then analyzed and a
signature is generated respective thereto. Respective of the image
signature, an advertisement item (e.g., a banner) is matched to the
image, for example, a swimsuit advertisement. Upon detection of a
user's gesture, for example, a mouse scrolling over the sea shore
image, the swimsuit ad is displayed.
[0036] The web-page may contain a number of multimedia content
elements; however, in some instances only a few advertisement items
may be displayed in the web-page. Accordingly, in one embodiment,
the signatures generated for the multimedia content elements are
clustered and the cluster of signatures is matched to one or more
advertisement items.
[0037] FIGS. 3 and 4 illustrate the generation of signatures for
the multimedia content elements by the SGS 140 according to one
embodiment. An exemplary high-level description of the process for
large scale matching is depicted in FIG. 3. In this example, the
matching is for a video content.
[0038] Video content segments 2 from a Master database (DB) 6 and a
Target DB 1 are processed in parallel by a large number of
independent computational Cores 3 that constitute an architecture
for generating the Signatures (hereinafter the "Architecture").
Further details on the computational Cores generation are provided
below. The independent Cores 3 generate a database of Robust
Signatures and Signatures 4 for Target content-segments 5 and a
database of Robust Signatures and Signatures 7 for Master
content-segments 8. An exemplary and non-limiting process of
signature generation for an audio component is shown in detail in
FIG. 4. Finally, Target Robust Signatures and/or Signatures are
effectively matched, by a matching algorithm 9, to Master Robust
Signatures and/or Signatures database to find all matches between
the two databases.
[0039] To demonstrate an example of signature generation process,
it is assumed, merely for the sake of simplicity and without
limitation on the generality of the disclosed embodiments, that the
signatures are based on a single frame, leading to certain
simplification of the computational cores generation. The Matching
System is extensible for signatures generation capturing the
dynamics in-between the frames.
[0040] The Signatures' generation process will now be described
with reference to FIG. 4. The first step in the process of
signatures generation from a given speech-segment is to breakdown
the speech-segment to K patches 14 of random length P and random
position within the speech segment 12. The breakdown is performed
by the patch generator component 21. The value of the number of
patches K, random length P and random position parameters is
determined based on optimization, considering the tradeoff between
accuracy rate and the number of fast matches required in the flow
process of the server 130 and SGS 140. Thereafter, all the K
patches are injected in parallel into all computational Cores 3 to
generate K response vectors 22, which are fed into a signature
generator system 23 to produce a database of Robust Signatures and
Signatures 4.
[0041] In order to generate Robust Signatures, i.e., Signatures
that are robust to additive noise L (where L is an integer equal to
or greater than 1) by the Computational Cores 3 a frame `i` is
injected into all the Cores 3. Then, Cores 3 generate two binary
response vectors: {right arrow over (S)} which is a Signature
vector, and {right arrow over (RS)} which is a Robust Signature
vector.
[0042] For generation of signatures robust to additive noise, such
as White-Gaussian-Noise, scratch, etc., but not robust to
distortions, such as crop, shift and rotation, etc., a core
C.sub.i={n.sub.i} (1.ltoreq.i.ltoreq.L) may consist of a single
leaky integrate-to-threshold unit (LTU) node or more nodes. The
node n.sub.i equations are:
V i = j w ij k j ##EQU00001## n i = .cndot. ( Vi - Th x )
##EQU00001.2##
[0043] where, .quadrature. is a Heaviside step function; w.sub.ij
is a coupling node unit (CNU) between node i and image component j
(for example, grayscale value of a certain pixel j); k.sub.j is an
image component `j` (for example, grayscale value of a certain
pixel j); Thx is a constant Threshold value, where x is `S` for
Signature and `RS` for Robust Signature; and Vi is a Coupling Node
Value.
[0044] The Threshold values Thx are set differently for Signature
generation and for Robust Signature generation. For example, for a
certain distribution of Vi values (for the set of nodes), the
thresholds for Signature (Th.sub.S) and Robust Signature
(Th.sub.RS) are set apart, after optimization, according to at
least one or more of the following criteria:
[0045] 1: For: V.sub.i>Th.sub.RS
1-p(V>Th.sub.S)-1-(1-.epsilon.).sup.l<<1
[0046] i.e., given that/nodes (cores) constitute a Robust Signature
of a certain image I, the probability that not all of these I nodes
will belong to the Signature of same, but noisy image, - is
sufficiently low (according to a system's specified accuracy).
[0047] 2: p(V.sub.i>Th.sub.RS).apprxeq.l/L
i.e., approximately l out of the total L nodes can be found to
generate a Robust Signature according to the above definition.
[0048] 3: Both Robust Signature and Signature are generated for
certain frame i.
[0049] It should be understood that the generation of a signature
is unidirectional, and typically yields lossless compression, where
the characteristics of the compressed data are maintained but the
uncompressed data cannot be reconstructed. Therefore, a signature
can be used for the purpose of comparison to another signature
without the need of comparison to the original data. Detailed
description of the Signature generation can be found U.S. Pat. Nos.
8,326,775 and 8,312,031, assigned to common assignee, which are
hereby incorporated by reference for all the useful information
they contain.
[0050] A Computational Core generation is a process of definition,
selection, and tuning of the parameters of the cores for a certain
realization in a specific system and application. The process is
based on several design considerations, such as:
[0051] (a) The Cores should be designed so as to obtain maximal
independence, i.e., the projection from a signal space should
generate a maximal pair-wise distance between any two cores'
projections into a high-dimensional space.
[0052] (b) The Cores should be optimally designed for the type of
signals, i.e., the Cores should be maximally sensitive to the
spatio-temporal structure of the injected signal, for example, and
in particular, sensitive to local correlations in time and space.
Thus, in some cases a core represents a dynamic system, such as in
state space, phase space, edge of chaos, etc., which is uniquely
used herein to exploit their maximal computational power.
[0053] (c) The Cores should be optimally designed with regard to
invariance to a set of signal distortions, of interest in relevant
applications. Detailed description of the Computational Core
generation, the computational architecture, and the process for
configuring such cores is discussed in more detail in the
co-pending U.S. patent application Ser. No. 12/084,150 referenced
above.
[0054] FIG. 5 depicts an exemplary and non-limiting flowchart 500
describing the process of adding an overlay to multimedia content
displayed on a web-page. In S510, the method starts when a web-page
is uploaded to a web-browser (e.g., web-browser 120-1). In another
embodiment, the method starts when a web-server (e.g., web-server
150-1) receives a request to host the requested web-page. In S515,
the server 130 receives the uniform resource locator (URL) of the
uploaded web-page. In another embodiment, the uploaded web-page
includes an embedded script. The script extracts the URL of the
web-page, and sends the URL to the server 130. In another
embodiment, an add-on installed in the web-browser 120 extracts the
URL of the uploaded web-page, and sends the URL to the server 130.
In yet another embodiment, an agent is installed on a user device
executing the web browser 120. The agent is configured to monitor
web-pages uploaded to the web-site to determine when web-pages have
been uploaded to the web-site, extract the URLs, and send the URLs
to the server 130. In another embodiment, a web-server (e.g.,
server 150) hosting the requested web-page, provides the server 130
with the URL of the requested web-page. It should be noted only
URLs of selected web sites can be sent to the server 130, for
example, URLs related to web-sites that paid for the additional
information.
[0055] In S520, the server downloads the web-page respective of
each received URL. In S525, the server 130 analyzes the web-page in
order to identify the existence of at least one or more multimedia
content elements in the uploaded web-page. It should be understood
that a multimedia content, such as an image or a video, may include
a plurality of multimedia content elements. In S530, the SGS 140
generates at least one signature for each multimedia content
element identified by the server 130. The signatures for the
multimedia elements are generated as described in greater detail
above.
[0056] In S540, respective of each signature, the server 130
determines one or more links to content that exists on a web
server, for example, each of the web servers 150-1 through 150-m
that can be associated with the multimedia element. A link may be a
hyperlink, a URL, and the like. The content accessed through the
link may be, for example, informative web-pages such as a
Wikipedia.RTM. article. The determination of the link may be made
by identification of the context of the signatures generated by the
server 130. For example, if a multimedia content element was
identified as a football player, a signature is generated
respective thereto, and a link to a sport website that contains
information about the football player is determined. In S550, the
determined link to the content is added as an overlay to the
web-page by the server 130, respective of the corresponding
multimedia content element. According to one embodiment, a link
that contains the overlay may be provided to a web browser
respective of a user's gesture. A user's gesture may be, for
example, a click on the multimedia content element through, for
example, a computer mouse, a touch pad, or a touch screen; and/or a
response to the multimedia content (e.g., movement detected by a
motion sensor, noise detected by a microphone, etc.).
[0057] The modified web-page that includes at least one multimedia
element with the added link can be sent directly to the web browser
(e.g., browser 120-1) requesting the web-page. This requires
establishing a data session between the server 130 and the web
browsers 120. In another embodiment, the multimedia element
including the added link is returned to a web server (e.g., server
150-1) hosting the requested web-page. The web server (e.g., server
150-1) subsequently returns the requested web-page with the
multimedia element containing the added link to the web browser
(e.g., browser 120-1) requesting the web-page. Once the "modified"
web page is displayed over the web browser, a detected event or
user's gesture with respect to the multimedia content element would
cause the browser to upload the content (e.g., a Wikipedia.RTM.
article web page) addressed by the link added to the multimedia
element.
[0058] In S560, it is checked whether the one or more multimedia
content elements contained in the web-page has changed, and if so,
execution continues with S525; otherwise, execution terminates.
[0059] Different portions of the multimedia content element may be
associated with different server content or links to server
content. As a non-limiting example, a web-page related to cinema is
uploaded and an image of the movie "Pretty Woman" showing actor
Richard Gere and actress Julia Roberts is identified within the
web-page by the server 130. A signature is generated by the SGS 140
respective of the actor Richard Gere and the actress Julia Roberts,
both shown as portions of the image. A link to Richard Gere's
biography on the Wikipedia.RTM. website and a link to Julia
Roberts' biography on the Wikipedia.RTM. website are then
determined respective of the signatures and the context of the
signatures as further described herein above. The context of the
signatures according to this example may be "American Movie
Actors."
[0060] An overlay containing the links to Richard Gere's biography
on the Wikipedia.RTM. website and Julia Roberts' biography on the
Wikipedia.RTM. website is added over the image such that upon
detection of a specified event or a user's gesture, for example, a
gesture wherein a mouse clicks on the part of the image where
Richard Gere is shown, the link to Richard Gere's biography on
Wikipedia.RTM. is provided to the user.
[0061] According to another embodiment, a request for a URL of a
web-page that contains an embedded video clip is received. The
video content within the requested web-page is analyzed and a
signature is generated respective of the entertainer Madonna that
is shown in the video content. A link to Madonna's official
web-page hosted on a web-server 150-n is then determined respective
of the signature as further described herein above. An overlay
containing the link to Madonna's official web-page is then added
over the video content. The web-page together with the link to
Madonna's official web-page is then sent to the web server 150-1.
Then, the requested web-page with the modified video element is
uploaded to the web-browser 120-1.
[0062] The web-page may contain a number of multimedia content
elements; however, in some instances only a few links may be
displayed in the web-page. Accordingly, in one embodiment, the
signatures generated for the multimedia content elements are
clustered and the cluster of signatures is matched to one or more
content items.
[0063] FIG. 6 depicts an exemplary and non-limiting flowchart 600
describing a method of analyzing an on-image gesture received by a
user device and providing a content item respective thereof
according to an embodiment. The method can be performed by the
server 130 using the SGS 140.
[0064] In S610, the method starts when at least a portion of a
multimedia content element from a web-page as well as at least one
gesture, event, or combination thereof, is received. The on-image
gestures and the multimedia content are captured and sent by a
web-browser (e.g., WB 120-1) executed over a user device. In an
embodiment, a URL of the web-page and an identifier of the
multimedia content associated with the detected gesture and/or
event is provided. On-image gestures may include, but are not
limited to: one or more touch gestures, one or more scrolls over
the at least a portion of the multimedia content element, one or
more clicks over the at least a portion of the multimedia content,
one or more responses to the at least a portion of the multimedia
content, a combination thereof, a portion thereof, and so on. The
touch gestures may be related to computing devices with a touch
screen display and such gestures may include, but are not limited
to, tapping on a content element, resizing a content element,
swiping over a content element, changing the display orientation,
and so on. In an embodiment gestures detected by the web-browser
can be sent together with one or more events. Alternatively, the
web-browser 120 can send only events related to the interaction of
a user with the content element. Events may include, but are not
limited to, a predetermined period of time in which a user views or
interacts with the multimedia content element.
[0065] In S620, the received gestures and/or events are analyzed to
determine at least one portion of the received multimedia content
element that is of particular interest to the user. As a
non-limiting example, if a multimedia content element is an image
featuring a man and a boat, and the user zooms in on the boat (an
event of expanding a part of a screen that demonstrates an interest
in the particular portion of the image that is expanded), the boat
is determined to be the portion of the multimedia content element
that is of particular interest to the user.
[0066] In S630, at least one signature is generated for each
portion of the multimedia content element identified in S620. The
signatures for the multimedia content elements are generated as
described in greater detail above.
[0067] In S640, respective of the at least a signature of at each
portion of the multimedia content element, the received on-image
gestures and/or events corresponding to the at least a portion of
the multimedia content element are determined. Each different
gesture, event, set of gestures, set of events, and combinations
thereof, received from a user can be differentiated and associated
with different links or content from a server. As an example, a
click on the at least a portion of the multimedia content may be
determined as a first gesture associated with, e.g., a link to a
Wikipedia.RTM. article, and a double click on the at least a
portion of the multimedia content may be determined as a different
gesture associated with push data being delivered to the user. In
an embodiment, a preconfigured table providing a mapping between a
type of gesture, event, and a combination of gesture and event to
the type of content item and its delivery method is saved in the
data warehouse 160 and is accessible by the server 130.
Furthermore, one of ordinary skill should appreciate that an
on-image gesture can be a graphic, a video stream, a video clip, an
audio stream, an audio clip, a video frame, and a photograph.
[0068] In S650, respective of each signature for the portion of the
multimedia content element and corresponding gestures and/or
events, a search is performed for content items that can be
associated with the multimedia element respective of the gestures
and/or events. This determination may be performed by matching
signatures generated for the portion of the multimedia content
element with potential content items. The search for such content
items is performed using a data warehouse 160 by the web servers
150. A content item is determined to be related to the portion of
multimedia content element when their respective signatures (as
generated by the SGS 140) match. The signature matching process is
described in more detail above. In an exemplary embodiment, when
two signatures overlap more than a predetermined threshold level,
for example 60% of the signature match, these signatures may be
considered as matching.
[0069] In an embodiment, the search for relevant content items is
not limited to the data warehouse. The search can be performed
using signatures generated by the SGS 140 and the identified
context in data sources that index searchable content including,
but not limited to, multimedia content items using signatures and
concepts. A context is determined as the correlation between a
plurality of concepts. An example for such indexing techniques
using signatures is disclosed in a co-pending U.S. patent
application Ser. No. 13/766,463, filed Feb. 13, 2013, entitled "A
SYSTEM AND METHODS FOR GENERATION OF A CONCEPT BASED DATABASE",
assigned to common assignee, and is hereby incorporated by
reference for all the useful information it contains.
[0070] In one embodiment, the signatures generated for more than
one unstructured data element are clustered. The clustered
signatures are used to search for a common concept. The concept is
a collection of signatures representing elements of the
unstructured data and metadata describing the concept. As a
non-limiting example, a `Superman concept` is a signature reduced
cluster of signatures describing elements (such as multimedia
elements) related to, e.g., a Superman cartoon: a set of metadata
representing proving textual representation of the Superman
concept. Techniques for generating concepts and concept structures
are also described in the co-pending U.S. patent application Ser.
No. 12/603,123 (hereinafter the '123 Application) to Raichelgauz et
al., which is assigned to common assignee, and is incorporated
hereby by reference for all that it contains.
[0071] In S660, the determined related content or a link to the
determined content is added as an overlay to the web-page
respective of the corresponding multimedia content element and the
corresponding gestures and/or events. According to one embodiment
(not shown), a vocabulary of the determined gestures and/or events
may be provided as part of the overlay. Such vocabulary may
include, but is not limited to, one or more gestures and/or events,
and a description of the corresponding server content or links to
server content that will be provided upon occurrence of the one or
more gestures and/or events.
[0072] In an embodiment, the modified web-page that includes at
least one multimedia element with the added link can be sent
directly to the web browser (e.g., browser 120-1) requesting the
web-page. This requires establishing a data session between the
server 130 and the web browsers 120. In another embodiment, the
multimedia element including the added link is returned to a web
server (e.g., server 150-1) hosting the requested web-page. The web
server (e.g., server 150-1) returns the requested web-page with the
multimedia element containing the added link to the web browser
(e.g., browser 120-1) requesting the web-page. Once the "modified"
web page is displayed over the web browser, a detected user's
gesture and/or the occurrence of an event over the multimedia
element would cause the browser to upload the content addressed by
the link added to the multimedia element.
[0073] In S670, it is checked whether one or more gestures and/or
events have occurred and, if so, execution continues with S610;
otherwise, execution terminates.
[0074] As another non-limiting example, a touch gesture associated
with a question mark as per the vocabulary may provide an
informative link, and a touch gesture associated with an
exclamation mark as per the vocabulary may provide a link in which
the user will be able to respond to the image by, e.g., leaving a
written comment regarding the image.
[0075] As a further non-limiting example, the multimedia content
element may be a video clip of a music video of a particular song.
Additionally, the video clip may have content item related to
purchasing the song, and the link to this server content may be
related to the combination of the event that a user views the video
for at least 30 seconds and the gesture of swiping on a touch
screen. If a user proceeds to view the clip for one minute then
swipes the touch screen, the user will be provided with a link to a
website that would allow the user to purchase the song.
[0076] The various embodiments disclosed herein can be implemented
as hardware, firmware, software, or any combination thereof.
Moreover, the software is preferably implemented as an application
program tangibly embodied on a program storage unit or computer
readable medium consisting of parts, or of certain devices and/or a
combination of devices. The application program may be uploaded to,
and executed by, a machine comprising any suitable architecture.
Preferably, the machine is implemented on a computer platform
having hardware such as one or more central processing units
("CPUs"), a memory, and input/output interfaces. The computer
platform may also include an operating system and microinstruction
code. The various processes and functions described herein may be
either part of the microinstruction code or part of the application
program, or any combination thereof, which may be executed by a
CPU, whether or not such a computer or processor is explicitly
shown. In addition, various other peripheral units may be connected
to the computer platform such as an additional data storage unit
and a printing unit. Furthermore, a non-transitory computer
readable medium is any computer readable medium except for a
transitory propagating signal.
[0077] All examples and conditional language recited herein are
intended for pedagogical purposes to aid the reader in
understanding the principles of the invention and the concepts
contributed by the inventor to furthering the art, and are to be
construed as being without limitation to such specifically recited
examples and conditions. Moreover, all statements herein reciting
principles, aspects, and embodiments of the invention, as well as
specific examples thereof, are intended to encompass both
structural and functional equivalents thereof. Additionally, it is
intended that such equivalents include both currently known
equivalents as well as equivalents developed in the future, i.e.,
any elements developed that perform the same function, regardless
of structure.
* * * * *