U.S. patent application number 14/638176 was filed with the patent office on 2015-07-16 for system and method for identifying a correct orientation of a multimedia content item.
This patent application is currently assigned to CORTICA, LTD.. The applicant listed for this patent is CORTICA, LTD.. Invention is credited to Karina Odinaev, Igal Raichelgauz, Yehoshua Y. Zeevi.
Application Number | 20150199355 14/638176 |
Document ID | / |
Family ID | 54065620 |
Filed Date | 2015-07-16 |
United States Patent
Application |
20150199355 |
Kind Code |
A1 |
Raichelgauz; Igal ; et
al. |
July 16, 2015 |
SYSTEM AND METHOD FOR IDENTIFYING A CORRECT ORIENTATION OF A
MULTIMEDIA CONTENT ITEM
Abstract
A method and system for identifying a correct orientation of a
multimedia content item are presented. The method includes
receiving from a user device the multimedia content item;
identifying at least one object shown in the multimedia content
item; generating by a signature generator system (SGS) at least one
signature for the at least one object shown in the multimedia
content item; querying, using the at least one generated signature,
a deep-content-classification (DCC) system to find at least one
concept that matches the at least one object, wherein the querying
of the DCC system is performed using the at least one signature
generated for each object shown in the multimedia content item;
determining a correct orientation of the at least one matching
concept; and comparing an orientation of the at least one object to
the determined correct orientation to determine if the at least one
object is correctly oriented.
Inventors: |
Raichelgauz; Igal; (New
York, NY) ; Odinaev; Karina; (New York, NY) ;
Zeevi; Yehoshua Y.; (Haifa, IL) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
CORTICA, LTD. |
TEL AVIV |
|
IL |
|
|
Assignee: |
CORTICA, LTD.
TEL AVIV
IL
|
Family ID: |
54065620 |
Appl. No.: |
14/638176 |
Filed: |
March 4, 2015 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
14096865 |
Dec 4, 2013 |
|
|
|
14638176 |
|
|
|
|
13624397 |
Sep 21, 2012 |
|
|
|
14096865 |
|
|
|
|
13344400 |
Jan 5, 2012 |
8959037 |
|
|
13624397 |
|
|
|
|
12434221 |
May 1, 2009 |
8112376 |
|
|
13344400 |
|
|
|
|
12195863 |
Aug 21, 2008 |
8326775 |
|
|
13624397 |
|
|
|
|
12084150 |
Apr 7, 2009 |
8655801 |
|
|
12195863 |
|
|
|
|
12084150 |
Apr 7, 2009 |
8655801 |
|
|
PCT/IL2006/001235 |
Oct 26, 2006 |
|
|
|
13624397 |
|
|
|
|
62030086 |
Jul 29, 2014 |
|
|
|
61890251 |
Oct 13, 2013 |
|
|
|
Current U.S.
Class: |
707/770 |
Current CPC
Class: |
G06N 5/04 20130101; H04N
21/26603 20130101; H04N 21/278 20130101; Y10S 707/99943 20130101;
G06F 16/51 20190101; G06K 9/6267 20130101; G06K 9/00758 20130101;
G06T 19/006 20130101; H04H 60/66 20130101; H04L 67/306 20130101;
G06F 16/14 20190101; G06F 16/2228 20190101; G06F 16/40 20190101;
H04H 60/46 20130101; H04L 67/327 20130101; G06F 16/48 20190101;
G06N 7/005 20130101; H04H 60/33 20130101; G06F 16/434 20190101;
G06F 16/7844 20190101; G06F 16/783 20190101; G06F 16/435 20190101;
G06F 16/438 20190101; G06F 16/4393 20190101; G06F 16/7834 20190101;
H04L 65/601 20130101; G06F 16/7847 20190101; G06Q 30/0201 20130101;
H04H 60/58 20130101; H04N 21/23418 20130101; G06F 16/152 20190101;
H04H 20/103 20130101; H04H 20/93 20130101; H04H 2201/90 20130101;
G06F 16/685 20190101; G09B 19/0092 20130101; H04N 21/466 20130101;
G06N 5/025 20130101; G06Q 30/0246 20130101; H04N 7/17318 20130101;
G06K 2209/27 20130101; G06F 16/433 20190101; G06F 16/487 20190101;
G06N 3/088 20130101; H04N 21/25891 20130101; H04H 60/56 20130101;
G06F 16/285 20190101; G06F 16/951 20190101; G10L 25/51 20130101;
Y10S 707/99948 20130101; G06N 5/022 20130101; G10L 15/26 20130101;
H04N 21/2668 20130101; G06F 16/904 20190101; G06K 9/00744 20130101;
G06Q 30/0261 20130101; G06F 40/134 20200101; H04L 67/22 20130101;
G06F 16/284 20190101; H04H 20/26 20130101; H04H 60/71 20130101;
G06F 16/9558 20190101; H04L 65/605 20130101; G06F 16/41 20190101;
H04H 60/49 20130101; G06F 3/0484 20130101; G06F 3/0488 20130101;
G06F 16/1748 20190101; G06F 16/35 20190101; G06K 9/00281 20130101;
G10L 15/32 20130101; H04H 60/59 20130101; G06F 16/43 20190101; G06F
16/683 20190101; G06F 16/172 20190101; G06K 9/00711 20130101; H04N
21/8106 20130101; G06F 3/048 20130101; G06N 3/0481 20130101; H04H
60/37 20130101; G06N 20/00 20190101; H04L 67/10 20130101; G06N
3/0454 20130101; G06N 5/02 20130101 |
International
Class: |
G06F 17/30 20060101
G06F017/30 |
Foreign Application Data
Date |
Code |
Application Number |
Oct 26, 2005 |
IL |
171577 |
Jan 29, 2006 |
IL |
173409 |
Aug 21, 2007 |
IL |
185414 |
Claims
1. A method for identifying a correct orientation of a multimedia
content item, the method comprising: receiving from a user device
the multimedia content item; identifying at least one object shown
in the multimedia content item; generating by a signature generator
system (SGS) at least one signature for the at least one object
shown in the multimedia content item; querying, using the at least
one generated signature, a deep-content-classification (DCC) system
to find at least one concept that matches the at least one object,
wherein the querying of the DCC system is performed using the at
least one signature generated for each object shown in the
multimedia content item; determining a correct orientation of the
at least one matching concept; and comparing an orientation of the
at least one object to the determined correct orientation to
determine if the at least one object is correctly oriented.
2. The method of claim 1, wherein the at least one signature
generated for each object shown in the multimedia content item is
robust to noise and distortion.
3. The method of claim 1, wherein the received multimedia content
item is any of: an image, a graphic, and a photograph.
4. The method of claim 1, wherein the orientation of the at least
one object is determined respective of the spatial location of the
at least one signature generated for the at least one object.
5. The method of claim 1, further comprising: upon identifying that
the at least one object is incorrectly oriented, rotating the
multimedia content item until the at least one object is in the
correct orientation; and displaying on the user device the correct
multimedia content item.
6. The method of claim 1, wherein the at least one concept is a
collection of signatures representing the at least one object and
metadata describing the at least one concept, the collection is of
a signature reduced cluster generated by inter-matching signatures
generated for the plurality of object shown in the multimedia
content item, and the at least one matching concept is represented
using at least one signature.
7. The method of claim 6, wherein the at least one concept is
determined to match the at least one object when a signature of the
at least one concept matches the at least one generated signature
for the at least one object over a predefined threshold.
8. The method of claim 1, wherein upon identification of at least
one matching concept, the at least one signature of the at least
one matching concept is returned.
9. A non-transitory computer readable medium having stored thereon
instructions for causing one or more processing units to execute
the method according to claim 1.
10. A system for identifying a correct orientation of a multimedia
content item, the system comprises: an interface to a network for
receiving the multimedia content item; a processing unit; and a
memory communicatively connected to the processing unit, wherein
the memory contains instructions that, when executed by the
processing unit, configures the system to: receive from a user
device the multimedia content item; identify at least one object
shown in the multimedia content item; generate by a signature
generator system (SGS) at least one signature for the at least one
object shown in the multimedia content item; query, using the at
least one generated signature, a deep-content-classification (DCC)
system to find at least one concept that matches the at least one
object, wherein the querying of the DCC system is performed using
the at least one signature generated for each object shown in the
multimedia content item; determine a correct orientation of the at
least one matching concept; and comparing an orientation of the at
least one object to the determined correct orientation to determine
if the at least one object is correctly oriented.
11. The system of claim 10, wherein the at least one signature
generated for each object shown in the multimedia content item is
robust to noise and distortion.
12. The system of claim 10, wherein the received multimedia content
item is any of: an image, a graphic, and a photograph.
13. The system of claim 10, wherein the orientation of the at least
one object is determined respective of the spatial location of the
at least one signature generated for the at least one object.
14. The system of claim 10, further configured to: upon identifying
that the at least one object is incorrectly oriented, rotate the
multimedia content item until the at least one object is in the
correct orientation; and display on the user device the correct
multimedia content item.
15. The system of claim 10, wherein the at least one concept is a
collection of signatures representing the at least one object and
metadata describing the at least one concept, the collection is of
a signature reduced cluster generated by inter-matching signatures
generated for the plurality of object shown in the multimedia
content item, and the at least one matching concept is represented
using at least one signature.
16. The system of claim 15, wherein the at least one concept is
determined to match the at least one object when a signature of the
at least one concept matches the at least one generated signature
for the at least one object over a predefined threshold.
17. The system of claim 10, wherein upon identification of at least
one matching concept, the at least one signature of the at least
one matching concept is returned.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims the benefit of U.S. provisional
application No. 62/030,086 filed on Jul. 29, 2014. This application
is also a continuation-in-part (CIP) of U.S. patent application
Ser. No. 14/096,865 filed Dec. 4, 2013, now pending, which claims
the benefit of U.S. provisional application No. 61/890,251 filed
Oct. 13, 2013. The Ser. No. 14/096,865 application is a
continuation-in-part (CIP) of U.S. patent application Ser. No.
13/624,397 filed on Sep. 21, 2012, now pending. The Ser. No.
13/624,397 application is a CIP of: [0002] (a) U.S. patent
application Ser. No. 13/344,400 filed on Jan. 5, 2012, U.S. Pat.
No. 8,959,037, which is a continuation of U.S. patent application
Ser. No. 12/434,221, filed May 1, 2009, now U.S. Pat. No.
8,112,376; [0003] (b) U.S. patent application Ser. No. 12/195,863,
filed Aug. 21, 2008, now U.S. Pat. No. 8,326,775, which claims
priority under 35 USC 119 from Israeli Application No. 185414,
filed on Aug. 21, 2007, and which is also a continuation-in-part of
the below-referenced U.S. patent application Ser. No. 12/084,150;
and [0004] (c) U.S. patent application Ser. No. 12/084,150 having a
filing date of Apr. 7, 2009, now U.S. Pat. No. 8,655,801, which is
the National Stage of International Application No.
PCT/IL2006/001235, filed on Oct. 26, 2006, which claims foreign
priority from Israeli Application No. 171577 filed on Oct. 26, 2005
and Israeli Application No. 173409 filed on 29 Jan. 2006.
[0005] All of the applications referenced above are herein
incorporated by reference for all that they contain.
TECHNICAL FIELD
[0006] The present invention relates generally to the analysis of
multimedia content items, and more specifically to techniques for
identifying a correct orientation of a multimedia content item.
BACKGROUND
[0007] Computing devices, such as mobile devices, tablets,
smartphones, and the likes, frequently include an orientation
sensor that indicates the orientation of the computing devices with
respect to a reference point, such as gravitational pull or other
orientation references. Current applications executed on these
computing devices use the orientation information of the computing
devices to adjust functions of each computing device. For example,
such applications are configured to rotate a multimedia content
item displayed on a user interface of a mobile device based on the
orientation of the mobile device.
[0008] The problem with such applications is that the multimedia
content item is not analyzed before it is displayed on the user
interface. Thus, in a case where the orientation of the multimedia
content item (e.g., an image), or portion of it (i.e., object shown
in the image) is incorrect in the first place; the image will be
displayed in an incorrect orientation on the user interface despite
the orientation sensor the mobile device is equipped with. For
example, in a case where an image is captured by a mobile device
with an inarticulate camera angle, the view of the image (as
captured) is not vertical or horizontal to the ground, hence using
the existing orientation sensor to rotate the image will not solve
the problem.
[0009] It would be therefore advantageous to provide an efficient
solution to analyze multimedia content items. It would be further
advantageous if such a solution would enable identification of a
correct orientation of an object shown in the multimedia content
item.
SUMMARY
[0010] A summary of several example embodiments of the disclosure
follows. This summary is provided for the convenience of the reader
to provide a basic understanding of such embodiments and does not
wholly define the breadth of the disclosure. This summary is not an
extensive overview of all contemplated embodiments, and is intended
to neither identify key or critical elements of all aspects nor
delineate the scope of any or all embodiments. Its sole purpose is
to present some concepts of one or more embodiments in a simplified
form as a prelude to the more detailed description that is
presented later. For convenience, the term some embodiments may be
used herein to refer to a single embodiment or multiple embodiments
of the disclosure.
[0011] Certain embodiments include a method for identifying a
correct orientation of a multimedia content item. The method
comprises receiving from a user device the multimedia content item;
identifying at least one object shown in the multimedia content
item; generating by a signature generator system (SGS) at least one
signature for the at least one object shown in the multimedia
content item; querying, using the at least one generated signature,
a deep-content-classification (DCC) system to find at least one
concept that matches the at least one object, wherein the querying
of the DCC system is performed using the at least one signature
generated for each object shown in the multimedia content item;
determining a correct orientation of the at least one matching
concept; and comparing an orientation of the at least one object to
the determined correct orientation to determine if the at least one
object is correctly oriented.
[0012] Certain embodiments include a system for identifying a
correct orientation of a multimedia content item. The system
comprises an interface to a network for receiving the multimedia
content item; a processing unit; and a memory communicatively
connected to the processing unit, wherein the memory contains
instructions that, when executed by the processing unit, configures
the system to: receive from a user device the multimedia content
item; identify at least one object shown in the multimedia content
item; generate by a signature generator system (SGS) at least one
signature for the at least one object shown in the multimedia
content item; query, using the at least one generated signature, a
deep-content-classification (DCC) system to find at least one
concept that matches the at least one object, wherein the querying
of the DCC system is performed using the at least one signature
generated for each object shown in the multimedia content item;
determine a correct orientation of the at least one matching
concept; and comparing an orientation of the at least one object to
the determined correct orientation to determine if the at least one
object is correctly oriented.
BRIEF DESCRIPTION OF THE DRAWINGS
[0013] The subject matter disclosed herein is particularly pointed
out and distinctly claimed in the claims at the conclusion of the
specification. The foregoing and other objects, features, and
advantages of the disclosed embodiments will be apparent from the
following detailed description taken in conjunction with the
accompanying drawings.
[0014] FIG. 1 is a schematic block diagram of a network system
utilized to describe the various embodiments disclosed herein.
[0015] FIG. 2 is a flowchart describing the process of identifying
a correct orientation of a multimedia content item according to an
embodiment.
[0016] FIG. 3 is a schematic block diagram of a drawing utilized to
describe the correction of an incorrect orientation according to an
embodiment.
[0017] FIG. 4 is a block diagram depicting the basic flow of
information in a signature generator system.
[0018] FIG. 5 is a diagram showing the flow of patches generation,
response vector generation, and signature generation in a
large-scale speech-to-text system.
DETAILED DESCRIPTION
[0019] It is important to note that the embodiments disclosed
herein are only examples of the many advantageous uses of the
innovative teachings herein. In general, statements made in the
specification of the present application do not necessarily limit
any of the various claimed embodiments. Moreover, some statements
may apply to some inventive features but not to others. In general,
unless otherwise indicated, singular elements may be in plural and
vice versa with no loss of generality. In the drawings, like
numerals refer to like parts through several views.
[0020] Certain exemplary embodiments disclosed herein include a
method for analyzing the orientation of objects shown in a
multimedia content item for detecting an incorrect orientation of
the multimedia content item. In an embodiment, the multimedia
content item is received from a user device. At least one signature
is generated for at least one object shown in the multimedia
content item. The signatures generated for the at least one object
are matched to signatures generated for at least one concept. The
signatures generated for each concept are retrieved from a data
warehouse. Upon identifying a match between at least one object and
at least one concept, the correct orientation of the concept is
retrieved from the data warehouse. The orientation of the concept
is correlated to the orientation of the object shown in the
multimedia content item to determine whether the orientation of the
object is the correct orientation.
[0021] Upon identification of an incorrect orientation of the
object, the multimedia content item is rotated until the object is
in the correct orientation. According to an embodiment, the correct
multimedia content item is then sent to the user device for
display.
[0022] FIG. 1 shows an exemplary and non-limiting schematic diagram
of a network system 100 utilized to describe the various
embodiments disclosed herein. A network 110 is used to communicate
between different parts of the network system 100. The network 110
may be the Internet, the world-wide-web (WWW), a local area network
(LAN), a wide area network (WAN), a metro area network (MAN), and
the like.
[0023] Further connected to the network 110 is a user device 120
configured to execute at least one application 125. The application
125 may be, for example, a web browser, a script, an add-on, a
mobile application ("app"), or any application programmed to
interact with a server 130. The user device 120 may be, but not
limited to, a personal computer (PC), a personal digital assistant
(PDA), a mobile phone, a smart phone, a tablet computer, a laptop,
a wearable computing device, or another kind of computing device
equipped with browsing, viewing, listening, filtering, and managing
capabilities that is enabled as further discussed herein below. It
should be noted that one user device 120 and one application 125
are illustrated in FIG. 1 only for the sake of simplicity and
without limitation on the generality of the disclosed
embodiments.
[0024] The network system 100 also includes a data warehouse 160
configured to store multimedia content items, previously generated
signatures for concepts or concept structures, information
respective of the concepts' orientation in space, and the like. The
data warehouse 160 may be further connected to the network 110. In
the embodiment illustrated in FIG. 1, the server 130, further
connected to the network 110, communicates with the data warehouse
160 through the network 110. In other non-limiting configurations,
the server 130 is directly connected to the data warehouse 160.
[0025] The various embodiments disclosed herein are realized using
the server 130, a signature generator system (SGS) 140 and a
deep-content-classification (DCC) system 150. The SGS 140 may be
connected to the server 130 directly or through the network 110.
The server 130 is configured to receive and serve the at least one
multimedia content item in which objects are shown and cause the
SGS 140 to generate at least one signature respective thereof and
query the DCC system 150. To this end, the server 130 is
communicatively connected to the SGS 140 and the DCC system 150.
The DCC system 150 may be further connected to the network 110.
[0026] The DCC system 150 is configured to generate concept
structures (or concepts) and to identify concepts that match the
objects. A concept is a collection of signatures representing an
object and metadata describing the concept. The collection is a
signature reduced cluster generated by inter-matching the
signatures generated for the many objects, clustering the
inter-matched signatures, and providing a reduced cluster set of
such clusters. As a non-limiting example, a `Superman concept` is a
signature reduced cluster of signatures describing elements (such
as objects) related to, e.g., a Superman cartoon: a set of metadata
including textual representations of the Superman concept.
[0027] Techniques for generating concepts and concept structures
are also described in the U.S. Pat. No. 8,266,185 (hereinafter the
'185 Patent) to Raichelgauz, et al., which is assigned to a common
assignee, and is incorporated by reference herein for all that it
contains. In an embodiment, the DCC system 150 is configured and
operates as the DCC system discussed in the '185 patent. The
process of generating the signatures in the SGS 140 is explained in
more detail below with respect to FIGS. 4 and 5.
[0028] It should be noted that each of the server 130, the SGS 140,
and the DCC system 150 typically comprise a processing unit, such
as a processor (not shown) or an array of processors coupled to a
memory. In one embodiment, the processing unit may be realized
through architecture of computational cores described in detail
below. The memory contains instructions that can be executed by the
processing unit. The instructions, when executed by the processing
unit, cause the processing unit to perform the various functions
described herein. The one or more processors may be implemented
with any combination of general-purpose microprocessors, multi-core
processors, microcontrollers, digital signal processors (DSPs),
field programmable gate array (FPGAs), programmable logic devices
(PLDs), controllers, state machines, gated logic, discrete hardware
components, dedicated hardware finite state machines, or any other
suitable entities that can perform calculations or other
manipulations of information. The server 130 also includes an
interface (not shown) to the network 110.
[0029] According to the disclosed embodiments, the server 130 is
configured to receive a multimedia content item showing objects
from the user device 120. The multimedia content item may be, but
is not limited to, an image, a graphic, a photograph, and/or
combinations thereof and portions thereof. An object may be any
element shown in the multimedia content item, for example, a tree,
a car, a person, a table, and the like. In one embodiment, the
server 130 is configured to receive a URL of a webpage viewed by
the user device 120 and accessed by the application 125. The
webpage is processed to extract the multimedia content item
contained therein.
[0030] The request to analyze the multimedia content item can be
sent by a script executed in the webpage such as the application
125 (e.g., a web server or a publisher server) when requested to
upload one or more multimedia content items to the webpage. Such a
request may include a URL of the webpage or a copy of the webpage.
The application 125 can also send a picture taken by a user of the
user device 120 to the server 130.
[0031] Responsive to receiving the multimedia content item, the
server 130 is configured to rotate the multimedia content item
until the multimedia content item is in the correct orientation and
to return the correctly oriented multimedia content item. To this
end, the server 130 is configured to analyze the multimedia content
item to identify portions or objects in the multimedia content
item. As an example, an image showing Central Park in New York is
analyzed to identify the objects of a carriage way, a car, a
streetlight, and a person. At least one signature is generated for
each object using the SGS 140. The generated signatures may be
robust to noise and distortion as discussed below.
[0032] In one embodiment, using the generated signatures, the DCC
system 150 is queried to determine if there is a match to at least
one concept maintained in the data warehouse 160. The DCC system
150 returns for each matching concept a concept's signature
(signature reduced cluster (SRC)) and optionally the concept's
metadata. Using the SRC of the matching concept and the signatures
generated for the at least one object, the server 130 is configured
to determine if there a difference between the orientation of the
object in the multimedia content item and the matching concept.
According to an embodiment, parameters such as orientation of an
object and/or a concept in space respective of a reference point
may be taken into account.
[0033] Specifically, when one match is identified, the server 130
is configured to retrieve from the data warehouse 160 information
respective of the typical orientation of a concept in space. The
information contained in the data warehouse 160 may have been
entered by users, collected from external web sources connected to
the network, saved from previous calculations of the disclosed
method, and the like. In another embodiment, the information
respective of the typical orientation of a concept in space may be
determined respective of the relation between at least two elements
shown constantly in multimedia elements stored in the database. As
a non-limiting example, a tree is always perpendicular to grass.
The correct orientation of such a concept is determined respective
thereof. As an example, when a match to the concept "tree" is
identified, the server 130 is configured to determine that the tree
should be perpendicular to the ground (the ground in such case can
be used as a reference point).
[0034] The server 130 is further configured to correlate the
orientation of an object shown in the multimedia content item and
the correct concept's orientation. This is performed by correlating
the signatures generated for the object and the signatures of the
concept retrieved from the data warehouse 160. Here it should be
noted that the signatures generated for each object are generated
respective of the spatial location of an object shown in the
multimedia content item. Upon identification of an incorrect
orientation of the object, the multimedia content item is rotated
until the object is in the correct orientation. According to an
embodiment, the correct multimedia content item is then sent to the
user device 120 for display.
[0035] In another embodiment, the SGS 140 is configured to generate
signatures for the objects shown in the received multimedia content
item. The generated signatures are matched by the server 130 to
previously generated signatures of concepts maintained in the data
warehouse 160 to identify at least one object that matches to at
least one concept. When such a match is identified, the server 130
is configured to correlate the orientation of the concept and the
orientation of the object shown in the multimedia content item as
noted above. Upon identification of an incorrect orientation of the
object, the multimedia content item is rotated until the object is
in the correct orientation. According to an embodiment, the correct
multimedia content item is then sent to the user device for
display.
[0036] FIG. 2 depicts an exemplary and non-limiting flowchart 200
describing a method for detecting an incorrect orientation of a
multimedia content item. The method may be performed by the server
130.
[0037] In S210, a multimedia content item in which objects are
shown is received. In an embodiment, the multimedia content item is
received together with a request to analyze the orientation of the
multimedia content item. Optionally, in S215, the received
multimedia content item is analyzed to identify at least one object
shown within.
[0038] In S220 at least one signature is generated for at least one
object. The signatures are generated respective of the spatial
location of the object shown in the multimedia content item. The
signatures are generated by the SGS 140 as described in greater
detail below with respect to FIGS. 3 and 4.
[0039] In S230, a DCC system (e.g., DCC system 150) is queried to
find a match between at least one concept and the object using
their respective signatures. In an embodiment, at least one
signature generated for an object is matched against the signature
(signature reduced cluster (SRC)) of each concept maintained by the
DCC system 150. If the signature of the concept overlaps with the
signature of the multimedia element more than a predetermined
threshold level, a match exists. Various techniques for determining
matching concepts are discussed in the '185 Patent. For each
matching concept the respective multimedia element is determined to
be identified and at least the concept signature (SRC) is
returned.
[0040] In S240, the correct/typical orientation of the matching
concept is determined respective of information related to the
concept maintained in a database, such as the data warehouse 160.
The information contained in the data warehouse 160 may have been
entered by users, collected from external web sources connected to
the network, saved from previous calculations of the disclosed
method, and the like. In another embodiment, the correct/typical
orientation of the matching concept is determined respective of the
relation between at least two elements shown constantly in
multimedia elements stored in the database, for example, a tree is
always perpendicular to grass, etc.
[0041] In S250 the orientation of the object is correlated to the
orientation of the concept. The correlation includes analyzing the
signatures generated for the object and the signatures of the
concept. Such correlation is performed based on the spatial
location of the object and respective of information related to the
concept maintained in the data warehouse 160. In another
embodiment, if matching concepts are not found, the signatures
generated in S220 are utilized to search the data warehouse
160.
[0042] In S260, it is checked whether the orientation of the object
shown in the multimedia content item is the correct orientation,
and if so, execution continues with S280; otherwise, execution
continues with 270. In S270, the multimedia content item is rotated
until the object is in the correct orientation. According to an
embodiment, the corrected multimedia content item is sent to the
user device 120 for display. In another embodiment, object is
rotated until the object is in the correct orientation. In S280, it
is checked whether additional multimedia content items are
received, and if so, execution continues with S215; otherwise,
execution terminates.
[0043] FIG. 3 shows an exemplary and non-limiting schematic diagram
of a drawing 300 utilized to describe the correction of an
incorrect orientation of a multimedia content item according to an
embodiment. The process may be performed by the server 130.
[0044] The objects of a house 310, the moon 320 and a tree 330 are
identified in the drawing 300 and signatures are generated for each
such object. The generated signatures are then used to search for
matching concepts. The "house" concept is identified. Information
related to the "house" concept is retrieved and the typical
orientation of a house in space is determined (i.e., the house
should be perpendicular to the ground). Specifically, the server
130 is configured to query a DCC system 150 to search a data
warehouse 160 for information related to the "house" concept. This
information includes a typical orientation of a house relative to
the ground. In another embodiment, the typical orientation of the
house is determined respective of the relation between at least two
elements shown constantly in multimedia elements stored in the
database, for example, a tree is always perpendicular to grass,
etc. The orientation of the concept "house" is correlated to
orientation of the house 310 shown in the drawing and it is
determined that the orientation of the house 310 is incorrect, and
therefore the orientation of the drawing is incorrect. According to
this embodiment, the drawing 300 is rotated 340 until the house 310
is in the correct orientation.
[0045] FIGS. 4 and 5 illustrate the generation of signatures for
the multimedia content elements by the SGS 140 according to one
embodiment. An exemplary high-level description of the process for
large scale matching is depicted in FIG. 4. In this non-limiting
example, the matching is conducted based on video content.
[0046] Video content segments 2 from a Master database (DB) 6 and a
Target DB 1 are processed in parallel by a large number of
independent computational cores 3 that constitute an architecture
for generating the signatures (hereinafter the "Architecture").
Further details on the generation of computational cores are
provided below. The independent cores 3 generate a database of
Robust Signatures and Signatures 4 for Target content-segments 5
and a database of Robust Signatures and Signatures 7 for Master
content-segments 8. An exemplary and non-limiting process of
signature generation for an audio component is shown in detail in
FIG. 5. Finally, Target Robust Signatures and/or Signatures are
effectively matched, by a matching algorithm 9, to Master Robust
Signatures and/or Signatures database to find all matches between
the two databases.
[0047] To demonstrate an example of the signature generation
process, it is assumed, merely for the sake of simplicity and
without limitation on the generality of the disclosed embodiments,
that the signatures are based on a single frame, leading to certain
simplification of the computational cores generation. The Matching
System is extensible for signatures generation capturing dynamics
in-between the frames.
[0048] The Signatures' generation process is now described with
reference to FIG. 5. The first step in the process of signatures
generation from a given speech-segment is to breakdown the
speech-segment to K patches 14 of random length P and random
position within the speech segment 12. The breakdown is performed
by the patch generator component 21. The value of the number of
patches K, random length P, and random position parameters is
determined based on optimization, considering the tradeoff between
accuracy rate and the number of fast matches required in the flow
process of the server 130 and SGS 140. Thereafter, all the K
patches are injected in parallel into all computational cores 3 to
generate K response vectors 22, which are fed into a signature
generator system 23 to produce a database of Robust Signatures and
Signatures 4.
[0049] In order to generate Robust Signatures, i.e., Signatures
that are robust to additive noise L (where L is an integer equal to
or greater than 1) by the computational cores 3 a frame `i` is
injected into all the cores 3. Then, cores 3 generate two binary
response vectors: {right arrow over (S)}, which is a Signature
vector, and {right arrow over (RS)} which is a Robust Signature
vector.
[0050] For generation of signatures robust to additive noise, such
as White-Gaussian-Noise, scratch, etc., but not robust to
distortions, such as crop, shift and rotation, etc., a core Ci={ni}
(1.ltoreq.i.ltoreq.L) may consist of a single leaky
integrate-to-threshold unit (LTU) node or more nodes. The node ni
equations are:
V i = j w ij k j ##EQU00001## n i = ( Vi - TH x )
##EQU00001.2##
[0051] where, is a Heaviside step function; wij is a coupling node
unit (CNU) between node i and image component j (for example,
grayscale value of a certain pixel j); kj is an image component `j`
(for example, grayscale value of a certain pixel j); TH.sub.x is a
constant Threshold value, where `x` is `S` for Signature and `RS`
for Robust Signature; and Vi is a Coupling Node Value.
[0052] The Threshold values ThX are set differently for Signature
generation than for Robust Signature generation. For example, for a
certain distribution of Vi values (for the set of nodes), the
thresholds for Signature (ThS) and Robust Signature (ThRS) are set
apart, after optimization, according to at least one or more of the
following criteria: [0053] 1: For: V.sub.i>Th.sub.RS
[0053] 1-p(V>Th.sub.S)-1-(1-.epsilon.).sup.l<<1
[0054] i.e., given that I nodes (cores) constitute a Robust
Signature of a certain image I, the probability that not all of
these I nodes will belong to the Signature of same, but noisy
image, is sufficiently low (according to a system's specified
accuracy).
2: p(V.sub.i>Th.sub.RS).apprxeq.l/L
[0055] i.e., approximately I out of the total L nodes can be found
to generate a Robust Signature according to the above definition.
[0056] 3: Both Robust Signature and Signature are generated for
certain frame i.
[0057] It should be understood that the generation of a signature
is unidirectional, and typically yields lossless compression, where
the characteristics of the compressed data are maintained but the
uncompressed data cannot be reconstructed. Therefore, a signature
can be used for the purpose of comparison to another signature
without the need for comparison to the original data. The detailed
description of the signature generation can be found in U.S. Pat.
Nos. 8,326,775 and 8,312,031, assigned to common assignee, which
are hereby incorporated by reference for all the useful information
they contain.
[0058] A computational core generation is a process of definition,
selection, and tuning of the parameters of the cores for a certain
realization in a specific system and application. The process is
based on several design considerations, such as:
[0059] (a) The cores should be designed so as to obtain maximal
independence, i.e., the projection from a signal space should
generate a maximal pair-wise distance between any two cores'
projections into a high-dimensional space.
[0060] (b) The cores should be optimally designed for the type of
signals, i.e., the cores should be maximally sensitive to the
spatio-temporal structure of the injected signal, for example, and
in particular, sensitive to local correlations in time and space.
Thus, in some cases, a core represents a dynamic system, such as in
state space, phase space, edge of chaos, etc., which is uniquely
used herein to exploit its maximal computational power.
[0061] (c) The cores should be optimally designed with regard to
invariance to a set of signal distortions, of interest in relevant
applications.
[0062] A detailed description of the computational core generation
and the process for configuring such cores is discussed in more
detail in U.S. Pat. No. 8,655,801 referenced above.
[0063] The various embodiments disclosed herein can be implemented
as hardware, firmware, software, or any combination thereof.
Moreover, the software is preferably implemented as an application
program tangibly embodied on a program storage unit or computer
readable medium consisting of parts, or of certain devices and/or a
combination of devices. The application program may be uploaded to,
and executed by, a machine comprising any suitable architecture.
Preferably, the machine is implemented on a computer platform
having hardware such as one or more central processing units
("CPUs"), a memory, and input/output interfaces. The computer
platform may also include an operating system and microinstruction
code. The various processes and functions described herein may be
either part of the microinstruction code or part of the application
program, or any combination thereof, which may be executed by a
CPU, whether or not such a computer or processor is explicitly
shown. In addition, various other peripheral units may be connected
to the computer platform such as an additional data storage unit
and a printing unit. Furthermore, a non-transitory computer
readable medium is any computer readable medium except for a
transitory propagating signal.
[0064] All examples and conditional language recited herein are
intended for pedagogical purposes to aid the reader in
understanding the principles of the disclosed embodiments and the
concepts contributed by the inventor to furthering the art, and are
to be construed as being without limitation to such specifically
recited examples and conditions. Moreover, all statements herein
reciting principles, embodiments, and embodiments of the
disclosure, as well as specific examples thereof, are intended to
encompass both structural and functional equivalents thereof.
Additionally, it is intended that such equivalents include both
currently known equivalents as well as equivalents developed in the
future, i.e., any elements developed that perform the same
function, regardless of structure.
* * * * *