U.S. patent application number 15/827311 was filed with the patent office on 2018-06-07 for system and method for determining a location based on multimedia content.
This patent application is currently assigned to Cortica, Ltd.. The applicant listed for this patent is Cortica, Ltd.. Invention is credited to Karina ODINAEV, Igal RAICHELGAUZ, Yehoshua Y ZEEVI.
Application Number | 20180157652 15/827311 |
Document ID | / |
Family ID | 54065620 |
Filed Date | 2018-06-07 |
United States Patent
Application |
20180157652 |
Kind Code |
A1 |
RAICHELGAUZ; Igal ; et
al. |
June 7, 2018 |
SYSTEM AND METHOD FOR DETERMINING A LOCATION BASED ON MULTIMEDIA
CONTENT
Abstract
A system and method for determining a precise location based on
multimedia content. The method includes: analyzing a multimedia
content element (MMCE), wherein the analysis further includes
generating at least one signature to the MMCE; matching the
generated at least one signature to at least one reference concept
stored in a database, wherein each of the at least one stored
concept is associated with a predetermined precise location; and
identifying, based on the matching, a precise location depicted in
the MMCE.
Inventors: |
RAICHELGAUZ; Igal; (Tel
Aviv, IL) ; ODINAEV; Karina; (Tel Aviv, IL) ;
ZEEVI; Yehoshua Y; (Haifa, IL) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Cortica, Ltd. |
Tel Aviv |
|
IL |
|
|
Assignee: |
Cortica, Ltd.
TEL AVIV
IL
|
Family ID: |
54065620 |
Appl. No.: |
15/827311 |
Filed: |
November 30, 2017 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
14597324 |
Jan 15, 2015 |
|
|
|
15827311 |
|
|
|
|
13766463 |
Feb 13, 2013 |
9031999 |
|
|
14597324 |
|
|
|
|
13602858 |
Sep 4, 2012 |
8868619 |
|
|
13766463 |
|
|
|
|
12603123 |
Oct 21, 2009 |
8266185 |
|
|
13602858 |
|
|
|
|
12084150 |
Apr 7, 2009 |
8655801 |
|
|
PCT/IL2006/001235 |
Oct 26, 2006 |
|
|
|
12603123 |
|
|
|
|
12195863 |
Aug 21, 2008 |
8326775 |
|
|
12603123 |
|
|
|
|
12084150 |
Apr 7, 2009 |
8655801 |
|
|
12195863 |
|
|
|
|
12348888 |
Jan 5, 2009 |
9798795 |
|
|
12603123 |
|
|
|
|
12084150 |
Apr 7, 2009 |
8655801 |
|
|
12348888 |
|
|
|
|
12195863 |
Aug 21, 2008 |
8326775 |
|
|
12084150 |
|
|
|
|
12538495 |
Aug 10, 2009 |
8312031 |
|
|
12603123 |
|
|
|
|
12084150 |
Apr 7, 2009 |
8655801 |
|
|
12538495 |
|
|
|
|
12195863 |
Aug 21, 2008 |
8326775 |
|
|
12084150 |
|
|
|
|
12348888 |
Jan 5, 2009 |
9798795 |
|
|
12195863 |
|
|
|
|
62428557 |
Dec 1, 2016 |
|
|
|
61928468 |
Jan 17, 2014 |
|
|
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
G06N 7/005 20130101;
H04H 60/33 20130101; H04H 60/37 20130101; H04L 67/327 20130101;
H04H 60/46 20130101; H04N 7/17318 20130101; G06F 16/904 20190101;
G06F 16/685 20190101; H04H 60/59 20130101; G06Q 30/0246 20130101;
G10L 25/51 20130101; Y10S 707/99943 20130101; G06Q 30/0201
20130101; G09B 19/0092 20130101; H04H 20/93 20130101; G06F 16/435
20190101; G06F 16/4393 20190101; G06F 16/51 20190101; G06F 16/7844
20190101; G06F 16/35 20190101; G06F 16/14 20190101; G06F 16/2228
20190101; G06F 16/438 20190101; H04N 21/466 20130101; G06F 3/0484
20130101; G06F 16/40 20190101; G06N 5/025 20130101; G06F 16/1748
20190101; G06F 16/951 20190101; H04L 67/10 20130101; G06F 16/487
20190101; Y10S 707/99948 20130101; G06N 5/02 20130101; H04H 60/58
20130101; G06F 16/48 20190101; H04H 60/49 20130101; H04L 67/22
20130101; G06F 3/0488 20130101; H04H 60/71 20130101; G06F 16/152
20190101; G06F 16/9558 20190101; G06N 5/04 20130101; G06N 20/00
20190101; H04H 60/56 20130101; H04L 65/601 20130101; G06F 16/434
20190101; G06K 9/00711 20130101; G06F 3/048 20130101; G06F 16/41
20190101; G06F 16/7847 20190101; H04L 67/306 20130101; G06F 16/783
20190101; G06F 40/134 20200101; G06K 9/00744 20130101; G06K 9/00758
20130101; H04N 21/25891 20130101; G10L 15/26 20130101; G06F 16/285
20190101; G06F 16/683 20190101; G06K 9/00281 20130101; H04H 20/26
20130101; G06F 16/172 20190101; G06Q 30/0261 20130101; G06F 16/43
20190101; G06K 9/6267 20130101; G06K 2209/27 20130101; G06F 16/433
20190101; G06T 19/006 20130101; G10L 15/32 20130101; H04H 20/103
20130101; H04H 60/66 20130101; G06F 16/284 20190101; G06F 16/7834
20190101; G06N 5/022 20130101; H04H 2201/90 20130101; H04N 21/2668
20130101; H04N 21/8106 20130101 |
International
Class: |
G06F 17/30 20060101
G06F017/30 |
Foreign Application Data
Date |
Code |
Application Number |
Oct 26, 2005 |
IL |
171577 |
Jan 29, 2006 |
IL |
173409 |
Aug 21, 2007 |
IL |
185414 |
Claims
1. A method for determining a precise location based on multimedia
content, comprising: analyzing a multimedia content element (MMCE),
wherein the analysis further comprises generating at least one
signature to the MMCE; matching the generated at least one
signature to at least one concept stored in a database, wherein
each of the at least one stored concept is associated with a
predetermined precise location; and identifying, based on the
matching, a precise location depicted in the MMCE.
2. The method of claim 1, further comprising: generating location
coordinates based on the identified precise location.
3. The method of claim 1, further comprising: determining at least
one concept based on the generated at least one signature; and
matching the determined at least one concept to the at least one
concept stored in the database.
4. The method of claim 3, wherein each concept is a collection of
signatures and metadata describing the concept.
5. The method of claim 3, wherein the at least one concept is
determined by querying a concept-based database using the at least
one signature.
6. The method of claim 1, wherein the at least one signature is
robust to noise and distortion.
7. The method of claim 1, wherein each signature is generated by a
signature generator system including a plurality of at least
partially statistically independent computational cores, wherein
the properties of each core are set independently of the properties
of each other core.
8. The method of claim 1, wherein each of the at least one
signature is generated based on at least one of: the MMCE, and
metadata associated with the MMCE.
9. The method of claim 8, wherein the metadata includes at least
one of: a time stamp of the MMCE, a device used to capture the
MMCE, a location pointer, tags, comments, and Global Positioning
System (GPS) coordinates associated with the MMCE.
10. A non-transitory computer readable medium having stored thereon
instructions for causing one or more processing units to execute a
process for determining a precise location based on multimedia
content, the process comprising: analyzing a multimedia content
element (MMCE), wherein the analysis further comprises generating
at least one signature to the MMCE; matching the generated at least
one signature to at least one reference concept stored in a
database, wherein each of the at least one reference concepts are
associated with a predetermined precise location; and identifying,
based on the matching, a precise location depicted in the MMCE.
11. A system for determining a precise location based on multimedia
content, comprising: a processing circuitry; and a memory, the
memory containing instructions that, when executed by the
processing circuitry, configure the system to: analyze a multimedia
content element (MMCE), wherein the analysis includes generating at
least one signature to the MMCE; match the generated at least one
signature to at least one concept stored in a database, wherein
each of the at least one stored concept is associated with a
predetermined precise location; and identify, based on the
matching, a precise location depicted in the MMCE.
12. The system of claim 11, wherein the system is further
configured to: generate location coordinates based on the
identified precise location.
13. The system of claim 11, wherein the system is further
configured to: determine at least one concept based on the
generated at least one signature; and match the determined at least
one concept to the at least one concept stored in the database.
14. The system of claim 13, wherein each concept is a collection of
signatures and metadata describing the concept.
15. The system of claim 13, wherein the at least one concept is
determined by querying a concept-based database using the at least
one signature.
16. The system of claim 11, wherein the at least one signature is
robust to noise and distortion.
17. The system of claim 11, wherein each signature is generated by
a signature generator system including a plurality of at least
partially statistically independent computational cores, wherein
the properties of each core are set independently of the properties
of each other core.
18. The system of claim 11, wherein each of the at least one
signature is generated based on at least one of: the MMCE, and
metadata associated with the MMCE.
19. The system of claim 18, wherein the metadata includes at least
one of: a time stamp of the MMCE, a device used to capture the
MMCE, a location pointer, tags, comments, and Global Positioning
System (GPS) coordinates associated with the MMCE.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims the benefit of U.S. Provisional
Application No. 62/428,557 filed on Dec. 1, 2016. This application
is also a continuation-in-part of U.S. patent application Ser. No.
14/597,324 filed on Jan. 15, 2015, now pending, which claims the
benefit of U.S. Provisional Application No. 61/928,468, filed on
Jan. 17, 2014. The Ser. No. 14/597,324 application is a
continuation-in-part of U.S. patent application Ser. No. 13/766,463
filed on Feb. 13, 2013, now U.S. Pat. No. 9,031,999. The Ser. No.
13/766,463 application is a continuation-in-part of U.S. patent
application Ser. No. 13/602,858 filed on Sep. 4, 2012, now U.S.
Pat. No. 8,868,619. The Ser. No. 13/602,858 application is a
continuation of U.S. patent application Ser. No. 12/603,123 filed
on Oct. 21, 2009, now U.S. Pat. No. 8,266,185, which is a
continuation-in-part of: [0002] (1) U.S. patent application Ser.
No. 12/084,150 having a filing date of Apr. 7, 2009, now U.S. Pat.
No. 8,655,801, which is the National Stage of International
Application No. PCT/IL2006/001235 filed on Oct. 26, 2006, which
claims foreign priority from Israeli Application No. 171577 filed
on Oct. 26, 2005, and Israeli Application No. 173409 filed on Jan.
29, 2006; [0003] (2) U.S. patent application Ser. No. 12/195,863
filed on Aug. 21, 2008, now U.S. Pat. No. 8,326,775, which claims
priority under 35 USC 119 from Israeli Application No. 185414,
filed on Aug. 21, 2007, and which is also a continuation-in-part of
the above-referenced U.S. patent application Ser. No. 12/084,150;
[0004] (3) U.S. patent application Ser. No. 12/348,888, filed on
Jan. 5, 2009, now pending, which is a continuation-in-part of the
above-referenced U.S. patent application Ser. No. 12/084,150, and
the above-referenced U.S. patent application Ser. No. 12/195,863;
and [0005] (4) U.S. patent application Ser. No. 12/538,495, filed
on Aug. 10, 2009, now U.S. Pat. No. 8,312,031, which is a
continuation-in-part of the above-referenced U.S. patent
application Ser. No. 12/084,150, the above-referenced U.S. patent
application Ser. No. 12/195,863, and the above-referenced U.S.
patent application Ser. No. 12/348,888.
[0006] All of the applications referenced above are hereby
incorporated by reference.
TECHNICAL FIELD
[0007] The present disclosure relates generally to the analysis of
multimedia content, and more specifically to determining a location
based on an analysis of multimedia content.
BACKGROUND
[0008] Access to maps and location-based driving directions are
readily available on many user devices, such as smartphones,
through websites and dedicated applications. One process that is
integral to the use of such systems is "geocoding." Geocoding is
the process of converting a description of a location from a format
that is meaningful to humans, such as a street address, an
intersection, or name of a point of interest, to a location on the
earth's surface in a format usable by computers, typically
represented by numerical longitude and latitude values such as
those used in a geographical coordinate system.
[0009] To request the display of a map using such an application, a
user describes the location at which the map should be centered. To
request directions within a mapping application, the user describes
both the desired origin and destination. The application will then
convert the requested locations to coordinates using a geocoding
process.
[0010] A shortcoming of the geocoding process is that often users
themselves are not sure of the exact location that they seek. In
certain circumstances, accurately setting a location is critical,
especially, for example, when mapping tools are used for ride
hiring services in hectic urban areas. Additionally, even when a
user is aware of the desired exact location, it can be difficult to
input such information within an application. For example, if a
user is waiting for a hired car service to pick them up at an
intersection, it can be difficult and cumbersome to input the exact
location within the intersection on a map, such as which side of
the street, which corner of the intersection, how far from the
actual intersection, and the like.
[0011] It would therefore be advantageous to provide a solution
that would overcome the challenges noted above.
SUMMARY
[0012] A summary of several example embodiments of the disclosure
follows. This summary is provided for the convenience of the reader
to provide a basic understanding of such embodiments and does not
wholly define the breadth of the disclosure. This summary is not an
extensive overview of all contemplated embodiments, and is intended
to neither identify key or critical elements of all embodiments nor
to delineate the scope of any or all aspects. Its sole purpose is
to present some concepts of one or more embodiments in a simplified
form as a prelude to the more detailed description that is
presented later. For convenience, the term "an embodiment" may be
used herein to refer to a single embodiment or multiple embodiments
of the disclosure.
[0013] Certain embodiments disclosed herein include a method for
determining a precise location based on multimedia content, the
method including: analyzing a multimedia content element (MMCE),
wherein the analysis further comprises generating at least one
signature to the MMCE; matching the generated at least one
signature to at least one concept stored in a database, wherein
each of the at least one stored concept is associated with a
predetermined precise location; and identifying, based on the
matching, a precise location depicted in the MMCE.
[0014] Certain embodiments disclosed herein also include a
non-transitory computer readable medium having stored thereon
instructions for causing one or more processing units to execute a
process for determining a precise location based on multimedia
content, the process comprising: analyzing a multimedia content
element (MMCE), wherein the analysis includes generating at least
one signature to the MMCE; matching the generated at least one
signature to at least one concept stored in a database, wherein
each of the at least one stored concept is associated with a
predetermined precise location; and identifying, based on the
matching, a precise location depicted in the MMCE.
[0015] Certain embodiments disclosed herein also include a system
for determining a precise location based on multimedia content,
comprising: a processing circuitry; and a memory, the memory
containing instructions that, when executed by the processing
circuitry, configure the system to: analyze a multimedia content
element (MMCE), wherein the analysis includes generating at least
one signature to the MMCE; match the generated at least one
signature to at least one concept stored in a database, wherein
each of the at least one stored concept is associated with a
predetermined location; and identify, based on the matching, a
precise location depicted in the MMCE.
BRIEF DESCRIPTION OF THE DRAWINGS
[0016] The subject matter disclosed herein is particularly pointed
out and distinctly claimed in the claims at the conclusion of the
specification. The foregoing and other objects, features, and
advantages of the disclosed embodiments will be apparent from the
following detailed description taken in conjunction with the
accompanying drawings.
[0017] FIG. 1 is a network diagram utilized to describe the various
disclosed embodiments.
[0018] FIG. 2 is an example diagram of a Deep Content
Classification system for creating concepts according to an
embodiment.
[0019] FIG. 3 is a flowchart illustrating a method for determining
a location based on a multimedia content element according to an
embodiment.
[0020] FIG. 4 is a block diagram depicting the basic flow of
information in the signature generator system.
[0021] FIG. 5 is a diagram showing the flow of patches generation,
response vector generation, and signature generation in a
large-scale speech-to-text system.
DETAILED DESCRIPTION
[0022] It is important to note that the embodiments disclosed
herein are only examples of the many advantageous uses of the
innovative teachings herein. In general, statements made in the
specification of the present application do not necessarily limit
any of the various claimed embodiments. Moreover, some statements
may apply to some inventive features but not to others. In general,
unless otherwise indicated, singular elements may be in plural and
vice versa with no loss of generality. In the drawings, like
numerals refer to like parts through several views.
[0023] The various disclosed embodiments include a method and
system for determining a location based on an analysis of
multimedia content. In some embodiments, the method includes
receiving an input multimedia content element (MMCE), analyzing the
input MMCE and metadata associated the MMCE to generate signatures
based on the content depicted within the MMCE, generating concepts
based on the signatures, comparing the generated signatures and
concepts to previously generated signatures and concepts that are
associated with predetermined precise locations, e.g., from a
database; and determining the precise location of the input MMCE.
The determined location can be provided to a user device in the
form of location coordinates.
[0024] FIG. 1 shows a network diagram 100 utilized to describe the
various disclosed embodiments. A user device 120, a server 130, a
signature generator system (SGS) 140, a database 150, and a deep
content classifier (DCC) system 160 are communicatively connected
via a network 110. The network 110 may include the Internet, the
world-wide-web (WWW), a local area network (LAN), a wide area
network (WAN), a metro area network (MAN), and other networks
capable of enabling communication between elements of a system
100.
[0025] The user device 120 may be, but is not limited to, a mobile
phone, a smart phone, a personal computer (PC), a tablet computer,
a wearable computing device, and other kinds of wired and mobile
devices capable of capturing, uploading, browsing, viewing,
listening, filtering, and managing MMCEs as further discussed
herein below. The user device 120 may have installed thereon an
application 125. The application 125 may be downloaded from an
application repository, such as the Apple.RTM. AppStore.RTM.,
Google Play.RTM., or any repository hosting software applications
for download.
[0026] The user device 120 includes a storage (not shown)
containing one or more MMCEs, such as, but not limited to, an
image, a photograph, a graphic, a screenshot, a video stream, a
video clip, a video frame, an audio stream, an audio clip,
combinations thereof, portions thereof, and the like. Additionally,
the user device 120 may include an MMCE capturing mechanism, such
as an image camera, a video camera, a microphone, and the like.
[0027] A server 130 is connected to the network 110 and is
configured to communicate with the user device 120. The server 130
may include a processing circuitry (PC) 135 and a memory 137. The
processing circuitry 135 may be realized as one or more hardware
logic components and circuits. For example, and without limitation,
illustrative types of hardware logic components that can be used
include field programmable gate arrays (FPGAs),
application-specific integrated circuits (ASICs),
application-specific standard products (ASSPs), system-on-a-chip
systems (SOCs), general-purpose microprocessors, microcontrollers,
digital signal processors (DSPs), and the like, or any other
hardware logic components that can perform calculations or other
manipulations of information.
[0028] In an embodiment, the memory 137 is configured to store
software. Software shall be construed broadly to mean any type of
instructions, whether referred to as software, firmware,
middleware, microcode, hardware description language, or otherwise.
Instructions may include code (e.g., in source code format, binary
code format, executable code format, or any other suitable format
of code). The instructions, when executed by the one or more
processors, cause the processing circuitry 135 to perform the
various processes described herein. Specifically, the instructions,
when executed, cause the processing circuitry 135 to determine
location based on analyzing multimedia content, as discussed
further herein below.
[0029] In an embodiment, the server 130 may further be configured
to identify metadata associated with each of the MMCEs. The
metadata may include, for example, a time stamp of the capturing of
the MMCE, the device used for the capturing, a location pointer,
tags, comments, Global Positioning System (GPS) coordinates
associated with the MMCE, and the like. The server 130 may further
be configured to access location data from the user device 120
directly, such as a GPS sensor of a smart phone.
[0030] The database 150 is configured to store either previously
generated signatures, concepts that have been previously generated
based on signatures, or a combination thereof. The database 150 is
accessible by the server 130, either via the network 110 (as shown
in FIG. 1) or directly (not shown).
[0031] The SGS 140 and the DCC system 160 are utilized by the
server 130 to perform the various disclosed embodiments. The SGS
140 and the DCC system 160 may be connected to the server 130
directly (not shown) or through the network 110 (as shown in FIG.
1). In certain configurations, the DCC system 160 and the SGS 140
may be embedded in the server 130. In an embodiment, the server 130
is connected to or includes an array of computational cores
configured as discussed in more detail below.
[0032] In an embodiment, the server 130 is configured to access an
input MMCE from the user device 120 and to send the input MMCE to
the SGS 140, the DCC system 160, or both. The decision of which to
be used (the SGS 140, the DCC system 160, or both) may be a default
configuration or may depend on the circumstances of the particular
MMCE being analyzed, e.g., the file type, the file size of the
MMCE, the clarity of the content within the MMCE, and the like. In
an embodiment, the SGS 140 receives the input MMCE and returns
signatures generated thereto. The generated signature(s) may be
robust to noise and distortion as discussed regarding FIGS. 4 and 5
below.
[0033] According to another embodiment, the analysis of the input
MMCE may further be based on a concept structure (hereinafter
referred to as a "concept") determined for the input MMCE. A
concept is a collection of signatures representing elements of the
unstructured data and metadata describing the concept. As a
non-limiting example, a `Superman concept` is a signature-reduced
cluster of signatures describing elements (such as MMCEs) related
to, e.g., a Superman cartoon: and a set of metadata providing a
textual representation of the Superman concept. Techniques for
generating concept structures are also described in the
above-referenced U.S. Pat. No. 8,266,185 to Raichelgauz et al., the
contents of which are hereby incorporated by reference.
[0034] According to this embodiment, a query is sent to the DCC
system 160 to match the input MMCE to at least one concept. The
identification of a concept matching the input MMCE includes
matching signatures generated for the input MMCE (such signature(s)
may be produced either by the SGS 140 or the DCC system 160) and
comparing the generated signatures to reference signatures
representing predetermined concepts. The signatures to which the
input MMCE is compared may be stored in and accessed from the
database 150. The matching can be performed across all concepts
maintained by the system DCC 160.
[0035] Based on the generated signatures, concepts, or both, the
server 130 is configured to identify a precise location indicated,
mentioned, shown, or otherwise represented by at least a portion of
the input MMCE. The location is identified by comparing signatures,
concepts, or both, to reference concepts that are associated with
predetermined precise locations, as further detailed below in FIG.
3. The precise location can further include latitudinal and
longitudinal coordinates representing a place that can be displayed
on a map. The identified precise location can be sent to the user
device 120 for display thereon, such as through an application 125,
e.g., a mapping application.
[0036] It should be appreciated that generating signatures allows
for more accurate analysis of MMCEs in comparison to, for example,
relying on metadata alone. The signatures generated for the MMCEs
allow for recognition and classification of MMCEs such as
content-tracking, video filtering, multimedia taxonomy generation,
video fingerprinting, speech-to-text, audio classification, element
recognition, video/image search and any other application requiring
content-based signatures generation and matching for large content
volumes such as, web and other large-scale databases. For example,
a signature generated by the SGS 140 for a picture showing a car
enables accurate recognition of the model of the car from any angle
at which the picture was taken.
[0037] It should be noted that only one user device 120 and one
application 125 are discussed with reference to FIG. 1 merely for
the sake of simplicity. However, the embodiments disclosed herein
are applicable to a plurality of user devices that can communicate
with the server 130 via the network 110, where each user device
includes at least one application.
[0038] FIG. 2 shows an example diagram of a DCC system 160 for
creating concepts. The DCC system 160 is configured to receive an
MMCE, for example from the server 130, database 150, or user device
120, via a network interface 260.
[0039] The MMCE is processed by a patch attention processor (PAP)
210, resulting in a plurality of patches that are of specific
interest, or otherwise of higher interest than other patches. A
more general pattern extraction, such as an attention processor
(AP) (not shown) may also be used in lieu of patches. The AP
receives the MMCE that is partitioned into items; an item may be an
extracted pattern or a patch, or any other applicable partition
depending on the type of the MMCE. The functions of the PAP 210 are
described herein below in more detail.
[0040] The patches that are of higher interest are then used by a
signature generator, e.g., the SGS 140 of FIG. 1, to generate
signatures based on the patch. It should be noted that, in some
implementations, the DCC system 160 may include the signature
generator. A clustering processor (CP) 230 inter-matches the
generated signatures once it determines that there are a number of
patches that are above a predefined threshold. The threshold may be
defined to be large enough to enable proper and meaningful
clustering. With a plurality of clusters, a process of clustering
reduction takes place so as to extract the most useful data about
the cluster and keep it at an optimal size to produce meaningful
results. The process of cluster reduction is continuous. When new
signatures are provided after the initial phase of the operation of
the CP 230, the new signatures may be immediately checked against
the reduced clusters to save on the operation of the CP 230. A more
detailed description of the operation of the CP 230 is provided
herein below.
[0041] A concept generator (CG) 240 is configured to create concept
structures (hereinafter referred to as concepts) from the reduced
clusters provided by the CP 230. Each concept comprises a plurality
of metadata associated with the reduced clusters. The result is a
compact representation of a concept that can now be easily compared
against an MMCE to determine if the received MMCE matches a concept
stored, for example, in the database 150 of FIG. 1. This can be
done, for example and without limitation, by providing a query to
the DCC system 160 for finding a match between a concept and a
MMCE.
[0042] It should be appreciated that the DCC system 160 can
generate a number of concepts significantly smaller than the number
of MMCEs. For example, if one billion (10.sup.9) MMCEs need to be
checked for a match against another one billion MMCEs, typically
the result is that no less than 10.sup.9.times.10.sup.9=10.sup.18
matches have to take place. The DCC system 160 would typically have
around 10 million concepts or less, and therefore at most only
2.times.10.sup.6.times.10.sup.9=2.times.10.sup.15 comparisons need
to take place, a mere 0.2% of the number of matches that have had
to be made by other solutions. As the number of concepts grows
significantly slower than the number of MMCEs, the advantages of
the DCC system 160 would be apparent to one with ordinary skill in
the art.
[0043] FIG. 3 is a flowchart illustrating a method 300 for
determining a precise location based on an analysis of multimedia
content elements according to an embodiment.
[0044] At optional S310, a request for a precise location is
received. The request may include user input, such as a textual
query including a street address, an intersection of two or more
streets, a name of a place of interest, e.g., a business name, and
the like. In an embodiment, the request includes a selection of an
area on a map, such as dropping a pin on a location within a
mapping application. According an embodiment, the request may be
received from a user via a user device. The user device may be, for
example, a mobile phone, a smart phone, a personal computer (PC), a
tablet computer, an electronic wearable device (e.g., glasses, a
watch, etc.), and the like.
[0045] At S320, one or more input MMCEs are received for analysis.
In an embodiment, the input MMCEs are received from the user
device. The input MMCEs may include an image, a graphic, a video
stream, a video clip, an audio stream, an audio clip, a video
frame, a photograph, combinations thereof and portions thereof. In
an embodiment, the input MMCEs are captured by the user device at a
desired location (e.g., a desired location for pickup), and include
one or more images of the desired location surroundings, such as
buildings, street signs, natural or artificial landmarks,
infrastructure, and the like.
[0046] At S330, each input MMCE is analyzed in order to identify a
precise location associated with the input MMCE. The analysis
includes generation of at least one signature based on each input
MMCE. In an embodiment, the signatures are generated by a signature
generation system or a deep-content classification system, as
discussed herein, which may generate a signature for an MMCE via a
large number of at least partially statistically independent
computational cores. The signatures may be generated for one or
more elements depicted within an MMCE. For example, if an MMCE is a
photograph of a street corner, where the photograph includes an
image of various elements, such as a storefront, a street sign, and
a tree, a signature may be generated for each of the various
elements.
[0047] The analysis may further include analysis of metadata
associated with the MMCE. Metadata may include a time stamp of the
capturing of the MMCE, the device used for the capturing, a
location pointer, tags, comments, Global Positioning System
coordinates associated with the MMCE, and the like. The metadata
analysis may further include generating signatures to the
metadata.
[0048] In an embodiment, the analysis further includes the
determination of a concept based on each generated signature. The
concepts are generated by a process of inter-matching of the
signatures once it is determined that there is a number of elements
therein above a predefined threshold. That threshold needs to be
large enough to enable proper and meaningful clustering.
[0049] Each concept is a collection of signatures representing
MMCEs and metadata describing the concept, and acts as an abstract
description of the content to which the signature was generated. As
a non-limiting example, a `Superman concept` is a signature-reduced
cluster of signatures representing elements (such as MMCEs) related
to, e.g., a Superman cartoon, and a set of metadata including a
textual representation of the Superman concept. As another example,
metadata of a concept represented by the signature generated for a
picture showing a bouquet of red roses is "flowers." As yet another
example, metadata of a concept represented by the signature
generated for a picture showing a bouquet of wilted roses is
"wilted flowers".
[0050] At S340, based on the analysis, the input MMCE is matched to
a database. The matching includes comparing the generated
signatures, determined concepts, or both, to one or more previously
generated reference concepts. Each reference concept is associated
with a reference location. The reference concepts may be stored in
and accessed from a database, such as the database 150 of FIG.
1.
[0051] As a non-limiting example, if the determined concept of the
input MMCE indicates the left side of a particular statue of a
bull, and the reference concepts associate that exact statue with a
particular location, e.g., the charging bull statue located in the
Wall Street district of lower Manhattan on Broadway street, where
the charging bull is facing north, a location is determined that
the MMCE has been captured on the western side of the charging bull
statue.
[0052] At S350, location coordinates associated with the determined
location are determined. The location coordinates may be generated
based on a predetermined list of reference coordinates associated
with the matching reference location. The reference coordinates may
be accessed from a database, or queried from a web source over a
network. The coordinates may be generated in a numerical format
using a geographic coordinate system, such as latitudinal and
longitudinal values including degrees, minutes, and seconds.
[0053] At optional S360, the determined location coordinates are
provided to a user device, e.g., through a mapping application. In
an embodiment, the location coordinates are provided to a second
user. For example, if a first user sends a location request from a
first user device using a ride-hiring application, the determined
location coordinates may be provided to a second user on an
application running on a second user device, where the second user
is offering ride-hiring services. In some implementations, the
first user may provide additional information, such as an image of
themselves, to allow the second user to identify the first user
more easily. The additional information may include the input
MMCEs.
[0054] At S370, it is checked if additional MMCEs have been
received, and if so, execution continues with S330; otherwise,
execution terminates.
[0055] FIGS. 4 and 5 illustrate the generation of signatures for
the multimedia content elements by the SGS 120 according to one
embodiment. An exemplary high-level description of the process for
large scale matching is depicted in FIG. 4. In this example, the
matching is for a video content.
[0056] Video content segments 2 from a Master database (DB) 6 and a
Target DB 1 are processed in parallel by a large number of
independent computational Cores 3 that constitute an architecture
for generating the Signatures (hereinafter the "Architecture").
Further details on the computational Cores generation are provided
below.
[0057] The independent Cores 3 generate a database of Robust
Signatures and Signatures 4 for Target content-segments 5 and a
database of Robust Signatures and Signatures 7 for Master
content-segments 8. An exemplary and non-limiting process of
signature generation for an audio component is shown in detail in
FIG. 5. Finally, Target Robust Signatures and/or Signatures are
effectively matched, by a matching algorithm 9, to Master Robust
Signatures and/or Signatures database to find all matches between
the two databases.
[0058] To demonstrate an example of the signature generation
process, it is assumed, merely for the sake of simplicity and
without limitation on the generality of the disclosed embodiments,
that the signatures are based on a single frame, leading to certain
simplification of the computational cores generation. The Matching
System is extensible for signatures generation capturing the
dynamics in-between the frames. In an embodiment, the signature
generator 140 is configured with a plurality of computational cores
to perform matching between signatures.
[0059] The Signatures' generation process is now described with
reference to FIG. 5. The first step in the process of signatures
generation from a given speech-segment is to breakdown the
speech-segment to K patches 14 of random length P and random
position within the speech segment 12. The breakdown is performed
by the patch generator component 21. The value of the number of
patches K, random length P and random position parameters is
determined based on optimization, considering the tradeoff between
accuracy rate and the number of fast matches required in the flow
process of the server 130 and SGS 140. Thereafter, all the K
patches are injected in parallel into all computational Cores 3 to
generate K response vectors 22, which are fed into a signature
generator system 23 to produce a database of Robust Signatures and
Signatures 4.
[0060] In order to generate Robust Signatures, i.e., Signatures
that are robust to additive noise L (where L is an integer equal to
or greater than 1) by the Computational Cores 3 a frame `i` is
injected into all the Cores 3. Then, Cores 3 generate two binary
response vectors: one which is a Signature vector, and one which is
a Robust Signature vector.
[0061] For generation of signatures robust to additive noise, such
as White-Gaussian-Noise, scratch, etc., but not robust to
distortions, such as crop, shift and rotation, etc., a core
Ci={n.sub.i} (1.ltoreq.i.ltoreq.L) may consist of a single leaky
integrate-to-threshold unit (LTU) node or more nodes. The node
n.sub.i equations are:
V i = j w ij k j ##EQU00001## n i = .theta. ( Vi - Th x )
##EQU00001.2##
[0062] where, .theta. is a Heaviside step function; w.sub.ij is a
coupling node unit (CNU) between node i and image component j (for
example, grayscale value of a certain pixel j); k.sub.j is an image
component `j` (for example, grayscale value of a certain pixel j);
Th.sub.X is a constant Threshold value, where `x` is `S` for
Signature and `RS` for Robust Signature; and Vi is a Coupling Node
Value.
[0063] The Threshold values Th.sub.X are set differently for
Signature generation and for Robust Signature generation. For
example, for a certain distribution of V.sub.i values (for the set
of nodes), the thresholds for Signature (Th.sub.S) and Robust
Signature (Th.sub.RS) are set apart, after optimization, according
to at least one or more of the following criteria: [0064] 1:
For:
[0064] V.sub.i>Th.sub.RS
1-p(V>Th.sub.S)-1-(1- ).sup.I<<1
i.e., given that I nodes (cores) constitute a Robust Signature of a
certain image I, the probability that not all of these I nodes will
belong to the Signature of same, but noisy image, is sufficiently
low (according to a system's specified accuracy). [0065] 2:
[0065] p(V.sub.i>Th.sub.RS).apprxeq.l/L
i.e., approximately I out of the total L nodes can be found to
generate a Robust Signature according to the above definition.
[0066] 3: Both Robust Signature and Signature are generated for a
certain frame i.
[0067] It should be understood that the generation of a signature
is unidirectional, and typically yields lossless compression, where
the characteristics of the compressed data are maintained but the
uncompressed data cannot be reconstructed. Therefore, a signature
can be used for the purpose of comparison to another signature
without the need of comparison to the original data. The detailed
description of the Signature generation can be found in U.S. Pat.
Nos. 8,326,775 and 8,312,031, assigned to common assignee, which
are hereby incorporated by reference for all the useful information
they contain.
[0068] A Computational Core generation is a process of definition,
selection, and tuning of the parameters of the cores for a certain
realization in a specific system and application. The process is
based on several design considerations, such as: [0069] (a) The
Cores should be designed so as to obtain maximal independence,
i.e., the projection from a signal space should generate a maximal
pair-wise distance between any two cores' projections into a
high-dimensional space. [0070] (b) The Cores should be optimally
designed for the type of signals, i.e., the Cores should be
maximally sensitive to the spatio-temporal structure of the
injected signal, for example, and in particular, sensitive to local
correlations in time and space. Thus, in some cases a core
represents a dynamic system, such as in state space, phase space,
edge of chaos, etc., which is uniquely used herein to exploit their
maximal computational power. [0071] (c) The Cores should be
optimally designed with regard to invariance to a set of signal
distortions, of interest in relevant applications.
[0072] A detailed description of the Computational Core generation
and the process for configuring such cores is discussed in more
detail in the above referenced U.S. Pat. No. 8,655,801, the
contents of which are hereby incorporated by reference.
[0073] As used herein, the phrase "at least one of" followed by a
listing of items means that any of the listed items can be utilized
individually, or any combination of two or more of the listed items
can be utilized. For example, if a system is described as including
"at least one of A, B, and C," the system can include A alone; B
alone; C alone; A and B in combination; B and C in combination; A
and C in combination; or A, B, and C in combination.
[0074] The various embodiments disclosed herein can be implemented
as hardware, firmware, software, or any combination thereof.
Moreover, the software is preferably implemented as an application
program tangibly embodied on a program storage unit or computer
readable medium consisting of parts, or of certain devices and/or a
combination of devices. The application program may be uploaded to,
and executed by, a machine comprising any suitable architecture.
Preferably, the machine is implemented on a computer platform
having hardware such as one or more central processing units
("CPUs"), a memory, and input/output interfaces. The computer
platform may also include an operating system and microinstruction
code. The various processes and functions described herein may be
either part of the microinstruction code or part of the application
program, or any combination thereof, which may be executed by a
CPU, whether or not such a computer or processor is explicitly
shown. In addition, various other peripheral units may be connected
to the computer platform such as an additional data storage unit
and a printing unit. Furthermore, a non-transitory computer
readable medium is any computer readable medium except for a
transitory propagating signal.
[0075] All examples and conditional language recited herein are
intended for pedagogical purposes to aid the reader in
understanding the principles of the disclosed embodiment and the
concepts contributed by the inventor to furthering the art, and are
to be construed as being without limitation to such specifically
recited examples and conditions. Moreover, all statements herein
reciting principles, aspects, and embodiments of the disclosed
embodiments, as well as specific examples thereof, are intended to
encompass both structural and functional equivalents thereof.
Additionally, it is intended that such equivalents include both
currently known equivalents as well as equivalents developed in the
future, i.e., any elements developed that perform the same
function, regardless of structure.
* * * * *