U.S. patent application number 14/616197 was filed with the patent office on 2016-07-14 for imputing knowledge graph attributes to digital multimedia based on image and video metadata.
The applicant listed for this patent is INSNAP, INC.. Invention is credited to Shafaq ABDULLAH, Mohammad SABAH, Mohammad Iman SADREDDIN.
Application Number | 20160203137 14/616197 |
Document ID | / |
Family ID | 56129838 |
Filed Date | 2016-07-14 |
United States Patent
Application |
20160203137 |
Kind Code |
A1 |
SABAH; Mohammad ; et
al. |
July 14, 2016 |
IMPUTING KNOWLEDGE GRAPH ATTRIBUTES TO DIGITAL MULTIMEDIA BASED ON
IMAGE AND VIDEO METADATA
Abstract
Techniques are disclosed herein for imputing attributes to
multimedia content (e.g., an image or a video) based on metadata of
the multimedia content. An analysis tool of a multimedia service
platform evaluates content metadata to identify at least a time and
a first location at which the content was captured. The time and
first location is correlated with a knowledge graph. The knowledge
graph stores information describing a plurality of locations,
including the first location. The knowledge graph also stores
information describing events scheduled to occur at one or more of
the plurality of locations. The analysis tool identifies, from the
correlated information, one or more of the attributes of the first
location to impute to the content. Thereafter, the analysis tool
imputes the identified attributes to the content.
Inventors: |
SABAH; Mohammad; (San Jose,
CA) ; SADREDDIN; Mohammad Iman; (Santa Clara, CA)
; ABDULLAH; Shafaq; (Belmont, CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
INSNAP, INC. |
Santa Clara |
CA |
US |
|
|
Family ID: |
56129838 |
Appl. No.: |
14/616197 |
Filed: |
February 6, 2015 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
62093372 |
Dec 17, 2014 |
|
|
|
Current U.S.
Class: |
707/738 |
Current CPC
Class: |
G06F 16/583 20190101;
G06N 5/02 20130101; G06F 16/24578 20190101; G06F 16/51 20190101;
G06N 5/048 20130101; G06F 16/48 20190101; G06F 16/5838 20190101;
G06Q 30/0269 20130101; G06F 16/285 20190101; G06F 16/9024 20190101;
G06F 16/9038 20190101; G06F 16/532 20190101; G06F 16/2379
20190101 |
International
Class: |
G06F 17/30 20060101
G06F017/30 |
Claims
1. A method for imputing attributes to multimedia content based on
metadata of the content, the method comprising: evaluating content
metadata to identify at least a time and a first location at which
the content was captured; correlating the time and the first
location with a knowledge graph, wherein the knowledge graph stores
information describing a plurality of locations, including the
first location, and describes events scheduled to occur at one or
more of the plurality of locations; identifying, from the
correlated information, one or more of the attributes of the first
location to impute to the content; and imputing the identified
attributes to the content.
2. The method of claim 1, wherein identified attributes include a
name of the first location, an event scheduled to occur at the
first location during the time.
3. The method of claim 2, wherein the identified attributes further
include at least one of a name of the event, a start time of the
event, and a price range of the event.
4. The method of claim 1, further comprising: for each attribute
imputed to the content, updating a user-attribute matrix to reflect
the attribute imputed to the content.
5. The method of claim 3, further comprising: evaluating the
user-attribute matrix to determine at least a first concept,
wherein the first concept is determined based on a co-occurrence
between one or more attributes in the user-attribute matrix.
6. The method of claim 1, wherein the data sources includes at
least one of an events source, an online encyclopedia source, and a
weather source.
7. The method of claim 1, wherein the time and first location are
correlated with the knowledge graph within a specified
spatiotemporal range.
8. The method of claim 1, wherein the multimedia content is one of
an image or a video.
9. A non-transitory computer-readable storage medium storing
instructions, which, when executed on a processor, performs an
operation for imputing attributes to multimedia content based on
metadata of the image, the operation comprising: evaluating content
metadata to identify at least a time and a first location at which
the content was captured; correlating the time and the first
location with a knowledge graph, wherein the knowledge graph stores
information describing a plurality of locations, including the
first location, and describes events scheduled to occur at one or
more of the plurality of locations; identifying, from the
correlated information, one or more of the attributes of the first
location to impute to the content; and imputing the identified
attributes to the content.
10. The computer-readable storage medium of claim 9, wherein
identified attributes include a name of the first location, an
event scheduled to occur at the first location during the time.
11. The computer-readable storage medium of claim 10, wherein the
identified attributes further include at least one of a name of the
event, a start time of the event, and a price range of the
event.
12. The computer-readable storage medium of claim 9, wherein the
operation further comprises: for each attribute imputed to the
content, updating a user-attribute matrix to reflect the attribute
imputed to the content.
13. The computer-readable storage medium of claim 12, wherein the
operation further comprises: evaluating the user-attribute matrix
to determine at least a first concept, wherein the first concept is
determined based on a co-occurrence between one or more attributes
in the user-attribute matrix.
14. The computer-readable storage medium of claim 9, wherein the
data sources includes at least one of an events source, an online
encyclopedia source, and a weather source.
15. The computer-readable storage medium of claim 9, wherein the
time and first location are correlated with the knowledge graph
within a specified spatiotemporal range.
16. The computer-readable storage medium of claim 9, wherein the
multimedia content is one of an image or a video.
17. A system, comprising: a processor; and a memory storing one or
more application programs configured to perform an operation for
imputing attributes to multimedia content based on metadata of the
content, the operation comprising: evaluating content metadata to
identify at least a time and a first location at which the content
was captured; correlating the time and the first location with a
knowledge graph, wherein the knowledge graph stores information
describing a plurality of locations, including the first location,
and describes events scheduled to occur at one or more of the
plurality of locations; identifying, from the correlated
information, one or more of the attributes of the first location to
impute to the content; and imputing the identified attributes to
the content.
18. The system of claim 17, wherein identified attributes include a
name of the first location, an event scheduled to occur at the
first location during the time.
19. The system of claim 18, wherein the identified attributes
further include at least one of a name of the event, a start time
of the event, and a price range of the event.
20. The system of claim 17, wherein the operation further
comprises: for each attribute imputed to the content, updating a
user-attribute matrix to reflect the attribute imputed to the
content.
21. The system of claim 17, wherein the data sources includes at
least one of an events source, an online encyclopedia source, and a
weather source.
22. The system of claim 17, wherein the time and first location are
correlated with the knowledge graph within a specified
spatiotemporal range.
23. The system of claim 17, wherein the multimedia content is one
of an image or a video.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to U.S. Provisional
Application No. 62/093,372, filed Dec. 19, 2014. The content of the
aforementioned application is incorporated by reference in its
entirety.
BACKGROUND
[0002] 1. Field
[0003] Embodiments of the present disclosure generally relate to
data analytics. More specifically, to imputing information from a
knowledge graph to an image based on image metadata.
[0004] 2. Description of the Related Art
[0005] Individuals take images to capture personal experiences and
events. The images can represent mementos of various times and
places experienced in an individual's life.
[0006] In addition, mobile devices (e.g., smart phones, tablets,
etc.) allow individuals to easily capture both digital images as
well as record video. For instance, cameras in mobile devices have
steadily improved in quality and are can capture high-resolution
images. Further, mobile devices now commonly have a storage
capacity that can store thousands of images. And because
individuals carry smart phones around with them, they can capture
images and videos virtually anywhere.
[0007] This has resulted in an explosion of multimedia content, as
virtually anyone can capture and share digital images and videos
via text message, image services, social media, video services, and
the like. This volume of digital multimedia, now readily available,
provides a variety of information.
SUMMARY
[0008] One embodiment presented herein describes a method for
imputing attributes to multimedia content based on the metadata of
the content. The method generally includes evaluating content
metadata to identify at least a time and a first location at which
the content was captured. The method also includes correlating the
time and the first location with a knowledge graph. The knowledge
graph stores information describing a plurality of locations,
including the first location. The knowledge graph also describes
events scheduled to occur at one or more of the plurality of
locations. The method also includes identifying, from the
correlated information, one or more of the attributes of the first
location to impute to the content. The identified attributes are
imputed to the content.
[0009] Other embodiments include, without limitation, a
computer-readable medium that includes instructions that enable a
processing unit to implement one or more aspects of the disclosed
methods as well as a system having a processor, memory, and
application programs configured to implement one or more aspects of
the disclosed methods.
BRIEF DESCRIPTION OF THE DRAWINGS
[0010] So that the manner in which the above recited features of
the present disclosure can be understood in detail, a more
particular description of the disclosure, briefly summarized above,
may be had by reference to embodiments, some of which are
illustrated in the appended drawings. It is to be noted, however,
that the appended drawings illustrate only exemplary embodiments
and are therefore not to be considered limiting of its scope, may
admit to other equally effective embodiments.
[0011] FIG. 1 illustrates an example computing environment,
according to one embodiment.
[0012] FIG. 2 further illustrates the mobile application described
relative to FIG. 1, according to one embodiment.
[0013] FIG. 3 further illustrates the analysis tool described
relative to FIG. 1, according to one embodiment.
[0014] FIG. 4 illustrates a method for building a knowledge graph
used to impute attributes onto image metadata, according to one
embodiment.
[0015] FIG. 5 illustrates a method for imputing knowledge graph
attributes onto image metadata, according to one embodiment.
[0016] FIG. 6 illustrates an application server computing system
configured to impute knowledge graph attributes onto image
metadata, according to one embodiment.
[0017] To facilitate understanding, identical reference numerals
have been used, where possible, to designate identical elements
that are common to the figures. It is contemplated that elements
and features of one embodiment may be beneficially incorporated in
other embodiments without further recitation.
DETAILED DESCRIPTION
[0018] Embodiments presented herein describe techniques for
inferring user interests from metadata associated with digital
multimedia (e.g., images and video). Digital multimedia provides a
wealth of information valuable to third parties (e.g., advertisers,
marketers, and the like). For example, assume an individual takes
pictures at a golf course using a mobile device (e.g., a smart
phone, tablet, etc.). Further, assume that the pictures are the
only indication the individual was at the golf course (e.g.,
because the individual made only cash purchases and signed no
registers). Metadata associated with this image can place the
individual at the golf course at a specific time. Further, event
data could be used to correlate whether there was going on at that
time (e.g., a specific tournament). Such information may be useful
to third parties, e.g., for targeted advertising and
recommendations.
[0019] However, an advertiser might not be able to identify an
effective audience for targeting a given product or service based
on such information alone. Even if image metadata places an
individual at a golf course at a particular point of time, the
advertiser might draw inaccurate inferences about the individual.
For example, the advertiser might assume that because the metadata
places the individual at a high-end golf course, the individual is
interested in high-end golf equipment. The advertiser might then
recommend other high-end equipment or other golf courses to that
individual. If the individual rarely plays golf or does not usually
spend money at high-end locations. Such recommendations may lead to
low conversion rates for the advertiser. Historically, advertisers
have been generally forced to accept low conversation rates, as
techniques for identifying individuals likely to be receptive to or
interested in a given product or service are often ineffective.
[0020] Embodiments presented herein describe techniques for
inferring user interests based on metadata of digital multimedia.
Specifically, embodiments describe techniques for imputing
knowledge graph attributes to digital images and videos using
metadata. In one embodiment, a multimedia service platform provides
a software development kit (SDK) that third parties may use in
mobile applications of third parties (e.g., retailers, marketers,
etc.) to access an application programming interface (API) of the
platform. The mobile application can use the API to upload images
and videos to the platform from a mobile device of a user. Further,
the multimedia service platform may identify patterns from metadata
extracted from images and videos. The metadata may describe where
and when a given image or video was taken. Further, in many cases,
embodiments presented herein can identify latent relationships
between user interests from collections of image metadata from
multiple users. For example, if many users who take pictures at
golf courses also take pictures at an unrelated event (e.g., take
pictures of a traveling museum exhibit) then the system disclosed
herein can discover a relationship between the interests.
Thereafter, advertising related to golfing products and services
could be targeted to individuals who publish pictures of the
travelling museum exhibit, regardless of any other known interest
in golf.
[0021] In one embodiment, the multimedia service platform evaluates
metadata corresponding to each image or video submitted to the
platform against a knowledge graph. The knowledge graph provides a
variety of information about events, places, dates, times, etc.
that may be compared with metadata. For example, the knowledge
graph may include weather data, location data, event data, and
online encyclopedia data. For instance, attributes associated with
an event may include a name, location, start time, end time, price
range, etc. The multimedia service platform correlates
spatiotemporal metadata from a digital image with a specific event
in the knowledge graph. That is, the knowledge graph is used to
impute attributes related to events, places, dates, times, etc., to
a given digital multimedia file based on the metadata provided with
that file.
[0022] To build the knowledge graph, a tool extracts unstructured
and structured text data from external sources, such as local news
and media websites, online event schedules for performance venues,
calendars published by schools, government, or private enterprises,
online schedules and ticket sales, etc. The tool applies pattern
matching and Natural Language Processing (NLP) techniques to
recognize relevant content from the text. For example, pattern
matching and NLP techniques may be used to identify location,
performer, ticket price range, genre, from a calendar of events at
a particular venue. In other cases, NLP techniques may be used to
derive information about a given location, e.g., that a given
physical location is a park, a school, or a ski resort,
irrespective of any particular event.
[0023] In one embodiment, the multimedia service platform imputes
attributes related to events, locations, etc to images captured by
users. That is, the multimedia service platform associates
attributes of the knowledge graph with images uploaded to the
platform by a user. To do so, an analysis tool may retrieve
metadata associated with an image or video published to the
platform. The analysis tool compares this metadata to identify
attributes to associate with the image or video. For example, the
analysis tool may determine (from the knowledge graph) that an
image was taken during a performance concert hall based on the time
and location metadata of the image. The knowledge graph may impute
a variety of attributes associated with the event to the image,
e.g., a price range for the performance, weather conditions, an
audience size, a description of a music or other genera related to
the performance (e.g., a rock concert or a Broadway musical),
number of users in attendance, etc. In a simpler case, the location
of an image metadata may indicate a given activity apart from any
scheduled event (e.g., a picture taken on the grounds of a golf
course or ski resort). Even in such cases, other inferences (and
therefore attributes form the knowledge graph) may be imputed to an
image. For example, consider an image with location metadata
matching the location of a ski resort. In such a case, different
attributes could be imputed depending on the date an image was
taken. Thus, in the winter, attributes related to snow sports could
be imputed to this image. In contrast, in the summer, attributes
related to hiking or mountain biking could be imputed.
[0024] In one embodiment, the analysis tool represents attributes
imputed to images and videos from a user base in a user-attribute
matrix, where each row of the matrix represents a distinct user and
each column represents an attribute from the knowledge graph that
can be imputed to an image or video. The analysis tool may add
columns to the user-attribute matrix as additional attributes are
identified. The cells of a given row indicate how many times a
given attribute has been imputed to an image or video published by
a user corresponding to that row. Accordingly, when the analysis
tool imputes an attribute to an image or video (based on the
metadata), a value for that attribute is incremented in the
user-attribute matrix. Imputing information from the knowledge
graph to a collection of images of a given user allows the
multimedia service platform to identify useful information about
that user. For instance, the analysis tool may identify that a user
often attends sporting events, movies, participates in a particular
recreational event (e.g., skiing or golf), etc. In addition, the
analysis tool may identify information about events that the user
attends, such as whether the events are related to a given sports
team, whether the events are related to flights from an airport, a
range specifying how much the event may cost, etc.
[0025] In one embodiment, the analysis tool may identify concepts
to associate with a user based on the attributes (or attribute
counts) reflected in the user-attribute matrix. A concept may be
associated with a set of attributes in the knowledge graph. For
example, attributes of "Wrigley Field," "Chicago Cubs," "game," and
"Sammy Sosa" may be associated with a "baseball" concept. The
analysis tool may determine what concepts to associate with a user.
That is, the analysis tool may identify that a user has an interest
in a given concept from the attributes in the user-attribute
matrix.
[0026] Further, the analysis tool may generate an interest taxonomy
based on the user-attribute matrix. In one embodiment, an interest
taxonomy is a hierarchical representation of user interests based
on the concepts. For example, the interest taxonomy can identify
general groups (e.g., sports, music, and travel) and sub-groups
(e.g., basketball, rock music, and discount airlines) of interest
identified from the concepts.
[0027] The multimedia service platform may use the interest
taxonomy to discover latent relationships between concepts. For
example, the multimedia service platform may build a predictive
learning model using the interest taxonomy. The multimedia service
platform could train the predictive learning model using existing
user-to-concept associations. Doing so would allow the multimedia
service platform use the model to predict associations for users to
other concepts that the user is not currently associated with.
Further, the multimedia service platform may map product feeds of
third party systems to the user interest taxonomy to identify
products to recommend to a given user. Doing so allows advertisers
to improve recommendations presented to a given user.
[0028] Note, the following description relies on digital images
captured by a user and metadata as a reference example of imputing
knowledge graph attributes onto those images based on the metadata.
However, one of skill in the art will recognize that the
embodiments presented herein may be adapted to other digital
multimedia that include time and location metadata, such as digital
video recorded on a mobile device. Further, an analysis tool may be
able to extract additional metadata features from such videos, such
as the length of the video, which can be used relative to the
techniques described herein.
[0029] FIG. 1 illustrates an example computing environment 100,
according to one embodiment. As shown, the computing environment
100 includes one or more mobile devices 105, an extract, transform,
and load (ETL) server 110, an application server 115, and a third
party system 120, connected to a network 125 (e.g., the
Internet).
[0030] In one embodiment, the mobile devices 105 include a mobile
application 106 which allows users to interact with a multimedia
service platform (represented by the ETL server 110 and the
application server 115). In one embodiment, the mobile application
106 is developed by a third-party organization (e.g., a retailer,
social network provider, fitness tracker developer, etc.). The
mobile application 106 may send images 108 and associated metadata
to the multimedia service platform. In one embodiment, the mobile
application 106 may access APIs exposed by a software development
kit (SDK) distinct to the platform.
[0031] In another embodiment, the mobile application 106 may access
a social media service (application service 116) provided by the
multimedia service platform. The social media service allows users
to capture, share, and comment on images 108 as a part of existing
social networks (or in conjunction) with those social networks. For
example, a user can link a social network account to the multimedia
service platform through application 106. Thereafter, the user may
capture a number of images and submit the images 108 to the social
network. In turn, the application 106 retrieves the metadata from
the submitted images. Further, the mobile application 106 can send
images 108 and metadata to the multimedia service platform. The
image service platform uses the metadata to infer latent interests
of the user.
[0032] In any case, the mobile application 106 extracts
Exchangeable Image Format (EXIF) metadata from each image 108. The
mobile application 106 can also extract other metadata (e.g.,
PHAsset metadata in Apple iOS devices) describing additional
information, such as GPS data. In addition, the mobile application
106 may perform extract, transform, and load (ETL) operations on
the metadata to format the metadata for use by components of the
image service platform. For example, the mobile application 106 may
determine additional information based on the metadata, such as
whether a given image was taken during daytime or nighttime,
whether the image was taken indoors or outdoors, whether the image
is a "selfie," etc. Further, the mobile application 106 also
retrieves metadata describing application use. Such metadata
includes activity by the user on the mobile application 106, such
as image views, tagging, etc. Further, as described below, the
mobile application 106 provides functionality that allows a user to
search through a collection of images by the additional metadata,
e.g., searching a collection of images that are "selfies" and taken
in the morning.
[0033] In one embodiment, the ETL server 110 includes an ETL
application 112. The ETL application 112 receives streams of image
metadata 114 (e.g., the EXIF metadata, PHAsset metadata, and
additional metadata) from mobile devices 105. Further, the ETL
application 112 cleans, stores, and indexes the image metadata 114
for use by the application server 115. Once processed, the ETL
application 112 may store the image metadata 114 in a data store
(e.g., such as in a database) for access by the application server
115.
[0034] In one embodiment, an application service 116 communicates
with the mobile application 106. In one embodiment, the application
server 115 may be a physical computing system or a virtual machine
computing instance in the cloud. Although depicted as a single
server, the application server 115 may comprise multiple servers
configured as a cluster (e.g., via the Apache Spark framework).
This architecture allows the application servers 115 to process
large amounts of images and image metadata sent from mobile
applications 106.
[0035] As shown, the application server 115 includes an analysis
tool 117, a knowledge graph 118, and a user interest taxonomy 119.
In one embodiment, the analysis tool 117 generates the user
interest taxonomy 119 based on image metadata 114 from image
collections of multiple users. As described below, the user
interest taxonomy 119 represents interests inferred from image
attributes identified from the knowledge graph 118.
[0036] In one embodiment, the knowledge graph 118 includes a
collection of attributes which may be imputed to an image. Examples
of attributes include time and location information, event
information, genres, price ranges, weather, subject matter, and the
like. The analysis tool 117 builds the knowledge graph 118 using
weather data, location data, events data, encyclopedia data, and
the like from a variety of data sources.
[0037] In one embodiment, the analysis tool 117 imputes attributes
from the knowledge graph 118 to the images 108 based on the
metadata 114. That is, the analysis tool 117 may correlate time and
location information in image metadata 114 to attributes in the
knowledge graph 118. For example, assume that a user captures an
image 108 of a baseball game. Metadata 114 for that image 108 may
include a GPS, a date, and a time when the image 108 was captured.
The analysis tool 117 can correlate this information to attributes
such as weather conditions at that time and location (e.g.,
"sunny"), an event name (e.g., "Dodgers Game"), teams playing at
that game (e.g., "Dodgers" and "Cardinals"), etc. The analysis tool
117 associates the imputed attributes with the user who took the
image. As noted, e.g., a row in a user attribute matrix may be
updated to reflect the imputed attributes of each new image taken
by that user. Further, the analysis tool 117 may perform machine
learning techniques, such as latent Dirichlet analysis (LDA), to
decompose the user-attribute matrix into sub-matrices. Doing so
allows the analysis tool 117 to identify concepts, i.e., clusters
of attributes. The analysis tool 117 may use the user interest
taxonomy 119 to generate product recommendations. The analysis tool
117 may also use the interest taxonomy 119 identify one or more
users that may be interested in a product or service. For example,
the analysis tool 117 may extract information from a product feed
121 of a third party system 120. In one embodiment, the product
feed 121 is a listing of products or services of a third party,
such as a retailer. The analysis tool 117 may identify, from the
product feed 121, one or more attributes describing each product.
For example, a product of a shoe retailer may have attributes such
as "shoe," "running," "menswear," and so on. The analysis tool 117
can map the attributes of the product feed 121 with the interest
taxonomy 119. Doing so allows the analysis tool 117 to identify
products and services from the feed 121 that align with interests
in the interest taxonomy. In turn, third parties can target users
who may be interested in the identified products and services.
[0038] FIG. 2 illustrates mobile application 106, according to one
embodiment. As shown, mobile application 106 includes a SDK
component 200 with APIs configured to send image and metadata
information to the multimedia service platform. The SDK component
200 further includes an extraction component 205, a search and
similarity component 210, and a log component 215. In one
embodiment, the extraction component 205 extracts metadata (e.g.,
EXIF metadata, PHAsset metadata, and the like) from images captured
using a mobile device 105. Further, the extraction component 205
may perform ETL preprocessing operations on the metadata. For
example, the extraction component 205 may format the metadata for
the search and similarity component 210 and the log component
215.
[0039] In one embodiment, the search and similarity component 210
infers additional metadata from an image based on the metadata
(e.g., spatiotemporal metadata) retrieved by the extraction
component 205. Examples of additional metadata include whether a
given image was captured at daytime or nighttime, whether the image
was captured indoors or outdoors, whether the image was edited,
weather conditions when the image was captured, etc. Further, the
search and similarity component 210 generates a two-dimensional
image feature map from a collection of images captured on a given
mobile device 105, where each row represents an image and columns
represent metadata attributes. Cells of the map indicate whether an
image has a particular attribute. The image feature map allows the
search and similarity component 210 to provide analytics and search
features for the collection of images captured by a mobile device.
For example, a user of the mobile application 106 may search for
images on their mobile device which have a given attribute, such as
images taken during daytime or taken from a particular location. In
turn, the search and similarity component 210 may evaluate the
image map to identify photos having such an attribute.
[0040] In one embodiment, the log component 215 evaluates the image
metadata. For example, the log component 215 records metadata sent
to the ETL server 110. Once received, the application 112 performs
ETL operations, e.g., loading the metadata into a data store (such
as a database). The metadata is accessible by the analysis tool
117.
[0041] FIG. 3 illustrates an example of components of the analysis
tool, according to one embodiment. As shown, the analysis tool 117
includes an aggregation component 305, a scraping component 310, a
correlation component 315, and a taxonomy component 320.
[0042] In one embodiment, the aggregation component 305 receives
streams of image metadata corresponding to images captured by users
of application 106 by users from the ETL server 110. Once received,
the aggregation component 305 organizes images and metadata by
user. The metadata may include both raw image metadata (e.g., time
and GPS information) and inferred metadata (e.g., daytime or
nighttime image, indoor or outdoor image, "selfie" image, etc.). To
organize metadata by user, the aggregation component 305 evaluates
log data from the ETL server 110 to identify image metadata from
different devices (and presumably different users) and metadata
type (e.g., whether the metadata corresponds to image metadata or
application usage data).
[0043] In one embodiment, the scraping component 310 builds (and
later maintains) the knowledge graph 118 using any suitable data
source, such as local news and media websites, online event
schedules for performance venues, calendars published by schools,
government, or private enterprises, online schedules and ticket
sales. For example, the scraping component 310 may evaluate event
websites (e.g., such as calendar and event pages on news and media
sites) to identify events that are scheduled to occur in a given
area or location.
[0044] The scraping component 310 also identifies information about
the event, such as event headliner (e.g., name of person, team, or
group), time of the event, location of the event, price range of
the event, and the like. In one embodiment, the scraping component
310 uses NLP techniques to tokenize raw text from the data sources
and perform pattern matching on the tokenized data. For example,
pattern matching techniques may be used to identify words and
phrases that correspond to event information, e.g., by determining
patterns that likely correspond to fields in an event listing
(e.g., "Name," "Date," etc.). The scraping component 310 determines
a set of attributes related to each event to store in the knowledge
graph 118.
[0045] Further, the scraping component 310 can evaluate sources
relating to a location, e.g., public parks, landmarks, museums, and
the like. For example, the scraping component 310 may evaluate an
art museum website to identify information about the art museum,
such as a location, schedule of events, genres of art on display,
and the like. In addition, the scraping component 310 can evaluate
sources relating to travel, such as websites of airlines and
tourist agencies. The scraping component 310 may identify
information such as arrival and departure times, schedules, airport
names, train station names, and the like.
[0046] In one embodiment, the correlation component 315 receives
new images taken by a given user. The correlation component 315
compares the image metadata associated with the user with
information in the knowledge graph 118. In particular, the
component 315 imputes attributes from the knowledge graph 118 to an
image based on the time and location information metadata of that
image. The correlation component 315 may update a user-to-attribute
matrix reflecting what attributes have been imputed to the image.
As noted, each row in the user-attribute matrix represents a user
and columns represent attributes of the knowledge graph 118. Cells
in a row reflect how many times an attribute has been imputed to
images taken by the corresponding user.
[0047] In one embodiment, to impute attributes from the knowledge
graph 118 to a given image, the correlation component 315 evaluates
time and location metadata of the image against the knowledge graph
118. The correlation component 315 determines whether the image
metadata matches a location and/or event in the knowledge graph.
The information may be matched using a specified spatiotemporal
range, e.g., within a time period of the event, within a set of GPS
coordinate range, etc. In one embodiment, the component 315 may
further match the information based on a similarity of metadata of
other user photos that have been matched to that event.
[0048] For example, assume image metadata indicates the image was
taken at 7:14 PM at GPS coordinates generally matching a baseball
stadium. The correlation component 315 may evaluate such metadata
against the knowledge graph 118 to identify a sporting event
occurring at that time and place. The knowledge graph 118 may
specify attributes associated with that event, such as names, date,
time, price range, weather, venue, and the like. Further, the
correlation component 315 may correlate event information based on
similarity of images captured by other users.
[0049] In one embodiment, if the metadata does not match an event
in the knowledge graph 118, the correlation component 315 may
identify a location where the image was captured (if listed in the
knowledge graph 118) and information related to that location. For
example, the correlation component 315 may determine that a given
image was captured at an art museum. The correlation component 315
may impute the name of the museum, genres of art exhibited at the
museum, as attributes for that image.
[0050] In one embodiment, the correlation component 315 maintains a
user-attribute matrix. The correlation component 315 populates rows
of the matrix with users and columns of the matrix with attributes
imputed from the knowledge graph 118 to images captured by that
user. As noted, each row represents a user, and each column
represents an attribute. Cells of the matrix include a value
representing the number of times a given attribute has been
observed in the images. When the correlation component 315
identifies an attribute in a given image, the correlation component
315 increments the cell value.
[0051] In one embodiment, the taxonomy component 320 evaluates the
user-attribute matrix to determine concepts to associate with a
given user. As stated, a concept is a cluster of related
attributes. The taxonomy component 320 may perform machine learning
techniques, such as latent Dirichlet analysis (LDA) to decompose
the user-attribute matrix into sub-matrices and derive concepts
from user attributes. Such techniques also allow the taxonomy
component 320 to identify concept hierarchies and build a user
interest taxonomy.
[0052] FIG. 4 illustrates a method 400 for building a knowledge
graph used to impute attributes onto images, according to one
embodiment. As shown, method 400 begins at step 405, where the
scraping component 310 extracts raw text from a collection of data
sources to identify events and other external information provided
by the data sources. As stated, the external sources may include
local news and media websites, online event schedules for
performance venues, calendars published by schools, government, or
private enterprises, online schedules and ticket sales, and the
like. At step 410, the scraping component 310 tokenizes the raw
text obtained from the external sources to generate word and phrase
tokens. To do so, the knowledge scraping component 310 may use any
suitable NLP or pattern matching techniques.
[0053] At step 415, the scraping component 310 identifies relevant
terms in the tokenized text. For example, relevant terms for an
event may include an event name, a date, time, location, key words
in a description, and the like. As another example, relevant terms
for an event related to a flight schedule would include a departing
airport, arrival airport, departure time, arrival time, and the
like. The knowledge scraping component 310 may perform NLP pattern
matching techniques to do so. The knowledge scraping component 310
builds the knowledge graph 118 from the attributes and event
data.
[0054] FIG. 5 illustrates a method 500 for imputing knowledge graph
attributes onto a given image, according to one embodiment. As
noted, the aggregation component 305 maintains metadata collections
images by user. Method 500 begins at step 505, where the
correlation component 315 accesses image metadata corresponding to
an image captured by a user. In particular, the correlation
component 315 evaluates time and location metadata of the
image.
[0055] At step 510, the correlation component 315 compares time and
location metadata with the event information stored in the
knowledge graph 118. That is, the correlation component 315
correlates the time and location metadata to a given location in
the knowledge graph 118. Doing so allows the correlation component
315 to identify information associated with that location. For
example, for an image captured at a golf course, the correlation
component may identify information such as the name of the golf
course, type of offerings provided by the golf course, and any
information about events taking place at the time the image was
taken. The correlation component 315 then determines which
attributes to impute to the image based on the information. Once
determined, at step 515, the correlation component 315 performs the
following for each attribute. At step 520, the correlation
component 315 determines whether the attribute is present in the
user-attribute matrix. As stated, the correlation component 315
maintains a two-dimensional user-attribute matrix, where each row
represents a user and each column represents an attribute. If a
column corresponding to an attribute imputed to an image is not
present, then the correlation component 315 adds a column for that
attribute to the matrix (step 525). If the corresponding column is
present, then at step 530, the correlation component 315 increments
the value of that attribute in the row of the given user. As
stated, the analysis tool 117 may learn one or more concepts by
evaluating the user-attribute matrix, e.g., by performing machine
learning techniques (e.g., latent Dirichlet allocation,
non-negative matrix factorization, etc.) over the user-attribute
matrix. The learned concepts form an interest taxonomy, which the
analysis tool may use to identify latent interests of a given
user.
[0056] FIG. 6 illustrates an application server computing system
600 configured to impute knowledge graph attributes onto image
metadata, according to one embodiment. As shown, the computing
system 600 includes, without limitation, a central processing unit
(CPU) 605, a network interface 615, a memory 620, and storage 630,
each connected to a bus 617. The computing system 600 may also
include an I/O device interface 610 connecting I/O devices 612
(e.g., keyboard, mouse, and display devices) to the computing
system 600. Further, in context of this disclosure, the computing
elements shown in computing system 600 may correspond to a physical
computing system (e.g., a system in a data center) or may be a
virtual computing instance executing within a computing cloud.
[0057] The CPU 605 retrieves and executes programming instructions
stored in the memory 620 as well as stores and retrieves
application data residing in the memory 620. The interconnect 617
is used to transmit programming instructions and application data
between the CPU 605, I/O devices interface 610, storage 630,
network interface 615, and memory 620. Note, CPU 605 is included to
be representative of a single CPU, multiple CPUs, a single CPU
having multiple processing cores, and the like. And the memory 620
is generally included to be representative of a random access
memory. The storage 630 may be a disk drive storage device.
Although shown as a single unit, the storage 630 may be a
combination of fixed and/or removable storage devices, such as
fixed disc drives, removable memory cards, or optical storage,
network attached storage (NAS), or a storage area-network
(SAN).
[0058] Illustratively, the memory 620 includes an application
service 622 and an analysis tool 624. The storage 630 includes a
knowledge graph 634, and one or more user interest taxonomies 636.
The application service 622 provides access to various services of
a multimedia service platform to mobile devices. The analysis tool
624 generates a user interest taxonomy 636 based on image metadata
of images taken by users.
[0059] Further, the analysis tool 624 builds the knowledge graph
634 from external data sources. To do so, the analysis tool 624
performs NLP techniques on the raw text obtained from the data
sources to identify relevant terms related to events, moments,
weather, etc. Further, the analysis tool 624 may impute information
from the knowledge graph 634 images submitted to the multimedia
service platform, i.e., associate one or more attributes of the
knowledge graph 634 to each image. To do so, the analysis tool 624
may correlate image metadata, e.g., metadata describing
spatiotemporal features of the image, with attributes provided in
the knowledge graph 634, such as event information, weather, and
the like. In addition, the analysis tool 624 generates a user
interest taxonomy 636 inferred from the attributes.
[0060] The preceding discussion presents a variety of embodiments.
However, the present disclosure is not limited to the specifically
described embodiments. Instead, any combination of the following
features and elements, whether related to different embodiments or
not, is contemplated to implement and practice the techniques
described herein. Furthermore, although embodiments of the present
disclosure may achieve advantages over other possible solutions
and/or over the prior art, whether or not a particular advantage is
achieved by a given embodiment is not limiting of the present
disclosure. Thus, the following aspects, features, embodiments and
advantages are merely illustrative and are not considered elements
or limitations of the appended claims except where explicitly
recited in a claim(s).
[0061] Aspects may be embodied as a system, method or computer
program product. Accordingly, embodiments may take the form of an
entirely hardware embodiment, an entirely software embodiment
(including firmware, resident software, micro-code, etc.) or an
embodiment combining software and hardware aspects that may all
generally be referred to herein as a "circuit," "module" or
"system." Furthermore, embodiments may take the form of a computer
program product embodied in one or more computer readable medium(s)
having computer readable program code embodied thereon.
[0062] Any combination of one or more computer readable medium(s)
may be utilized. The computer readable medium may be a computer
readable signal medium or a computer readable storage medium. A
computer readable storage medium may be, for example, but not
limited to, an electronic, magnetic, optical, electromagnetic,
infrared, or semiconductor system, apparatus, or device, or any
suitable combination of the foregoing. More specific examples (a
non-exhaustive list) of the computer readable storage medium would
include the following: an electrical connection having one or more
wires, a portable computer diskette, a hard disk, a random access
memory (RAM), a read-only memory (ROM), an erasable programmable
read-only memory (EPROM or Flash memory), an optical fiber, a
portable compact disc read-only memory (CD-ROM), an optical storage
device, a magnetic storage device, or any suitable combination of
the foregoing. In the context of this document, a computer readable
storage medium may be any tangible medium that can contain, or
store a program for use by or in connection with an instruction
execution system, apparatus or device.
[0063] The flowchart and block diagrams in the figures illustrate
the architecture, functionality and operation of possible
implementations of systems, methods and computer program products
according to various embodiments presented herein. In this regard,
each block in the flowchart or block diagrams may represent a
module, segment or portion of code, which comprises one or more
executable instructions for implementing the specified logical
function(s). In some alternative implementations the functions
noted in the block may occur out of the order noted in the figures.
For example, two blocks shown in succession may, in fact, be
executed substantially concurrently, or the blocks may sometimes be
executed in the reverse order, depending upon the functionality
involved. Each block of the block diagrams and/or flowchart
illustrations, and combinations of blocks in the block diagrams
and/or flowchart illustrations can be implemented by
special-purpose hardware-based systems that perform the specified
functions or acts, or combinations of special purpose hardware and
computer instructions.
[0064] While the foregoing is directed to embodiments of the
present disclosure, other and further embodiments of the disclosure
may be devised without departing from the basic scope thereof, and
the scope thereof is determined by the claims that follow.
* * * * *