U.S. patent application number 15/233790 was filed with the patent office on 2017-02-09 for nlp-based content recommender.
The applicant listed for this patent is VCVC III LLC. Invention is credited to Navdeep S. Dhillon, Jose Hernando, Krzysztof Koperski, Jisheng Liang, Neil Roseman, Diana Schwend, Korina J. Stark.
Application Number | 20170039272 15/233790 |
Document ID | / |
Family ID | 40567784 |
Filed Date | 2017-02-09 |
United States Patent
Application |
20170039272 |
Kind Code |
A1 |
Roseman; Neil ; et
al. |
February 9, 2017 |
NLP-BASED CONTENT RECOMMENDER
Abstract
Methods, techniques, and systems for using natural language
processing to recommend related content to an associated text
segment or document. Example embodiments provide a NLP-based
content recommender ("NCR") which uses NLP-based search techniques,
potentially in conjunction with context or other related
information, to locate and provide content related to entities that
are recognized in the associated material. NCRs may be embedded as
widgets, for example on Web pages to assist users in their perusal
and search for information, provided by means of browser plug-ins
or other application plug-ins, provided in libraries or in
standalone environments, or otherwise integrated into other code,
programs, or devices. This abstract is provided to comply with
rules requiring an abstract, and it is submitted with the intention
that it will not be used to interpret or limit the scope or meaning
of the claims.
Inventors: |
Roseman; Neil; (Seattle,
WA) ; Liang; Jisheng; (Bellevue, WA) ;
Koperski; Krzysztof; (Seattle, WA) ; Stark; Korina
J.; (Bellevue, WA) ; Dhillon; Navdeep S.;
(Seattle, WA) ; Schwend; Diana; (Seattle, WA)
; Hernando; Jose; (Seattle, WA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
VCVC III LLC |
Seattle |
WA |
US |
|
|
Family ID: |
40567784 |
Appl. No.: |
15/233790 |
Filed: |
August 10, 2016 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
14181591 |
Feb 14, 2014 |
9471670 |
|
|
15233790 |
|
|
|
|
12288349 |
Oct 16, 2008 |
8700604 |
|
|
14181591 |
|
|
|
|
60999559 |
Oct 17, 2007 |
|
|
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
G06F 16/3344 20190101;
G06F 40/295 20200101; G06F 16/3338 20190101; G06F 40/205 20200101;
G06F 3/0482 20130101; G06F 16/367 20190101; G06F 16/338
20190101 |
International
Class: |
G06F 17/30 20060101
G06F017/30; G06F 17/27 20060101 G06F017/27 |
Claims
1-13. (canceled)
14. A computer-implemented NLP-based content recommendation system,
comprising: a memory; and a content recommender module, stored in
the memory, and having instructions that are configured, when
executed by a computer processor, to: receive a text segment for
processing; identify one or more named entities to which the
received text segment refers based, at least in part, upon a
natural language processing (NLP) parsing and linguistic analysis
of the text segment; derive related content based at least in part
upon a natural language processing parsing and linguistic analysis
of entity based information of the identified one or more named
entities and based upon context information associated with the
named entities or from the received text segment, wherein the
related content includes at least one named entity that is
connected to at least one of the one or more named entities; and
cause the derived related content to be presented.
15. The system of claim 14, wherein the module is further
configured, when executed, to display one or more indicators for
navigating to the related content.
16. The system of claim 15 wherein the module is further
configured, when executed, to present the related content in
response to detecting selection of at least one of the navigation
indicators.
17. The system of claim 15 wherein the indicators are at least one
of links, graphical symbols, icons, shapes, logos, or
trademarks.
18. (canceled)
19. The system of claim 15 wherein the natural language processing
parsing and linguistic analysis initiated using a natural language
query.
20. The system of claim 19 wherein the natural language query is a
relationship search query.
21. The system of claim 14 wherein the content recommender module
is embedded into third party software instructions as a code module
separate from the third party software instructions.
22. The system of claim 14 wherein the content recommender module
is embedded into a browser page, is installed as a plug-in module,
or is installed as a pop-up window.
23. The system of claim 14 wherein the content recommender module
is displayed adjacent content controlled by the third party.
24. The system of claim 14 wherein the content recommender module
has associated representations that are displayed on a display
screen and selectable to invoke the functionality of the content
recommender module.
25. The system of claim 24 wherein the associated representations
are at least one of icons, images, or graphical symbols.
26. The system of claim 14 wherein the content recommender module
is customized by presentation of a different user interface, color
scheme, or capability.
27. The system of claim 26 wherein the content recommender module
is customized based upon the context within which it is
integrated.
28. The system of claim 14 wherein the content recommender is
invoked using a scripting language.
29. A non-transitory computer-readable medium containing content
that, when executed, causes a computing system to perform a method
comprising: receive a text segment for processing; identify one or
more named entities to which the received text segment refers
based, at least in part, upon a natural language processing (NLP)
parsing and linguistic analysis of the text segment; derive related
content based at least in part upon a natural language processing
parsing and linguistic analysis of entity based information and
based upon context information associated with the named entities
or from the received text segment, wherein the related content
includes at least one named entity that is connected to at least
one of the one or more named entities; and cause the derived
related content to be presented.
30. The non-transitory computer-readable medium of claim 29 wherein
the method further comprises causing display of one or more
indicators for navigating to the related content.
31. The non-transitory computer-readable medium of claim 30 wherein
the method further comprises causing the related content to be
presented in response to detecting selection of at least one of the
navigation indicators.
32. The non-transitory computer-readable medium of claim 31 wherein
the indicators are links.
33. A method in a computer system for providing additional content
comprising: receiving a text segment for processing; identifying
one or more named entities to which the received text segment
refers based, at least in part, upon a natural language processing
(NLP) parsing and linguistic analysis of the text segment; deriving
related content based at least in part upon a natural language
processing parsing and linguistic analysis of entity based
information and based upon context information associated with the
named entities or from the received text, wherein the related
content includes at least one named entity that is connected to at
least one of the one or more named entities; and returning the
related content.
34. The method of claim 33, further comprising: providing a script
that defines a user interface widget that is configured, when
executed, to send a request for related content; enabling the
provided script to be embedded in a Web page; and responsive to a
request to provide the Web page, serving the Web page with the
embedded script such that, when the embedded script is executed,
the user interface widget is presented to provide the related
content.
35. The method of claim 33 wherein the acts are provided responsive
to a request from a user interface widget executing in a client
application.
35. The method of claim 33 wherein the method further comprises
causing display of one or more indicators for navigating to the
related content.
36. The method of claim 35 wherein the method further comprises
causing the related content to be presented in response to
detecting selection of at least one of the navigation indicators.
Description
TECHNICAL FIELD
[0001] The present disclosure relates to methods, techniques, and
systems for presenting content using natural language processing
and, in particular, to methods, techniques, and systems for
recognizing named entities using natural language processing and
presenting content related thereto.
BACKGROUND
[0002] With more than 15 billion documents on the World Wide Web
(the Web) today, it has become very difficult for users to find
desired information or to discover relevant information. Typically,
a user engages a keyword (Boolean) based search engine to enter
terms that s/he thinks relates to the topic of interest.
Unfortunately, there could be hundreds of thousands of documents
with similar keywords requiring readers to sort out what is
relevant. Moreover, once a user has followed links (e.g.,
hyperlinks, hypertext, indicators, etc.) to more than a few web
pages, it is highly likely that the user has navigated to a point
that makes it difficult to retrace steps.
[0003] Thus, although the volume of documents on the Web
potentially makes a lot more information available to the average
person, it takes a fair bit of time to actually find documents that
are useful.
BRIEF DESCRIPTION OF THE DRAWINGS
[0004] The patent or application file contains at least one drawing
executed in color. Copies of this patent or patent application
publication with color drawings will be provided by the Office upon
request and payment of the necessary fee.
[0005] FIG. 1A is an example screen display of an example mechanism
for invoking a NLP-Based Content Recommender from a web page
displayed in a web browser.
[0006] FIG. 1B is an example screen display of an example NLP-Based
Content Recommender presented to recommend content relating to
underlying text.
[0007] FIG. 1C is an example screen display illustrating the result
of selection of one of the named entities in the underlying
content.
[0008] FIG. 1D is an example screen display of example refinements
of recommendations of an example NLP-Based Content Recommender
based upon selection of node in a connections map.
[0009] FIG. 1E is an example screen display of an NLP-Based Content
Recommender playing a selected video.
[0010] FIG. 2 is an example screen display of another type of
NLP-Based
[0011] Content Recommender widget presented adjacent to
content.
[0012] FIGS. 3A-3E are example screen displays of a named entity
profile presented by an example embodiment of an NLP-Based Content
Recommender.
[0013] FIGS. 4A-4C are example screen displays of another type of
NLP-Based Content Recommender widget presented adjacent to
content.
[0014] FIG. 5 is an example code for installing an example
embodiment of an NLP-Based Content Recommender in a content
creator's Web page.
[0015] FIGS. 6A-6D illustrate example screen displays for an
example embodiment of an NLP-Based Content Recommender in the form
of links to further information.
[0016] FIGS. 7A and 7B are example screen displays that illustrate
use of the widgets shown in FIGS. 1A-1D integrated into an
application.
[0017] FIGS. 8A and 8B illustrate example screen displays for an
example embodiment of an NLP-Based Content Recommender in the form
of graphical links that can be used to navigate to further
information.
[0018] FIG. 9 is another example screen display of a graphical
representation of connections.
[0019] FIGS. 10A and 10B illustrate another interface for
presenting related content to an underlying named entity.
[0020] FIGS. 11A-11C illustrate another example NCR widget that
combines some of the previously described textual and graphical
presentations to present related and/or auxiliary information.
[0021] FIGS. 12A-12C illustrates another example NCR widget
integrated into a website that provides links to news and blog
information.
[0022] FIG. 13 is an example block diagram of an example computing
system that may be used to practice embodiments of a NLP-Based
Content Recommender.
DETAILED DESCRIPTION
[0023] Embodiments described herein provide enhanced computer- and
network-assisted methods, techniques, and systems for using natural
language processing techniques, potentially in conjunction with
context or other related information, to locate and provide content
related to entities that are recognized in associated material.
Example embodiments provide one or more NLP-based content
recommenders ("NCRs") that each, based upon a natural language
analysis of an underlying text segment, determine which entities
are being referred to in the text segment and recommend additional
content relating to such entities.
[0024] NCRs may be useful in environments such as to support a user
browsing pages of content on the Web. One or more NCRs may be
embedded as widgets on such pages to assist users in their perusal
and search for information, provided by means of browser plug-ins
or other application plug-ins, provided in libraries or in
standalone environments, or otherwise integrated into other code,
programs, or devices.
[0025] For example, when a news article is being displayed in a Web
browser, an NCR may be invoked to suggest additional relevant
content by recognizing the entities referred to in the article and
determining relevant additional content, organized by a number of
factors, for example, by frequency of appearance of other
information relating to one of the recognized entities in the
article, by knowledge of the browse patterns of the reader, etc. An
NCR might also be invoked to allow the reader to explore the top
entities "connected" to one of the entities selected from the
entities recognized in the news article. Connectedness in this
sense refers to entities which are related to the selected
recognized entity typically through one or more actions (verbs). Or
an NCR might be invoked to "filter" or otherwise rank or order the
content presented to the user.
[0026] FIG. 1A is an example screen display of an example mechanism
for invoking a NLP-Based Content Recommender from a web page
displayed in a web browser. In FIG. 1A, web browser 100 is shown
displaying news article 104. An icon 150 labeled "Evri" is display
for invoking the NCR.
[0027] FIG. 1B is an example screen display of an example NCR
presented to recommend content relating to underlying text. The
news article web page 104 is shown presented using web browser 100
as described with reference to FIG. 1A. An example NLP based
content recommender 101 is displayed as a pop-up window 101
accessible from an icon 150 in FIG. 1A. The example embodiment of
NCR 101 shows an section 103 of "Top Related Articles" and a filter
section 102 of focus terms that may be used to filter the top
related articles shown in section 103.
[0028] In at least some embodiments, the NCR may use context
information relating to source information that was used to
establish and identify the entities (e.g., verbs, related entities,
entities within close proximity in the underlying text or in other
text, or other clues) in the recommendations. In some embodiments,
algorithms are employed for natural language-based entity
recognition and disambiguation to determine which entities are
present in the underlying text. For example, these algorithms may
be incorporated to display an ordered list of all, or the most
important, or the top "n" entities present on a Web page in
conjunction with the underlying page. The items on the list can
then be used to navigate to additional (related) content, for
example, as "links" or other references to the content. The example
NCR illustrated in FIGS. 1B performs extensive NLP-based searching
and processing in the background to identify the entities in the
underlying article 104 and then to find and order the top related
articles that are displayed in section 103.
[0029] An example system that supports the generation of an ordered
list of entities is described in co-pending U.S. patent application
Ser. No. 12/288,158 titled "NLP-Based Entity Recognition and
Disambiguation," which is incorporated by reference it its
entirety. In addition, in at least some embodiments, an NLP-Based
search mechanism can be incorporated by an NCR to find related
(e.g., auxiliary or supplemental) information to recommend.
Contextual and other information, such as information from ontology
knowledge base lookups or from other knowledge repositories may
also be incorporated in establishing information to recommend. One
such system and methods for generating related content using
relationship searching is encompassed in the InFact.RTM.
relationship search technology (now the Evri relationship search
technology), described in more detail in U.S. patent application
Ser. No. 11/012,089, filed Dec. 13, 2004, which published on Dec.
1, 2005 as U.S. Patent Publication No. 2005/0267871A1, and which is
hereby incorporated by reference in its entirety. In this system,
NLP-based processing is used to locate entities and the connections
(relationships) between them based upon actions that link a source
entity to a target entity, or visa versa (i.e., queries that
specify a subject and/or an object, and zero or more verbs that may
relate them).
[0030] In addition, the InFact.RTM./Evri technology provides a
query language called "IQL" (now "EQL") and a navigation tip system
with query templates for generating relationship queries with or
without a graphical user interface. Query templates and the
navigation tip system may be incorporated by other code to
automatically generate generalized searches of content that utilize
sophisticated linguistics and/or knowledge-based analysis. The
InFact.RTM./Evri tip system not only performs the NLP-based search,
but can order the results as desired. In addition, the tip system
can dynamically evolve the searches hence the related entities as
the underlying text is changed, for example by filtering it using
focus terms 102 in FIG. 1B. Additional information on the
InFact.RTM./Evri navigation tip system is found in U.S. patent
application Ser. No. 11/601,602, filed Nov. 16, 2006, which
published on Jul. 5, 2007 as U.S. Patent Publication No.
2007/0156669A1, and in U.S. patent application Ser .No. 12/049,184,
filed Mar. 14, 2008, which are herein incorporated by reference in
their entireties. Other and or different NLP-based processing may
similarly be incorporated by example embodiments of an NCR.
[0031] In at least some embodiments, NCRs are provided by means of
a user interface control displayed adjacent to, approximate to, on
or near other displayed content such as illustrated in FIG. 1B.
Such an interface control can be implemented in the form of a
"widget" (e.g., a code module, excerpt, script, etc.), which can be
made available to third parties and other content providers to
associate with content they control. In addition, a user or other
widget consumer (such as a content creator or distributor) can
download a widget provided via a URI or URL (uniform resource
identifier or locator), web portal, server, etc. For example, a
content creator may download an NCR widget for installing it as a
plug-in in the creator's blogging platform. The NCR widget may have
one or more associated representations, i.e., icons, images, or
graphical symbols, which may take many different forms, and which
can be displayed on a display screen and used to invoke the
functionality of the widget. In some embodiments, customizations,
such as different UI renderings, color schemes, capabilities, etc.
may also be available when the widget is installed. Also, in some
embodiments, NCR widget end users (those using the widgets to
display related content) may also be provided with
customizations.
[0032] FIG. 5 is an example code for installing an example
embodiment of an NCR in a content creator's Web page. In
particular, the script 501 may be integrated to provide a pop-up
window NCR widget, such as that illustrated in FIG. 1B. In this
example, the script 501 and installation notes 502 are provided on
a Web page controlled by the widget created. Although the
particular script 501 is written in html (which includes
JavaScript), appropriate other scripting languages (e.g., Ruby,
Perl, and Python) can be used in other environments to include an
NCR widget. For example, a VisualBasic script may be used to
provide a similar NCR widget in a Microsoft Office environment.
[0033] Such widgets need not be limited to displaying related
content accessible via a Web browser. Indeed, NCR widgets also may
be useful in a variety of other contexts and platforms, such as to
create other mechanisms for finding sought after data in large
repositories of information (e.g., corporate intelligence data
bases, product information, etc.), to perform research or other
discovery, to provide learning tools in educational environments,
to navigate newsletters and archived articles for a company, etc.
NCRs are intended to aid in conveying meaningful information to end
users from among a morass of data without them necessarily knowing
how to search for that information. They are intended to do a
better job at emulating "understanding" the underlying text than a
keyword search engine would, so that users can search less and
understand more, or discover more with less work.
[0034] NCR widgets present user interfaces that may vary depending
upon the context in which they are integrated, their use, etc.
FIGS. 1B-13 illustrate several different example embodiments of
forms for such widgets that contain content summary information,
and controls for navigating to related or other
contextually-significant information. In at least some embodiments,
an indicator (such as a hypertext link, or hyperlink) is displayed
proximate to a respective entity if more or recommended related
information is available. Also, in some embodiments user interface
controls are provided to navigate to and among the various
supplemental information. For example, one or more indicators for
navigating to the supplemental information may be presented. These
indicators may be presented in the form of links, graphical
symbols, icons, shapes, logos, trademarks, or the like. Many
variations for presenting widgets/user interface controls are
possible, and the ones presented in FIGS. 1B-13 are merely
illustrative and not intended to be exhaustive.
[0035] In at least some of the NCRs, the name of entity (e.g.,
Barack Obama) is provided along with an indication of the type of
entity and/or its roles (e.g., categories or facets, such as
senator, democrat, presidential candidate). Then, for some NCRs, a
list of facts about the entity and/or an overview of further
content is displayed. In at least some embodiments, an image
associated with the named entity is also displayed. Importantly, if
more information (as determined by the NCR) is available, then a
link (also referred to as a hyperlink, hypertext, or other
indicator) may be displayed. The link may be operated (e.g.,
selected or navigated to) by a user to navigate to recommended
content. Other features, including more or different features may
be provided or combined in an embodiment of an NCR as helpful in
the context.
[0036] For example, as described earlier, the example NCR 101 in
FIG. 1B is provided in a pop-window on top of underlying news
article content 104. The "Focus On" list in filter section 102 is
created using the natural language processing methods described
above. In particular, section 102 lists the "most important" named
entities found in the underlying content as determined by NLP-based
relationship searching (such as that provided by the
InFact.RTM./Evri relationship search technology). Different
definitions of "most important" may be used in the NCR, including
but not limited to frequency of use in the article, popularity
among a set of documents searched, etc.
[0037] FIG. 1C is an example screen display illustrating the result
of selection of one of the named entities in the underlying
content. In particular, when the user selects, from the filter
section 102, the link "Barack Obama" 105, which is a named entity
found in the underlying content, the top related articles section
108 changes to reflect new recommendations. In at least some
embodiments, the NCR executes a natural language based relationship
query, such as an EQL query, in the background against some body of
documents. The resulting information can be used to populate
various fields in the user interface of the NCR and to find and
suggest the recommended content that is displayed to the end user
when, for example, the user navigates to such content via a
displayed link. Accordingly, the related articles section 109 shows
the result of executing a query of Barack Obama relating (in one or
more ways described by actions/verbs) to one or more of the named
entities in the underlying content (the news article 104).
[0038] The illustrated NCR 110 also includes a "Connections"
section 106, which provides a graphical map of the entities related
to the selected named entity 105. The entities included in the
graphical map 106 may be selected by the NCR 110 as the most
popular entities, the most frequently described in the top related
articles, or using other rules. In one embodiment, as shown, the
entities in the connections map 106 are color-coded based upon
their base type: for example, whether they are persons, places, or
things (which may include organizations, products, etc.). An end
user may select one of the nodes 107 on the map 106, to further
change the recommendations by refining what is considered
"related."
[0039] FIG. 1D is an example screen display of example refinements
of recommendations of an example NCR based upon selection of node
in a connections map. In FIG. 1D, the user has selected the node
"Ohio" 106 in the illustrated NCR 120, which has caused the NCR to
change its background searches to focus the recommendations on
articles in which "Barack Obama" is connected (related by
action/verb) to then entity "Ohio." This changed focus is reflected
in field 122. The articles now displayed in the recommended
articles section 121 reflect the top articles that describe
something about Baracks interactions with Ohio. Full profiles
(descriptions of useful, related information) are obtainable by
selection of the links for the named entities in the related
articles section 121; that is for the entities 124.
[0040] Example NCRs also may include still and or video images. By
selecting link 123, the user can navigate to recommended videos
that relate to the relationship between "Barack Obama" and "Ohio."
Note that these recommendations may also be ordered and/or ranked.
FIG. 1E is an example screen display of an NCR playing a selected
video. A video 132 of Barack Obama is played in response to user
selection of the video link 122. Images, when available, may be
displayed similarly.
[0041] Note that FIGS. 1B-1E provide an example of one type of NCR.
Many other example, including ones with very different appearing
user interfaces, may be implemented.
[0042] FIG. 2 is an example screen display of another type of NCR
widget presented adjacent to content. The NCR 201 is provided next
to news article 200 and comprises and entity information section
202, a related articles section 203, and a connections map 205. The
related articles section 203 and connections map 205 operate
similarly to those described with reference to FIGS. 1B-1E. As is
observable, in this particular NCR 201, persons (e.g., Jennifer
Brunner node 208) are color coded in green, places (e.g., Ohio 207)
in blue, and things--organizations (e.g., social security
administration 206) in red. The entity information section 202
includes named entities from the article 200, ordered. In the
embodiment shown, they are ordered in importance. Other orderings
can be similarly incorporated. The NCR 201 also displays a link 204
to the profile (description of the named entity) of the most
relevant named entity "Jennifer Brunner."
[0043] FIGS. 3A-3E are example screen displays of a named entity
profile presented by an example embodiment of an NCR. As can be
observed, the user interface and controls are different than those
provided in FIGS. 1B and 2; however, many of the same capabilities
of an NCR are present. In particular, the example NCR of FIG. 3A
provides a connection map 301 and a top articles section 303 that
recommends the "top" articles relating to the named entity
"Jennifer Brunner." Again, these articles may be ordered based upon
the most current and/or frequency of mentioning Ms. Brunner,
popularity of access to articles, most relationships entities
connected to Ms. Brunner, or based upon other definitions of
topmost. The NCR also provides a user interface control 302 for
modifying (by filtering based upon action) the articles 303
displayed. In addition, the NCR includes a recommended images area
307 with links to one or more images; a recommended videos area 308
with links to one ore more videos; a section reserved for
advertisements 306 (which may also be targeted to the profile being
displayed); top connections links 304 to explore profiles of the
entities most current and relevant to the displayed profile and to
filter the top articles section 303; and an about section 305,
which contains a brief description and fast facts regarding the
named entity whose profile is being displayed.
[0044] FIG. 3B illustrated details of the connections map shown in
FIG. 3A. In particular, in connections map 320, the large central
circle (or node) (e.g., node 311) represents the profiled person,
place, or thing. The smaller nodes (e.g., node 312) are its top
connections. The lines between the nodes (e.g., line/dot 310)
represent the actual connection, which may be presented, for
example, when the user hovers an input device over the dot on the
line. When a user selects the action (e.g., dot 310), the top
articles section is updated to reflect that connection.
[0045] FIG. 3C illustrates the modifications to the articles 332
displayed when the user interface control 330 is selected to cause
filtering based upon a selected action. Here, the user has selected
the action (i.e., verb) "governing" as reflected in field 331. As a
result, the NCR displays the top articles 332 that show the current
entity "Jennifer Brunner" in a governing relation with other
entities. The top recommended images section 333 and videos section
334 have been updated as well. In some embodiments the user
interface control 330 also includes modifiers of the various named
entities, so that the user may follow leads and find more
information on, for example, the roles of the various entities.
[0046] The powerful NLP based search processing identifies the
topmost entities in the relationship displayed by the articles
recommended in section 322. That is, these are the entities
involved in a "governing" relationship with "Jennifer Brunner."
FIG. 3D lists these related entities in section 340, which can be
selected to further filter the top articles display. For example,
when the user selects the "Oberlin College" link 346, the filtering
(an abbreviated EQL) is shown in area 341, the articles are changed
to reflect the selection in top articles 342, and the recommended
images links 343 and videos links 344 are also updated. By
selecting the icon 350, the user is able to navigate to the profile
page for that entity when one is available. FIG. 3E is an example
of the profile page 351 for Oberlin College displayed when the icon
350 is selected for the Oberlin College link 346. The user
interface control 352 shows the actions for "Oberlin College" that
can be selected for further filtering. The top articles 353 and
images 354 are updated for the entity "Oberlin College."
[0047] FIGS. 4A-4C are example screen displays of another type of
NCR widget presented adjacent to content. In this case, the NCR 402
is displayed below the news article 401. The behavior of this NCR
widget is similar to that described with reference to FIGS. 1A-1E.
FIG. 4A illustrates what the NCR looks like when it is invoked.
FIG. 4B illustrates the results when a user selects the connection
node
[0048] "White House" 404 (in relation to Barack Obama 403). FIG. 4C
illustrates example results when the user selects a related named
entity in NCR widget 420. In particular, when the user selects one
of the named entities in the recommended articles, here "New York
Times" link 421, the connection map 423 and the related top
articles 422 are changed to reflect that entity as the focus. Other
behaviors are of course possible.
[0049] FIGS. 6-13 are provide a variety of additional forms for the
user interfaces of example embodiments of an NCR.
[0050] FIGS. 6A-6D illustrate example screen displays for an
example embodiment of an NLP-Based Content Recommender in the form
of (hypertext) links to further information. The link can be used
to navigate to the information, which is based upon the entities
recognized in the underlying content. For example, in FIGS. 6A and
6D, several recommendation user interface controls and "tips" are
illustrated (and presumed to be based upon the underlying content
shown, or resultant from a relationship search). In particular, tip
609 displays information relating to Al Qaeda and tip 601 displays
information relating to Barack Obama. For each of these NCR tips,
other forms/presentations are displayed beneath them.
[0051] As described above, the layout of an NCR tip or user
interface control (UI control) may depend upon the information
available. Generally, in the example illustrated in FIGS. 6A-6D,
the name of the entity 602 (e.g., Barack Obama) is presented,
followed by the entity types and roles relating to the entity 603
(e.g., senator, democrat, presidential candidate). Then, for some
tips and/or UI controls, a list of facts about the entity 604 or
608, with or without tags, and/or an overview 607 of further
content is displayed. In at least some embodiments, an image 606
associated with the named entity is also displayed. Importantly, if
more information (as determined by the NCR) is available, then a
link 605 (also referred to as a hyperlink, hypertext, or other
indicator) may be displayed. The link 605 may be further navigated
by a user to display recommended content.
[0052] For example, as shown in larger images in FIGS. 6B and 6C,
the link 605 may be used to navigate to an NCR widget provided, for
example, on a designated website, or transparently. In FIGS. 6A and
6C, a list 660 is displayed of the recognized entities in an
underlying text segment. This list 660 presents an indicator of the
name of the entity, optionally followed by a symbol 611 of some
sort, when further content is available. For example, when "Barack
Obama" is selected, one of the tips 601 is displayed as previously
described. Similarly, when the "United States of America" is
selected, a UI control such as tip 620 is presented. In addition to
the (ordered) list of named entities 660, the NCR widget presents a
set of actions 612, and, when an action is selected, a list of the
relationships 613. In NLP terminology, selecting the action (or
verb) will generate a representation of the subjects or objects
related to the selected entity via that verb. In at least some
embodiments, a list of the most relevant articles 614 to the
currently displayed article is also presented. This list can be
implemented using the InFact.RTM./Evri search technology described
in detail elsewhere. For example, the summary sentence that is
displayed for each article may indicate where the specific
relationship was found.
[0053] According to one example embodiment, to populate the fields
of the tip or UI control, such as action list 612 and connections
list (relationships list) 613, an IQL/EQL query may be performed
against the last "W" weeks of news content to return related
information. In the illustrated case, "N" results are returned for
actions performed by the entity, in this case United States of
America, sorted by action (verb) frequency. The top "V" verbs are
then displayed, as seen in action list 612. In other embodiments,
actions could be derived from an NLP-based relationship extraction
of the context (trigger) text or a set of documents related to the
context text, or from other sources.
[0054] FIGS. 7A and 7B are example screen displays that illustrate
use of the widgets shown in FIGS. 6A-6D integrated into an
application, such as a news content provider site. In FIGS. 7A and
7B, underlying content 700, such as a news article about Barack
Obama, is presented, for example, on a web page. Either
automatically, or when explicitly or implicitly indicated by a user
(depending upon the news platform implementation), an information
widget such as widget 701 is displayed. This widget 701 has similar
fields to those described with reference to FIGS. 1B-1D above.
[0055] The progression from FIG. 7A to 7B shows how the illustrated
NCR widget can be dynamically updated as information is found or
computed. For example, the widget can populate the relationships
field 704 based upon the content shown in the most relevant
articles field 710, which in turn is based upon the selected entity
from entity list 702 and the selected action from action list 703.
In at least some embodiments, the content of these fields is
periodically updated, potentially automatically (and transparently)
by rerunning the appropriate NLP queries on a periodic or defined
schedule.
[0056] FIGS. 8A and 8B illustrate example screen displays for an
example embodiment of an NLP-Based Content Recommender in the form
of graphical links that can be used to navigate to further
information. In this user interface paradigm, relationships are
represented as connected nodes, and recommended content is used as
"annotations" to the nodes and/or the connectors. For example, in
FIG. 8A, several entities 801, 802, 804, and 805 are shown linked
through their relationships. Entities 801 and 802 are person
entities; whereas entities 804 and 805 are an organization entity
and an event entity, respectively.
[0057] When the user hovers over or otherwise selects the named
entity "Kaela Kennelly" 801, a tip 850 is displayed with initial
information similar to that described with reference to FIGS.
6A-6D. Again, part of the displayed tip is a link (here labeled
"(read, more)" 852) to further information. When a user navigates
through the link, a detailed entity page 860 is displayed, which
can be populated not just with static information, but with further
content accessible via an NCR widget.
[0058] As shown in larger image in FIG. 8B, the relationship of
entity Kaela Kennelly 801 to the ASP Women's World Tour 2006 event
entity 829 is represented in summarized form in tip 831. When
expanded by selecting a "more" graphical indicator 832, a more
extended form of related content page 830 is displayed. The
extended form 830 shows a list of categories of related content
834, for example news & blogs, pictures and video, and a
related website. An embodiment of an NCR widget can be used to
present and drive the content and/or the links displayed in the
extended page 830. The user can return to the summary form by
selecting a "less" graphical indicator 832.
[0059] FIGS. 9, 10A, 10B, 11A-11C, and 12A-12C illustrate
additional alternatives for providing user interfaces and/or tips
via an NCR widget used to provide related or recommended
content.
[0060] FIG. 9 is another example screen display of a graphical
representation of connections. A graphical representation is shown
of the connections between a subject entity, here "Keala Kennelly,
and all of the entities she interacts with. Entities having more
distant connections, for example, as determined by the frequency of
the relationships encountered, are displayed as nodes that appear
further from the node that represents Keala.
[0061] FIGS. 10A and 10B illustrate another interface for
presenting related content to an underlying named entity, for
example, one either selected by a user directly, or perhaps even
indirectly via entity recognition of entities presented on an
underlying web page. In the illustrated example, content relating
to a named entity 1001 "Arnold Schwarzenegger" is presented. Fast
facts area 1004 displays a number of tidbits of quick information
regarding the named entity 1001, which may be available as
determined by the frequency of information gleaned during the
natural language based analysis of related information or other
contextual information. Roles list 1002 contains a list of all of
the roles (facets for or categories) found for the named entity
1001. A detailed entity description 1005 is shown followed by a
graphical representation of his roles, which display shows a
"weighting" associated with such roles. Questions area 1006
illustrates the use of query templates and navigation tips for
finding and presenting related information without the user needing
to type in a query via a query language such as IQL/EQL. Related
entities area 1008, also supported by comprehensive NLP based
searching and indexing, allows the user to navigate to other
related information.
[0062] FIGS. 11A-11C illustrate another example NCR widget that
combines some of the previously described textual and graphical
presentations to present related and/or auxiliary information. For
example, in FIG. 11A, the user is presented with an NCR widget 1110
displayed in the foreground of the underlying (news) content 1100.
The widget presents a list 1102 with quick summaries of the most
relevant similar articles to the underlying content 1100 along with
a graphical representation of the "connections" (relationships)
1107 to entities that appear in the article 1105 selected from the
related articles list 1102. FIG. 11B shows an alternative graphical
representation of the connections 1112 derived from a selected
article 1111 of articles list 1102. FIG. 11C is an illustration of
an image 1120 rendered in response to user selection of image 1115
from a display of images. Text 1113 shows story highlights from
selected article 1111.
[0063] FIGS. 12A-12C illustrates another example NCR widget
integrated into a website that provides links to news and blog
information. The presentations of this NCR widget focus on
timeliness and frequency concepts, and thus the various displays
may be organized differently than might be presented elsewhere. For
example, the article summary list 1200 displayed under the "Related
News and Blogs" tab may be beneficial in social networking and/or
blogging venues in that they are brief, list the source of the
content, and the time when posted. In addition, in FIG. 12B, under
the "Most Popular Content" tab, the entity names that appear in the
most frequent news and blog postings are displayed with graphical
indications according to their importance to and frequency found
within the documents being searched (for example, in real time).
For example some entities in list 1201 are presented in different
size fonts, different colors, etc. FIG. 12C illustrates, under the
"Connections" tab, a representation of the connections
(relationships) 1202 that may be explored in the articles
summarized in article summary list 1200. These connection nodes are
the result of relationship queries on the underlying documents
summarized in article summary list 1200.
[0064] Other representations for presenting recommended content by
means of an NLP-Based Content Recommenders are also contemplated.
It is notable that many such representations hide the power of the
underlying relationship indexing and searching technology by giving
the user simple navigation tools and hints for getting more
information. Moreover, the information is determined, calculated,
and presented in substantially real-time or near real-time, and may
be dynamically updated periodically, or at specified intervals, or
according to different schedules.
[0065] An NCR widget may be implemented using standard programming
techniques that leverage the capabilities of a NLP-based processing
engine that can perform indexing and relationship searching. It is
to be understood that, although the interfaces illustrated in FIGS.
1B-12 are described as incorporating the powerful capabilities of
NLP processing, less sophisticated searching techniques can also
take advantage of the user interface designs of such widgets, tips,
and user interface controls to the extent they are able to generate
a portion of the content. For example, using a standard keyword
search that pattern matches terms, some number of the entities
referred to in an underlying article may be uncovered using
frequency counts; however, to the extent the text is complex (and,
for example, contains aliases, coreferences, pronouns, ambiguous
nouns, etc.) it is not possible to confidently discover and
subsequently list all of the named entities in the underlying
document. To do this, the document must be "understood."
Accordingly, the sophisticated and powerful natural language
technology supporting the content recommenders described herein,
can be used to achieve far improved results.
[0066] Also, although certain terms are used primarily herein,
other terms could be used interchangeably to yield equivalent
embodiments and examples. In addition, terms may have alternate
spellings which may or may not be explicitly mentioned, and all
such variations of terms are intended to be included. In addition,
in the following description, numerous specific details are set
forth, such as data formats and code sequences, etc., in order to
provide a thorough understanding of the described techniques. The
embodiments described also can be practiced without some of the
specific details described herein, or with other specific details,
such as changes with respect to the ordering of the code flow,
different code flows, etc. Thus, the scope of the techniques and/or
functions described are not limited by the particular order,
selection, or decomposition of steps described with reference to
any particular routine.
[0067] FIG. 13 is an example block diagram of an example computing
system that may be used to practice embodiments of a NLP-Based
Content Recommender. Note that a general purpose or a special
purpose computing system may be used to implement an NCR. Further,
the NCR may be implemented in software, hardware, firmware, or in
some combination to achieve the capabilities described herein.
[0068] Computing system 1300 may comprise one or more server and/or
client computing systems and may span distributed locations. In
addition, each block shown may represent one or more such blocks as
appropriate to a specific embodiment or may be combined with other
blocks. Moreover, the various blocks of the NCR 1310 may physically
reside on one or more machines, which use standard (e.g., TCP/IP)
or proprietary interprocess communication mechanisms to communicate
with each other.
[0069] In the embodiment shown, computer system 1300 comprises a
computer memory ("memory") 1301, a display 1302, one or more
Central Processing Units ("CPU") 1303, Input/Output devices 1304
(e.g., keyboard, mouse, CRT or LCD display, etc.), other
computer-readable media 1305, and network connections 1306. The NCR
1310 is shown residing in memory 1301. In other embodiments, some
portion of the contents, some of, or all of the components of the
NCR 1310 may be stored on and/or transmitted over the other
computer-readable media 1305. The components of the NCR 1310
preferably execute on one or more CPUs 1303 and perform entity
identification and present content recommendations, as described
herein. Other code or programs 1330 and potentially other data
repositories, such as data repository 1320, also reside in the
memory 1301, and preferably execute on one or more CPUs 1303. Of
note, one or more of the components in FIG. 13 may not be present
in any particular implementation. For example, some embodiments
embedded in other software may not provide means for other user
input or display.
[0070] In one embodiment, the NCR 1310 includes an entity
identification engine 1311, a knowledge analysis engine 1312, an
NCR user interface support module 1313, an NLP parsing engine or
preprocessor 1314, an NCR API 1317, a data repository (or interface
thereto) for storing document NLP data 1316, and a knowledge data
repository 1315, for example, an ontology index, for storing
information from a multitude of internal and/or external sources.
In at least some embodiments, one or more of the NLP parsing
engine/preprocessor 1314, the entity identification engine 1311,
and the knowledge analysis engine 1312 are provided external to the
NCR and are available, potentially, over one or more networks 1380.
Other and or different modules may be implemented. In addition, the
NCR 1310 may interact via a network 1380 with applications or
client code 1355 that uses results computed by the NCR 1310, one or
more client computing systems 1360, and/or one or more third-party
information provider systems 1365, such as purveyors of information
used in knowledge data repository 1315. Also, of note, the
knowledge data 1315 and the document data 1316 may be provided
external to the NCR as well, for example, and be accessible over
one or more networks 1380 to the NCR.
[0071] In an example embodiment, components/modules of the NCR 1310
are implemented using standard programming techniques. However, a
range of programming languages known in the art may be employed for
implementing such example embodiments, including representative
implementations of various programming language paradigms,
including but not limited to, object-oriented (e.g.,
[0072] Java, C++, C#, Smalltalk), functional (e.g., ML, Lisp,
Scheme, etc.), procedural (e.g., C, Pascal, Ada, Modula, etc.),
scripting (e.g., Perl, Ruby, Python, JavaScript, VBScript, etc.),
declarative (e.g., SQL, Prolog, etc.), etc.
[0073] The embodiments described use well-known or proprietary
synchronous or asynchronous client-sever computing techniques.
However, the various components may be implemented using more
monolithic programming techniques as well, for example, as an
executable running on a single CPU computer system, or alternately
decomposed using a variety of structuring techniques known in the
art, including but not limited to, multiprogramming,
multithreading, client-server, or peer-to-peer, running on one or
more computer systems each having one or more CPUs.
[0074] Some embodiments are illustrated as executing concurrently
and asynchronously and communicating using message passing
techniques. Equivalent synchronous embodiments are also supported
by an NCR implementation.
[0075] In addition, programming interfaces to the data stored as
part of the NCR 1310 (e.g., in the data repositories 1315 and 1316)
can be made available by standard means such as through C, C++, C#,
and Java APIs; libraries for accessing files, databases, or other
data repositories; through scripting languages such as XML; or
through Web servers, FTP servers, or other types of servers
providing access to stored data. The data repositories 1315 and
1316 may be implemented as one or more database systems, file
systems, or any other method known in the art for storing such
information, or any combination of the above, including
implementation using distributed computing techniques.
[0076] Also, the example NCR 1310 may be implemented in a
distributed environment comprising multiple, even heterogeneous,
computer systems and networks. For example, in one embodiment, the
modules 1311-1314, and 1317, and the data repositories 1315 and1316
are all located in physically different computer systems. In
another embodiment, various modules of the NCR 1310 are hosted each
on a separate server machine and may be remotely located from the
tables which are stored in the data repositories 1315 and 1316.
Also, one or more of the modules may themselves be distributed,
pooled or otherwise grouped, such as for load balancing,
reliability or security reasons. Different configurations and
locations of programs and data are contemplated for use with
techniques of described herein. A variety of distributed computing
techniques are appropriate for implementing the components of the
illustrated embodiments in a distributed manner including but not
limited to TCP/IP sockets, RPC, RMI, HTTP, Web Services (XML-RPC,
JAX-RPC, SOAP, etc.). Other variations are possible. Also, other
functionality could be provided by each component/module, or
existing functionality could be distributed amongst the
components/modules in different ways, yet still achieve the
functions of an NCR.
[0077] Furthermore, in some embodiments, some or all of the
components of the NCR may be implemented or provided in other
manners, such as at least partially in firmware and/or hardware,
including, but not limited to, one or more application-specific
integrated circuits (ASICs), standard integrated circuits,
controllers (e.g., by executing appropriate instructions, and
including microcontrollers and/or embedded controllers),
field-programmable gate arrays (FPGAs), complex programmable logic
devices (CPLDs), etc. Some or all of the system components and/or
data structures may also be stored as contents (e.g., as executable
or other machine-readable software instructions or structured data)
on a computer-readable medium (e.g., as a hard disk; a memory; a
computer network or cellular wireless network or other data
transmission medium; or a portable media article to be read by an
appropriate drive or via an appropriate connection, such as a DVD
or flash memory device) so as to enable or configure the
computer-readable medium and/or one or more associated computing
systems or devices to execute or otherwise use or provide the
contents to perform at least some of the described techniques. Some
or all of the system components and data structures may also be
transmitted as contents of generated data signals (e.g., by being
encoded as part of a carrier wave or otherwise included as part of
an analog or digital propagated signal) on a variety of
computer-readable transmission mediums, including wireless-based
and wired/cable-based mediums, and may take a variety of forms
(e.g., as part of a single or multiplexed analog signal, or as
multiple discrete digital packets or frames). Such computer program
products may also take other forms in other embodiments.
Accordingly, embodiments of the present disclosure may be practiced
with other computer system configurations.
[0078] All of the above U.S. patents, U.S. patent application
publications, U.S. patent applications, foreign patents, foreign
patent applications and non-patent publications referred to in this
specification and/or listed in the Application Data Sheet,
including but not limited to U.S. Provisional Patent Application
No. 60/999,559, entitled "NLP-BASED CONTENT RECOMMENDER," filed
Oct. 17, 2007, and U.S. application Ser. No. 12/288,347, entitled
NLP-BASED CONTENT RECOMMENDER," filed Oct. 16, 2008, are
incorporated herein by reference, in their entireties.
[0079] From the foregoing it will be appreciated that, although
specific embodiments have been described herein for purposes of
illustration, various modifications may be made without deviating
from the spirit and scope of this disclosure. For example, the
methods, techniques, and systems for entity recognition and
disambiguation are applicable to other architectures other than a
Web-based architecture. For example, other systems that are
programmed to perform natural language processing can be employed.
Also, the methods, techniques, and systems discussed herein are
applicable to differing query languages, protocols, communication
media (optical, wireless, cable, etc.) and devices (such as
wireless handsets, electronic organizers, personal digital
assistants, portable email machines, game machines, pagers,
navigation devices such as GPS receivers, etc.).
* * * * *