Methods And Systems For Determining Media Value ARTZT; Greg ; et al. [General Sentiment, Inc.;]

Methods And Systems For Determining Media Value

ARTZT; Greg ; et al.

Patent Application Summary

U.S. patent application number 13/794385 was filed with the patent office on 2013-07-25 for methods and systems for determining media value. This patent application is currently assigned to General Sentiment, Inc.. The applicant listed for this patent is General Sentiment, Inc.. Invention is credited to Greg ARTZT, Mark Fasciano, Levon Lloyd, Steve Skiena.

Application Number	20130191380 13/794385
Document ID	/
Family ID	44560918
Filed Date	2013-07-25

United States Patent Application	20130191380
Kind Code	A1
ARTZT; Greg ; et al.	July 25, 2013

METHODS AND SYSTEMS FOR DETERMINING MEDIA VALUE

Abstract

Exemplary embodiments are directed to determining a media value associated mentions of an entity in one or more documents based on a sentiment attributed to the mentions of the entity and/or a frequency with which the entity is mentioned. Exemplary embodiments can include a media value engine that can identify mentions of an entity in documents, attribute sentiment to the mentions of the entity; determine a polarity of the sentiment, and calculate a media value attributed to the entity based on the sentiment.

Inventors:

ARTZT; Greg; (Weddington, NC) ; Fasciano; Mark; (Port Washington, NY) ; Skiena; Steve; (Setauket, NY) ; Lloyd; Levon; (Patchougue, NY)

Applicant:

Name	City	State	Country	Type
General Sentiment, Inc.;	Hicksville	NY	US

Assignee:

General Sentiment, Inc.
Hicksville
NY

Family ID:

44560918

Appl. No.:

13/794385

Filed:

March 11, 2013

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
13047527	Mar 14, 2011	8402035
13794385
61313342	Mar 12, 2010

Current U.S. Class:	707/727
Current CPC Class:	G06F 16/24578 20190101; G06Q 30/02 20130101
Class at Publication:	707/727
International Class:	G06F 17/30 20060101 G06F017/30

Claims

1. A method of determining media value of an entity of interest comprising: calculating a media value based on a frequency of instances of the entity included in one or more documents, wherein calculating the media value is based on a sentiment associated with the instances of the entity included in the one or more documents and wherein calculating the media value comprises: determining a weighted value attributed to the entity; determining a total number of references to the entity in the one or more documents; and multiplying the weighted value attributed to the entity by the total number of references to the entity in the one or more documents to generate a weighted entity reference value.

2-10. (canceled)

11. The method of claim 1, further comprising: identifying instances of the entity included in one or more computer documents; and associating a sentiment with the instances of the entity.

12. The method of claim 1, further comprising: generating a media value report using the media value.

13. The method of claim 1, wherein the media value is represented in terms of a financial value.

14-15. (canceled)

16. A system for determining media value of an entity of interest comprising: a computing system having one or more computers, the computing system being configured to calculate a media value based on a frequency of instances of the entity included in one or more computer documents wherein calculating the media value is based on sentiment associated with instances of the entity included in one or more computer documents and wherein calculating the media value comprises: determining a weighted value attributed to the entity; determining a total number of references to the entity in the one or more computer documents; and multiplying the weighted value attributed to the entity by the total number of references to the entity in the one or more computer documents to generate a weighted entity reference value.

17. The method of claim 1, wherein the one or more documents are published within a specified time range.

18. The method of claim 1, wherein calculating the media value further comprises: determining an exposure number representing a number of people to which the one or more documents are distributed; determining an economic value per person attributed to exposure of the one or more documents; and multiplying the exposure number by the economic value to generate a media value multiplier.

19. The system of claim 16, wherein the one or more computer documents are published within a specified time range.

20. The system of claim 16, wherein calculating the media value further comprises: determining an exposure number representing a number of people to which the one or more computer documents are distributed; determining an economic value per person attributed to exposure of the one or more computer documents; and multiplying the exposure number by the economic value to generate a media value multiplier.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims priority to U.S. Provisional Application No. 61/313,342, filed Mar. 12, 2010, the contents of which are hereby incorporated by reference in their entirety.

BACKGROUND

[0002] 1. Technical Field

[0003] Embodiments disclosed herein are directed to determining media value associated with entities of interest mentioned in one or more documents.

[0004] 2.Brief Discussion of Related Art

[0005] Typically entities, such as corporations, are willing to pay a fee to advertise to gain exposure to a target recipient. For example, entities may pay to include an advertisement in a magazine, newspaper, webpage, and the like. Today, entities are being mentioned across the Internet, in news, blogs, tweets, and other social media. This "buzz" can be created by product launches, ad campaigns, PR events, earnings reports, a single consumer's product experience, and many other triggers, even scandals. Many times, this buzz is unsolicited by the entity and/or occurs without requiring the entity to pay a fee. For example, a product manufactured by an entity can be included in a product review article, an article can discuss financial statements of the entity, and the like. Such mentions or occurrences can have advertising or marketing value. For example, if a product review is negative, the value of the product review to the entity may be negative or in some instances may be considered positive. Likewise, if the product review is positive, the value of the product review to the entity may be positive. Taking this value into account can aid in optimizing marketing strategies.

[0006] As such, it would be desirable to attribute a media value to the mentions or occurrences of entities in documents based on whether the mentions or occurrences reflect negative or positive sentiment.

SUMMARY

[0007] In one aspect, a method of determining media value of an entity of interest is disclosed. The method includes calculating a media value based on a frequency of instances of the entity included in the one or more computer documents.

[0008] In another aspect, a non-transitory computer readable medium storing instructions executable by a computing system including at least one computing device is disclosed. Execution of the instructions implements a method for determining media value of an entity of interest that includes calculating a media value based on the sentiment associated with the instances of the entity of interest included in the one or more computer documents.

[0009] In yet another aspect, a system for determining media value of an entity of interest is disclosed. The system can include a computing system having one or more computers. The computing system is configured to calculate a media value based on the sentiment associated with the instances of the entity of interest included in the one or more computer documents.

[0010] In still another aspect, a method of determining media value of an entity of interest is disclosed. The method includes identifying mentions of an entity in one or more documents, attributing a sentiment to the mentions, determining a polarity of the sentiment, the polarity being negative or positive, and calculating a media value based on the sentiment attributed to the entity included in the one or more computer documents.

[0011] Other objects and features will become apparent from the following detailed description considered in conjunction with the accompanying drawings. It is to be understood, however, that the drawings are designed as an illustration only.

BRIEF DESCRIPTION OF THE DRAWINGS

[0012] FIG. 1 depicts a block diagram illustrating an exemplary embodiment of a media value (MV) engine.

[0013] FIG. 2 depicts an exemplary computing device for implementing embodiments of the MV engine.

[0014] FIG. 3 depicts an exemplary computing system for implementing embodiments of the MV engine in a networked environment.

[0015] FIG. 4 is a flowchart illustrating a process of determining media value for entities of interest and generating a media value reporting using the media value.

[0016] FIG. 5 is a flowchart illustrating calculating media value for an entity based on sentiment attributed to the entity in one or more documents.

[0017] FIG. 6 is an exemplary document that has been marked-up by an embodiment of the MV engine.

[0018] FIG. 7 is an exemplary graph that can be generated for outputting a media value generated by embodiments of the MV engine.

DETAILED DESCRIPTION

[0019] Exemplary embodiments are directed to determining media value corresponding to an entity based on a frequency with which the entity is mentioned in one or more documents and/or based on sentiment attributed to the entity in one or more documents. Embodiments can scour and analyze sources of content `listening` in real-time to the mentions of brands, products, politicians, celebrities, companies, and the like, and can calculate media value based on these mentions.

[0020] Exemplary embodiments can include a media value engine, which can process the one or more documents to identify entities of interest, determine sentiment associated with occurrences of the identified entities in the documents, determine a polarity associated with the sentiment, and calculate a media value for the identified entities using, at least in part, the frequency with which the entity is mentioned in the documents, the sentiment, and/or a polarity of the sentiment identified in the documents. Embodiments of the media value engine can generate an output for an entity, such as a media value report, a dashboard display in a web-based user interface, or other suitable output, which includes the calculated media value for the entity.

[0021] As used herein, "media value" refers to an economic value or financial value, such as an amount of currency including, for example, dollars, Euros, pounds, Yen, and the like, attributed to the mentions or occurrences of an entity in one or more documents. As one example, media value can represent the advertising purchase-equivalent value of an entity's media exposure across the web. Using a frequency with which an entity is mentioned and/or a sentiment associated with the mentions as weighting metrics, embodiments of the media value engine can estimate what it would have cost to attract the same media exposure through traditional advertising channels.

[0022] FIG. 1 depicts a block diagram of a media value (MV) engine 100. The MV engine 100 collects a corpus of computer documents and identifies mentions (e.g., occurrences) of entities in the corpus. An entity can be a name for a given person, place, or thing. For example, an entity can be a name of a person, a company, a consumer brand, a product, a service, a university, a city, a state, a country, and the like. A corpus can refer to classes of document sources from which documents can be collected. A corpus can include one or more document sources, such as social media postings in a social networking site, microblogging websites, news media articles published by news media websites, press releases published by entities on their website or through other channels, and the like. Each document source can include one or more documents. Mentions or occurrences of an entity in a document can refer to instances of the entity included in the document, such as the name of the entity, where each instance of the entity in the document is a mention or occurrence of the entity in the document.

[0023] The MV engine 100 identifies sentiment and a polarity of the sentiment expressed in the documents and can determine a number of people to whom the document is distributed or exposed. Sentiment can refer to a manifestation of an opinion, fact, emotion, attitude, bias, and the like, in a document, which may solicit an interpretation, reaction, response, and the like, from a viewer/observer of the sentiment. The sentiment can have an associated polarity that can be indicative of the likely or anticipated interpretation, reaction, response, and the like, of a viewer of the sentiment. For example, sentiment can have a positive polarity indicating that the sentiment is favorable for the entity associated with the sentiment, a negative polarity indicating that the sentiment is unfavorable for the entity associated with the sentiment, or a neutral polarity indicating that the sentiment is not positive or negative. A value can be assigned to the sentiment based on whether the polarity of the sentiment is positive or negative. In some embodiments, the value assigned to the sentiment can be weighted to give more or less value to sentiment based on a degree of the polarity. In some instances, while the polarity of the sentiment can be negative, the negative sentiment can have a positive, neutral, or negative effect on the media value. Thus, the MV engine 100 can be configured based on the notion that any exposure is valuable exposure. For example, an entity may consider any mention, negative or positive, as having some positive media value.

[0024] For each entity identified in the corpus of documents, the MV engine 100 stores an amount of sentiment expressed, a polarity for the sentiment, a date of publication for each document, an amount of exposure of a document source from which the documents were obtained, and the like. The MV engine 100 can maintain a media value (MV) database in which the sentiment that has been attributed to one or more entities is stored for each document source in the corpus and each day that has been processed. The media value for an entity identified in the documents can be calculated based on a total sentiment expressed towards the entity from each source in the corpus and/or a frequency with which the entity is mentioned in the documents (e.g., a number of times the entity is mentioned in a document), as well as a total exposure of each source. Media value can refer to an economic value or financial value, such as an amount of currency including, for example, dollars, Euros, pounds, Yen, and the like, attributed to the mentions or occurrences of the entity in the document sources. The media value can be expressed as a cumulative value for the documents in the corpus, per document source, per document, and like. The MV engine 100 includes an entity identifier 110, a sentiment analyzer 120, a media value (MV) calculator 130, and an output generator 140.

[0025] The entity identifier 110 identifies mentions, occurrences, or instances of entities in a document. An entity can be a member of a category of interest (e.g. "John Smith" is a member of the category of interest "person"; "General Sentiment" is a member of the category of interest "corporation"). Categories of interest can include people, geographic locations, consumer brands , products, services, companies, universities, and the like. The entity identifier 110 can receive a document from the corpus of documents as an input and can produce an output identifying occurrences of entities that are found in the document. The output of the entity identifier can be a marked-up version of the document in which the occurrences of the entity can be highlighted using tagging, changes in the color of the text, changes in the size of the text, changes in the font of the text, and the like. The entity identifier 110 can include a part-of-speech tagger 112, a natural-language rules analyzer 114 (hereinafter "rules analyzer 114"), and a white-list applier 116.

[0026] The part-of-speech tagger 112 can identify a part-of-speech for the words in the document received by the entity identifier 110. For example, based on historical usage patterns (e.g., "dog" is usually a noun, while "fast" can be a noun, verb, or adjective) and common patterns of part-of-speech usage, the part-of-speech tagger 112 outputs a part-of-speech for each word in the document. The part-of-speech tagger 112 can generate a marked-up version of the document in which the part-of-speech tagger 112 can append the part-of-speech to the end of each word in the document. The part-of-speech can be appended to each word as a tag, such as a mark-up language tag.

[0027] Once the part-of-speech for the words in the document have been determined, the rules analyzer 114 can group words of a document together to identify entities based on a set of pre-determined patterns. The rules analyzer 114 can include a set of rules that can be used by the rules analyzer 114 to identify entities having a name composed of more than one word. The set of rules can be based on parts-of-speech identified by the parts-of-speech tagger 112. As one example, the rules analyzer can include a rule such that when the word "University" appears in a document, followed by the words "of" or "at", followed by a sequence of proper nouns, such as "Southern California", the rules analyzer combines the words as a single entity (e.g., University of Southern California) and identifies the words as a mention or occurrence of the single entity. The rules analyzer 114 applies the rules to each sentence in the document to identify entities. The rules analyzer 114 can generate a marked-up the version of the document received from the part-of-speech tagger 112. The occurrences of the entity identified using the rules analyzer 114 can be highlighted using tagging, changes in the color of the text, changes in the size of the text, changes in the font of the text, and the like.

[0028] The white-list applier 116 of the entity identifier 110 can facilitate automatic recognition of entities in the documents. The white-list applier 116 can include a set of words and/or phrases representing the names of entities to be automatically recognized in the documents. The white-list applier 116 can ensure that occurrences of specific entities in documents are identified and can facilitate identification of entities included in the list without requiring the part-of-speech tagger 112 and/or the rules analyzer 114 to detect the entities. Thus, the white-list applier can be used in combination with the part-of-speech tagger 112 and the rules analyzer 114 to identify some or all of the entities mentioned in the documents. The white-list applier 116 can scan the document for instances of the entries in the list and can compare the words and/or phrases in the list to the words and/or phrases in the document, and when a word or phrase in the document matches a word and/or phrase in the list, the white-list applier 116 can identify the word and/or phrase in the document as a name of an entity. The white-list applier 116 can generate a marked-up version of the document, or can mark-up the version of the document output by the part-of-speech tagger 112 and/or rules analyzer 114, in which the entities identified by the white-list applier 116 can be can be highlighted using tagging, changes in the color of the text, changes in the size of the text, changes in the font of the text, and the like.

[0029] The sentiment analyzer 120 can identify sentiment expressed in a document, a polarity of the sentiment, and entities to which the sentiment is directed. The sentiment analyzer 120 can use natural language processing to identify the sentiment expressed in a document and can determine an amount of sentiment attributed to each entity identified in a document. For example, the sentiment analyzer 120 can identify a cumulative amount of sentiment having a positive polarity and a negative polarity in a document. The sentiment analyzer 120 can receive the marked-up version of the document output by the entity identifier 110 as an input and can output the sentiment expressed towards each entity identified in the document. The sentiment analyzer 120 includes a sentiment lexicon generator 122, a sentiment word identifier 124, a sentiment attribution analyzer 126, and a sentiment aggregator 128. Those skilled in the all will recognize that sentiment in a document can be identified using other techniques and that sentiment identification is not limited to the illustrative embodiments described herein.

[0030] The sentiment lexicon generator 122 can generate a lexicon of sentiment words and/or phrases using a computer dictionary of synonym/antonym relationships between words and/or phrases. In some embodiments, a small seed set of positive and negative sentiment words can be used to derive the lexicon of sentiment words. In some embodiments, sentiment lexicon generation by the sentiment lexicon generator 122 can use path analysis. Expanding seed lists into lexicons can be performed using recursive querying for synonyms using a computer dictionary. The sentiment lexicon generator can expand a set of seed words using synonym and antonym queries. A polarity (positive or negative) can be associated with the words and/or phrases in the sentiment lexicon and synonyms and antonyms of the words and/or phrases can be identified. Synonyms of a word and/or phrase inherit the polarity from the parent, whereas antonyms of the word and/or phrase inherit the opposite polarity.

[0031] The sentiment word identifier 124 receives the document from the entity identifier 110 and the sentiment lexicon generated by a sentiment lexicon generator 122 as an input and outputs the identified sentiment words and/or phrases based on the sentiment lexicon along with any associated modifiers, such as, for example, "not", "very", and the like, which can modify the sentiment (e.g. "not" reverses polarity, "very" magnifies sentiment). For example, the sentiment word identifier 124 can compare words and/or phrases in the sentiment lexicon to words and/or phrases in the document. The words indicating sentiment can be identified by marking-up the document.

[0032] The sentiment attribution analyzer 126 receives the document with the identified entities and sentiment in the marked-up document and attributes the identified sentiment to the entities. In some embodiments, the sentiment attribution analyzer 126 attributes sentiment in a sentence to all entities identified in the sentence. In some embodiments, sentiment can be attributed to an instance of an entity occurring closest to the sentiment (e.g., the entity with the least number of words between the sentiment and the instance of the entity).

[0033] The sentiment aggregator 128 enters an entry in the MV database representing an amount of sentiment towards entities encountered in the document, along with a date the document was published and the source that published the document. In some embodiments, the sentiment aggregator 128 can sum the number of positive sentiment words attributed to an entity in the document and can subtract the number of negative sentiment words attributed to the entity in the document from the sum. In some embodiments, negative sentiment words that have a negative effect on the media value can be subtracted from the sum of the number of positive sentiment words. In some embodiments, negative sentiment words that have a positive effect on the media value can be added to the sum of the positive sentiment words.

[0034] The MV calculator 130 calculates media value associated with exposure of an entity based on results of the entity identifier 110 and sentiment Analyzer 120 including, but not limited to occurrences of the entity in the document, a sentiment (and polarity of the sentiment) attributed to the occurrences, an amount of exposure or distribution the documents have, and the like. The MV calculator 130 can query the MV database for the results of entity identification and sentiment analysis. Using this, the MV calculator 130 can produce a total or cumulative media value for the entity and can calculate the media value associated with the corpus of documents, each of the document sources, each of the documents, and the like. The MV calculator 130 can include an exposure weighting unit 132 and a calculation unit 134.

[0035] The exposure weighting unit 132 can determine a number of people to which a document has been distributed or exposed. A document is distributed or exposed to a person when it is e-mailed to the person, tweeted to the person, accessed by a person via a browser, downloaded, or otherwise made available to the person. Distribution or exposure of classes of sources can be measured in different ways to determine the amount of people to which a particular document from that source is exposed. For example, traditional news media sources measure physical circulation; web-based sources can be measured using a number of hits a website receives, an Alexa rank, and the like; and micro-blog sources, such as Twitter, can be measured by the number of followers a source has. The exposure weighting unit 132, examines one or more of these types of measures and produces as an output, for each particular document in a class of sources, an approximation of the number of people to which a document of a particular source is distributed or exposed.

[0036] To generate the media value associated with a specified entity, during a specified time period, in a specified corpus of documents, the number of mentions, the sentiment polarity of those mentions, and the exposure weighting of those sources, during the specified time period, in the specified corpus, can be extracted from the MV database. For each source in the corpus, the media value for the specified entity, in the specified date range is calculated according to the following mathematical expression:

media value=((rw*ref(entity))+(nw*neg_ref(entity))+(pw*pos_ref(entity)))- *(exposure(source)*dollars/eyeball),

where: rw refers to the weight assigned by a user to an entity identified in the corpus of documents; nw refers to the weight assigned by the user to a negative polarity for the entity identified in the corpus; pw refers to the weight assigned by the user to a positive polarity entity identified in the corpus; ref(entity) refers to a total number of references to the specified entity in the given source during the given date range, which is extracted from the database and calculated in the sentiment analyzer; neg ref(entity) refers to a total number of references with negative sentiment polarity to the specified entity in the given source during the given date range, which is extracted from the database and calculated in the sentiment analyzer; pos_ref(entity) refers to a total number of references with positive sentiment polarity to the specified entity in the given source during the given date range, which is extracted from the database and calculated in the sentiment analyzer; exposure(source) refers to the number of people that a document published in the given source is exposed to, which is extracted from the database, and calculated in the exposure weighting unit; dollars/eyeball refers to the amount of money, specified by the user, that the user values for the specified entity being exposed to one person, from the specified corpus; the expression rw*ref(entity) can be referred to as a weighted reference value; the expression nw*neg ref(entity) can be referred to as a negative entity reference value; the expression pw*pos_ref(entity) can be referred to as a positive entity reference value; and the expression exposure(source)*dollars/eyeball can be referred to as a media value multiplier.

[0037] The cost of advertisements can be a function of the number of readers on the given media channel (a specific newspaper, website, blog, etc.), which can be expressed in terms of "cost per thousand" or similar quantities. The dollars/eyeball value can be determined based on published and estimated advertising rates for published or online advertisements (e.g. per thousand hits/impressions). Thus, different document sources may have different advertising rates.

[0038] The output generator 140 generates and outputs media value to users. For example, the output generator 140 can include a media value report generator 142 that generates media value reports (MVRs) based on the result of media value calculations, in response to user queries, a user interface or dashboard 144 that can be accessed by a user to view media value as well as other information attributed to one or more entities, and/or one or more application program interfaces (APIs) 146 that allows users to interface with the output generator using one or more applications. The output generator 140 takes as input the desired entities and time frame from the user and outputs media value attributed to the desired entities. The time frame can be specified as a range of dates, such as Oct. 15, 2009 to Jan. 10, 2010, or can be specified relative to the current date, such as yesterday, last week, last year, month-to-date, year-to-date, and the like. In some embodiments, the output can contain a total amount of media value for the entity during the specified time range; a time series showing the amount of media value for each day during the specified time range; a list of sources that contributed to the media value for the specified entity, during the specified time range, ordered by the amount of media value contributed; and the like.

[0039] FIG. 2 depicts an exemplary computing device 200 for determining media value for entities of interest using the MV engine 100. The computing device 200 can be a mainframe; personal computer (PC); laptop computer; workstation; handheld device, such as a PDA and/or smart phone; and the like. In the illustrated embodiment, the computing device 200 includes a central processing unit (CPU) 202 and can include storage 204 for storing data and instructions. The storage 204 can include computer readable medium technologies, such as a floppy drive, hard drive, compact disc, tape drive, Flash drive, optical drive, read only memory (ROM), random access memory (RAM), and the like. The computing device can include a display device 206 that enables the computing device 200 to communicate with user through a visual display and can include data entry device(s) 208, such as a keyboard, touch screen, microphone, and/or mouse.

[0040] Applications 210, including the MV engine 100, can be resident in the storage 204. The applications 210 can include instructions for implementing the MV engine 100. The instructions can be implemented using, for example, C, C--H--, Java, JavaScript, Basic, Pert, Python, assembly language, machine code, and the like. The storage 204 can be local or remote to the computing device 200. The computing device 200 includes a network interface 212 for communicating with a network. The CPU 202 operates to run the applications 210 in storage 204 by executing instructions therein and storing data resulting from the performed instructions, which may be output via a display 206 or by other mechanisms known to those skilled in the art, such a print out from a printer.

[0041] FIG. 3 is an exemplary networked computing system 300 for implementing embodiments of the MV engine. The computing system 300 includes one or more servers 310 and 320 coupled to clients 330 and 340, via a communication network 350, which can be any network over which information can be transmitted between devices communicatively coupled to the network. The system 300 can also include repositories or database devices 360, which can be coupled to the servers 310/320 and clients 330/340 via the communications network 350. The servers 310/320, clients 330/340, and database devices 360 can be implemented using a computing device.

[0042] The servers 310/320, clients 330/340, and/or databases 360 can store information, such as sentiment attributed to one or more entities mentioned in a corpus of documents; media value associated with one or more entities mentioned in the corpus of documents; a list of entities to be automatically identified in the corpus of documents; a sentiment lexicon; and the like. In some embodiments, the MV engine 100 can be distributed among the servers 310/320, clients 330/340, and database devices 360 such that one or more components of the MV engine 100 and/or portion of one or more components of the MV engine 100 can be implemented by a different device (e.g. clients, servers, databases) in the communication network 350. For example, in some embodiments, the entity identifier 110 and the sentiment analyzer can be resident on the server 310, the MV calculator 130 can be resident on the server 320, the output generator 140 can be resident on the clients 330 and 340. One or more of the databases 360 can serve of the MV database to store entity information, sentiment and polarity information, media value information, media value reports, a corpus of documents, and the like. Those skilled in the art will recognize that the distribution of components of the MV engine is illustrative and that different distributions of the components of the MV engine can be implemented.

[0043] FIG. 4 is an exemplary flowchart illustrating an exemplary process for determining media value and generating an media value output. To begin, the MV engine can collect documents for one or more document sources (400). Documents are collected in a manner that is appropriate for their domain. For example, Twitter documents are collected using Twitter's published application program interfaces (API's) and newspaper articles are collected by programs that download and clean-up webpages from the newspapers website. Once the MV engine has collected the documents, the MV engine performs entity identification on the documents (402) and performs sentiment analysis of the documents (404). Entity Identification identifies occurrences of entities of interest(names of people, companies, consumer brands, etc.) in the documents. Sentiment analysis identifies the expression of opinion and the polarity of that opinion in the documents and attributes that sentiment to the identified entities.

[0044] The MV engine looks at each publication source in the corpus, and determines, based on source specific information, the number of people that a document published by that source is exposed (406). For example, for a newspaper article, the circulation of the newspaper in which the article was published is approximately the number of people to which the article was exposed. The MV engine takes the exposure of the documents sources and the attributed sentiment of the documents and calculates the media value attributed to the entity in each document source (408). The media value for each document source, on each day in a specified time frame can be used to produce an output, such as a MVR or a display on a dashboard (e.g., a web-based user interface), containing time series of media value over the time period, document sources ranked by value created, a total amount of media value generated by the entity over the date range, and the like (410).

[0045] FIG. 5 is a flowchart illustrating an exemplary calculation of the media value for an entity based on sentiment attributed to the entity in one or more documents. The MV engine can determine a weighted value attributed to the entity (500) and can determine a total number of mentions or occurrences of the entity in one or more of the documents (502). The weighted value attributed to the entity can be multiplied by the total number of references to the entity in one or more of the documents to generate a weighted entity reference value (504).

[0046] The MV engine can determine a weighted value attributed to a negative polarity for the entity (506) and can determine a total number of references to the entity having a negative sentiment polarity (508). The weighted value attributed to the negative polarity can be multiplied by the total number of references to the entity having the negative sentiment polarity to generate a negative entity reference value (510). The total number of mentions or occurrences of the entity that have a negative sentiment polarity can be determined with respect to a specified document source during a specified date of publication range.

[0047] The MV engine can determine a weighted value attributed to a positive polarity for instances of the entity in the one or more documents (512) and can determine a total number of references to the entity having a positive sentiment polarity. The weighted value attributed to the positive polarity can be multiplied by the total number of references to the entity having the positive sentiment polarity to generate a positive entity reference value (514). The total number of mentions or occurrences of the entity that have a positive sentiment polarity can be determined with respect to a specified source during a specified date of publication range.

[0048] The MV engine can determine an exposure number representing a number of people to which the one or more documents are distributed (516) and can determine an economic value attributed to exposure of the one or more documents to one person (518). The exposure number can be multiplied by the economic value to generate a media value multiplier (520).

[0049] A sentiment activity sum can be calculated by adding the weighted entity reference value, the negative entity reference value, and the positive entity reference value (522). The media value attributed to mentions or occurrences of the entity in one or more documents can be generated by multiplying the sentiment activity sum by the media value multiplier (524). Those skilled in the art will recognize the order in which the calculation of the media value is calculated can vary and that the ordered described with respect to FIG. 5 is illustrative of an exemplary media value calculation process. Further, those skilled in the art will recognize that the media value can be calculated using various techniques and that the techniques described herein are illustrative an exemplary media value calculation process.

[0050] FIG. 6 is an exemplary screenshot of a document 600 that has been marked-up by the MV engine. The document can include tags for identifying entities, sentiment attributed to the entities, a polarity of the sentiment attributed to the entities, and the like. For example, the MV engine can identify the entity "Microsoft" as being mentioned in the document and can associate a sentiment and polarity of the sentiment to the mention of "Microsoft". The MV engine can also associate the entity with a category of interest and can associate a sentiment value with the mention of the entity. For example, the MV engine can classify the entity "Microsoft" as a corporation.

[0051] FIG. 7 is an exemplary graph 700 illustrating media value generated by the MV engine that can be included in a media value report (MVR), a dashboard (e.g., a web-based user interface), or in any other suitable format. The graph 700 can have a y-axis 710 representing a media value and an x-axis representing time over which a media value has been generated. A plot 730 of the media value versus time can be displayed in the graph 700. The plot 730 can identify a total media value over time for a corpus of documents, one or more document sources, specified documents, and the like. In the present example, the plot identifies the contribution of to the media value from three document sources: news media; social media; and Twitter.

* * * * *