Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

What is Sentiment?

Sentiment is a Luminoso Daylight feature that examines a set of your documents and reports the presence of sentiment – positive, negative, or neutral – separately for each concept within those documents.

Where other sentiment solutions only analyze at the document-level, Luminoso analyzes text data at the document level by examining the sentiment of words and phrases. When first entering the feature view, a summary is provided to help users get started. Sentiment currently supports 13 languages, including Arabic, Chinese, Dutch, English, French, German, Italian, Japanese, Korean, Portuguese, Russian, Spanish, and Swedish, including emoji and emoticons. Development of sentiment lists for Polish and Bahasa Indonesia is underway and expected to be released in early Q3 2019.concept-level so that you don’t miss any nuance in your customers’ feedback. 

Sentiment works in all 15 languages that Luminoso supports

Using Sentiment in Luminoso Daylight

Sentiment

...

model: Deep learning

To assemble the various views of sentiment in a project, Luminoso searches for and analyzes sentiment words, or words that convey feeling and indicate sentiment around topics. These words are often adjectives. Consider the following document:

"The food was superb, but my waiter was slow"

In this example, “superb” and “slow” are considered sentiment words, as they indicate feeling around the food and the waitstaff, respectively.

Sentiment score: Document-level analysis

When a project is created, Daylight calculates a sentiment score for every document by assigning any sentiment-laden word a sentiment integer rating between -5 and 5. The ultimate distribution of this integer is reflected in the app as a number from -1.0 to 1.0 and scored as “negative” for anything below 0, and “positive” for anything above 0. If a document’s sentiment score is closer to -1, the overall document is likely negative. If the sentiment score is closer to 1, the document is likely positive. When a project is created, Luminoso calculates a sentiment score for each document.

To calculate each document’s sentiment score, Luminoso sums individual scores for each sentiment word at the concept level, then converts that sum into a percentage.

4 -1

"The food was superb, but my waiter was slow."

In the above example, “superb” received an individual sentiment score of 4, and “slow” a score of -1. As the positive score is much stronger than the negative, the entire document will have a positive sentiment of 3, expressed as the percentage 30%. The system interprets this overall document as having a 30% chance of being positive.

...

analyzes concepts and their contexts with a multilingual deep learning model. Unlike lexicon-based models, which only look at each word in isolation, deep learning models pay attention at numerous levels to create a complex representation of the concept and its context. These complex representations are then passed to our supervised sentiment classifier, which predicts a sentiment label for each concept-document pair. The model was trained on reviews of products and services and will work the best on such datasets.

Sentiment distribution: Concept-level analysis

Sentiment mix distribution is a measure the share of all positive, negative, and neutral sentiment about topics concepts in a dataset, described using three representative percentage values. Expressed as a combination, this provides a unique view into the mix of feelings around a particular word or phrase. Understanding a concept’s sentiment mix distribution is extremely valuable when analyzing datasets that contain no ratings or have a statistically insignificant number of rating responses.

Sentiment mix distribution is calculated by determining each document’s sentiment score, sorting by overall positive, negative, and neutral documents, summing concept’s sentiment label in all the documents it appears in, adding together the number of each positive, negative, and neutral documentlabels, and then calculating the percentage of each sentiment type over the total number of documents concept labels in the dataset.

Consider the phrase work-life balance in a sample group of 20 beer review documents, a subset of which are represented below. The phrase “dark chocolate” appears in these following four:

“The aroma is massively roasty with lots of black malts, cocoa powder, dark chocolate and espresso.”

“Big dark chocolate flavor, roasted malt, freshly brewed coffee, nice hint of bourbon, and an excellent vanilla extract taste.”

“Pours pitch black with a two-finger dark chocolate/coffee-colored head with excellent retention, only slowly fading into a lasting cap that coats the glass with chunky rings of soapy lacing.”

...

HR survey documents:

“world wide reputation, industry respect, competent people all around, different opportunities, strong technical people, good senior management, good work life balance

“They have fantastic, brilliant colleagues who are very supportive and fun to work with! Poor work-life balance.”

“Flexibility of schedule is good. Educational opportunities are good during times of less expense restrictions. Work/life balance is a joke as the expectations are to keep doing more with fewer resources. Total compensation compared to others is terrible.”

“This company can provide an environment supportive of work life balance especially in relation to those family events which occur from time to time. Extremely poor compensation. HR's sole focus is on reducing $ per head and this is reflected in the treatment of employees with respect to salary, pay increases and bonus payments.”

When uploaded, the application searches for sentiment words, analyzes their usage, and determines a sentiment score label for each individual documentconcept within those documents. Based on the subset of documents in which “dark chocolate” “work-life balance” appears, there are 0 2 negative, 1 neutral2 positive, and 3 positive associated documents. Translated into a percentage, calculated over the total set of 20 documents, the resultant sentiment mix for “dark chocolate” would be assigned a score of 0% / 5% / 15%, or 0% negative, 5% neutral, and 15% positive. 0 neutral labels of that concept. Therefore, the sentiment distribution for this concept is 50% negative, 50% positive, and 0% neutral. 

Sentiment suggestions

In the Sentiment feature pane, Luminoso displays a list of the up to 50 concepts most significantly correlated to with positive and negative sentiment in each project. This list of concepts contains topics, phrases, and nouns words and phrases that are associated with strong sentiment, and not actual sentiment words, such as like “great”, “amazing”, or “awful”, which are inherently descriptors and used to calculate scores in the featureand don’t provide insights.

The number of suggested sentiment concepts may differ from project to project and from filter to filter since only concepts with statistically significant association with positive or negative sentiment are returned.

Frequently asked questions

Why is neutral

...

sentiment’s percentage not displayed by default?

As Sentiment is designed to show concepts with the most positive and negative sentiment, neutral results were found to be uninteresting in are less interesting by comparison. Neutral sentiment can be viewed in the application by either hovering over a concept result or inclusion viewing a concept’s sentiment distribution in the results export. 

You can also view a neutral example of that concept by clicking on the relevant concept. 

What are the current limitations of

...

Very large documents. Luminoso analyzes projects at the document level, meaning if documents contain multiple sentiment words, some sentiment terms may get buried under others. For example, consider the previous document:

4 -1

"The food was superb, but my waiter was slow."

This document has an overall positive sentiment. If multiple documents reiterate both a highly positive word such as “superb”, in conjunction with negative feedback about the waitstaff, it is possible for the sentiment surrounding the waitstaff to get buried. This type of issue manifests in large datasets such as those examining Voice of the Employee, where respondents wish to convey a general feeling of positivity around their work environments, and only a bit of criticism. The result? Positive terms mask much less frequent negative terms. This problem is usually mitigated by feature- or aspect-based sentiment, which works by assigning a sentiment score to each individual feature/aspect, not document. Feature-based sentiment is currently available in Luminoso as a solution engagement.

...

Sentiment?

Wishes and hypothetical situations – LuminosoSentiment cannot detect such subtlety, where only positive words are used to set a contrast with the not-so-positive reality. For example, consider the following document:

“If only the company offered a generous tuition reimbursement or a student loan assistance benefit. It would be sooo great to have assistance on student loan repayment. This would increase the level of talent and attract amazing candidates.”

This document would be scored highly positiveConcepts such as tuition reimbursement and candidates are assigned positive sentiment, even though it’s clear to a human reader that the company lacks this benefitthese benefits. Sentiment classification is currently unable to differentiate this tone from a direct answer.

Sarcasm. Sentiment cannot detect sarcasm, where vocal tone is needed to detect the actual meaning behind what is said. For example, consider this document:

“Way to go, another excellent choice made by our management team.”

Social nuances. Sentiment has no awareness of social nuances or norms. Consider this mobile gaming review document:

...

“Men seem to think this game

...

is a dating site.”

Black box. As is the case with many deep learning models, it’s difficult to diagnose why the model is making a specific decision. 

Can I tune the sentiment model? 

No. Luminoso’s Concept-Level Sentiment model works at an extremely high quality compared to industry standards. In English, Luminoso’s model outperformed or nearly matched the gold-standard benchmarks on industry standard datasets, and achieved a similar level of quality in all 15 supported languages. 

Additionally, the model considers the context in which a term is used to determine its sentiment, helping ensure accurate, nuanced results in datasets across industries.

And the best part? Customers don’t need to provide any data, code, or tuning to make Luminoso’s sentiment analysis work. 

Are there industry-specific sentiment models available?

Luminoso’s sentiment model is broadly trained to produce high-quality results across any industry, including those with lots of specialized language, like pharmaceuticals or banking. Luminoso’s training and testing process determined that the model works at a high level of accuracy for any dataset. The model’s use of word context helps with accurate sentiment detection across industries. 

How do I use Concept-Level Sentiment? 

To take advantage of the new concept-level sentiment analysis, either create a new Daylight project, create a project from a subset of documents in an existing project, or upload new documents to an existing project.

Can I still see Document-Level Sentiment? 

Starting on July 25, 2020 for cloud users and in the next on-site release for on-site users, Luminoso offers only concept-level sentiment analysis. This model performs better on every level than the current lexicon-based document-level model. 

For the future, stay tuned – Luminoso is conducting active research to re-introduce document-level results using a deep learning approach.