Science Explained: Drivers in Luminoso

Drivers is a Luminoso feature that examines a set of documents (that include a numeric customer score) at the level of individual concepts (words or phrases) and reports when the presence of individual concepts are correlated with higher or lower customer scores. A numeric customer score may represent a CSAT (Customer Satisfaction), an NPS (Net Promoter Score), or a product rating. 

To illustrate how a concept’s importance score is calculated, we’ll use the example “free wifi” that’s found in a set of documents that contain hotel reviews and rating scores. The importance score is derived from the following values:

  • Impact: All reviews that mentioned “free wi-fi” (and related concepts) were 0.3 stars lower than the average numeric score.

  • Confidence: The statistical t-value of this comparison is 1.1. Which means it's likely that there's a real correlation there, but we would need more data to confidently distinguish this correlation from random variance.

  • Relevance: Our Luminoso domain model believes that "free wi-fi" is a concept that matters to your project.

  • Weight: We examined 24.5 documents about "free wi-fi" to produce these results. (The .5 is because we can give partial credit to non-exact matches.)

The four values above are then combined into one overall internal value called "importance":

  • Importance: On a scale from 0 to 1, where 0 is something that has no correlation with customer scores and 1 is the most important driver in the project, this driver’s importance is 0.45.

A deeper dive into the science

What’s the scale of the "Impact" score?

The impact score is on scale with the customer score, which might be measured in "stars" or "percent" or something else. You might see an impact of -0.4 on a scale from 1 to 5 stars, meaning that the documents matching the concept get reviews that are an average of 0.4 stars lower. If the same reviews were on a scale from 0 to 100 instead of 1 to 5, you would see an impact of -10. 

The range of impact is always with respect to the customer’s score range for that set of documents.

What’s the scale of the "Confidence" value?

The confidence value is the t value, which comes from Student's t-test. This is a two-sided t-test. The possible situations we're distinguishing are:

  • Documents containing the concept have higher scores (t is positive

  • Documents containing the concept have lower scores (t is negative)

  • Documents containing the concept have about the same scores (t is near 0)

The Drivers code currently outputs t as is, but it makes a better "confidence" value to take the absolute value of t.

We've been asked what qualifies as "near zero" -- meaning how low of a confidence is too low? Currently -0.6 < t < 0.6 qualifies as too low. The null hypothesis is a very likely hypothesis in that range. The Drivers tool does not present concept impact scores that have too low a confidence value. The confidence value returned via the score_driver API is |t| (that absolute value of t).

So a score driver with t = 0.7 could be considered spurious, but some interesting  drivers start to appear around there.

What’s the scale of "Importance"?

A scale of importance is completely arbitrary based on a set of documents, as a result it is a scale of 0 to 1 for each calculation of drivers. The numbers that go into calculating importance vary according to the customer scores, the number of documents overall, the number of conceptual matches for each concept, and the Luminoso relevance scale (which is also arbitrary). 

The importance value is for internal purposes only, because it cannot be used to compare scores from one run of drivers to another, and is used only to sort a unique set of drivers.


What happens when you change the set of documents?

When a user filters a set of documents and creates a new project (effectively removing concepts and documents), or adds new documents to an existing set of documents, the list of concepts and documents changes in the project. When the project is rebuilt and the drivers are recalculated, these changes will cause:

  • Different conceptual matches to be found in the project

  • The “Importance” scale to change, because a different concept may now be the most important drive

Can we calculate score drivers separately for each subset?

Yes. Driver subset analysis is can be done directly in the UI by using the metadata filters provided in the filter panel on the left. Driver impact scores will automatically be recalculated based on subset selection.

Are Drivers a causal model?

No. We can't say that the 0.3 star increase is due to the document mentioning "free wi-fi". If we had a causal model, we would have to subtract out the values of all other drivers in that document so we could assign credit to the terms that were really the "cause". But the causation doesn't work that way anyway. The cause of a high review score isn't that the reviewer mentioned "fast check-in". Causality flows in the opposite direction. Rather, the cause is that the customer liked the establishment they're reviewing. Providing fast check-in may have contributed to that. The customer's positive experience caused them to write a positive review, and the need to describe what was positive about their experience caused them to mention "fast check-in". 

Drivers are correlations only. Many of these correlations may be helpful to understand the domain.