Integrating Daylight API with Tableau endpoints

Table of contents

Tables exported from Luminoso

Table: doc_subset 

Field

Description

doc_id

 Unique identifier for each document/verbatim uploaded in a given file - surrogate key of all tables

subset

Special field for Tableau Export containing the subset ID - this field is generated from the export tool, not from Daylight
Example: subset 0, subset 1, subset 2

subset_name

The name of the subset category, not the subset
Example: Location, product ID, Category, Age

value

 The value of the subset

Example: New York, or 25 years old

Table: doc_terms

Field

Description

doc_id

 Unique identifier for each document/verbatim uploaded in a given file - surrogate key of all tables

term

 The smallest units within database knowledge to create concepts
Example: “Best Buy” is one concept with two terms

Example2: “Leaky pen battery” is one concept with three terms 

exact_match

  Boolean value (0 or 1) to determine whether the document is an exact match to a specific term

Example: “The battery leaked” either has (1) or does not have (0) an exact match to “Battery”

association

The scored relationship between document and term as generated in Daylight
(Dot product between vectors)

 Table: doc_topic

Field

Description

association

The scored relationship between document and term as generated in Daylight
(Dot product between vectors)

doc_id

 Unique identifier for each document/verbatim uploaded in a given file 

topic

 Now identified in the current version of Daylight as “Saved Concept” which can be a combination of concepts or a single concept

Table: drivers

Field

Description

doc_count

 Number of documents in a project

driver

 The concept that is correlated with a difference in score from the average
Example: “great smell” is associated with a higher score, while “leaky” is associated with a lower score

example_doc

First sample document that contains a concept that is driving the score

example_doc2

Second sample document that contains a concept that is driving the score

example_doc3

Third sample document that contains a concept that is driving the score

impact

 The numeric value of the drivers score driver difference from the average
Example: documents that include “great smell” have an impact of 1.2 higher than the average
Documents that include “leaky” have an impact of 2 lower than the average

related_terms

 Terms related to the driver
Example: “not stinky” is related to “great smell”
“Leaky” is related to “bad packaging”

subset

 A special field for Tableau Export containing the subset ID - this field is generated from the export tool, not from Daylight
Example: subset 0, subset 1, subset 2

type

 user_defined: same as a saved concept
Example: “leaky”

auto_found: system identified drivers

 Table: skt

Field

Description

subset

 A special field for Tableau Export containing the subset ID - this field is generated from the export tool, not from Daylight
Example: subset 0, subset 1, subset 2

term

 The smallest units within database knowledge to create concepts
Example: “Best Buy” is one concept with two terms

Example2: “Leaky pen battery” is one concept with three terms 

text_1

First sample document that contains a concept belonging to the subset of the documents 

text_2

Second sample document that contains a concept belonging to the subset of the documents 

text_3

Third sample document that contains a concept belonging to the subset of the documents 

impact

 The numeric value of the driver’s score difference from the average
Example: documents that include “great smell” have an impact of 1.2 higher than the average
Documents that include “leaky” have an impact of 2 lower than the average

conceptual_matches

Number of documents with the highest association scores to a specific concept

exact_matches

 Number of documents with the specific concept

odds_ratio

 An odds ratio (OR) is a statistic that quantifies the strength of the association between two events, A and B. The odds ratio is defined as the ratio of the odds of A in the presence of B and the odds of A in the absence of B, or equivalently (due to symmetry), the ratio of the odds of B in the presence of A and the odds of B in the absence of A.

p_value

 In statistical hypothesis testing, the p-value or probability value or significance is, for a given statistical model, the probability that, when the null hypothesis is true, the statistical summary (such as the absolute value of the sample mean difference between two compared groups) would be greater than or equal to the actual observed results.

total_matches

 Sum of conceptual and exact matches 

value

 The value of the subset

Example: New York, or 25 years old



Table: doc_topic 

Field

Description

term

 The smallest units within database knowledge to create concepts
Example: “Best Buy” is one concept with two terms

Example2: “Leaky pen battery” is one concept with three terms 

exact_matches

 Number of documents with the specific concept

related_matches

 Number of documents with the specific concept that are not exact matches
Total Matches - Exact Matches = Related Matches

 Table: themes

Field

Description

cluster_label

 List of terms which represent a theme

Example: If “Leaky Battery” is a dominant theme, then the cluster label would represent that cluster of concepts
Example: term|language
Leaky|en
Batter|en

docs

 Number of documents within the theme cluster

id

 Theme cluster identifier (like the subset ID)
Example: Theme 0, Theme 1, Theme 2...

name

 List of concepts which describes the details of a theme
Example: Cluster label “Leaky battery” could also contain “pen battery” or “bad packaging” or “leaky pen”

Visualizations available in Luminoso’s Tableau template

Data Source

This table allows you to view the fields and tables exported from Luminoso. 

Auto Themes

Themes detected from the project displayed in clearly identified and labeled bubbles, along with text examples of each theme.

Themes (2)

Same themes as from Auto Themes displayed as horizontal bars instead of bubbles so the viewer may identify the quantity of qualifying documents easily. 

Drivers

Similar to the  Drivers tool in Daylight, this tab is a scatter plot of terms and their average impact on the scores. 

Example: The term "scent" is located at +4.8 average impact and 6k sum of doc count. This can be interpreted as "Documents that reference the scent have a score of 4.8 higher than those who do not reference scent." 

Example 2: The term "berry" is located at -14.6 with a volume of about 11k sum of doc count. This can be interpreted as "Documents that reference the scent have a score of -14.6 lower than those who do not reference berry.”

Driver terms on the left sort the drivers by the calculated impact, which factors in both the intensity of the driver and the volume of the documents referencing that term. 

Subset Key-Terms

A horizontal bar chart allows viewers to identify and compare key terms by subset. 

Example: In the 65 year old age bracket, we see "haven't smoked a cigarette" as a key term, while in the 23 year old age bracket, we see "pods" and "refreshing" as key terms. 

Clicking on key terms populates examples below. 

SKT (Subset Key Term) Bubbles 

A horizontal bar chart shows the volume of subsets

Example: Spot check to see which age demographics are most represented in the dataset

Drivers (2)

Functional duplicate of Drivers above with variable visualization options. 

Example: Use the 'Marks' option of the left to adjust the size of the terms based on impact

Low VOCTerms

The trend of average score and document counts is on top, selected months allow viewers to view individual terms are associated with low scores are reviewed by volume and average score.

High VOCTerms

The trend of average score and document counts is on top, selected months allow viewers to view individual terms are associated with high scores are reviewed by volume and average score.

DocCount over Time (2)

Functional Duplicate of VOC Terms drills into the volume of data over time, along with the scores. 

Tags w/verbatim

Tags with the highest average scores can be broken down by age, score, and votes. Examples of verbatims including those topics are listed below. 

Topics over Time

View topic scores over time, and select individual topics to see examples. 

© 2020 Luminoso Technologies. All rights reserved.