Integrating Daylight API with Tableau endpoints
Table of contents
- 1 Tables exported from Luminoso
- 1.1 Table: doc_subset
- 1.2 Table: doc_terms
- 1.3 Table: doc_topic
- 1.4 Table: drivers
- 1.5 Table: skt
- 1.6 Table: doc_topic
- 1.7 Table: themes
- 2 Visualizations available in Luminoso’s Tableau template
- 2.1 Data Source
- 2.2 Auto Themes
- 2.3 Themes (2)
- 2.4 Drivers
- 2.5 Subset Key-Terms
- 2.6 SKT (Subset Key Term) Bubbles
- 2.7 Drivers (2)
- 2.8 Low VOCTerms
- 2.9 High VOCTerms
- 2.10 DocCount over Time (2)
- 2.11 Tags w/verbatim
- 2.12 Topics over Time
Tables exported from Luminoso
Table: doc_subset
Field | Description |
doc_id | Unique identifier for each document/verbatim uploaded in a given file - surrogate key of all tables |
subset | Special field for Tableau Export containing the subset ID - this field is generated from the export tool, not from Daylight |
subset_name | The name of the subset category, not the subset |
value | The value of the subset Example: New York, or 25 years old |
Table: doc_terms
Field | Description |
doc_id | Unique identifier for each document/verbatim uploaded in a given file - surrogate key of all tables |
term | The smallest units within database knowledge to create concepts Example2: “Leaky pen battery” is one concept with three terms |
exact_match | Boolean value (0 or 1) to determine whether the document is an exact match to a specific term Example: “The battery leaked” either has (1) or does not have (0) an exact match to “Battery” |
association | The scored relationship between document and term as generated in Daylight |
Table: doc_topic
Field | Description |
association | The scored relationship between document and term as generated in Daylight |
doc_id | Unique identifier for each document/verbatim uploaded in a given file |
topic | Now identified in the current version of Daylight as “Saved Concept” which can be a combination of concepts or a single concept |
Table: drivers
Field | Description |
doc_count | Number of documents in a project |
driver | The concept that is correlated with a difference in score from the average |
example_doc | First sample document that contains a concept that is driving the score |
example_doc2 | Second sample document that contains a concept that is driving the score |
example_doc3 | Third sample document that contains a concept that is driving the score |
impact | The numeric value of the drivers score driver difference from the average |
related_terms | Terms related to the driver |
subset | A special field for Tableau Export containing the subset ID - this field is generated from the export tool, not from Daylight |
type | user_defined: same as a saved concept auto_found: system identified drivers |
Table: skt
Field | Description |
subset | A special field for Tableau Export containing the subset ID - this field is generated from the export tool, not from Daylight |
term | The smallest units within database knowledge to create concepts Example2: “Leaky pen battery” is one concept with three terms |
text_1 | First sample document that contains a concept belonging to the subset of the documents |
text_2 | Second sample document that contains a concept belonging to the subset of the documents |
text_3 | Third sample document that contains a concept belonging to the subset of the documents |
impact | The numeric value of the driver’s score difference from the average |
conceptual_matches | Number of documents with the highest association scores to a specific concept |
exact_matches | Number of documents with the specific concept |
odds_ratio | An odds ratio (OR) is a statistic that quantifies the strength of the association between two events, A and B. The odds ratio is defined as the ratio of the odds of A in the presence of B and the odds of A in the absence of B, or equivalently (due to symmetry), the ratio of the odds of B in the presence of A and the odds of B in the absence of A. |
p_value | In statistical hypothesis testing, the p-value or probability value or significance is, for a given statistical model, the probability that, when the null hypothesis is true, the statistical summary (such as the absolute value of the sample mean difference between two compared groups) would be greater than or equal to the actual observed results. |
total_matches | Sum of conceptual and exact matches |
value | The value of the subset Example: New York, or 25 years old |
Table: doc_topic
Field | Description |
term | The smallest units within database knowledge to create concepts Example2: “Leaky pen battery” is one concept with three terms |
exact_matches | Number of documents with the specific concept |
related_matches | Number of documents with the specific concept that are not exact matches |
Table: themes
Field | Description |
cluster_label | List of terms which represent a theme Example: If “Leaky Battery” is a dominant theme, then the cluster label would represent that cluster of concepts |
docs | Number of documents within the theme cluster |
id | Theme cluster identifier (like the subset ID) |
name | List of concepts which describes the details of a theme |
Visualizations available in Luminoso’s Tableau template
Data Source
This table allows you to view the fields and tables exported from Luminoso.
Auto Themes
Themes detected from the project displayed in clearly identified and labeled bubbles, along with text examples of each theme.
Themes (2)
Same themes as from Auto Themes displayed as horizontal bars instead of bubbles so the viewer may identify the quantity of qualifying documents easily.
Drivers
Similar to the Drivers tool in Daylight, this tab is a scatter plot of terms and their average impact on the scores.
Example: The term "scent" is located at +4.8 average impact and 6k sum of doc count. This can be interpreted as "Documents that reference the scent have a score of 4.8 higher than those who do not reference scent."
Example 2: The term "berry" is located at -14.6 with a volume of about 11k sum of doc count. This can be interpreted as "Documents that reference the scent have a score of -14.6 lower than those who do not reference berry.”
Driver terms on the left sort the drivers by the calculated impact, which factors in both the intensity of the driver and the volume of the documents referencing that term.
Subset Key-Terms
A horizontal bar chart allows viewers to identify and compare key terms by subset.
Example: In the 65 year old age bracket, we see "haven't smoked a cigarette" as a key term, while in the 23 year old age bracket, we see "pods" and "refreshing" as key terms.
Clicking on key terms populates examples below.
SKT (Subset Key Term) Bubbles
A horizontal bar chart shows the volume of subsets
Example: Spot check to see which age demographics are most represented in the dataset
Drivers (2)
Functional duplicate of Drivers above with variable visualization options.
Example: Use the 'Marks' option of the left to adjust the size of the terms based on impact
Low VOCTerms
The trend of average score and document counts is on top, selected months allow viewers to view individual terms are associated with low scores are reviewed by volume and average score.
High VOCTerms
The trend of average score and document counts is on top, selected months allow viewers to view individual terms are associated with high scores are reviewed by volume and average score.
DocCount over Time (2)
Functional Duplicate of VOC Terms drills into the volume of data over time, along with the scores.
Tags w/verbatim
Tags with the highest average scores can be broken down by age, score, and votes. Examples of verbatims including those topics are listed below.
Topics over Time
View topic scores over time, and select individual topics to see examples.
© 2020 Luminoso Technologies. All rights reserved.