Luminoso Daylight is an AI-powered text analytics application that automatically analyzes conversational text like product reviews, open-ended survey responses, and support tickets. Daylight learns from the text data you upload, automatically adapting to your specific data.  Daylight relies on data that is compatible with its system to create accurate and valuable results. Before uploading your data to Daylight, consider what data sources best meet your organization’s needs. 

If your organization uses external data sources, you’re responsible for ensuring that the acquisition and use of data abides by your license agreement with Luminoso. 

Table of contents

Key vocabulary

What data do I have?

First, consider all of the ways that potential customers or employees might generate natural language text data. This data could take the form of:

This article discusses the most common data sources among Daylight users, but don’t limit yourself. Use these examples to think through the challenges and opportunities of using any appropriate data source available to your organization. Depending on your needs, all of these data sources could produce valuable results, but some require additional work to optimize for Daylight, and may offer varying levels of opportunity.

Every example here is generalized. Use these suggestions to guide your analytical strategy and create a data set that gets optimal results from Daylight. 

Which data should I choose?

Consider these essential factors when you select natural language text data to process with Luminoso Daylight. 

  1. Use natural language text — Not all text is natural language. For instance, structured fields or forms offer very little change from verbatim to verbatim. 

  2. Consider how much data you have and how often you’ll use it —Think about how many rows of text data you have to analyze and how often you’ll need to take action on it. It’s also good to think about if you’ll need to upload more documents to your project later. 

  3. Prioritize unique unstructured text — Daylight identifies relationships between terms, so if your verbatims include recurring information, like ticket IDs or greetings, that come from within your organization, Daylight will probably identify that information as a top concept. Remove this information to focus on text that’s unique. 

  4. Include structured data — Adding structured data, or metadata, related to your verbatims allows you to use all features in Daylight. When you add metadata, you can filter your documents within a project and create new projects based on matching filter criteria. After you upload satisfaction scores alongside verbatims, use the Drivers feature in Daylight to analyze concepts that impact satisfaction scores. If you include dates, you can filter your dataset to display specific periods of time. 

  5. Provide context — Select verbatims where the writer has a focused goal when they are providing feedback. Avoid analyzing scattered data that isn’t relevant to your brand. This is especially crucial when considering data sources like forums or tweets, where the prompt may not be brand-generated, and the forum may be less moderated. 

  6. Plan your extraction process — When planning your approach to data and analysis, consider the amount of effort it will take to format data into a Comma Separated Values (CSV) file  for Daylight. Some sources may require more effort to make Daylight-compatible than others. 

Reviews

Reviews collect customer thoughts about a product or service. Customers share thoughts and opinions about the parts of a product or service that were most emotionally impactful to them, which makes reviews one of the best data sources for Daylight.  Before Luminoso, creating value from reviews could be difficult for companies at scale, since it’s impossible to predict the words a customer will use to describe a product or service. Daylight automatically identifies the words and phrases in a dataset — even those it has never seen before — so you don’t have to guess the different ways customers might describe their experience.   

Find insights like

Best practices for reviews

Keep and label data associated with your text to use as metadata

Example

User: dsrice
Text: My 9 year old gets lots of ear infections, usually in the middle of the night. This warm fox and Tylenol are what soothes the pain until we can see a doctor and get medicine. It's a must have at our house.
Star Rating: 5
Title: Great for ear infection pain!
Review Date: October 5, 2018
Product: Warmies Microwavable French Lavender Scented Plush
Style: Fox
Verified Purchase: Yes

Rationale: Keeping your data clean allows you to deep-dive by creating projects from a master project. The more metadata you include, the more questions you can address. For example, Marketing may want to know what’s popular with two different generations of consumers, while Manufacturing may need to differentiate problems between models.

Surveys

Surveys are a request for feedback that may be administered one-time, periodically, or on a rolling basis. To work in Daylight, a  survey must include open-ended questions that involve natural language text. Surveys often include rich structured data, like numeric satisfaction ratings.  Including this structured data as it relates to the natural language enables deeper research within Daylight through filters. The more open-ended a survey prompt, the stronger analysis will be in Daylight. Targeted questions can skew the way a respondent answers, making results more difficult to interpret. 

Find insights like

Best practices for surveys

Focus on responses to open-ended prompts or elaboration on an initial answer

Example

Text: I loved the free coffee and the room was very clean, but it smelled strongly of cigarette smoke.
Gold Member Since: 2015
Recent Stay: January 5, 2019
Overall Experience Score: 7
Check-In: 3
Room Cleanliness: 8
Room Service: 5
Check-Out: 4

Rationale: Responses like this offer unique insight into what the respondent was thinking or feeling. Including score data combines your natural language text with scores in Daylight’s Drivers feature, helping you make informed decisions about your data.

Support tickets

Support tickets are an information source that gives insight into parts of a product or service where users encounter difficulty. Daylight is an excellent match for this data source, since support tickets contain natural language descriptions of problems. A benefit of analyzing support tickets is identifying patterns and quantifying frequent issues. Some successful Daylight users conduct analysis alongside events like marketing campaigns, changes in service, or after a problem is reported. 

Find insights like

Best practices for support tickets

Focus on the customer's side of the conversation and remove canned responses from data

Remove ticket IDs from your data

Forum posts

Forum posts are an information source that are typically generated by users, not organizations. Users create and respond to threads, which usually start with a question or observation. Responses to the initial post are unique and open-ended, and vary based on the initial post and the forum’s moderation rules. Forum posts might not always address your business questions, but could provide major and unexpected insight. Forum data can also require work to clean and isolate individual posts from a running thread. Be careful, since combining multiple threads can cause the context across different conversations to drift apart. 

Find insights like

Best practices for forum posts

Choose high-volume threads with a clear focus

Social media posts

Social media posts, like those on Twitter or Instagram, are a unique source of natural language data which are often brief in length but may be high volume. Due to the brief but creative nature of posts, you need many samples to create a helpful Daylight analysis. Before starting, consider your methodology for de-duplicating re-posts and capturing posts that are relevant to your topic of research. Some companies measure success of marketing campaigns through social media. Daylight can improve that process by summarizing the main topics in the response.    

Find insights like

Best practices for social media posts

Focus on direct mentions of company support.

Study data associated with specific hashtags

Ask open-ended questions to spark conversational, natural responses

Remove redundancies before you upload documents

Know the limitations of sarcasm and AI

How do I prepare my data?

Use the Preparing a dataset for upload section in our Getting started with Daylight guide to help you format and upload data into Daylight.