A guide to QuickLearn 2.0

Introduction

QuickLearn is Luminoso’s proprietary natural language modeling system that automatically learns domain-specific terminology.

QuickLearn uses word embeddings that represent words as vectors. Vectors that point in similar directions represent words with similar meanings. For example, the word car is related to other vectors like auto, vehicle, and transportation. QuickLearn understands language by creating lists of numbers that represent each vector. Each word embedding represents hundreds of dimensions that capture meaning and nuance. Luminoso’s word embeddings take less data to create, and don’t require supervised or labelled data. Luminoso requires less data than off-the-shelf competitors to create useful word embeddings.

QuickLearn draws on a background space, a collection of word embeddings that reflect what words mean across the scope of a language. Unlike maintained, industry-specific vocabulary lists, background spaces aim to represent concepts through common sense and inherent definitions of words. QuickLearn uses a background space that is based on ConceptNet and is combined with several other word embedding spaces. The resulting background space is a set of 34 million relationships between concepts, mathematically represented in a general domain model and based on knowledge of how the world works.

Table of contents

1 Introduction
2 What is QuickLearn 2.0?
3 What improvements are part of QuickLearn 2.0?
- 3.1 Less bias
- 3.2 More relevant conceptual matches
- 3.3 Improved language support
4 QuickLearn 2.0 frequently asked questions

What is QuickLearn 2.0?

QuickLearn 2.0 is the most significant advancement Luminoso has made to its background science to date. It expands Luminoso’s background knowledge by adding an additional 150 dimensions to the existing 150 dimensions, consuming more knowledge from ConceptNet, increasing Luminoso’s overall conceptual clarity across all of our language models, and promoting debiasing.

Beginning in December 2019, all Luminoso products will use QuickLearn 2.0.

What improvements are part of QuickLearn 2.0?

Less bias

Through updates that remove bias from words in Luminoso’s background space, QuickLearn 2.0 decreases risk of bias in its results, especially for categories that are protected from discrimination. Debiasing prevents projects from taking on biased and harmful associations learned from the internet. This essential change makes Luminoso’s data processing as accurate as possible.

More relevant conceptual matches

We expanded QuickLearn’s background space to incorporate new advancements in natural language understanding (NLU) research. We also doubled the dimensionality of our representation of concepts, allowing for more specific conceptual matches. In QuickLearn 2.0, conceptual matches are now more intuitive and higher quality.

Improved language support

Luminoso now uses a larger starting vocabulary for most languages, thanks to improvements in ConceptNet and other background spaces, which strengthens analysis across the 15 languages Luminoso supports and enhances our conceptual matching ability. This change especially improves classification and conceptual matches in Luminoso’s classification offering, Compass.

QuickLearn 2.0 frequently asked questions

When will QuickLearn 2.0 be released?

Cloud: QuickLearn 2.0 will be released to all cloud users on December 7, 2019.
On-site: QuickLearn 2.0 will be released to on-site users in February 2020 as part of the on-site update package.

How do I start using QuickLearn 2.0?

Any new projects or classifiers you create automatically use QuickLearn 2.0.

How can I rebuild an existing Daylight project with QuickLearn 2.0?

Cloud: Contact your Customer Success Manager to have existing projects rebuilt. If you add new documents to your project, the project will rebuild using QuickLearn 2.0.
On-site: Use the build project API endpoint to rebuild projects or add new documents to your project to rebuild your project using QuickLearn 2.0.

How can I rebuild existing Compass classifiers with QuickLearn 2.0?

Cloud: Use the dedicated rebuild endpoint to rebuild your classifiers. Classifiers are rebuilt using QuickLearn 2.0.
On-site: Use the dedicated rebuild endpoint to rebuild your classifiers. Classifiers are rebuilt using QuickLearn 2.0.

Will my old Daylight projects update to QuickLearn 2.0 automatically?

Cloud: Yes, if a project was built six or more months ago and has not been opened in the last seven days, Luminoso Daylight automatically rebuilds to keep it current with the latest releases.
On-site: No, Luminoso Daylight does not automatically rebuild older projects.

What changes can I expect to see in my new Daylight projects?

You should see more intuitive top matching concepts for any concept you select. Matches that previously had a low relevance should display a lower score than before.

Will QuickLearn 2.0 change conceptual matches?

Yes. Improving the background space results in more accurate conceptual matches and association scores. Since the number of concepts counted as conceptual matches is higher in QuickLearn 2.0, the number of documents that contain a conceptual match for any concept you select is higher.

Here’s a comparison of conceptual matches for the concept “upgrade” in two identical datasets uploaded to QuickLearn 1.0 and QuickLearn 2.0 instances:

In QuickLearn 2.0, there’s a major increase in conceptual matches for “upgrade”. When you compare the top related matches, you can see that the results are more precise and synonymous.

Why are my concept relation scores different?

The numbers you see in the Top related concepts pane represent how similar concepts are based on their proximity in QuickLearn 2.0’s 300-dimension space. Since QuickLearn 2.0 has more directions within the semantic space, concepts may be farther apart. Additionally, a concepts’ meaning must match in more ways to display a high relatedness score.

How do I find related matches?

The top matches in the Top related concepts section may now reflect more related concepts that are highly synonymous with the concept you have selected. If you may now need to scroll down to find concepts that are more related, or close together in the vectored space that Daylight creates, instead of synonymous with the concept you have selected.

When QuickLearn 2.0 is introduced, will my saved concepts change?

QuickLearn 2.0 does not affect the list of concepts that Daylight identifies, and Daylight maintains your list of saved concepts. However, rebuilding a project will improve concept association scores and related concepts, which updates several features. You may notice changes in:

Drivers feature: Changes to conceptual matches will affect the rank of suggested drivers and a concept’s average score calculations. Overall analysis will be more accurate.
Sentiment feature: Changes in conceptual matches may subtly affect the ordering of sentiment suggestions.
Match counts: The number of total matches will change as a direct result of QuickLearn 2.0 improvements to conceptual matching.

Will QuickLearn 2.0 affect my compound concepts?

Any compound concepts and the concepts you saved as part of them remain saved in QuickLearn 2.0. QuickLearn 2.0 doesn’t change the concepts in your project, but it does affect concept association scores. The way your compound concepts relate to the rest of your project will change as a result of QuickLearn 2.0.

Why does the shape of my project’s Galaxy look different?

QuickLearn 2.0 changes the way Luminoso learns, which affects the way the Galaxy portrays semantic space. QuickLearn 2.0 produces many small, loosely related Galaxy clusters, unlike QuickLearn 1.0, which generated a few large clusters of terms. In QuickLearn 2.0, visualizations still arrange related concepts together in a two-dimensional view of a multi-dimensional space. QuickLearn 2.0 changes the position of each concept’s vector, and calculates closer relationships between all concepts, so the visualization no longer displays several large clusters with nothing in between.

Do I need to change anything in my API set-up to accommodate the QuickLearn update?

No, QuickLearn 2.0 is automatically included in the API.

Will projects take more time to build?

Because QuickLearn 2.0 is more intelligent in identifying and plotting out relationships between relevant concepts, projects may take longer to build.

Will classifiers take longer to build?

QuickLearn 2.0 does not take longer to build Luminoso Compass classifiers.

Will debiasing affect my projects?

We updated the background space Luminoso uses so that social biases in the analysis are reduced. This helps prevent harmful associations learned from the internet that affect conceptual matches. The effects of debiasing in a project are very subtle, so you may not notice a difference. Associations that Luminoso draws specifically from your project’s documents don’t rely on knowledge in the background space and are much stronger. These associations should be relatively unaffected.