QuickLearn is Luminoso’s proprietary natural language modeling system that automatically learns domain-specific terminology.
QuickLearn uses word embeddings that represent words as vectors. Vectors that point in similar directions represent words with similar meanings. For example, the word car is related to other vectors like auto, vehicle, and transportation. QuickLearn understands language by creating lists of numbers that represent each vector. Each word embedding represents hundreds of dimensions that capture meaning and nuance. Luminoso’s word embeddings take less data to create, and don’t require supervised or labelled data. Luminoso requires less data than off-the-shelf competitors to create useful word embeddings.
QuickLearn draws on a background space, a collection of word embeddings that reflect what words mean across the scope of a language. Unlike maintained, industry-specific vocabulary lists, background spaces aim to represent concepts through common sense and inherent definitions of words. QuickLearn uses a background space that is based on ConceptNet and is combined with several other word embedding spaces. The resulting background space is a set of 34 million relationships between concepts, mathematically represented in a general domain model and based on knowledge of how the world works.
Table of contents
QuickLearn 2.0 is the most significant advancement Luminoso has made to its background science to date. It expands Luminoso’s background knowledge by adding an additional 150 dimensions to the existing 150 dimensions, consuming more knowledge from ConceptNet, increasing Luminoso’s overall conceptual clarity across all of our language models, and promoting debiasing.
Beginning in December 2019, all Luminoso products will use QuickLearn 2.0.
Through updates that remove bias from words in Luminoso’s background space, QuickLearn 2.0 decreases risk of bias in its results, especially for categories that are protected from discrimination. Debiasing prevents projects from taking on biased and harmful associations learned from the internet. This essential change makes Luminoso’s data processing as accurate as possible.
We expanded QuickLearn’s background space to incorporate new advancements in natural language understanding (NLU) research. We also doubled the dimensionality of our representation of concepts, allowing for more specific conceptual matches. In QuickLearn 2.0, conceptual matches are now more intuitive and higher quality.
Luminoso now uses a larger starting vocabulary for most languages, thanks to improvements in ConceptNet and other background spaces, which strengthens analysis across the 15 languages Luminoso supports and enhances our conceptual matching ability. This change especially improves classification and conceptual matches in Luminoso’s classification offering, Compass.
Cloud: QuickLearn 2.0 will be released to all cloud users on December 7, 2019.
On-site: QuickLearn 2.0 will be released to on-site users in February 2020 as part of the on-site update package.
Any new projects or classifiers you create automatically use QuickLearn 2.0.
Cloud: Contact your Customer Success Manager to have existing projects rebuilt. If you add new documents to your project, the project will rebuild using QuickLearn 2.0.
On-site: Use the build project API endpoint to rebuild projects or add new documents to your project to rebuild your project using QuickLearn 2.0.
Cloud: Use the dedicated rebuild endpoint to rebuild your classifiers. Classifiers are rebuilt using QuickLearn 2.0.
On-site: Use the dedicated rebuild endpoint to rebuild your classifiers. Classifiers are rebuilt using QuickLearn 2.0.
Cloud: Yes, if a project was built six or more months ago and has not been opened in the last seven days, Luminoso Daylight automatically rebuilds to keep it current with the latest releases.
On-site: No, Luminoso Daylight does not automatically rebuild older projects.
You should see more intuitive top matching concepts for any concept you select. Matches that previously had a low relevance should display a lower score than before.
Yes. Improving the background space results in more accurate conceptual matches and association scores. Since the number of concepts counted as conceptual matches is higher in QuickLearn 2.0, the number of documents that contain a conceptual match for any concept you select is higher.
Here’s a comparison of conceptual matches for the concept “upgrade” in two identical datasets uploaded to QuickLearn 1.0 and QuickLearn 2.0 instances:
In QuickLearn 2.0, there’s a major increase in conceptual matches for “upgrade”. When you compare the top related matches, you can see that the results are more precise and synonymous.
The numbers you see in the Top related concepts pane represent how similar concepts are based on their proximity in QuickLearn 2.0’s 300-dimension space. Since QuickLearn 2.0 has more directions within the semantic space, concepts may be farther apart. Additionally, a concepts’ meaning must match in more ways to display a high relatedness score.
The top matches in the Top related concepts section may now reflect more related concepts that are highly synonymous with the concept you have selected. If you may now need to scroll down to find concepts that are more related, or close together in the vectored space that Daylight creates, instead of synonymous with the concept you have selected.
QuickLearn 2.0 does not affect the list of concepts that Daylight identifies, and Daylight maintains your list of saved concepts. However, rebuilding a project will improve concept association scores and related concepts, which updates several features. You may notice changes in:
Drivers feature: Changes to conceptual matches will affect the rank of suggested drivers and a concept’s average score calculations. Overall analysis will be more accurate.
Sentiment feature: Changes in conceptual matches may subtly affect the ordering of sentiment suggestions.
Match counts: The number of total matches will change as a direct result of QuickLearn 2.0 improvements to conceptual matching.
Any compound concepts and the concepts you saved as part of them remain saved in QuickLearn 2.0. QuickLearn 2.0 doesn’t change the concepts in your project, but it does affect concept association scores. The way your compound concepts relate to the rest of your project will change as a result of QuickLearn 2.0.
QuickLearn 2.0 changes the way Luminoso learns, which affects the way the Galaxy portrays semantic space. QuickLearn 2.0 produces many small, loosely related Galaxy clusters, unlike QuickLearn 1.0, which generated a few large clusters of terms. In QuickLearn 2.0, visualizations still arrange related concepts together in a two-dimensional view of a multi-dimensional space. QuickLearn 2.0 changes the position of each concept’s vector, and calculates closer relationships between all concepts, so the visualization no longer displays several large clusters with nothing in between.
No, QuickLearn 2.0 is automatically included in the API.
Because QuickLearn 2.0 is more intelligent in identifying and plotting out relationships between relevant concepts, projects may take longer to build.
QuickLearn 2.0 does not take longer to build Luminoso Compass classifiers.
We updated the background space Luminoso uses so that social biases in the analysis are reduced. This helps prevent harmful associations learned from the internet that affect conceptual matches. The effects of debiasing in a project are very subtle, so you may not notice a difference. Associations that Luminoso draws specifically from your project’s documents don’t rely on knowledge in the background space and are much stronger. These associations should be relatively unaffected.