Last Updated: Jan 24, 2019

Table of contents

Input and output

Requests

Note: Ensure that all calls to the Compass API include a trailing slash in the endpoint URL.

For GET/DELETE requests, parameters should go in the URL as query parameters. For PUT/POST requests, parameters should almost always go in the request body (exceptions for particular endpoints are noted in their documentation); they can be formatted as HTML forms (in which case the Content-Type header should be set to "application/x-www-form-urlencoded") or as JSON (in which case the Content-Type header should be set to "application/json").

Responses

Response bodies are JSON-encoded (except for certain 50X server errors).

Successful responses use the following HTTP codes:

Some API methods return a "paginated" list of objects (for example, getting the list of messages in a topic). A paginated list is a JSON object with four keys:

Errors

Error responses use the following HTTP codes:

In cases where there is an error, in addition to the HTTP code the response will include some information about the error. This information will usually be a JSON object but could also be a JSON string.

Below are some typical examples of error responses.

For a 401 (Unauthorized), when the request is not authenticated:

{
   "detail": "Authentication credentials were not provided."
} 

For a 400 (Bad Request), when trying to create a project while specifying no name and a nonexistent account:

{
    "name": [
         "This field is required."
    ],
    "account": [
         "Invalid pk 'xxx' - object does not exist."
   ]
} 

Data types

The documentation refers to several data types for parameters:

Authentication and access

Permissions

A permission allows a particular user to do certain things with respect to a particular account.

The permission levels are:

In addition to permissions on accounts, there is a special status called "site admin". If a user is a site admin, s/he is allowed to do everything. Site admins are the only users who are allowed to create accounts and users. 

Tokens and authentication

Each user can obtain a token that can be used to authenticate to the API. Currently, each user can have at most one token. All tokens currently expire two weeks after their creation, or they can be deleted manually. A user's token is also reset if his password is reset or changed.

For an API request to be authenticated, it should include an "Authorization" header, whose value should be the string "Token " followed by the user's token. For example, if the token is "74e6d14dbde4303fe9864cb77306cc36394de460", then the value of the Authorization header should be "Token 74e6d14dbde4303fe9864cb77306cc36394de460".

To obtain your token, log in with your username and password. If you already have a token, the login response will include your existing token, otherwise it will include a newly-generated token.

Compass API Endpoints

The following endpoints/methods are provided by the Compass API.

Note: Ensure that all calls to the Compass API include a trailing slash in the endpoint URL.

To make API requests, the endpoints listed here should be prefixed with "https://<compass-url.tld>/api"

Tokens

A token authenticates a particular user to the API. Currently, each user has at most one token at a time.

A token object in the API has the following fields:

POST /login/

Log in with a username and password to get a token.

Response: JSON object with three keys:

GET /tokens/

Get a list of tokens. This will only include your tokens, unless you are a site admin.

Response: paginated list of token objects

GET /tokens/<token>/

Get an existing token.

Response: token object 

DELETE /tokens/<token>/

Delete a token.

Response: (empty)

Projects

A project is one set of data (classifiers, documents, messages and their associated information and analysis).

A project object in the API has the following fields:

GET /projects/

Get the list of all the projects the user has access to.

POST /projects/

Create a project.

Response: new project object

GET /projects/<project_id>/

Get a project.

Response: project object

PUT /projects/<project_id>/

Update a project.

Response: updated project object

DELETE /projects/<project_id>/

Delete a project.

Response: (empty)

Documents

A document is an individual unit of text that can be used in the classification workflow.  A document object in the API has the following fields:

Typically, labeled documents are put into a dataset and used to train supervised classifiers in Compass. Unlabeled documents can also be added, for the purposes of setting the vector space used in supervised classifier building or building other types of classifiers.

GET /projects/<project_id>/p/documents/

Get the list of documents from the first page (20 documents).

Response: paginated list of document objects

POST /projects/<project_id>/p/documents/

Add a document or multiple documents. To add multiple documents at a time, specify a list of JSON objects instead of a single JSON object.

Response: paginated list of project objects

Classifiers

Compass provides three types of classifiers: voting, topic-based, and sentiment. The voting classifier is a supervised classifier trained on a collection of labeled documents. The topic-based classifier is a semi-supervised classifier; it relies on topics defined by the user and does not require training. A collection of domain documents is still needed by the topic-based classifier to create the semantic space on which to operate. The sentiment classifier relies on the semantic space only for domain expansion, and domain documents are necessary only for that case.

Depending on the type of the classifier, a classifier object is defined as follows.

A classifier object for the voting classifier:

A classifier object for the topic-based classifier:

A classifier object for the sentiment classifier:

GET /projects/<project_id>/p/classifiers/

Get the list of classifiers in the project.

Response: paginated list of classifier objects.

POST /projects/<project_id>/p/classifiers/

Create a new classifier.

Permission required: write permission

Required parameters:

Optional parameters vary slightly by the type of the classifier.

Optional body parameters for the voting classifier:

{
     “name”: “my first voting classifier”,
     “status”: “active”,
     “type”: “voting”,

     “dataset”: “restaurant-reviews-training-set”,

     “info”: {
           “num_topics”: 2,
           “threshold”: 0.5
     }
}

Optional body parameters for the topic-based classifier:

Sample parameters for topic-based classifier

{
    “name”: “food and drink”,
    “status”: “active”,
    “type”: “topic_based”,
    “info”: {
         “num_topics”: 2,
         “threshold”: 0.76,
          “dataset”: “restaurant reviews”,
          “topics”: [
               {“title”: “breakfast”, “info”: “bagel AND (schmear OR cream cheese)”},
               {“title”: “coffee”, “info”: “drink AND coffee AND NOT latte”},
               {“title”: “lunch”, “info”: “falafel OR chickpea fritter sandwich”},
               {“title”: “dessert”, “info”: “cake OR pie OR ice cream”}
            ]
    }
}

Optional body parameters for the sentiment classifier:

Sample parameters for sentiment classifier of type "sentiment_combined"

{
    "name": "Sentiment Classifier - combined",
    "type": "sentiment_combined",
    "status": "active"
    “info”: {
        “combined_threshold”: 0.5,
        “domain_expansion”: True
 

    }
}

Sample parameters for sentiment classifier of type "sentiment_split"

{
   "name": "Sentiment Classifier - split",
    "type": "sentiment_split",
    "status": "active"
     “info”: {
        “negative_threshold”: -0.3,
        “positive_threshold”: 0.5,
        “domain_expansion”: True
 

    }
}

Sample parameters for sentiment classifier of type "sentiment_custom"

{
    "name": "My Custom Sentiment List",
    "type": "sentiment_custom",
    "status": "active",
    "info": {
        "topics": {
            "bad": {"min": -1, "max": 0},
            "good": {"min": 0, "max": 1}
        },
        "wordlist": {
            "superb": 5,
            "excellent": 4,
            "mighty fine": 3,
            "mediocre": 2,
            "meh": 1,
            "poor": -1,
            "sucky": -2,
            "deplorable": -3,
            "miserable": -4,
            "execrable": -5
         }
    }
}

Response: a classifier object. While the classifier is being built or rebuilt, its building_state is set to building and its status to inactive. Classifier’s URL should be checked periodically until the building_state value has changed to ready and the status to active. If something went wrong during construction of the classifier, the building_state value is error and the info field will contain information on the error.

GET /projects/<project_id>/p/classifiers/<classifier_id>/

Retrieve the specific classifier’s information.

Response: a classifier object.

PUT /projects/<project_id>/p/classifiers/<classifier_id>/

Once the classifier has been created, some of its information can be updated.

Response: an updated classifier object.

DELETE /projects/<project_id>/p/classifiers/<classifier_id>/

Delete a classifier.

Response: none. 

POST /projects/<project_id>/p/classifiers/<classifier_id>/rebuild/

Rebuilds classifier, leveraging new or updated set of domain documents and/or training documents (if applicable) and/or new or changed set of labels (for the voting classifier), or to change certain configuration parameters for Sentiment Classifiers. The following are the classifiers that allow rebuilding:

  1. Voting Classifier

  2. Topic-based Classifier

  3. Sentiment-Combined Classifier with Domain Expansion flag on

  4. Sentiment-Split Classifier with Domain Expansion flag on

  5. Sentiment-Custom Classifier with Domain Expansion flag on

While the classifier is being rebuilt, its building_state is set to 'building'; the current version of the classifier still accepts messages for classification until rebuild is complete. 

Response: an updated classifier object.

POST /projects/<project_id>/p/classifiers/<classifier_id>/test/ 

Once created a classifier can be tested for accuracy. The input to the classification is a set of text-label pairs, where label values represent the "truth".  The output of this endpoint is the accuracy number summarized and list of classification values for each of the classifier's topics. Accuracy is defined as ratio of text items classified into a correct topic, to the to total number of text items submitted, represented as percentage.

It is a best practice to submit to-be-classified text elements as a list of around 1000.

Permission required: write permission

Required parameters:

Optional parameters:

Response: is dictionary with two elements:

Domain Expansion for Sentiment Classifiers

This section applies to Sentiment classifiers only. With Sentiment classifiers, users have the option of turning on a domain_expansion flag. Domain expansion is a powerful option, and it allows to expand the standard (generic) sentiment to the sentiment that is informed by the project's domain-specific words.

All three sentiment classifiers - Sentiment_combined, Sentiment_split, Sentiment_custom - have the option of turning the domain_expansion flag on.  By default the domain_expansion flag is off.

Domain expansion leverages the domain documents in the project to find additional sentiment-bearing terms. User must have documents in the project to use domain expansion. If 'dataset' is specified, the system will use that dataset to build domain specific sentiment, if 'dataset' is not specified, it will use all documents to build domain specific sentiment, and if there are no documents, the user will receive a warning that domain_expansion cannot be turned on.

Domain expansion can be turned on at the classifier creation, or it can be turned on for a classifier that had domain expansion turned off.

Sample code to create a sentiment_combined with domain expansion on

{
     “name”: “Sentiment Combined, expansion on”,
    “type”: “sentiment_combined”,
    “info”: {
        “domain_expansion”: True,
       “dataset”: “restaurant reviews”,
    }
}

Sample output that shows terms in the domain-expanded output

{
   "info": {
        "domain_expansion": True,
        "domain_terms": {
            "challenge": -1.74,
            "more": 1.87,
            "takeout": -2.35,
            "tip": -2.11,
            "happy hour": 3.2,
            "oysters": -1.43
        ...
    }
}

If desired, the domain_terms can be extracted by using a GET endpoint, and once edited the list can be updated by issuing a PUT.

It is possible for the system to generate words that carry both negative and positive sentiment. Those are put in the term_conflicts element, and are not considered by the classifier. Conflicted terms may be reviewed and moved into the domain_terms element with an appropriate sentiment score.

Domain expansion can be turned off, by setting domain_expansion to 'false'. 
Note: once domain_expansion is turned off, the list of domain terms is deleted. 

Classifier Topics for topic-based classifier

This section applies to topic-based classifiers only. Topics for voting classifiers are derived directly from the labels on the training documents and cannot be modified once the classifier has been created. 

POST /projects/<project_id>/p/classifiers/<classifier_id>/topics

Create a new topic for an existing topic-based classifier.

Response: the newly created topic object.

PUT /projects/<project_id>/p//topics/<topic_id>

Modify an existing topic associated with topic-based classifier.

Response: the modified topic object. 

DELETE /projects/<project_id>/p/topics/<topic_id>

Topic Syntax

This section describes how to construct topics for the topic-based classifier. A topic specification is a string composed of terms (operands) linked by Boolean operators. There are three operators: AND, OR, and NOT, with their usual meaning; they must appear in all-uppercase. Operands may be single words or multi-word phrases. The string must not contain any punctuation (including quotation marks). 

Specifications may range from the very simple (two terms linked by a single operator) to the arbitrarily complex (with parentheses demarcating embedded clauses). Here are some examples:

A simple rule of thumb is: use parentheses whenever you change operators. These two expressions are perfectly unambiguous and do not require parentheses:

The following expression, on the other hand, is incorrect because it is ambiguous: 

Parentheses are required to distinguish the two possible interpretations:

 The operands (words or phrases) always match concepts or semantic classes rather than literal words. It is not possible to restrict operands to match words in a document a literally.

Classifiers/Chain

With multiple classifiers in the project, the default behavior is for the incoming messages to get classified by all of the classifiers independently. But if desired, it is possible to classify messages conditionally by chaining two or more classifiers together. A classifier chain defines a path of classifiers that will process a message depending on its classifications into particular topics. After a message has been processed by one classifier, it may be sent to another classifier.

A classifier chain is defined per classifier: each classifier entry maps topic(s) to their target classifier(s). If a classifier is the target for a topic anywhere in the configuration, it will only process messages that are chained to it.

Classifiers and topics are specified by name. Classifiers must only be specified once, or not at all. A classifier whose name does not appear in the chain specification will classify all messages. Topics not present in the topic list do not receive any chaining, but are still active. The same classifier may be the target for more than one topic. The 'UNCLASSIFIED' topic works as any other topic and may chain to other classifiers.

POST /projects/<project_id>/p/classifiers/chain

Create (or update) the chain specification for the classifiers in the project. Only one chain specification is allowed per project.

Here's a code snippet for the paths parameter:

{
    "Features Classifier": {
        "Display": "Display Sentiment Classifier",
        "Keyboard": "Keyboard Sentiment Classifier"
    },
    "Risk Classifier": {
         "UNCLASSIFIED": "Scoring Classifier"
    }
}
 

Response: a created or updated classification chain object with these attributes:

Classify

Classification is accomplished by posting to one of two endpoints - /projects/<project_id>/p/classify or /projects/<project_id>/p/messages. Posting to the /classify endpoint does not save the message presented, whereas posting to /messages does. Both endpoints return in near-real time a response object for each message that will contain a topics element with a list of the topics predicted for the message. Additionally, posting to /messages invokes periodic unsupervised clustering.

POST /projects/<project_id>/p/classify/

Get classification(s) for a message without saving it to the database.

Response: a message object with the topics list filled in; if none of the classifier’s predictions met the threshold, the special label UNCLASSIFIED will be returned. Each entry in the topics list has these four attributes:

Also, if classifier chaining is active, a topic will have a next element that holds the topic assigned by the next classifier in the chain.

You can see the details about submitting via /projects/<project_id>/p/messages end point here. 

Project Topics

A topic is for messages to be categorized into. A topic is defined by zero or more clusters. A topic object in the API has the following fields:

GET /projects/<project_id>/p/topics/

Get the list of topics.

Permission required: read

Required parameters: none

Optional query parameters:

Response: paginated list of topic objects 

GET /projects/<project_id>/p/topics/<id>/

Get a topic.

Response: topic object 

[DEPRECATED] GET /projects/<project_id>/p/topics/<id>/messages/

Get the list of messages in a topic.

Response: paginated list of message objects

PUT /projects/<project_id>/p/topics/<id>/

Optional body parameters:

Response: updated topic object

DELETE /projects/<project_id>/p/topics/<id>/

[DEPRECATED] Historical Topics

A read-only list of topics from the project's history, regardless of their status. Objects returned have the same fields as a regular topic.

[DEPRECATED] GET /projects/<project_id>/p/topics_history/

Get a list of topics from the project's history. Without query parameters, all topics are returned in reverse chronological order.

Optional query parameters:

Four query parameters are available In addition to the query parameters for regular topics. They all take a timestamp argument in ISO 8601 format -- YYYY-MM- DD(THH:mm:ss)*. All times are UTC.

Response: paginated list of topic objects

[DEPRECATED] Project Messages

A message is a single unit of text, along with some associated data. Messages are via API.

A message object in the API has the following fields:

[DEPRECATED] POST /projects/<project_id>/p/messages/

Submit a message or multiple messages for classification, while saving them to the database. To add multiple messages at a time, specify a list of JSON objects instead of a single JSON object (adding multiple messages at a time is probably not possible when using HTML forms instead of JSON).

Required body parameters (on each message):

Optional body parameters (on each message):

Optional query parameters:

Response: message object or list of message objects. If at least one classifier is present, the message object(s) will have topics list filled in, one for each of the topics predicted. Each entry in the topics list has these four attributes:

[DEPRECATED] GET /projects/<project_id>/p/messages/

Get the list of messages.

[DEPRECATED] GET /projects/<project_id>/p/messages/<id>/

Response: message object 

[DEPRECATED] DELETE /projects/<project_id>/p/messages/<id>/

Response: (empty)

[DEPRECATED] Project Statistics

[DEPRECATED] GET /projects/<project_id>/p/stats/

Get statistics about the number of messages in the project.

Optional query parameters:

Response: JSON object with two keys:

If the number of buckets requested is greater than the number of buckets that exist, some of the buckets will be null.

Project Message Usage

Message usage objects keep track of the number of messages used by the project.

A message usage object in the API has the following fields:

GET /projects/<project_id>/p/usage/

Optional query parameters:

Response: list of JSON objects, one per month, each with four keys:

GET /projects/<project_id>/p/usage/<id>/

Response: message usage object 

[DEPRECATED] Project Configuration

Note: This area of the API is deprecated. A configuration is a user-modifiable aspect of a project's settings - for example, the number of clusters to make in each partition. Usually the defaults are good and you will not need to change them.

A configuration object in the API has the following fields:

The current configurable options are:

[DEPRECATED] GET /projects/<project_id>/p/config/

Response: JSON object with two keys:

[DEPRECATED] GET /projects/<project_id>/p/config/<key>/

Response: configuration object 

[DEPRECATED] PUT /projects/<project_id>/p/config/<key>/

Response: updated configuration object 

[DEPRECATED] DELETE /projects/<project_id>/p/config/<key>/ 

Response: (empty) 

Accounts

An account is a billable entity and has certain account-wide settings (it doesn't actually have any yet).

An account object in the API has the following fields:

GET /accounts/

Get the list of accounts. If you are a site admin, this will include all accounts; otherwise, it will only include accounts that you have permission on.

Response: paginated list of account objects

POST /accounts/

Create a new account.

GET /accounts/<account_id>/

Get an account.

Response: account object

PUT /accounts/<account_id>/

Update an account.

Response: updated account object

DELETE /accounts/<account_id>/

Delete an account.

Response: (empty)

Users

A user represents one person using the system.

A user object in the API has the following fields:

GET /users/

Get the list of users. If you are a site admin, this will include all users, otherwise it will include yourself and any users who have permission on any account that you have manage permission on.

Response: paginated list of user objects

POST /users/

Create a user.

 Response: new user object

GET /users/<email>/

Get a user.

Response: user object

PUT /users/<email>/

Update a user.

Response: updated user object

DELETE /users/<email>/

Delete a user.

Response: (empty) 

PUT /users/<email>/password/

Change a user's password. This also resets his API token.

Response: JSON object with three keys:

POST /users/<email>/password/reset/

Reset a user's password to a temporary password. This also deletes his API token. When the user next logs in, he will not be permitted to do (most) things until he changes his password to something else.

Response: JSON object with one key:

Permissions

Permissions allow users to do certain things to certain accounts. A user can have at most one permission on each account.

A permission object in the API has the following fields:

GET /permissions/

Get the list of permissions on accounts that you have manage permission on.

POST /permissions/

Response: new permission object 

GET /permissions/<permission_id>/

Get a permission object.

Response: permission object

DELETE /permissions/<permission_id>/

Remove a permission.

Response: (empty)