NAV Navbar

Creating Mix.nlu models

Use Mix.nlu to build a highly accurate, high quality custom natural language understanding (NLU) system quickly and easily, even if you have never worked with NLU before.

About Mix.nlu

Mix.nlu enables you to author Automatic Speech Recognition (ASR) and Natural Language Understanding (NLU) models for your application. Mix.nlu models are deployed from the Mix Project Dashboard. Your application accesses these models with the ASR as a Service gRPC API and the NLU as a Service gRPC API.

Model development workflow

The following steps summarize the workflow to develop, deploy, and iterate on an NLU model and optionally a recognition-only domain language model (DLM):

  1. Create a project: The first step is to create a project in Mix.dashboard. This project contains all the data necessary for building your models.
  2. Develop your model: You then develop your model in Mix.nlu by creating your ontology and adding training samples.
  3. Train your model: Training is the process of having the model learn model parameters based on the training data that you have provided.
  4. Test your model: After you train your model, use the Try panel to test it interactively on sample sentences and tune it.
  5. Build your model: When you make a build, you create a model version, which is a snapshot of your model as it exists now.
  6. Create your application configuration: To use your model in an application, you create your application configuration, which is the combination of the model versions that you want to use in your application (for example, Mix.asr model v2 with Mix.nlu model v3 for project CoffeeMaker).
  7. Deploy your application configuration to an environment that is accessible by your application.
  8. Discover what your users say: Collect feedback on how well your model is performing by viewing how the model handled actual user utterances in the deployed application configuration.
  9. Circle back to step 2, refining the model based on insight from user data.

Open the project in Mix.nlu

To open a project in Mix.nlu:

  1. From Mix.dashboard, select your project in the Projects list.
  2. Click the .nlu icon.

Mix.nlu UI overview

The interface of Mix.nlu UI is divided into three tabs containing different functionalities to help you develop, optimize, and refine your NLU model.

  1. intent-verified Develop tab: Define the types of things your users will say, create and annotate examples of these sentences, and use these to train and test your model. The Develop tab offers a simpler interface intended for novice users working on smaller projects.
  2. intent-verified Optimize tab: Allows the same functionality as the Develop tab, but with some more advanced tooling to optimize development. The Optimize tab is intended for more advanced users working on larger projects.
  3. intent-verified Discover tab: For projects with a deployed application configuration, this tab shows recent data on what real users said, with information on how well your model understood what the users were saying. This gives useful feedback to further refine the model.

Click on one of the icons to enter the tab.

About the Mix.nlu Develop tab

You use the Mix.nlu Develop tab to create intents and entities, add samples, try your model, and then train it.

Note the following:

Multiple language support

Mix.nlu supports multiple languages (or locales) per project. As you can imagine, sample phrases of what your users may say will differ from one language to another. Your samples, therefore, will be different per language/locale.

To filter the list of samples, select the language code from the menu near the name of your project. (If your project includes a single language, no menu appears.)

For example, this project supports three locales, with en_US currently selected:

multi-lang-select

Mix.nlu also allows you to define different literals for list-type entity values per language/locale. This allows you to support the various languages in which your users might ask for an item, such as "coffee", "café", or "kaffee" for a "drip" coffee. More information on how to do this is provided in the sections that follow.

Develop your model

To develop your model, you:

  1. Add intents to your model. An intent defines and identifies an intended action. An utterance or query spoken by a user will express an intent, for example, to order a drink. As you develop an NLU model, you define intents based on what you expect your users to do in your application.
  2. Add entities to your model. Entities identify details or categories of information relevant to your application. While the intent is the overall meaning of a sentence, entities and values capture the meaning of individual words and phrases in that sentence.
  3. Link your entities to your intents. Intents are almost always associated with entities that serve to further specify particulars about the intended action.
  4. Add samples. Samples are typical sentences that your users might say. They teach Mix how your users will interact with your application.
  5. Annotate your samples. Once you define entities in an ontology, you need to annotate the tokens within the samples so that the machine learns.
  6. Modify intents and annotations. Make any required modifications to your intents and annotations.
  7. Verify samples before training. As a final step, review the verification status of each sample phrase or sentence. This is an essential step that has a direct impact on the accuracy of the data used to create your model(s).

Add intents to your model

An intent is something a user might want to do in your application. You might think of intents as actions (verbs); for example, to order. For more information about intents, see Intents.

To add intents to your model:

  1. In Mix.nlu, click the Develop tab.
  2. On the Intents bar, click the plus (+) icon to add an intent.
    add_project_icon
  3. Type the name of your intent (for example, ORDER_COFFEE) and press Enter.

The intent name is added to the list of intents.

Edit an intent name

To edit an intent name:

  1. In the Develop tab intents list, open the menu for the intent.
  2. Select Edit intent name. You can now edit the text of the intent name
  3. Make the edits to the intent name.
  4. Press Enter or click the check icon to make the change. If you instead want to cancel the edit and go back to the existing name, press Escape or click the x icon.

Add entities to your model

Entities collect additional important information related to your intent. You might think of entities as analogous to slots or parameters. For example, if the user's intent is to order an espresso drink, entities might include COFFEE_TYPE, COFFEE_SIZE, FLAVOR, and so on.

This procedure describes how to create custom entities, which are the entities that are specific to your Intent. You can also use one of the existing predefined entities, which are entities that have already been defined to save you the trouble of creating them from scratch. Examples of predefined entities include monetary amounts, Boolean values, calendar items (dates, times, or both), cardinal and ordinal numbers, and so on. For more information, see Predefined entities.

To simplify your model, avoid adding a unique entity for each instance of a similar item. Instead, add a single entity that describes a general type of item. For example, instead of defining entities for Cappuccino, Espresso, and Americano, define a single entity named COFFEE_TYPE.

When you add an entity, you specify the following information:

Field Description
Type Specifies the type of entity. Valid values are:
  • List: A list entity has possible values that can be enumerated in a list. For example, if you have defined an intent called ORDER_COFFEE, the entity COFFEE_TYPE would have a list of the types of drinks that can be ordered. See List entities.
  • Relationship: A relationship entity has a specific relationship to an existing entity, either the "isA" or "hasA" relationship. See Relationship entities.
  • Freeform: A freeform entity is used to capture user input that you cannot enumerate in a list. For example, a text message body could be any sequence of words of any length. In the query "send a message to Adam hey I'm going to be ten minutes late", the phrase "hey I'm going to be ten minutes late" becomes associated with the freeform entity MESSAGE_BODY. See Freeform entities.
  • Regex-based: A regex-based entity defines a set of values using a regular expression. For example, to match account numbers, postal (zip) codes, order numbers, and other pattern-based formats. See Regex-based.
  • Rule-based: A rule-based entity defines a set of values based on a GrXML grammar file. While regular expressions can be useful for matching patterns in text-based input, grammars are useful for matching patterns in spoken user inputs. This type only available for some users. See Rule-based entities.
Referenced as Defines how the entity can be referred to; for example, whether it is referring to a person (Contact, "him") or a place (City, "there"). These are used for handling anaphoras in dialogs.
Sensitive Indicates whether the entity contains sensitive personally identifiable information. Values assigned to any entity marked as Sensitive at runtime will appear in call logs as a masked value.
Note: This only applies to call logs, not diagnostic logs.
Dynamic (Applies to list entities only) Indicates if the entity is dynamic or not. Dynamic list entities allow you to upload data dynamically at runtime. See Dynamic list entities.
Literals (Applies to list entities only) Lets you enter literals and values. A set of literals is the range of tokens in a user's query that corresponds to a certain entity. With literals, you can specify misspellings and synonyms for an entity's value. For example, in the queries "I'd like a large t-shirt" and "I'd like t-shirt, size L", the literals corresponding to the entity SHIRT_SIZE are "large" and "L", respectively. In both cases, the value is the same. Literals can be paired with values, which are then returned in the NLU interpretation result. For example, "small", "medium", and "large" can be paired with the values "S", "M", and "L". For projects that include multiple languages, you can specify variations per language/locale for an entity value.
See List entities for details.
Note: There is a limit to the number of literals that you can enter. See Limits for more information.

To add entities to your model:

  1. On the Entities bar, click the plus (+) icon.
    add_entity_icon
  2. Type the name of the entity (for example, COFFEE_SIZE) and press Enter.
  3. Click the entity to see its details.
  4. Define the entity (see the table above for a description of the fields).

The next step is to link your entities to your intents so that they can be interpreted.

For example, if you have an intent called ORDER_COFFEE that uses the COFFEE_SIZE and COFFEE_TYPE entities, you need to link these entities with the ORDER_COFFEE intent. You also need to link any predefined entities that you want to use.

To link entities to your intents:

  1. On the Intents bar, select the intent.
  2. Click the link entity plus (+) icon and select the entity to link.
    link_entity_to_intent
  3. Repeat for each entity that you want to link to the intent.

Add samples

Samples are typical phrases or sentences that your users might say. They teach Mix how your users think (their mental models) when interacting with your application.

If your project includes multiple languages, be sure to select the appropriate language before you start to enter samples.

multi-lang-select

You can enter a maximum of 500 characters per sample.

In the Develop tab, you can add samples in two ways:

Samples can also be added in the Optimize tab.

Add samples one at a time under a selected intent

To add samples:

  1. (As required) Select the language from the menu near the name of the project.
  2. In the Intents area, click the name of the intent.
  3. In the "The user says" field, type a sample utterance and press Enter. For example, "I want a double espresso."
  4. Repeat this procedure as needed to add samples.

Import multiple samples at once using file import

To add multiple samples at once via a .txt file upload:

  1. (As required) Select the language from the menu near the name of the project.
  2. In the intents bar, click the file-upload upload icon. A Data Upload dialog will open.
  3. Use the file picker to select a .txt file containing samples. You will then be given two options on how to handle the file:
    • Upload the samples to a specific intent
    • Auto-Intent: Import a set of samples and apply Auto-intent to suggest, for each sample, either an existing intent or a newly detected intent
  4. Select the desired option, and if uploading to a specific intent, select an intent as well. upload samples
  5. Click Upload to initiate the upload.

Samples uploaded to a specific intent are attached to that intent. You will want to go in and add annotations after uploading.

Samples uploaded with Auto-intent applied are added initially as UNASSIGNED_SAMPLES with the identified intents initially only suggestions. You will want to go into the Optimize tab to view suggested intents and accept or discard those suggestions. See Apply automation for more details.

The more samples you include for each intent, the better your model will become at interpreting.

For optimal machine learning, samples should be based on data of real-world usage. For additional details on importing samples, see Import data. Also see Creating and Annotating Datasets for Optimal Accuracy.

Note on samples and contractions

Contractions are common in a number of languages, in particular in many European languages like English, French, and Italian. A contraction is a shortened version of a word or group of words combined together by dropping letters and joining with an apostrophe. For example, he's and didn't in English, c'est and l'argent in French, and c'è and l'estratto in Italian.

When sample sentences are added to Mix, whether via import or by typing the sentences in the Develop tab under an intent, the sample sentence is tokenized — broken up into individual tokens (individual units of meaning, usually words) that can be marked up with annotations.

For some languages, the tokenization may work differently than you might expect when encountering contractions using an apostrophe. Sometimes, the tokenization will split the two parts at the apostrophe, with the first part, apostrophe, and second part split as separate tokens.

There is not currently a workaround for this, but be aware that you may see this behavior in some cases.

Edit the sample text

To edit the text of a sample:

  1. Open the menu for the sample.
  2. Select Edit sample.
  3. Make the edits to the sample text.
  4. Press Enter or click the check icon to make the change. If you instead want to cancel the edit and go back to the existing text, press Escape or click the x icon.

Annotate your samples

The final step in developing your training set is to annotate the literals in your samples with entities and tag modifiers.

This will help your model learn to not only interpret intents, but also the entities related to the intents.

Annotated sentence example

As a simple example, consider the following sentence for an intent ORDER_COFFEE:

I want a large cappuccino.

Suppose that this intent has two linked entities, COFFEE_SIZE and COFFEE_TYPE. You can annotate this sample sentence to indicate which entities correspond to what literals. You could annotate the sample as follows:

I want a [COFFEE_SIZE]large[/] [COFFEE_TYPE]cappuccino[/]

Here, the word large is annotated with the COFFEE_SIZE entity and cappuccino is annotated with the COFFEE_TYPE entity.

Annotation use cases

Be aware that some of the details of annotation will depend on whether you are:

More details are available in the sections below.

Selecting tokens

To annotate a sample, you first need to select the relevant tokens in the sample that you want to annotate. Note that a literal can potentially span multiple consecutive tokens, for example, "United States of America". Click on the first and last words for the literal. This highlights and brackets the span of words you want to label. It also opens an entity selection menu to select an entity label.

If you make a mistake and need to deselect and start again, simply click anywhere on the screen. Once you have finished selecting the relevant tokens, select the appropriate entity from the menu to apply the annotation.

Annotating tokens with no previous annotations

If you are annotating a previously un-annotated span of tokens, you can choose an entity from one of two sources in the entity selection menu:

  1. From a list of entities that have already been linked to the present intent. If any entities have already been linked, these will appear at the top of the list in the menu.
  2. From one of the other user-defined or predefined entities available in your project, using Link Entity.
    1. Select Link Entity from the menu.
    2. Select Custom Entities to browse the list of user-defined entities, or Predefined Entities to browse the list of predefined entities.
    3. Select an entity to complete the annotation. This entity will also be linked to the present intent.

Annotating previously annotated tokens

If you try to annotate a span of text that has already been annotated with an entity, the Link Entity option will be unavailable.

Generally, you will also not be able to annotate that span of text with any of the other entities linked to the intent. The exception to this is if a hierarchical relationship (hasA) entity has already been linked to the intent, and the entity for the annotated text is either the inner or outer part of that relationship. In that case the other entity will be available in the list of entities and you will be able to annotate over or within the same text.

For example, suppose your intent has a linked entity FULL_NAME, which is a hasA relationship entity containing two inner entities GIVEN_NAME and FAMILY_NAME. Suppose you have a sample with the following partial annotation:

Notify [FULL_NAME]John Anderson[/].

You will still be able to annotate within this span of text to annotate John with GIVEN_NAME and Anderson with FAMILY_NAME.

You can also still apply tag modifiers, as applicable.

Tag modifiers

A tag modifier modifies or combines entities using a logical operator AND, OR, or NOT.

AND and OR modify two instances of the same entity type to represent one entity value and/or the other. NOT modifies one entity to represent not selecting that entity.

To add AND, OR, or NOT tag modifiers to your annotation, first annotate the entities you want to modify. Then select the entities to modify by clicking the first annotation and then clicking the last annotation. Select Tag Modifier and the appropriate modifier from the entity selection menu.

For example, consider the following partially annotated sentence:

I want a [COFFEE_TYPE]cappuccino[/] and a [COFFEE_TYPE]latte[/]

To annotate with the AND modifier, click the annotation for cappuccino and then the annotation for latte to select both as well as any tokens in between. With the span encompassing both COFFEE_TYPE annotations selected, choose the AND modifier in the Tag modifier sub-menu. The AND modifier is added, wrapping the two COFFE_TYPE annotations:

I want a [AND][COFFEE_TYPE]cappuccino[/] and a [COFFEE_TYPE]latte[/][/]

Annotating with an OR modifier is similar.

To understand how to annotate with a NOT modifier, consider the following partially annotated sentence:

I would like a [COFFEE_SIZElarge[/] [COFFEE_TYPE]coffee[/] with no [SWEETENER]sugar[/].

Here you want to add a NOT annotation to the sample to help your model distinguish between asking for sweetener vs asking specifically not to put sweetener. Click the word not and the SWEETENER annotation to select both, and then choose NOT from the Tag modifier sub-menu. The NOT modifier is added:

I would like a [COFFEE_SIZElarge[/] [COFFEE_TYPE]coffee[/] with [NOT]no [SWEETENER]sugar[/][/].

For information on verifying the status of samples, see Verify samples.

Modify intents and annotations

Mix.nlu provides various ways to modify the intents and annotations that you have added.

Fix incorrect samples

If you make typos while adding samples, or if some samples were not transcribed correctly, you should fix them to make sure that they correspond to what users actually said. This builds a better model.

To fix an incorrect sample:

  1. Click the ellipsis icon ellipsis icon beside the sample that you want to edit and click Edit.
  2. Correct the text as appropriate.
  3. Click the checkmark to save your changes.

Edit or remove annotations

To change an entity that annotates a sample:

  1. Click the entity in the sample then click Remove.
  2. To choose a new entity, click the literal and choose a new entity.

Change intent

To assign a sample to a different intent, use the Move selected Samples dialog. When moving sample sentences, you can choose to also move or delete any annotations that you've made.

To assign a sample sentence to a different intent:

  1. Click the ellipsis icon ellipsis icon beside the sample and click Change Intent.
  2. In the Move selected Samples dialog, select an option for moving your selected sample: to use an existing intent, or to create a new one.
  3. Click Next.
  4. Either:
    • Choose an existing Intent: Choose another intent, NO_INTENT, or UNASSIGNED_SAMPLES.
    • Create a new Intent: Enter a name for the new intent.
  5. Click Next.
  6. Choose to import or remove annotated entities. (This step is not available when moving intents to UNASSIGNED_SAMPLES.)
  7. Click Next.
  8. To confirm, click Finish.

Use the check boxes (or the select-all check box above the list of samples) to move multiple samples.

Assign NO_INTENT

Sometimes an entity applies to more than one intent or, to look at it another way, an entity can mean different things depending on the dialog state. Rather than add this entity to multiple intents, it's best to use NO_INTENT.

Consider these two example interactions. The first one is in the context of booking a meeting.

User: Create a meeting
System: For when?
User: Tomorrow at 2

This second example is in the context of booking a flight.

User: Book flight to Paris
System: For when?
User: Tomorrow at 2

In each of these interactions, there is a clear intent in the user's first statement, but the second utterance on its own has no clear intent.

In this case, it's best to tag "Tomorrow at 2" as [nuance_CALENDARX]Tomorrow at 2[/] to cover both scenarios (and not as [MEETING_TIME]Tomorrow at 2[/] or [FLIGHT_DEPARTURE_TIME]Tomorrow at 2[/]).

As shown in the examples, often these words or phrases are fragments and are used in a dialog as follow-up statements or queries.

NO_INTENT can also be used to support the recognition of global commands like "goodbye," "agent" / "operator," and "main menu" in dialogs. For more information, see configure global commands in the Mix.dialog documentation.

Verify samples before training

Before generating models, verify your training sample data. This step involves reviewing each sample phrase or sentence for intents and entities and ensuring that they have been assigned the correct status. It also involves confirming which samples to include in the training set for the model, and which to exclude.

This process improves your model's accuracy.

Verification of the sample data needs to be carried out for each language in the model, and for each intent.

Open and view samples by language and intent

To get started, open up the set of sample sentences for the language and intent.

  1. Open the Develop tab.
  2. (For multi-locale projects) Select the locale from the menu near the name of the project.
  3. Click an intent to view the samples.

Overview of verification states

Samples can be in the following verification states:

Icon State Description
intent-assigned Intent-assigned A half-filled circle icon indicates that the sample has been assigned an intent.
For example, via .txt or TRSX file upload, by adding a sample using Try, or by manually adding a sample phrase or sentence to an intent in the Mix.nlu UI.
Sample may or may not be annotated.
Impact of this state on the model: Samples assigned this state will only be used to detect the intent. The data provided by this sample will not be used to detect the presence of Entities.
annotation-assigned Annotation-assigned A filled-circle icon indicates that the sample has been assigned an intent and annotation is complete.
Sample can be annotation-assigned via TRSX file upload or in the Mix.nlu UI.
Sample may or may not be annotated.
Impact of this state on the model: Samples assigned this state are used to detect the intent as well as any annotated entities. If such a sample contains a literal that appears in an entity but is not annotated, it will be used as a "counter example" for that entity; that is, it will lower the chance of such entity literals being detected.
excluded Excluded A "pause" icon indicates that the sample, although assigned an intent, is to be Excluded from the model.
Sample can be Excluded in the UI or via TRSX file upload.
Sample may or may not be annotated.
Impact of this state on the model: Samples assigned this state are Excluded.

Samples assigned to UNASSIGNED_SAMPLES, either via .txt or TRSX file upload or manually in the UI, do not have a status icon. These samples contain no annotations and are excluded from the model.

Display status information

By default, status information is not displayed. To see the status information, click the Status visibility toggle, above the samples on the right.
verify_status_toggle Status icons will then appear to the left of the sample items (Or on the right for samples in right-to-left scripts).

In the same area as the Status visibility toggle are toggles for:

Exclude or include samples

You can exclude a sample from your model without having to delete and then add it again. By default, new samples are included in the next model that you build. By excluding a sample, you specify that you do not want it to be used for training a new model. For example, you might want to exclude a sample from the model that does not yet fit the business requirements of your app.

To exclude a sample, click the ellipsis icon ellipsis icon beside the sample and then choose Exclude.


An excluded sample appears with gray diagonal bars and the status icon changes to indicate it is excluded.

You can still modify the excluded sample. Any annotations that were attached to the sample before it was excluded are saved in case you want to re-include it later.

To include a previously excluded sample, either use the ellipsis icon menu or click on the status icon. The sample is restored to its previous state with any previous intent and annotations restored.

Change the status of a sample

When you start annotating a sample assigned to an intent, its state automatically changes from Intent-assigned to Annotation-assigned. This signals to Mix.nlu that you intend to add the sample to your model(s). You can always choose to assign a different state to the sample; for example, to exclude it (change the state to Excluded) or to use it to detect intent only (change to Intent-assigned).

To change the status of a sample, hover over the status icon and click. This will allow you to change the state from Intent-assigned to Annotation-assigned or vice-versa.

Filter displayed samples by status

When there are a lot of samples for an intent, you may want to filter the displayed samples by status. To do this, open the drop-down menu next to the status visibility toggle to choose the status to display.


Bulk operations

To change the verification state of multiple samples at once, use the check boxes (or the select-all check box above the list of samples) to choose multiple samples. Select the appropriate icon from the row above the samples to include or exclude samples, assign them as Intent-assigned, or assign them as Annotation-assigned. You can also choose to remove the selected samples or move them to another intent.
The general idea is that bulk operations apply to all selected samples, but there are operation-specific particularities you should be aware of.

Operation Notes on behavior
Exclude Already excluded samples will stay as-is. Intent-assigned and Annotation-assigned samples will be excluded, but the previous state, including any assigned intent and annotations, will be remembered in case you want to re-include the sample.
Include Already included samples will stay as-is. Previously excluded samples will be re-included with the same verification state as they had before being excluded.
Intent-assigned Excluded samples are not impacted and stay excluded.
Annotation-assigned Excluded samples are not impacted and stay excluded.

Only visible samples can be selected for mass status change, that is, samples that have not been filtered from the view.

Notes

Train your model

Training is the process of building a model based on the data that you have provided.

If your project (or locale) contains no samples, you cannot train a model. You need at least one sample sentence that is either intent-assigned or annotation-assigned. Be sure to verify samples.

Developing a model is an iterative process that includes multiple training passes. For example, you can retrain your model when you add or remove sample sentences, annotate samples, verify samples, include or exclude certain samples, and so on. When you change the training data, your model no longer reflects the most up-to-date data. As this happens, the model must be retrained to enable testing the changes, exposing errors and inconsistencies, and so on.

Training a model

To train your model:

  1. In Mix.nlu, click the Develop tab.
  2. (As required) Select the locale from the menu near the name of the project.
  3. Click Train Model.

Mix.nlu trains your model. This may take some time if you have a large training set. A status message is displayed when your model is trained.

To view all status messages (notifications), open the Console panel Console panel icon.

Training a model that includes prebuilt domains

If you have imported one or more prebuilt domains, click the Train Model button to choose to include your own data and/or the prebuilt domains. Since some prebuilt domains are quite large and complex, you may not want to include them when training your model.

To train your model to include one or more domains:

  1. Click the arrow beside Train Model.
    The list of prebuilt domains is displayed in addition to your own data.
    In the example below, the Nuance TV and Nuance Weather prebuilt domains have been imported into the project:
  2. Check the domains you want to include.
  3. Check My data to include your data.
  4. Click Train Model.

Training error log

Training error log example

Sometimes training will result in errors that cause the training to fail. In this case, Mix.nlu provides information on the errors as a downloadable CSV log file.

A download link appears next to the Train Model button. Click to download the CSV file. The file includes one line for each error produced during the attempt to train the model.

Download error log

Test it

After you train your model, you can test it interactively in the Try panel. Use testing to tune your model so that your client application can better understand its users.

The Try panel is available in both the Develop and Optimize tabs.

Try to interpret a new sentence

To test your model:

  1. In Mix.nlu, click the Develop tab.
  2. (As required) Select the language from the menu near the name of the project.
  3. Click Try. The Try panel appears.
  4. Enter a sentence your users might say and press Enter.

Read and understand the results

The Try panel presents the response from the NLU engine.

The Results area shows the interpretation of the sentence by the model with the highest confidence. In the image here, the Results area displays the orderCoffee intent with a confidence score of 1.00. The Results area also shows any entity annotations the model has been able to identify.

Note that the Results area will not reflect any the changes you have made to intents and entities since the last time you trained the model.

No annotations appear in the Results area if the NLU engine cannot interpret the entities in your sample using your model. Also, there is no annotation for dynamic list entities. Only your client application can provide this information at runtime.

Full information from the NLU engine, including all interpretations, appears formatted as a JSON object. For more information on the fields in an interpretation, see the reference section: Interpretation.

Add the sentence to the training set

If you are unsatisfied with the result in Try, you can add the sentence to your project as a new sample and then manually correct the intent or annotations. Realistic sentences that the model understands poorly are excellent candidates to add to the training set. Adding correctly annotated versions of such sentences helps the model learn, improving your model in the next round of training.

To add a sentence you have just tested, click Add Sample. The sample will be added to the training set for the intent identified by the model, along with any entity annotations the model recognized.

If Try recognized an intent, but no entities, the new sample will be added as Intent-assigned.

If Try also recognized entities, the new sample will be added as Annotation-assigned.

If the same sentence is already in the training set with the same annotations, the count will be updated for that sentence. If the same sentence is already in the training set, but with different annotations, then to maintain consistency in the training set you will not be able to add the sample from Try.

Correct errors in the interpretation

Once the sample is added into the training set, make corrections to the intent and annotation labels to help the model better recognize such sentences in the future.

If the recognized intent was incorrect, change the intent.

If the annotated entities were incorrect, edit the annotation.

Roll out your model

Now that you have developed, trained, and tested out your model, you are ready to roll out the model and the project. This way, users can interact with it via an application and you can see how well your application works "in the wild".

To do this, you need to:

  1. Build your model.
  2. Create your application configuration.
  3. Deploy your application configuration.

Discover what your users say

Now that your model is ready, and rolled out in an application, you can look at what people say or type while using your application. These samples from users appear in the Discover tab, along with information about the origin of the samples and how your model interpreted each sample. You’ll review them there, then add the ones you want directly into your intents in your training set to improve and grow your model.

To open the Discover tab for a project:

  1. From the Mix Dashboard, select a project with a deployed application configuration.
  2. Click the .nlu icon to open Mix.nlu.
  3. Select the Discover tab.

View Discover data

When you first open the Discover tab, there will be no data displayed, and you will be prompted to select a source of data to display.

discover-load-samples

To access data for an application configuration within the Discover tab:

  1. Use the selectors at the top of the tab to identify the source and time range from which to pull data. Select the application, associated context tag, environment, and date range using the selectors. By default, date range will select the past seven days, but you can choose a custom date range using either a start and end date, a number of days, or one of the available preset range options.
    NOTE: The start date can be no more than 28 days prior to the current date.
    select-source-and-date-range
  2. Click Load Samples.

Mix.nlu will look for user sample data from the specified source and time frame. If there is data from the application in the selected time frame available to retrieve, it will be displayed.

Is there is no applicable data, you will see a no samples screen instead.

no-samples

Refresh Discover data

To refresh the displayed data for the same application configuration and date range, click the Load Samples button again.

Discover tab contents

Within the Discover tab, you can view information on speech or text input from application users. The information is presented in tabular format, with one row for each sample.

Here is more detail about the contents for each column in the table.

Column Description
Intent The intent identified by the model for the user input.
If the model determines that the sample does not seem to fit any of the expected intents, it will show NO_MATCH. NO_MATCH cases can help you identify intents that were not considered before but which are important to users. These can be added to refine and improve the model.
Samples The content of the user input, as text. The sample may include annotations attached by the model if (1) the model identified an intent, (2) the identified intent has entities defined, and (3) the model confidently identified entity values in the sample.
Note: For entities marked as sensitive in the model underlying the application, the information will show up as ****redacted****.
Score The model’s level of confidence in the inferred intent, as a decimal between 0.00 and 1.00.
Collected on Date and time the input was collected in your time zone.
Region Deployment region where the user interaction occurred.

If there is a lot of user data, the data is presented in pages.

You can sort the rows by the values of the Intent, Score, Collected on, or Region columns. Click on the column title to sort. By default, the data is sorted on the "Collected on" column to show the data in chronological order. Clicking on a column header a second time will sort on that column in the opposite order.

Invalid intents and entities

If you have changed the model ontology since last deploying your application configuration, and these changes impact the intents and/or entities interpreted for the samples, this is flagged in the table contents to remind you that the interpreted results are based on an outdated version of the model.

Intents and entities within the table will be visibly flagged with an orange marker if the intent or entity inferred by the application is no longer in the model ontology in Mix.nlu.

discover-intent-invalid

discover-entity-invalid

Filtering displayed data

As the usage of your application ramps up, and you get multiple pages of user data, the amount of recent data displayed in Discover can become difficult to make sense of.

The Discover tab provides filters to help reduce the displayed samples down to a smaller subset of samples. To do this, use the filter panel beside the table.

discover-filters

You can filter the samples on these dimensions:

For Intents and Entities, you can select multiple items to include in each filter by clicking the available checkboxes. Click once on a checkbox to select and a second time to deselect.

Filters for which at least one selection has been made are marked with a blue dot. When you select the first item, the filter value is displayed on the filter label. If you select more than one item, a simple count of how many are selected out of the total number of options is displayed.

Within the Intents and Entities filters you can click Select All to check all the checkboxes; this makes it easier to select all except by selecting all then deselecting the specific items you don't want to see. Clear All unchecks all the checkboxes for a filter.

Once you have chosen the filters you want to apply, click Apply in the filters header. The data displayed in the table will update to show only data corresponding to the filter values.

Clicking Clear all in the filters header resets the selections in the filters to their original defaults and displays all samples.

You can hide the filter panel to free up space as needed and open it again to go back.

Change the intent for a sample

You can change the intent for a sample to one of the intents that are currently in the model ontology. This is useful if the model version used in the application interpreted the sample as an intent that is no longer in the model. This could happen, for example, if you have recently refactored your ontology.

To change the intent for a sample, open the intent menu and select the desired intent.

You can choose either one of the existing intents, or UNASSIGNED_SAMPLES.

Change intent

The sample will be labeled with the updated valid intent, and the the intent column will be marked with a blue dot to indicate that the intent has been updated.

Hovering over the dot will reveal a tooltip indicating the originally inferred intent.

Original intent rollover

Add samples to the training set

From the Discover tab, you can add selected samples for valid intents directly to the training set.

There are two options available for this:

Samples can be added to the training set under one of three verification states:

Note the following behaviors which apply to importing individual samples and bulk imports:

Note that once a sample has been imported to the training set, the sample will remain in Discover.

Add an individual sample

To add a sample with a valid intent to the training set:

  1. Click the add-sample-icon icon to open the add menu.
  2. Select one of the verification state options from the menu to add the sample to the training set with the chosen verification state.

Add single sample

Add multiple samples with bulk add

To save time adding multiple samples from Discover to your training set, you can select multiple samples at once for bulk import, and then add the samples to the training set in a chosen verification state.

Checkboxes are provided beside each sample to select the samples. A checkbox in the header above the samples allows you to select all selectable samples on the current page.

A bulk add samples button in the header allows you to choose the target verification state for the selected samples.

To bulk add a selection of samples:

  1. Use the checkboxes to select samples.
  2. Select the desired state for the samples in the bulk actions bar above the samples.

Add multiple samples

Download bulk add errors data

When bulk adding multiple samples, it is possible that errors and warnings will be produced. A pop up appears when a bulk add is completed, summarizing the results of the operation, including any errors and warnings. To read detailed error logs, you can download an errors log file in CSV format. A Download Logs button for the CSV file will be displayed in the popup. To download the file, click the button.

Bulk add download error logs

Download Discover data

You can download the currently selected data from the Discover tab as a CSV file. This includes, for each sample, any entity annotations identified by the model and displayed in Discover.

If filters are currently applied, only the filtered data will be downloaded.

To download the sample data as CSV, click on the download icon download-data above the table. You can then process the CSV data externally into a format that can be imported into Mix.nlu. For more information about importing data into a model, see Importing and exporting data.

Iterating your model

Using the insights gained from the Discover tab, you can refine your training data set, build and redeploy your updated model, and finally view the data from your refined model on the Discover tab. Rinse and repeat! You can improve your model (and your application) over time using an iterative feedback loop.

Optimize model development

The Optimize tab is a feature intended for advanced power users.

It provides advanced automation tools to help make it more efficient to develop larger or more complex projects and perform more sophisticated work on your NLU models.

For users new to Mix.nlu, the Develop tab is the best place to start developing models. The Develop tab is more appropriate for smaller DIY projects.

Optimize tab overview

Visible at the top of the screen are:

The Train Model button initiates training using the training data samples.

The Try panel, as in the Develop tab, allows you to interactively test the model by typing in a new sentence.

Sample Sentences panel

The Sample Sentences panel gives a unified view of all samples in the project for the currently selected language, of all intent types and all verification statuses.

The Optimize tab also gives a unified set of controls to perform operations on samples, whether for a single sample, or a chosen set of samples.

The data is displayed in a table, with one row for each sample and with data displayed for the following columns:

Column Description
Intent Intent type for the sample. This can have one of the following values:
  • A user-defined Intent type
  • NO_INTENT
  • UNASSIGNED_SAMPLES
  • A new intent suggestion coming from an Auto-intent run
An intent menu in this column for each sample allows you to change the intent for the sample, either to an existing intent or a new intent created on the fly.
Status Indicates the sample status with an icon. This includes the same values used in the Develop tab.
  • Excluded
  • Intent-suggested: Sample with intent in a suggested state pending acceptance of Auto-intent suggestion by user
  • Intent-Assigned
  • Annotation-assigned
Note: UNASSIGNED_SAMPLES do not have a verification status, and appear in this column with a dash.
Sample The text of the sample, along with any already assigned entity annotations, as well as:
  • Checkbox selector to select multiple samples for bulk operations.
  • Ellipsis menu to perform actions on the individual sample.
  • Count indicator (optional) showing the number of times the exact sample appears in the corpus. You can also increase or decrease the number of appearances.
Note: Counts and annotations can be toggled on and off using the controls in the sample column header.

The data in the table can be sorted by column values:

Click on the column header to sort the samples by that column. Click again to sort in the opposite order.

### Sample status progress bar A progress bar above the data table gives a visual sense of what proportion of the sample data has been processed through to Annotation-assigned, and is thereby ready to use for training a model.

Visibility toggles

The header bar above the Sample contents column has toggles to control the visibility of:

Personal Data: show or hide personally identifying information (PII) in the displayed samples

sentence visibility toggles

Filter displayed samples

By default, the Optimize tab displays all samples.

To filter the samples down to a smaller subset of samples, use the filter panel beside the table. You can filter the samples on these dimensions:

filters

Multiple items to include can be selected in the Intents and Entities filters by clicking the available checkboxes. Click once on a checkbox to select and a second time to deselect.

Filters for which at least one selection has been made are marked with a blue dot. When you select the first item, the filter value is displayed on the filter label. If you select more than one item, a simple count of how many are selected out of the total number of options is displayed.

filters

Within the Intents and Entities filters you can click Select All to check all the checkboxes; this makes it easier to select all except by selecting all then deselecting the specific items you don't want to see. Clear All unchecks all the checkboxes for a filter.

Once you have chosen the filters you want to apply, click Apply in the filters header. The data displayed in the table will update to show only data corresponding to the filter values.

Clicking Clear all in the filters header resets the selections in the filters to their original defaults and displays all samples.

You can hide the filter panel to free up space as needed and open it again to go back.

Apply automation

The Select Automation menu appears in the samples actions bar above the samples. Select automation provide options for automating basic tasks of grouping and annotating samples. Currently this menu supports one automation task, Auto-intent. In future releases, additional automations will be added.

Auto-intent

Auto-intent performs an analysis of UNASSIGNED_SAMPLES, suggesting intents for these samples.

Each previously unassigned sample is tentatively labeled with one of a small number of auto-detected intents present within the set of unassigned samples.

If a sample is recognized as fitting the pattern of an already defined intent, Auto-intent suggests this existing intent.

Groups of samples that appear related, but which do not appear to fit the pattern of an existing intent are labeled generically as AUTO_INTENT_01, AUTO_INTENT_02, and so on.

Run Auto-intent on UNASSIGNED_SAMPLES

To run Auto-intent:

  1. Choose Select Automation from the actions bar above the table.
  2. Select the Auto-Intent checkbox.
  3. Click Run Automation.

auto-intent

This initiates the Auto-intent process. When the run is finished, it returns a suggested intent classification for each previously unassigned sample.

Accept or discard Auto-intent suggestions

When the Auto-intent operation completes, you can view the suggestions. Initially, these suggestions are tentative, and from a verification perspective, they are in the status Intent-suggested. No intent is yet assigned. You can next choose to accept or discard the suggestions.

auto-intent

Clicking the checkmark icon accepts the suggestion, while clicking the x icon discards the suggestion. For a sample with a suggestion for an existing intent, accepting the suggestion assigns the sample to that intent and moves the sample from Intent-suggested to Intent-assigned. Discarding the suggestion moves the sample back to UNASSIGNED_SAMPLES. A toast icon will be displayed to confirm your choice has been applied.

Rename a newly identified intent

For a sample identified as a newly identified intent (AUTO_INTENT_01, AUTO_INTENT_02...), you are prompted to rename the intent to a meaningful name when you try to accept the suggestion.

auto-intent

Enter a new name in the text field provided and press Enter.

Three things happen when you do this:

auto-intent

#### Auto-annotation Auto-annotation is a feature that works on un-annotated samples (Intent-assigned but not Annotation-assigned) for a specified intent. Working within the selected intent, Auto-annotation attempts to identify any instances of entities associated with the intent and labels them accordingly. #### Run Auto-annotation To run Auto-annotation: 1. Choose **Select Automation** from the action bar above the table. 2. Select the **Auto-Annotation** checkbox. 3. Choose an intent from the list of existing intents on which to apply Auto-annotation. 3. Click **Run Automation**. This initiates the Auto-annotation process. When the run is finished, it returns a suggested entities annotation for each previously unassigned sample. #### Accept or discard auto-annotate suggestions When the Auto-annotate operation completes, you can view the suggestions. Initially, these suggestions are tentative, and from a verification perspective, they are in the status *Annotations-suggested*. No annotations are yet assigned. You can next choose to *accept* or *discard* the suggestions. Clicking the checkmark icon will accept the suggestion, while clicking the trash icon will discard the suggestion. A toast icon will be displayed to confirm your choice has been applied.

Add multiple samples to an intent

A Samples editor provides an interface to create and add multiple new samples in one shot. This serves as a faster way to create new samples.

Samples are added as plain text without annotations. Individual samples can have up to a maximum of 500 characters. You can add up to 100 samples at one time using this editor.

To add samples:

  1. Select Sample from the actions bar above the table. An editor will launch with multiple lines to type in samples.
    add samples
  2. Use the Select Intent dropdown to choose the intent to which you want to add new samples.
    choose intent
    You can also select instead to apply Auto-intent to the new samples.
  3. Enter samples in the editor. There are a few ways to do this:
    • Type in a sample and press the Tab or Enter key or click the next line to enter another sample.
    • Copy-paste a list of samples from a word processor or other text editor. The samples need to be separated with hard or soft returns in the source for the editor in Mix to correctly divide them into separate samples. The samples will appear in the editor on separate lines. choose intent
  4. Repeat as needed until you have entered all the samples you want to add.
  5. Once you have added your samples, click Submit to add the samples.

If you chose an intent for the samples, the new samples should now appear in Optimize and in Develop under the intent. You can annotate the samples in either of these tabs.

If you chose to apply Auto-intent to the samples, the samples will appear in the table of samples with intent suggestions. You can then proceed to rename any newly detected intents, accept or discard the suggested intents, and annotate the samples.

### Find and replace Find and Replace fields in the samples actions bar above the table allow you to do a substring search or search and replace on the entire training set. Regex patterns also can be used for the search. #### Perform find and replace To perform find samples matching a string or pattern (and if desired, do a replace): 1. Click the **Find** field and type a search string or a regex pattern. 2. If you want to do a replace on samples that match, click the **Replace** field and type in replacement text. 3. Press Enter. Samples containing the search substring or matching the regex pattern will be displayed, and if replace was selected, the matches will be replaced with the replacement text.

Update individual samples

You can perform several actions on individual samples:

The controls and behavior for individual sample operations are mostly the same as those in the Develop tab.

Change sample intent in intent menu

An intent menu available in the Intent column of each sample allows an alternate means to change the intent for a sample.

To change the sample intent to an existing intent:

  1. Click to open the intent menu.
    change-intent-dropdown
  2. Select a new intent for the sample. There are multiple ways to do this:

    • Scroll through the list of existing intents, and find the intent you want.
    • If there are a lot of intents in your project, you can also use the search field to track down the intent you want more quickly.
  3. Click on the intent name to select the intent.

Sometimes, you may realize that the sample does not fit any of the existing intents. In this case, you can create a new intent directly in the menu. With the intent menu open:

  1. Type in a new intent name in the search field. You will see no results in the search field and will be prompted to add the intent.
    name-new-intent
  2. Click the add icon add-icon in the intent menu

In both cases, the Move Samples menu will open to allow you to move the sample to the new intent and decide how you want to deal with any entities in the sample.

Perform bulk operations

As in the Develop tab, you can perform bulk operations on a selected subset of multiple samples at the same time. The behavior for bulk operations in the Optimize tab is the same as for bulk operations in the Develop tab, as described in verify samples before training. The only difference is that in the Optimize tab, operations can be carried out on samples from more than one intent at once.

bulk-operations

Select a subset of samples using the checkbox selectors on each row, or choose all samples in the current filter view using the checkbox selector in the columns header.

Once you have selected the subset of samples, click an icon on the header bar to apply one of the available operations:

The icons for accepting and discarding selected samples will only be active if at least one of the selected samples has a pending auto-intent suggestion.

Clicking the bulk accept icon opens a window summarizing the selected samples with samples grouped by suggested intent. For newly identified intents, you need to choose a global rename for the intent. Only once all newly identified intents have been renamed can you click to accept the suggestions.

Auto-intent (applies to unassigned samples) Auto-annotate (applies to samples that are Intent-assigned but not Annotation-assigned)

Ontology

In natural language understanding, an ontology is a formal definition of entities, ideas, events, and the relationships between them, for some knowledge area or domain. The existence of an ontology enables mapping natural language utterances to precise intended meanings within that domain.

In the context of Mix.nlu, an ontology refers to the schema of intents, entities, and their relationships that you specify and that are used when annotating your samples, and interpreting user queries.

Intents

An intent identifies an intended action. For example, an utterance or query spoken by a user expresses an intent to order a drink. As you develop an NLU model, you define intents based on what you want your users to be able to do in your application. You then link intents to functions or methods in your client application logic.

Here are some examples of intents you might define:

Intents are often associated with entities to further specify particulars about the intended action.

Entities

An entity is a language construct for a property, or particular detail, related to the user's intent. For example, if the user's intent is to order an espresso drink, entities might include COFFEE_TYPE, FLAVOR, TEMPERATURE, and so on. You can link entities and their values to the parameters of the functions and methods in your client application logic.

If an entity applies to a particular intent, it is referred to as a relevant entity for that intent. The idea of relevant entities is important:

Mix.nlu supports the following user-defined entity types:

Mix.nlu also supports two classes of predefined types:

Mix.nlu also provides some mechanisms to modify, combine and refer to the existing types:

List entities

A list entity has possible values that can be enumerated in a list. For example, if you have defined an intent called ORDER_COFFEE, the entity COFFEE_TYPE would have a list of drink types that can be ordered. Other list types entities might include song titles, states of a light bulb (on or off), names of people, names of cities, and so on.

A literal is the range of tokens in a user's utterance or query that corresponds to a certain entity. The literal is the exact spoken text. For example, in the query "I'd like a large t-shirt", the literal corresponding to the entity SHIRT_SIZE is "large". Other literals might be "small", "medium", "large", "big", and "extra large". When you annotate samples, you select a range of text to tag with an entity. For list-type entities, you can then add the text to the list for the entity. Lists of literals can also be uploaded in .list or .nmlist files. For more information, see Importing entity literals.

Literals can be paired with values. For example, "small", "medium", and "large" can be paired with the values "S", "M", and "L", respectively. Multiple literals can have the same value, which makes it easy to map different ways a user might say an entity into a single common form. For example, "large", "big", "very big" could all be given the same value "L".

Defining literal-value pairs per language

If your project includes multiple languages, you will want to support the various ways that users might ask for an item in their language of choice. List-based entities created in a project are shared across languages. The values and associated literals connected to the entity, however, are created and managed separately by language. This gives flexibility to handle situations where the value options vary by language and location.

When you add a value-literal pair, this pair will apply to the entity only in the currently selected language. The same value name can be used in multiple languages for the same list-based entity, but the value and its literals need to be added separately in each language.

To add a new value and a literal for a list-based entity within the currently selected language, enter the literal and value in the Entity list pane where indicated and then click the plus (+) icon. The new value appears in the list along with the first literal. You can also click there to add new literals that map to the same entity value. Again, the literal-value pairs added will not be automatically added to the other languages in the project.

To remove a literal, click the delete icon close-icon next to the literal. You are asked to confirm the deletion. This removes the literal from the currently selected language.

entity-edit-literal

Dynamic list entities

It is not always feasible to know all possible literals when you create a model, and you may need the ability to interpret values at runtime. For example, each user will have a different set of contacts on his or her phone. It is not practical (or doable) to add every possible set of contact names to your entity when you are building your model in Mix.nlu.

Dynamic list entities allow you to upload data dynamically in a client application at runtime. The data is uploaded in the form of a wordset using the Mix NLUaaS or ASRaaS API. The client application can then use this data to provide personalization and to improve spoken language recognition and natural language understanding accuracy.

Defining dynamic entities

To define a list entity as dynamic, check the Dynamic box for this entity.

While dynamic data is uploaded at runtime, it is still important to define a representative subset of literal and value pairs for dynamic list entities. This ensures that the model is trained properly and improves the accuracy of the ASR. Using our contact example, this means that you should include a representative subset of what you expect contact names to look like, and ensure that you have samples with the proper annotation.

When naming your dynamic entities in each model, keep in mind that they are global per application ID (across languages and deployed model versions).

Relationship entities: isA and hasA

A relationship entity has a specific relationship to an existing entity, either an "isA" or a "hasA" relationship.

isA relationship entities

An isA relationship states that ENTITY_X is a type of ENTITY_Y. The definition of Y is inherited by X, such as Y's list of literals, as well as any applicable grammars and relationships. Note that while the definition of the child entity is the same as the parent entity, the child entity picks up differences because of its different role in your samples.

For example, say you have a train schedule app and you want to accept queries such as "When is the next train from Boston to New York." Both "Boston" and "New York" are instances of the STATION entity. If you annotated the query using STATION for both cases, then you would have no way of determining which is the origin and which is the destination. To resolve this, you could instead define two list-type entities, FROM_STATION and TO_STATION, and associate each with the same list of literals. This would, of course, be time consuming and difficult to manage. The better solution is to define one list-type entity STATION with an associated list of cities/stations, and then define FROM_STATION isA STATION, and TO_STATION isA STATION. Now, you only have one list of stations to manage. The model interprets queries and returns FROM_STATION or TO_STATION as appropriate for the roles they play in the query, and returns literals and values from the list associated with the STATION entity.

You can also make isA relationships to predefined entities. For example, AGE is a nuance_CARDINAL_NUMBER.

hasA relationship entities

A hasA relationship states that ENTITY_Y is a property or a part of ENTITY_X. That is, ENTITY_X has a ENTITY_Y. For example, the entity FULL_NAME might have the sub-entities GIVEN_NAME and FAMILY_NAME as part of it. The entity DRINK might have COFFEE_TYPE and SIZE as part of it. Note that unlike an isA relationship, an entity can have multiple hasA relationships.

You would use hasA relationships if the entities in your queries have structure. However, Nuance recommends that you use hasA relationships only if you have a definite need, since they can be tricky to work with, and the complexity means the NLU models may be less accurate than desired. An example of a definite need is to be able to interpret a query like "put the red block into the gre en box". In this case you need a way to associate the color red with the block and the color green with the box. Without using hasA relationships the JSON object returned would be flat and you would not know which color went with which object. Using hasA, you would define an OBJECT that has a COLOR and SHAPE. Then the following annotation becomes possible: "put the [OBJECT][COLOR]red[/][SHAPE]block[/][/] into the [OBJECT][COLOR]green[/][SHAPE]box[/][/]".

Create a new relationship entity

  1. Click the + icon in the Entities panel to create a new entity and give it a name.
  2. Click on the new entity in the Entities panel to open the editor.
  3. Set Type to Relationships.
    select relationships entity type
    A relationships definition editor appears underneath.
    relationship editor
  4. Click the + icon for the type of relationship entity you want to create, isA or hasA. A dropdown will open allowing you to pick from the existing custom and predefined entities.
    choose related entity
  5. Select one of the sub-entities to which your new entity is related.
  6. Repeat steps 4 and 5 for any other sub-entities in the relationship definition.

The relationship is now defined.

relationship-defined

  1. Go to the Develop tab and open the intent containing the sentence.
  2. Click to select the portion of the sentence containing the (outer) hasA entity.
    select-outer-entity
    In the entity selection menu that appears, you can see both the outer, hasA entity, as well as the sub-entities to which it is related by a hasA relationship.
  3. Select the hasA entity from the menu. The outer entity will be annotated.
    annotate-outer-entity
  4. For each of the inner sub-entities, select the portion of the sentence containing the entity, and select the entity from the menu.
    annotate-inner-entities

The sentence is now fully annotated.

annotate-inner-entities

Regex-based

A regex-based entity defines a set of values using regular expressions. For example, product or order values are typically alphanumeric sequences with a regular format, such as gro-456 or ABC 967. Both of these examples, and many more codes with the same general pattern, can be described with the regex pattern:
[A-Za-z]{3}\s?-?\s?[0-9]{3}

Similarly, you might use regex-based entities to match account numbers, postal (zip) codes, confirmation codes, PINs, or driver's license numbers, and other pattern-based formats.

Creating regex-based entities

To use a regular expression to validate the value of an entity (for example, an order number as shown below), enter the expression as valid JavaScript.

In this example the user is creating a regex-based entity called ORDER_NUMBER, which will match order numbers in the form gro-456, COF-123, sla 889, and so on (three characters + an optional hyphen and/or space + three digits).

To save the pattern, click Download project and save regex-based entity.

Before the entity-type is created (or modified), Mix.nlu exports your existing NLU model to a ZIP file containing a TRSX file so that you have a backup. Creating (or modifying) a regex-based entity requires your NLU model to be re-tokenized, which may take some time and impact your existing annotations. You receive a message when the entity is saved successfully.

Mix.nlu validates the search pattern as you enter it and alerts you if it is invalid. Invalid expressions (including empty values) are not saved.

Notes and cautions

Note the following points when creating regular expressions in regex-based entities:

Capture groups

Be careful when using parentheses in a regular expression, for example to quantify a sub-pattern with +, *, ?, or {m,n}. Enclosing in parentheses creates a capture group. In general programming, matching a regex pattern with capture groups on a string returns both the full pattern, and the individual capture groups, in order, packaged as an array.

With Mix.nlu specifically, however, an entity expects a single value. When you use a regex with capture groups, Mix.nlu will return the result from the first capture group only rather than the full pattern. This is to allow extra flexibility for developers; for example if you want to recognize a date pattern, but only need the month to fulfill the user's intent. If you need to use a parenthetical group, but want the full pattern match as the value returned for the entity, there are two options:

Anchors

Avoid using a caret (^) to denote the beginning of a regular expression, or a dollar sign ($) to denote the end, as doing so will cause the NLU engine to expect the expression at the beginning, or end, of a sentence. Consider this phone number regex-based entity (any phone number of format 123-456-7890):

Annotating with regex-based entities

Annotating with regex-based entities means identifying the tokens to be captured by the regex-defined value. At runtime the model tries to match user words with the regular expression.

For example:

What's the status of order [ORDER_NUMBER]COF-123[/]

Rule-based

A rule-based entity defines a set of values based on a GrXML grammar file.

While regular expressions can be useful for matching patterns in text-based input, grammars are useful for matching patterns in spoken user inputs.

Add to Glossary in mix-documentation: Rule-based entities do not have specific values. Instead their values are defined in a GrXML file. See [Rule-based entity](../mix-nlu/#rule-based).

Creating rule-based entities

To create a rule-based entity:

  1. Prepare the grammar file. See Understanding grammar files and GrXML file rules below for more details on filename conventions and the required format of the file.
  2. (As required) In Mix.nlu select the language from the menu near the name of the project. (GrXML files are language-specific.)
  3. Create a new entity and name it appropriately, keeping in mind the requirements described in the link above.
  4. Choose as the Type: Rule-based.
  5. Browse to upload the grammar file that you have prepared.
  6. Click Download project and save rule-based entity.
    create_grxml_entity
  7. If your project includes multiple languages, upload separate grammar files, one for each language. See the note below.

Before the new entity is saved (or modified), Mix.nlu exports your existing NLU model to a ZIP file (one ZIP file per language) so that you have a backup of your NLU model. Creating (or modifying) a rule-based entity requires your NLU model to be retokenized, which may take some time and impact your existing annotations. You receive a message when the entity is saved successfully.

At any time you can use the download button to view the contents of the GrXML file.

download_grxml_

Note the following additional points when creating rule-based entities:

#### Annotating with rule-based entities Annotating with rule-based entities means identifying the words to be captured using a rule grammar (GrXML file). At runtime, the model tries to match user words with the grammar file. For example: `I'd like to pay with my [CARD_TYPE]Visa[/]`

Understanding grammar files

Example GrXML file:

<?xml version='1.0' encoding='utf-8'?>
<grammar xml:lang="en-US" version="1.0" root="DP_NUMBER" xmlns="http://www.w3.org/2001/06/grammar">
   <meta name="swirec_normalize_to_probabilities" content="1"/>
   <meta name="swirec_enable_robust_compile" content="1"/>

   <rule id="DP_NUMBER" scope="public">
      <one-of>
         <item>
            <ruleref uri="#S"/>
            <tag>DP_NUMBER = S.V</tag>
         </item>
         <item>
            <ruleref uri="#EMIR"/>
            <tag>DP_NUMBER = EMIR.V</tag>
         </item>
      </one-of>
   </rule>

   <rule id="S">
      <item repeat="1-16">
        <one-of>
            <item>
              <ruleref uri="#DIGIT"/>
              <tag>V = V ? V + DIGIT.V : DIGIT.V</tag>
            </item>
            <item> <ruleref uri="#dash"/> </item>
        </one-of>
      </item>
   </rule>

   <rule id="EMIR">
         seven eight four <tag><![CDATA[V = "784"]]> </tag>
         <item repeat="0-1"> <ruleref uri="#dash"/> </item>
         <one-of>
            <item> nineteen <tag><![CDATA[V=V+"19"]]></tag> </item>
            <item> twenty <tag><![CDATA[V=V+"20"]]></tag>  </item>
         </one-of>
         <one-of>
            <item>
               <ruleref uri="#DIGIT"/> <tag>V = V ? V + DIGIT.V : DIGIT.V</tag>
               <ruleref uri="#DIGIT"/> <tag>V = V ? V + DIGIT.V : DIGIT.V</tag>
            </item>
            <item> eighty <tag><![CDATA[V=V+"80"]]></tag> </item>
            <item> eighty one <tag><![CDATA[V=V+"81"]]></tag>  </item>
            <item> eighty two <tag><![CDATA[V=V+"82"]]></tag>  </item>
            <item> eighty three <tag><![CDATA[V=V+"83"]]></tag>  </item>
            <item> eighty four <tag><![CDATA[V=V+"84"]]></tag>  </item>
            <item> eighty five <tag><![CDATA[V=V+"85"]]></tag>  </item>
            <item> eighty six <tag><![CDATA[V=V+"86"]]></tag>  </item>
            <item> eighty seven <tag><![CDATA[V=V+"87"]]></tag>  </item>
            <item> eighty eight <tag><![CDATA[V=V+"88"]]></tag>  </item>
            <item> eighty nine <tag><![CDATA[V=V+"89"]]></tag>  </item>
         </one-of>
         <item repeat="0-1"> <ruleref uri="#dash"/> </item>
         <ruleref uri="#DIGIT"/> <tag>V = V ? V + DIGIT.V : DIGIT.V</tag>
         <ruleref uri="#DIGIT"/> <tag>V = V ? V + DIGIT.V : DIGIT.V</tag>
         <ruleref uri="#DIGIT"/> <tag>V = V ? V + DIGIT.V : DIGIT.V</tag>
         <ruleref uri="#DIGIT"/> <tag>V = V ? V + DIGIT.V : DIGIT.V</tag>
         <ruleref uri="#DIGIT"/> <tag>V = V ? V + DIGIT.V : DIGIT.V</tag>
         <ruleref uri="#DIGIT"/> <tag>V = V ? V + DIGIT.V : DIGIT.V</tag>
         <ruleref uri="#DIGIT"/> <tag>V = V ? V + DIGIT.V : DIGIT.V</tag>
         <item repeat="0-1"> <ruleref uri="#dash"/> </item>
         <ruleref uri="#DIGIT"/> <tag>V = V ? V + DIGIT.V : DIGIT.V</tag>
   </rule>

   <rule id="DIGIT" scope="private">
      <one-of>
         <item> <ruleref uri="#zero"/> <tag><![CDATA[V="0"]]></tag> </item>
         <item> <item>one</item>  <tag><![CDATA[V="1"]]></tag> </item>
         <item> <item>two</item>  <tag><![CDATA[V="2"]]></tag> </item>
         <item> <item>three</item>  <tag><![CDATA[V="3"]]></tag> </item>
         <item> <item>four</item>  <tag><![CDATA[V="4"]]></tag> </item>
         <item> <item>five</item>  <tag><![CDATA[V="5"]]></tag> </item>
         <item> <item>six</item>  <tag><![CDATA[V="6"]]></tag> </item>
         <item> <item>seven</item>  <tag><![CDATA[V="7"]]></tag> </item>
         <item> <item>eight</item>  <tag><![CDATA[V="8"]]></tag> </item>
         <item> <item>nine</item>  <tag><![CDATA[V="9"]]></tag> </item>
         <item> double <ruleref uri="#zero"/> <tag><![CDATA[V="00"]]></tag> </item>
         <item> double <item>one</item>  <tag><![CDATA[V="11"]]></tag> </item>
         <item> double <item>two</item>  <tag><![CDATA[V="22"]]></tag> </item>
         <item> double <item>three</item>  <tag><![CDATA[V="33"]]></tag> </item>
         <item> double <item>four</item>  <tag><![CDATA[V="44"]]></tag> </item>
         <item> double <item>five</item>  <tag><![CDATA[V="55"]]></tag> </item>
         <item> double <item>six</item>  <tag><![CDATA[V="66"]]></tag> </item>
         <item> double <item>seven</item>  <tag><![CDATA[V="77"]]></tag> </item>
         <item> double <item>eight</item>  <tag><![CDATA[V="88"]]></tag> </item>
         <item> double <item>nine</item>  <tag><![CDATA[V="99"]]></tag> </item>
         <item> triple <ruleref uri="#zero"/> <tag><![CDATA[V="000"]]></tag> </item>
         <item> triple <item>one</item>  <tag><![CDATA[V="111"]]></tag> </item>
         <item> triple <item>two</item>  <tag><![CDATA[V="222"]]></tag> </item>
         <item> triple <item>three</item>  <tag><![CDATA[V="333"]]></tag> </item>
         <item> triple <item>four</item>  <tag><![CDATA[V="444"]]></tag> </item>
         <item> triple <item>five</item>  <tag><![CDATA[V="555"]]></tag> </item>
         <item> triple <item>six</item>  <tag><![CDATA[V="666"]]></tag> </item>
         <item> triple <item>seven</item>  <tag><![CDATA[V="777"]]></tag> </item>
         <item> triple <item>eight</item>  <tag><![CDATA[V="888"]]></tag> </item>
         <item> triple <item>nine</item>  <tag><![CDATA[V="999"]]></tag> </item>
      </one-of>
   </rule>

   <rule id="dash" scope="private">
      <one-of>
         <item> dash </item>
         <item> minus </item>
      </one-of>
   </rule>

   <rule id="zero" scope="private">
      <one-of>
         <item> zero </item>
         <item> null </item>
         <item> oh </item>
      </one-of>
   </rule>

</grammar>

Shown here is an example GrXML file. This grammar file is designed to recognize a specific account number type in conjunction with a rule-based entity called DP_NUMBER.

From the attributes of the grammar element, we know the language for the grammar is United States English (xml:lang="en-US")

Notice that the header of the file identifies "DP_NUMBER" (the same name as the rule-based entity) as the root rule (root="DP_NUMBER").

Below this, we see the root rule definition (<rule id="DP_NUMBER" scope="public">).

This rule itself consists of a one-of list with two options representing two possible formats for the account number. Each of these options refers to a sub-rule appearing further on in the file via a ruleref element. The first option refers to a rule entitled "S" (<ruleref uri="#S"/>). The second option refers to another rule entitled "EMIR" (<ruleref uri="#EMIR"/>). These sub-rules themselves reference additional rules "DIGIT", "dash", and "zero" used by both.

At runtime, Mix.nlu compares what the user says with the patterns defined in the different sub-rule branches. If the user utterance matches a pattern, this activates that branch. The code in the tag element of the branch assigns the appropriate value to the DP_NUMBER variable and returns this value.

If the user utterance doesn’t match an option from any of the rules with reasonable accuracy, the rule-based entity and any intents using the entity will not match with significant confidence.

A rule includes some number of items, which represent parts of possible matches for the rule. A rule can look for any one of a set of items matching the rule. For example, this rule looks for different ways to say the same digit zero:

<rule id="zero" scope="private">
  <one-of>
   <item> zero </item>
   <item> null </item>
   <item> oh </item>
  </one-of>
</rule>
A rule or item can also look for a specified number or range of repetitions of some pattern. For example, the following looks for zero or one matches to a rule that recognizes a dash. ` ` These can also be combined. For example, the following rule looks for a sequence of between one and sixteen digits and dashes.

<rule id="S">
  <item repeat="1-16">
    <one-of>
      <item>
        <ruleref uri="#DIGIT"/>
        <tag>V = V ? V + DIGIT.V : DIGIT.V</tag>
      </item>
      <item><ruleref uri="#dash"/></item>
    </one-of>
  </item>
</rule>

For more information on GrXML, refer to the standard at Speech Recognition Grammar specification.

GrXML file rules

The filename for the GrXML file must have from 1-128 characters, and may include upper and lowercase letters, 0-9, - (hyphen), and _ (underscore).

A rule grammar file has this format:

Troubleshooting GrXML errors

Here are some notes that may help if you encounter problems creating rule-based entities.

Issue Description
Invalid file extension The file is not a GrXML file. If you are creating a rule-based entity, you must upload a GrXML file with the *.grxml extension.
Invalid file name The filename must not exceed 128 characters and is limited to upper and lowercase letters, 0-9, - (hyphen), and _ (underscore).
Grammar root value The grammar root in the GrXML file must be the entity name. For example:
<grammar ... root="DP_NUMBER" ...>
File contains GrXML errors There are format errors in the file’s GrXML markup. For example, check that the grammar root, the rule ID, and the return tag all use the entity name:
<grammar... root="DP_NUMBER" ...>
<rule id="DP_NUMBER" ...>
<tag>DP_NUMBER = S.V</tag>
Grammars may not reference other files The grammar file may not include references to other files; for example, this is not supported: <ruleref uri="acct_num.grxml#emir"/>
Any related rules required by the grammar must be included in the file being uploaded.

Freeform entities

A freeform entity is used to capture, as a single block, user input that you cannot:

Take the example of an intent for sending a text message to a specified user. A text message body could be any sequence of words of any length. In the query "send a message to Adam hey I'm going to be ten minutes late", the phrase "hey I'm going to be ten minutes late" becomes associated with a freeform entity MESSAGE_BODY.

An important aspect of a freeform entity is that the meaning of the literal corresponding to the entity should not be important or necessary for fulfilling the intent. In the example of sending a text message, the application does not need to understand the meaning of the message; it just needs to send the literal text as a string to the intended recipient. Note that since you are not concerned with the meaning of the contents of a freeform entity, the values for a freeform entity are only vaguely defined. Only the literal is important.

Having difficulty determining which type to use? See the examples below.

Example sports application – List type

Consider a sports application, where your samples would include many ways of referring to one sports team, for example, the Montreal Canadiens:

Since you could enumerate each option, you would make this a list type and annotate it accordingly. Additionally, the NLU engine would learn about the entity from these different ways of referring to the Canadiens. You would not have to enumerate every possible sports team or every possible way to refer to the Canadiens.

Example SMS app message recipient – regex or rule-based type

Consider an SMS messaging application, where samples include the destination phone number. There are billions of possible phone number combinations, so clearly you could not enumerate all the possibilities, nor would it really make sense to try. However, phone numbers would not be considered freeform input, since there is a fixed, systematic structure to phone numbers that falls under a small set of pattern formats. These patterns can be recognized either with a regex pattern (for typed in phone numbers) or a grammar (for spoken numbers). Another problem with handling a phone number as a freeform entity is that understanding the phone number contents will be necessary to properly direct the message.

Example SMS app message contents – Freeform type

When your sample entity includes text that does not have well-defined many-to-one relationships and that cannot be fully enumerated or described with rules or patterns, use the freeform entity type. Consider an SMS app, where it is impossible to list or specify every way that a user may say something to your app. The body of an SMS message could be literally anything. Here is an example of what those annotations might look like:

MESSAGE_BODY would be a freeform entity because the contents of a message are unpredictable and cannot be fully enumerated. Moreover, understanding the contents is not necessary to send the message to its destination.

Notes on freeform entity annotation

Some important points to remember about annotating freeform entities:

Notes on freeform entity recognition

Some important points to remember about recognition of freeform entities:

Best practice

Be careful not to overuse freeform entities, especially when a large base grammar already exists for the information you want to collect, such as SONGS or CITIES. Avoid using a freeform entity to collect this type of information—the NLU engine has already been trained on a huge number of values, and you won't benefit from this if you use a freeform entity.

Predefined entities

Mix.nlu includes a set of predefined entities that can be useful as you develop your own NLU models. Predefined entities save you the trouble of defining entities that are generally useful in a number of different applications, such as monetary amounts, Boolean values, calendar items (dates, times, or both), cardinal and ordinal numbers, and so on.

A predefined entity is not limited to a flat list of values, but instead can contain a complete grammar that defines the various ways that values for that entity can be expressed. A grammar is a compact way of expressing a vast range of possible constructions.

For example, within the nuance_DURATION entity, there is a grammar that defines expressions such as "3.5 hours", "25 mins", "for 33 minutes and 19 seconds", and so on. It would simply not make sense to try to capture the possible expressions for this entity in a list.

Some notes:

For more information, including on specific predefined entities, see Predefined entities.

Dialog predefined entities

Mix.nlu adds a default set of entities to simplify your Mix.dialog applications. These dialog entities are isA entities that refer to predefined entities. Dialog entities have shorter, more descriptive names than predefined entities. This can make it easier to develop and maintain your Mix.dialog application while taking advantage of the convenience of predefined entities.

For example, DATE is a dialog predefined entity that is defined as an isA entity for nuance_CALENDARX. If your Mix.dialog application processes dates, use the DATE entity instead of nuance_CALENDARX.

Like the predefined entities prefaced with nuance_, you cannot rename dialog predefined entities, delete them, or edit them.

Dialog entities appear in the Predefined Entities section of the Entities area. Mix adds them when you create your project.

This table briefly describes the purpose of each dialog predefined entity.

Dialog entity isA predefined entity Description
DATE nuance_CALENDARX Calendar date
TIME nuance_CALENDARX Time of day
YES_NO nuance_BOOLEAN Yes or no

Note: The following dialog entities are deprecated and, therefore, may appear in the Custom Entities list. These dialog entities can be edited, renamed, and deleted.

Dialog entity isA predefined entity Description
CC_EXP_DATE nuance_EXPIRY_DATE Credit card expiry date
CREDIT_CARD nuance_CARDINAL_NUMBER Credit card number
CURRENCY nuance_AMOUNT Monetary amount
DIGITS nuance_CARDINAL_NUMBER String of digits
NATURAL_NUMBER nuance_CARDINAL_NUMBER Round number with no decimal point
PHONE nuance_CARDINAL_NUMBER Telephone number
SSN nuance_CARDINAL_NUMBER Social Security Number
ZIP_CODE nuance_CARDINAL_NUMBER Postal zip code

Tag modifiers

A tag modifier modifies or combines entities in a sample by adding a logical operator: AND, OR, or NOT. You specify tag modifiers by annotating samples.

Your Mix.nlu model can use the AND and OR modifiers to connect multiple entities. It can use the NOT modifier to negate the meaning of a single entity.

For example, "a cappuccino and a latte" would be annotated as [AND][COFFEE_TYPE]cappuccino[/] and a [COFFEE_TYPE]latte[/][/]. The AND modifier applies to the two COFFEE_TYPE annotations.

The literal "no cinnamon" would be annotated as no[NOT][SPRINKLE_TYPE]cinnamon[/][/]. The NOT modifier applies to the SPRINKLE_TYPE annotation.

Note how the literals "and" and "no" are not annotated as an entity or tag modifier. Instead, tag modifiers are the parents of the annotations that they connect or negate.

Anaphoras

An anaphora is defined as "the use of a word referring back to a word used earlier in a text or conversation, to avoid repetition" (from Lexico/Oxford dictionary).

An anaphora often occurs in dialogs and makes it difficult to understand what the user means. For example, consider the following phrases:

In this example, "there" is an anaphora for "Montreal".

In this example, "him" is an anaphora for "Bob".

An ellipsis (intent anaphora) occurs when a user references an intent that was identified in a previous request. The dialog recognizes when the wording of the new request refers to the intent of the previous request, including its entities. For example: * User: “What is the weather in Boston this weekend?” * System: “This weekend in Boston the weather will be …” * User: “What about Montreal?” * The system understands the intent is to find the weather and includes the entity weekend: “This weekend in Montreal, the weather will be …” Note: Ellipsis are supported in the context of the most recent intent; the system cannot recognize previous intents.

Tagging anaphoras

In Mix.nlu, you can:

This will help your dialog application determine to which entity the anaphora refers, based on the data it has, and internally replace the anaphora with the value to which it refers. For example, "Drive there" would be interpreted as "Drive to Montreal".

The four types of anaphora entities are:

Identify an entity as referable

First, you want to identify the entity as referable.

  1. In the Entities area of the Develop tab, select the entity.
  2. In the Referenced as field, select the correct anaphora type for this entity.
    For example, for a location, select REF_PLACE:

Annotate a sample containing an anaphora

Once the entity has been identified as referable, you can annotate a sample containing an anaphora reference to that entity.

  1. In the Develop tab, open the intent containing the sample.
  2. Locate the sample containing an anaphora reference to the referable entity, and click the reference word.
  3. An entity selector menu will open. You should see as options both the referable entity, as well as the corresponding anaphora entity type (REF_xxxx) to which the entity is referable. Select the anaphora entity type from the menu.

The sentence is now annotated as containing an anaphora reference.

Language support

The Nuance Mix Platform offers a growing number of languages. To determine the languages (locales) available to your project, go to the Mix.Dashboard, select your project, and click the Targets tab. For more information, see Build resources.

For the complete list of supported languages, see Languages.

Change log

2021-09-15

2021-08-25

2021-08-04

2021-06-09

2021-04-21

2021-03-31

2021-03-03

Updates to Optimize tab.

2021-02-03

2021-01-27

2020-12-14

2020-12-02

2020-11-25

2020-10-14

Update to Discover tab enabling export of data as .csv.

2020-09-03

Update to Verify samples to enable bulk operations changing the verification state of multiple samples at the same time.

2020-09-02

Adding new Discover tab. The Mix.nlu Discover tab allows you to see what users are saying to your deployed application, giving you the opportunity to refine your NLU models based on actual data. For now the data is read-only; additional functionality will be added in future releases, such as ability to export data, assign intents, annotate the data, and add selected samples to your training set.

2020-08-30

Update and refactoring of Modify samples and Verify samples sections to reflect updates to the UI of the Develop tab samples view and changes in functionality.

2020-08-11

2020-07-17

Added additional information to Verify samples to explain the impact of the new "intent verified" and "fully verified" states.

Note that action is required to approve (fully verify) entity annotations. This crucial step ensures that models are built with the correct data.

2020-07-14

2020-06-11

2020-05-04

Updated screenshots.

2020-03-31

2020-02-19

2020-01-22

Updated predefined entities section.

2019-12-18

2019-12-02

Updated occurrences of the term "concept" with "entity."

2019-11-15

Below are changes made to the Mix.nlu documentation since the initial Beta release: