Skip to main content

Dataset creation

The better the bot understands its users, the better it will perform its tasks. It is important to train it on a dataset resembling the data the bot will encounter during real communication.

JAICP provides several tools through which you can create a dataset for training the bot from the platform interface:

caution
Data labeling and intent fine-tuning are disabled by default. Send a request to our customer support if you want them to become available.

To go to dataset creation:

  1. Navigate to a project and select CAILA → Data labeling in the dashboard.

  2. Choose a tool appropriate for your needs: data labeling or intent fine-tuning.

    tip
    Use data labeling if you have your own data for training. Intent fine-tuning will come in handy if the bot has already been in operation for some time and has acquired dialog data.
  3. Depending on your choice, upload a file with the data or select Import from analytics. Now you can start working on the dataset.

    tip
    The article on How to train intents contains practical recommendations for creating CAILA classifiers. Keep them in mind as you work on your dataset.

If you have already used data labeling or fine-tuning in this project, but you want to process a new dataset this time, select New set of phrases after navigating to CAILA → Data labeling.