Bot Configurations Introduction

All NVIDIA ACE Agent components are driven by a set of configurations. A bot is defined by a set of configuration files in a directory. The following section acts as a guidance for different configurations available to you and outlines the common files you need to assemble in order to build your bot.

The supported configurations are an extension of the configurations supported by NeMo Guardrails.

  • General Configurations - Which language models to use, their parameters, general instructions (similar to system prompts), and sample conversation.

  • Slot Configurations [Optional] - A yaml file that defines any rules to detect, store, and maintain slots in the dialog history. A slot is a key-value pair in which the Chat Engine can store any information as part of its memory. ACE Agent extracts relevant slots from memory whenever needed to efficiently understand and answer user queries.

  • Plugin Configurations [Optional] - In order to integrate any third party application written in Python like LangChain, plugins can be provided under a directory named plugins within the bot directory.

  • Model Configurations [Optional] - A yaml file to define configurations of any on-prem or remotely deployed model using different servers. These models can be used to carry out a set of supported standard Natural Language Processing tasks.

  • Chat Engine Configurations - These configurations define the paths which your bot should follow when a request comes in. These configurations can be authored using NVIDIA’s proprietary dialog modeling language called Colang.

These files are typically included in a folder (let’s call it bot_config) which can be referenced when starting the bot using aceagent tool.

aceagent chat cli -c ./bot_config
sample _bot_directory
├── bot_config
│   └── bot_config.yaml
│   ├── file_1.co
│   ├── file_2.co
│   ├── ...
│   ├──
│   ├── config.py
│   └── slots.yaml
  • Speech Configurations - A set of configurations to control speech AI capabilities for your bot involving mainly automatic speech to text and text to speech.