General Configurations#

The bot configuration file is the entrypoint for your bot. The section lists the options you can configure in the bot_config.yaml file. It contains all the key configurations needed for building different use cases. It is an extension of the configurations supported by NeMo Guardrails. All configurations supported by NeMo Guardrails are also supported by NVIDIA ACE Agent.

In addition, the following configurations are supported exclusively by NVIDIA ACE Agent.

Bot Name and Version#

Attention

It is mandatory to specify a bot name in the bot configuration file.

It is optional to specify a bot version as part of the bot configuration file. The bot name and version combination must be unique and will be used uniquely to identify your bot, when multiple bots are deployed together.

bot: llm_bot
version: 1

Configuring the NLP and Plugin Server Endpoints#

These configurations can be specified under the configs field in the bot_config.yaml file. The available configurations are listed below. All of them have a default value, and get utilized by default unless specified otherwise.

configs:
# The endpoint URL of the NLP server service
nlp_server_url: http://localhost:9000

# The endpoint URL of the plugin service
fm_server_url: http://localhost:90025050

# The interval in seconds used to poll nlp server and fm server for checking availability by Chat Engine. This polling is used to ensure a smooth fallback if either of these services goes down in the middle of conversation.

health_check_interval: 5

Configuring the NLP Server Model Parameters#

There are different parameters associated with different deployed models. This section illustrates the common configurations used to control the model behavior.

- task_name: str # preset task name for which you want to configure
  model_name: str # model name
  confidence_threshold: Optional[float] # The minimum confidence score that should be returned by the specified model for the specified task. The model result is disregarded otherwise.

Some examples are provided below for supported tasks.

Intent Slot Models#

nlp_models:
- task_name: generate_user_intent
    model_name: riva_intent_weather
    confidence_threshold: 0.5

Named Entity Recognizer Models#

nlp_models:
- task_name: generate_user_slots
    model_name: riva_ner
    confidence_threshold: 0.5

Text Classifier Models#

nlp_models:
- task_name: text_classifier
    model_name: riva_text_classification
    confidence_threshold: 0.5

Configuring Bot Voices#

NVIDIA ACE Agent supports Riva TTS models out of the box. It provides capability to plugin any custom models. To set a unique voice name for one or more bots, you can use the voice_name field to provide the values.

This configuration is useful to support multi-voice for multi-bots. An example usage of this field can be seen in the sample NPC bots provided.

A list of available voices for Riva TTS can be found in the Riva TTS documentation. The default value of the voice_name field is picked up from the Speech Configurations.

voice_name: English-US.Female-Fearful # Example voice name

Multilingual Configurations#

NVIDIA ACE Agent supports multi-language input and multi-language output.

If you are using Riva’s Neural Machine Translation model to do translation either at the query level or response level, then you need to provide the request_language and response_language code as shown below. The default values are en-US for all three.

language: Optional[str] = en-US
    request_language: Optional[str] = en-US
    response_language: Optional[str] = en-US

If the request_language is not equal to language, then the NMT model (if deployed) will be called to translate the query with the source language set to request_language and target language set to language. For example, you need to set the following configs to convert all incoming queries from Spanish to English:

language: Optional[str] = en-US
request_language: Optional[str] = es-US

If the response_language is not equal to language, then the NMT model (if deployed) will be called to translate the response text with the source language set to language and target language set to response_language. For example, you need to set the following configs to convert all outgoing responses from English to Spanish:

language: Optional[str] = en-US
response_language: Optional[str] = es-US
Tip

Refer to Spanish Weather Bot to see the above configurations in action.

The request and response language can also be provided at runtime with every query. Refer to the /chat endpoint schema.

LLM Model Configurations#

To configure the backbone LLM model that will be used by the guardrails configuration, set the models key:

models:
- type: main
    engine: openai
    model: text-davinci-003

The meaning of the attributes are as follows:

type is set to main indicating the main LLM model
engine is the LLM provider.
model is the name of the model. The recommended option is text-davinci-003.
parameters are any additional parameters, for example, temperature, top_k, and so on.

Only one model should be specified in the models key.

All NeMo Guardrails supported LLM models and their configurations are also supported by NVIDIA ACE Agent.

LangChain Models#

You can use any LLM provider from the cloud that is supported by LangChain by mentioning its name in the engine field. For example:

ai21
aleph_alpha
anthropic
anyscale
azure
cohere
huggingface_endpoint
huggingface_hub
openai
self_hosted
self_hosted_hugging_face

Refer to the LangChain documentation for the complete list.

NVIDIA AI Endpoints#

You can use any available NVIDIA API Catalog LLM as part of the NVIDIA AI Endpoints model type. Set the engine type field to nvidia-ai-endpoints to plug it in with your bot and provide the model name along with any parameters needed.

- type: main
engine: nvidia-ai-endpoints
model: ai-mixtral-8x7b-instruct

Visit the NVIDIA API Catalog website to browse the LLMs and try them out.

To use any of the providers, you will need to install additional packages. When you first try to use a configuration with a new provider, you will typically receive an error from LangChain that will instruct you on what package should be installed.

Important

You can instantiate any of the LLM providers above, depending on the capabilities of the model, some will work better than others with the NVIDIA ACE Agent toolkit. The toolkit includes prompts that have been optimized for certain types of models (for example, openai). For others, you can optimize the prompts yourself.

Custom LLM Models using the LangChain Wrapper#

To register a custom LLM provider using the LangChain wrapper, you need to create a class that inherits from the BaseLanguageModel class provided by LangChain and register it using the register_llm_provider utility method provided by NeMo Guardrails. This code can be provided in the initialization file named config.py. NVIDIA ACE Agent will automatically execute it before starting the bot.

from langchain.base_language import BaseLanguageModel
from nemoguardrails.llm.providers import register_llm_provider

class CustomLLM(BaseLanguageModel):
    """A custom LLM."""

register_llm_provider("custom_llm", CustomLLM)

You can then use the custom LLM provider in your bot configuration file:

models:
- type: main
    engine: custom_llm

LLM Prompts#

You can customize the prompts that are used for the various LLM tasks (for example, generate user intent, generate next step, generate bot message) using the prompts key. For example, to override the prompt used for the generate_user_intent task for the OpenAI gpt-3.5-turbo model:

prompts:
- task: generate_user_intent
    models:
    - gpt-3.5-turbo
    content: |-
    <<This is a placeholder for a custom prompt for generating the user intent>>

The full list of predefined tasks used by the NVIDIA ACE Agent toolkit include the following:

general generates the next bot message, when no canonical forms are used.
generate_user_intent generates the canonical user message.
generate_next_steps generates the next thing the bot should do or say.
generate_bot_message generates the next bot message.
generate_value generates the value for a context variable (in other words, extract the user-provided values).
fact_checking checks the facts from the bot response against the provided evidence.
jailbreak_check checks if there is an attempt to break moderation policies.
output_moderation checks if the bot response is harmful, unethical, or illegal.
check_hallucination checks if the bot response is a hallucination.

You can check the default prompts in the prompts folder of the NeMo Guardrails repository.