General Configurations#
The bot configuration file is the entrypoint for your bot. The section lists the options you can configure in the bot_config.yaml
file. It contains all the key configurations needed for building different use cases. It is an extension of the configurations supported by NeMo Guardrails. All configurations supported by NeMo Guardrails are also supported by NVIDIA ACE Agent.
In addition, the following configurations are supported exclusively by NVIDIA ACE Agent.
Bot Name and Version#
Attention
It is mandatory to specify a bot name in the bot configuration file.
It is optional to specify a bot version as part of the bot configuration file. The bot name and version combination must be unique and will be used uniquely to identify your bot, when multiple bots are deployed together.
bot: llm_bot version: 1
Configuring the NLP and Plugin Server Endpoints#
These configurations can be specified under the configs
field in the bot_config.yaml
file. The available configurations are listed below. All of them have a default value, and get utilized by default unless specified otherwise.
configs: # The endpoint URL of the NLP server service nlp_server_url: http://localhost:9000 # The endpoint URL of the plugin service fm_server_url: http://localhost:90025050 # The interval in seconds used to poll nlp server and fm server for checking availability by Chat Engine. This polling is used to ensure a smooth fallback if either of these services goes down in the middle of conversation. health_check_interval: 5
Configuring the NLP Server Model Parameters#
There are different parameters associated with different deployed models. This section illustrates the common configurations used to control the model behavior.
- task_name: str # preset task name for which you want to configure model_name: str # model name confidence_threshold: Optional[float] # The minimum confidence score that should be returned by the specified model for the specified task. The model result is disregarded otherwise.
Some examples are provided below for supported tasks.
Intent Slot Models#
nlp_models: - task_name: generate_user_intent model_name: riva_intent_weather confidence_threshold: 0.5
Named Entity Recognizer Models#
nlp_models: - task_name: generate_user_slots model_name: riva_ner confidence_threshold: 0.5
Text Classifier Models#
nlp_models: - task_name: text_classifier model_name: riva_text_classification confidence_threshold: 0.5
Configuring Bot Voices#
NVIDIA ACE Agent supports Riva TTS models out of the box. It provides capability to plugin any custom models. To set a unique voice name for one or more bots, you can use the voice_name
field to provide the values.
This configuration is useful to support multi-voice for multi-bots. An example usage of this field can be seen in the sample NPC bots provided.
A list of available voices for Riva TTS can be found in the Riva TTS documentation. The default value of the voice_name
field is picked up from the Speech Configurations.
voice_name: English-US.Female-Fearful # Example voice name
Multilingual Configurations#
NVIDIA ACE Agent supports multi-language input and multi-language output.
If you are using Riva’s Neural Machine Translation model to do translation either at the query level or response level, then you need to provide the request_language
and response_language
code as shown below. The default values are en-US
for all three.
language: Optional[str] = en-US request_language: Optional[str] = en-US response_language: Optional[str] = en-US
If the request_language
is not equal to language
, then the NMT model (if deployed) will be called to translate the query with the source language set to request_language
and target language set to language
. For example, you need to set the following configs to convert all incoming queries from Spanish to English:
language: Optional[str] = en-US request_language: Optional[str] = es-US
If the response_language
is not equal to language
, then the NMT model (if deployed) will be called to translate the response text with the source language set to language
and target language set to response_language
. For example, you need to set the following configs to convert all outgoing responses from English to Spanish:
language: Optional[str] = en-US response_language: Optional[str] = es-USTip
Refer to Spanish Weather Bot to see the above configurations in action.
The request and response language can also be provided at runtime with every query. Refer to the
/chat
endpoint schema.
LLM Model Configurations#
To configure the backbone LLM model that will be used by the guardrails configuration, set the models
key:
models: - type: main engine: openai model: text-davinci-003
The meaning of the attributes are as follows:
type
is set tomain
indicating the main LLM modelengine
is the LLM provider.model
is the name of the model. The recommended option istext-davinci-003
.parameters
are any additional parameters, for example,temperature
,top_k
, and so on.
Only one model should be specified in the models
key.
All NeMo Guardrails supported LLM models and their configurations are also supported by NVIDIA ACE Agent.
LangChain Models#
You can use any LLM provider from the cloud that is supported by LangChain by mentioning its name in the engine
field. For example:
ai21
aleph_alpha
anthropic
anyscale
azure
cohere
huggingface_endpoint
huggingface_hub
openai
self_hosted
self_hosted_hugging_face
Refer to the LangChain documentation for the complete list.
NVIDIA AI Endpoints#
You can use any available NVIDIA API Catalog LLM as part of the NVIDIA AI Endpoints model type. Set the engine
type field to nvidia-ai-endpoints
to plug it in with your bot and provide the model name along with any parameters needed.
- type: main engine: nvidia-ai-endpoints model: ai-mixtral-8x7b-instruct
Visit the NVIDIA API Catalog website to browse the LLMs and try them out.
To use any of the providers, you will need to install additional packages. When you first try to use a configuration with a new provider, you will typically receive an error from LangChain that will instruct you on what package should be installed.
Important
You can instantiate any of the LLM providers above, depending on the capabilities of the model, some will work better than others with the NVIDIA ACE Agent toolkit. The toolkit includes prompts that have been optimized for certain types of models (for example,
openai
). For others, you can optimize the prompts yourself.
Custom LLM Models using the LangChain Wrapper#
To register a custom LLM provider using the LangChain wrapper, you need to create a class that inherits from the BaseLanguageModel
class provided by LangChain and register it using the register_llm_provider
utility method provided by NeMo Guardrails. This code can be provided in the initialization file named config.py
. NVIDIA ACE Agent will automatically execute it before starting the bot.
from langchain.base_language import BaseLanguageModel from nemoguardrails.llm.providers import register_llm_provider class CustomLLM(BaseLanguageModel): """A custom LLM.""" register_llm_provider("custom_llm", CustomLLM)
You can then use the custom LLM provider in your bot configuration file:
models: - type: main engine: custom_llm
LLM Prompts#
You can customize the prompts that are used for the various LLM tasks (for example, generate user intent, generate next step, generate bot message) using the prompts
key. For example, to override the prompt used for the generate_user_intent
task for the OpenAI gpt-3.5-turbo
model:
prompts: - task: generate_user_intent models: - gpt-3.5-turbo content: |- <<This is a placeholder for a custom prompt for generating the user intent>>
The full list of predefined tasks used by the NVIDIA ACE Agent toolkit include the following:
general
generates the next bot message, when no canonical forms are used.generate_user_intent
generates the canonical user message.generate_next_steps
generates the next thing the bot should do or say.generate_bot_message
generates the next bot message.generate_value
generates the value for a context variable (in other words, extract the user-provided values).fact_checking
checks the facts from the bot response against the provided evidence.jailbreak_check
checks if there is an attempt to break moderation policies.output_moderation
checks if the bot response is harmful, unethical, or illegal.check_hallucination
checks if the bot response is a hallucination.
You can check the default prompts in the prompts folder of the NeMo Guardrails repository.