General Configurations#
The bot configuration file is the entrypoint for your bot. The section lists the options you can configure in the bot_config.yaml
file. It contains all the key configurations needed for building different use cases. It is an extension of the configurations supported by NeMo Guardrails. All configurations supported by NeMo Guardrails are also supported by NVIDIA ACE Agent.
In addition, the following configurations are supported exclusively by NVIDIA ACE Agent.
Bot Name and Version#
Attention
It is mandatory to specify a bot name in the bot configuration file.
The bot name and version combination must be unique and will be used uniquely to identify your bot, when multiple bots are deployed together. Bot name with the version should look like {bot_name}_v{bot_version}
. For example:
bot: llm_bot_v1
It is optional to specify a bot version as part of the bot configuration file. Without a version, the bot name should look like {bot_name}
. For example:
bot: llm_bot
Configuring the NLP and Plugin Server Endpoints#
These configurations can be specified under the configs
field in the bot_config.yaml
file. The available configurations are listed below. All of them have a default value, and get utilized by default unless specified otherwise.
configs: # The endpoint URL of the NLP server service nlp_server_url: http://localhost:9000 # The endpoint URL of the plugin service plugin_server_url: http://localhost:9002 # The interval in seconds used to poll nlp server and fm server for checking availability by Chat Engine. This polling is used to ensure a smooth fallback if either of these services goes down in the middle of conversation. health_check_interval: 5
Multilingual Configurations#
NVIDIA ACE Agent supports multi-language input and multi-language output.
If you are using Riva’s Neural Machine Translation model to do translation either at the query level or response level, then you need to provide the request_language
and response_language
code as shown below. The default values are en-US
for all three.
language: Optional[str] = en-US request_language: Optional[str] = en-US response_language: Optional[str] = en-US
If the request_language
is not equal to language
, then the NMT model (if deployed) will be called to translate the query with the source language set to request_language
and target language set to language
. For example, you need to set the following configs to convert all incoming queries from Spanish to English:
language: Optional[str] = en-US request_language: Optional[str] = es-US
If the response_language
is not equal to language
, then the NMT model (if deployed) will be called to translate the response text with the source language set to language
and target language set to response_language
. For example, you need to set the following configs to convert all outgoing responses from English to Spanish:
language: Optional[str] = en-US response_language: Optional[str] = es-USTip
Refer to Spanish Weather Bot to see the above configurations in action.
The request and response language can also be provided at runtime with every query. Refer to the
/chat
endpoint schema.
LLM Model Configurations#
To configure the backbone LLM model that will be used by the guardrails configuration, set the models
key:
models: - type: main engine: openai model: gpt-4-turbo
The meaning of the attributes are as follows:
type
is set tomain
indicating the main LLM modelengine
is the LLM provider.model
is the name of the model. The recommended option isgpt-4-turbo
.parameters
are any additional parameters, for example,temperature
,top_k
, and so on.
Only one LLM model should be specified in the models
key.
All NeMo Guardrails supported LLM models and their configurations are also supported by NVIDIA ACE Agent.
LangChain Models#
You can use any LLM provider from the cloud that is supported by LangChain by mentioning its name in the engine
field. For example:
ai21
aleph_alpha
anthropic
anyscale
azure
cohere
huggingface_endpoint
huggingface_hub
openai
self_hosted
self_hosted_hugging_face
Refer to the LangChain documentation for the complete list.
NVIDIA AI Endpoints#
You can use any available NVIDIA API Catalog LLM as part of the NVIDIA AI Endpoints model type. Set the engine
type field to nim
to plug it in with your bot and provide the model name along with any parameters needed.
models: - type: main engine: nim model: <MODEL_NAME>
Visit the NVIDIA API Catalog website to browse the LLMs and try them out.
For using local NIM deployment:
models: - type: main engine: nim model: <MODEL_NAME> parameters: base_url: <NIM_ENDPOINT_URL>
To use any of the providers, you will need to install additional packages. When you first try to use a configuration with a new provider, you will typically receive an error from LangChain that will instruct you on what package should be installed.
Important
You can instantiate any of the LLM providers above, depending on the capabilities of the model, some will work better than others with the NVIDIA ACE Agent toolkit. The toolkit includes prompts that have been optimized for certain types of models (for example,
openai
). For others, you can optimize the prompts yourself.
The following models have been tested with the Colang 2.0-beta version.
OpenAI models:
gpt-3.5-turbo-instruct
gGpt-3.5-turbo
gpt-4-turbo
gpt-4o
gpt-4o-mini
NIM models:
meta/llama3-8b-instruct
meta/llama3-70b-instruct
meta/llama-3.1-8b-instruct
meta/llama-3.1-70b-instruct
Custom LLM Models using the LangChain Wrapper#
To register a custom LLM provider using the LangChain wrapper, you need to create a class that inherits from the BaseLanguageModel
class provided by LangChain and register it using the register_llm_provider
utility method provided by NeMo Guardrails. This code can be provided in the initialization file named config.py
. NVIDIA ACE Agent will automatically execute it before starting the bot.
from langchain.base_language import BaseLanguageModel from nemoguardrails.llm.providers import register_llm_provider class CustomLLM(BaseLanguageModel): """A custom LLM.""" register_llm_provider("custom_llm", CustomLLM)
You can then use the custom LLM provider in your bot configuration file:
models: - type: main engine: custom_llm
LLM Prompts#
You can customize the prompts that are used for the various LLM tasks (for example, generate user intent, generate next step, generate bot message) using the prompts
key. For example, to override the prompt used for the generate_user_intent
task for the OpenAI gpt-4-turbo
model:
prompts: - task: generate_user_intent models: - gpt-4-turbo content: |- <<This is a placeholder for a custom prompt for generating the user intent>>
The full list of predefined tasks used by the NVIDIA ACE Agent toolkit include the following:
general
generates the next bot message, when no canonical forms are used.generate_user_intent
generates the canonical user message.generate_next_steps
generates the next thing the bot should do or say.generate_bot_message
generates the next bot message.generate_value
generates the value for a context variable (in other words, extract the user-provided values).fact_checking
checks the facts from the bot response against the provided evidence.jailbreak_check
checks if there is an attempt to break moderation policies.output_moderation
checks if the bot response is harmful, unethical, or illegal.check_hallucination
checks if the bot response is a hallucination.
You can check the default prompts in the prompts folder of the NeMo Guardrails repository.