Configuration Reference#
The AI-Q blueprint is configured through a single YAML file that defines LLMs, tools, agents, and the workflow. The NeMo Agent Toolkit reads this file at startup and wires everything together.
Config File Structure#
Every config file has four top-level sections:
general: # Telemetry, logging, front-end settings
llms: # LLM definitions (model, endpoint, parameters)
functions: # Tools and agents (search tools, classifiers, research agents)
workflow: # Top-level orchestrator configuration
Environment Variable Substitution#
You can reference environment variables anywhere in the YAML using shell-style syntax:
# Required variable (fails if not set)
api_key: ${NVIDIA_API_KEY}
# Variable with a default value
checkpoint_db: ${AIQ_CHECKPOINT_DB:-./checkpoints.db}
# Nested in a URL
collection_name: ${COLLECTION_NAME:-test_collection}
The syntax ${VAR_NAME} substitutes the value of the environment variable. The syntax ${VAR_NAME:-default} provides a fallback value if the variable is not set. Environment variables are typically defined in deploy/.env or .env at the project root.
general Section#
Controls telemetry, logging, and the application front-end.
general:
use_uvloop: true # Use uvloop for better async performance (web mode)
telemetry:
logging:
console:
_type: console
level: INFO # DEBUG, INFO, WARNING, ERROR
tracing:
phoenix: # Optional: Phoenix observability
_type: phoenix
endpoint: http://localhost:6006/v1/traces
project: dev
front_end: # Only for web/API mode
_type: aiq_api
runner_class: aiq_api.plugin.AIQAPIWorker
db_url: ${NAT_JOB_STORE_DB_URL:-sqlite+aiosqlite:///./jobs.db}
expiry_seconds: 86400
cors:
allow_origin_regex: 'http://localhost(:\d+)?|http://127.0.0.1(:\d+)?'
allow_methods: [GET, POST, DELETE, OPTIONS]
allow_headers: ["*"]
allow_credentials: true
expose_headers: ["*"]
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
Enable uvloop for improved async I/O performance. Recommended for web mode. |
|
|
|
Logging backend type. |
|
|
|
Log level: |
|
|
– |
Optional tracing configuration (Phoenix, OpenTelemetry). |
|
|
– |
Front-end type. Use |
|
|
|
Database URL for async job persistence. |
|
|
|
How long completed jobs remain in the database (seconds). |
|
|
– |
CORS settings for the API server. |
llms Section#
Defines named LLM instances. Each entry gets a user-chosen key (for example, nemotron_llm) that agents reference.
llms:
nemotron_llm:
_type: nim
model_name: nvidia/nemotron-3-nano-30b-a3b
base_url: "https://integrate.api.nvidia.com/v1"
temperature: 0.1
top_p: 0.3
max_tokens: 16384
num_retries: 5
chat_template_kwargs:
enable_thinking: true
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
required |
LLM provider type. Use |
|
|
required |
Model identifier (for example, |
|
|
|
API endpoint URL. Should always be set explicitly for NVIDIA NIM endpoints. |
|
|
– |
API key. If omitted, uses |
|
|
|
Sampling temperature. Lower values produce more deterministic output. When |
|
|
|
Nucleus sampling threshold. When |
|
|
|
Maximum tokens in the response. Set higher values (for example, |
|
|
|
Number of retry attempts on API failure. |
|
|
– |
Extra arguments passed to the chat template. Use |
Common LLM Configurations#
Different agents benefit from different parameter profiles:
Role |
Temperature |
Top-p |
Max Tokens |
Notes |
|---|---|---|---|---|
Intent classifier |
|
|
|
Moderate creativity for classification |
Shallow researcher |
|
|
|
Low temperature for factual accuracy |
Deep research orchestrator |
|
|
|
High temperature with thinking enabled for deep reasoning |
Summary LLM |
|
– |
|
Conservative, short output for document summaries |
functions Section#
Defines tools and agents. Each entry has a _type field that maps to a registered NeMo Agent Toolkit plugin. The key you assign (for example, web_search_tool) becomes the name used in tools lists.
tavily_web_search#
Web search powered by the Tavily API.
functions:
web_search_tool:
_type: tavily_web_search
max_results: 5
max_content_length: 1000
advanced_web_search_tool:
_type: tavily_web_search
max_results: 2
advanced_search: true
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
Maximum number of search results to return. |
|
|
|
Whether to include a synthesized answer alongside search results. Tavily returns a direct answer in addition to individual result documents. |
|
|
|
Tavily API key. Falls back to |
|
|
|
Number of retry attempts on search failure. |
|
|
|
Use Tavily’s advanced search mode for deeper, more thorough results. |
|
|
|
Truncate each result’s content to this many characters. Reduces token usage. |
paper_search#
Academic paper search through Google Scholar (using the Serper API).
functions:
paper_search_tool:
_type: paper_search
max_results: 5
serper_api_key: ${SERPER_API_KEY}
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
Maximum number of paper results. |
|
|
|
Serper API key. Falls back to |
|
|
|
Timeout in seconds for search requests. |
knowledge_retrieval#
Semantic search over ingested documents. Supports two backends: LlamaIndex (local ChromaDB) and Foundational RAG (hosted NVIDIA RAG Blueprint).
functions:
# LlamaIndex backend
knowledge_search:
_type: knowledge_retrieval
backend: llamaindex
collection_name: ${COLLECTION_NAME:-test_collection}
top_k: 5
chroma_dir: ${AIQ_CHROMA_DIR:-/tmp/chroma_data}
generate_summary: true
summary_model: summary_llm
summary_db: ${AIQ_SUMMARY_DB:-sqlite+aiosqlite:///./summaries.db}
functions:
# Foundational RAG backend
knowledge_search:
_type: knowledge_retrieval
backend: foundational_rag
collection_name: ${COLLECTION_NAME:-test_collection}
top_k: 5
rag_url: ${RAG_SERVER_URL:-http://localhost:8081/v1}
ingest_url: ${RAG_INGEST_URL:-http://localhost:8082/v1}
timeout: 300
# verify_ssl: false # Only set to false for self-signed certs
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
|
Backend type: |
|
|
|
Name of the document collection/index. |
|
|
|
Number of results to return per query. |
|
|
|
Generate one-sentence summaries for ingested documents. |
|
|
|
LLM reference from |
|
|
|
Database URL for document summaries (SQLite or PostgreSQL). |
|
|
|
ChromaDB persistence directory. LlamaIndex backend only. |
|
|
|
RAG query server URL. Foundational RAG backend only. |
|
|
|
RAG ingestion server URL. Foundational RAG backend only. |
|
|
|
Request timeout in seconds. Foundational RAG backend only. |
|
|
|
Verify SSL certificates. Set |
intent_classifier#
Classifies user queries as meta (conversational) or research, and determines research depth (shallow vs. deep).
functions:
intent_classifier:
_type: intent_classifier
llm: nemotron_llm_intent
tools:
- web_search_tool
- paper_search_tool
verbose: true
llm_timeout: 90
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
required |
Reference to an LLM defined in |
|
|
|
Tool references passed to the intent prompt for tool-awareness. |
|
|
|
Enable verbose logging with trace callbacks. |
|
|
|
Timeout in seconds for the intent classification LLM call. |
clarifier_agent#
Interactive clarification dialog for deep research queries. Asks follow-up questions to refine scope before research begins.
functions:
clarifier_agent:
_type: clarifier_agent
llm: nemotron_llm
planner_llm: nemotron_llm
tools:
- web_search_tool
max_turns: 3
enable_plan_approval: true
max_plan_iterations: 10
log_response_max_chars: 2000
verbose: true
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
required |
LLM for generating clarification questions. |
|
|
|
LLM for plan generation. Falls back to |
|
|
|
Tools available for gathering context during clarification. |
|
|
|
Maximum number of clarification Q&A turns before auto-completing. |
|
|
|
Show research plan to the user for approval after clarification. |
|
|
|
Maximum plan feedback iterations before auto-approving. |
|
|
|
Maximum characters to log from LLM responses. |
|
|
|
Enable verbose logging. |
shallow_research_agent#
Fast, single-pass research agent that produces citation-backed answers in one tool-calling loop.
functions:
shallow_research_agent:
_type: shallow_research_agent
llm: nemotron_llm
tools:
- web_search_tool
- knowledge_search
max_llm_turns: 10
max_tool_iterations: 5
verbose: true
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
required |
LLM for research and synthesis. |
|
|
|
Search tools available to the agent. |
|
|
|
Maximum number of LLM turns (includes both reasoning and tool-calling steps). |
|
|
|
Maximum tool-calling iterations before forcing synthesis. |
|
|
|
Enable verbose logging. |
deep_research_agent#
Multi-phase research agent with separate orchestrator, planner, and researcher sub-agents that produces long-form reports.
functions:
deep_research_agent:
_type: deep_research_agent
orchestrator_llm: nemotron_llm_deep
researcher_llm: nemotron_llm_deep
planner_llm: nemotron_llm_deep
tools:
- paper_search_tool
- advanced_web_search_tool
- knowledge_search
max_loops: 2
verbose: true
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
required |
LLM for the orchestrator that coordinates the research workflow. |
|
|
|
LLM for the researcher sub-agent. Falls back to |
|
|
|
LLM for the planner sub-agent. Falls back to |
|
|
|
Search tools available to the researcher sub-agent. |
|
|
|
Maximum number of orchestrator planning/research loops. |
|
|
|
Enable verbose logging. |
workflow Section#
Defines the top-level orchestrator that wires together all agents.
workflow:
_type: chat_deepresearcher_agent
enable_escalation: true
enable_clarifier: true
use_async_deep_research: true
max_history: 20
verbose: true
checkpoint_db: ${AIQ_CHECKPOINT_DB:-./checkpoints.db}
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
required |
Workflow type. Use |
|
|
|
Allow the intent classifier to route queries to deep research. When |
|
|
|
Run the clarifier agent before deep research to gather user requirements. |
|
|
|
Submit deep research as an async background job (requires Dask scheduler). |
|
|
|
Maximum number of messages to keep in conversation history before trimming. |
|
|
|
Enable verbose logging. |
|
|
|
SQLite path or PostgreSQL DSN for persistent conversation checkpoints. |
Note:
interactive_authis a YAML-level field consumed by the CLI entry point (start_cli.sh/aiq-research), not a Pydantic field onChatDeepResearcherConfig. It can be set in YAML config files but is not part of the workflow config class.
Complete Annotated Example#
Below is a complete configuration for CLI mode with web search, paper search, and clarification enabled:
# General settings
general:
telemetry:
logging:
console:
_type: console
level: INFO # Set to DEBUG for troubleshooting
# LLM definitions
llms:
intent_llm: # Used by intent classifier
_type: nim
model_name: nvidia/nemotron-3-nano-30b-a3b
base_url: "https://integrate.api.nvidia.com/v1"
temperature: 0.5
top_p: 0.9
max_tokens: 4096
num_retries: 5
chat_template_kwargs:
enable_thinking: true
research_llm: # Used by shallow researcher + clarifier
_type: nim
model_name: nvidia/nemotron-3-nano-30b-a3b
base_url: "https://integrate.api.nvidia.com/v1"
temperature: 0.1
top_p: 0.3
max_tokens: 16384
num_retries: 5
chat_template_kwargs:
enable_thinking: true
deep_llm: # Used by deep research orchestrator
_type: nim
model_name: nvidia/nemotron-3-nano-30b-a3b
base_url: "https://integrate.api.nvidia.com/v1"
temperature: 1.0
top_p: 1.0
max_tokens: 128000
num_retries: 5
chat_template_kwargs:
enable_thinking: true
# Tools and agents
functions:
web_search_tool: # Standard web search
_type: tavily_web_search
max_results: 5
max_content_length: 1000
advanced_web_search_tool: # Deep search (fewer results, more depth)
_type: tavily_web_search
max_results: 2
advanced_search: true
paper_search_tool: # Academic paper search
_type: paper_search
max_results: 5
serper_api_key: ${SERPER_API_KEY}
intent_classifier: # Classifies queries, routes depth
_type: intent_classifier
llm: intent_llm
tools:
- web_search_tool
- paper_search_tool
clarifier_agent: # Asks clarifying questions for deep research
_type: clarifier_agent
llm: research_llm
planner_llm: research_llm
tools:
- web_search_tool
max_turns: 3
enable_plan_approval: true
verbose: true
shallow_research_agent: # Fast single-pass research
_type: shallow_research_agent
llm: research_llm
tools:
- web_search_tool
max_llm_turns: 10
max_tool_iterations: 5
deep_research_agent: # Multi-phase deep research
_type: deep_research_agent
orchestrator_llm: deep_llm
tools:
- paper_search_tool
- advanced_web_search_tool
max_loops: 2
# Top-level orchestrator
workflow:
_type: chat_deepresearcher_agent
enable_escalation: true # Allow deep research routing
enable_clarifier: true # Ask clarifying questions first
checkpoint_db: ${AIQ_CHECKPOINT_DB:-./checkpoints.db}
Provided Config Files#
The repository includes several pre-built configurations:
File |
Mode |
Features |
|---|---|---|
|
CLI |
Web search, paper search, clarifier with plan approval |
|
Web API |
LlamaIndex knowledge retrieval, web search, paper search |
|
Web API |
Foundational RAG knowledge retrieval, web search, paper search |