Is this page helpful?

Hybrid Frontier Model#

This example uses NVIDIA NIM models for intent classification and shallow research (fast, low-cost) and a frontier model for deep research (higher quality reports).

Prerequisites#

NVIDIA_API_KEY for NIM models
LLM_API_KEY_FOR_FRONTIER_MODEL for frontier model (state-of-the-art models for high-quality outputs; examples include GPT-5.2 from OpenAI, OPUS-4.6 from Anthropic, etc.)
TAVILY_API_KEY for web search

Configuration#

llms:
  # NIM models for fast paths (intent + shallow)
  nemotron_llm_intent:
    _type: nim
    model_name: nvidia/nemotron-3-nano-30b-a3b
    base_url: "https://integrate.api.nvidia.com/v1"
    temperature: 0.5
    max_tokens: 4096

  nemotron_nano_llm:
    _type: nim
    model_name: nvidia/nemotron-3-nano-30b-a3b
    base_url: "https://integrate.api.nvidia.com/v1"
    temperature: 0.1
    max_tokens: 16384

  # Frontier model for deep research (higher quality)
  frontier_llm:
    _type: openai
    model_name: ${FRONTIER_MODEL_NAME}
    api_key: ${LLM_API_KEY_FOR_FRONTIER_MODEL}
    base_url: ${API_URL_FOR_FRONTIER_MODEL}
    temperature: 1.0
    max_tokens: 128000

functions:
  web_search_tool:
    _type: tavily_web_search
    max_results: 5

  advanced_web_search_tool:
    _type: tavily_web_search
    max_results: 2
    advanced_search: true

  knowledge_search:
    _type: knowledge_retrieval
    backend: llamaindex
    collection_name: ${COLLECTION_NAME:-test_collection}
    top_k: 5
    chroma_dir: ${AIQ_CHROMA_DIR:-/tmp/chroma_data}

  intent_classifier:
    _type: intent_classifier
    llm: nemotron_llm_intent
    tools:
      - web_search_tool
      - knowledge_search

  clarifier_agent:
    _type: clarifier_agent
    llm: nemotron_nano_llm
    planner_llm: nemotron_nano_llm
    tools:
      - web_search_tool
      - knowledge_search
    max_turns: 3
    enable_plan_approval: true

  shallow_research_agent:
    _type: shallow_research_agent
    llm: nemotron_nano_llm
    tools:
      - web_search_tool
      - knowledge_search
    max_llm_turns: 10
    max_tool_iterations: 5

  deep_research_agent:
    _type: deep_research_agent
    orchestrator_llm: frontier_llm    # Frontier model here
    max_loops: 2
    tools:
      - advanced_web_search_tool
      - knowledge_search

workflow:
  _type: chat_deepresearcher_agent
  enable_escalation: true
  enable_clarifier: true

How to Run#

Save the configuration above to configs/config_hybrid_frontier.yml, then run:

# Web mode (recommended)
dotenv -f deploy/.env run .venv/bin/nat serve \
  --config_file configs/config_hybrid_frontier.yml

# CLI interactive mode
./scripts/start_cli.sh \
  --config_file configs/config_hybrid_frontier.yml

How It Works#

The hybrid setup keeps shallow queries fast and cheap (NIM models respond in seconds) while routing complex deep research to a frontier model that produces higher-quality reports.