Quick Start#

Run your first research query in 5 minutes. This guide assumes you have already completed Installation.

Step 1: Set API Keys#

If you have not already, create the environment file and add your keys:

cp deploy/.env.example deploy/.env

Edit deploy/.env and set at minimum:

NVIDIA_API_KEY=nvapi-...
TAVILY_API_KEY=tvly-...

Step 2: Choose a Mode#

The AI-Q blueprint supports two primary modes for interactive use: a terminal-based CLI and a browser-based web UI.

Option A: CLI Mode#

The CLI provides an interactive research assistant in your terminal.

source .venv/bin/activate

# Using the convenience script
./scripts/start_cli.sh

# Or run directly with the custom CLI entry point
dotenv -f deploy/.env run .venv/bin/aiq-research --config_file configs/config_cli_default.yml

Note: start_cli.sh runs .venv/bin/aiq-research (a custom CLI entry point registered by this project), not nat run. The custom entry point adds interactive features like conversation history that are not part of the standard NeMo Agent Toolkit CLI.

You should observe the agent start up and present an input prompt where you can type questions.

Option B: Web UI Mode#

For a browser-based experience with a chat interface:

source .venv/bin/activate

./scripts/start_e2e.sh

This starts:

  • Backend API at http://localhost:8000

  • Frontend UI at http://localhost:3000

Open http://localhost:3000 in your browser.

Note

The web UI requires Node.js 22+ and npm. If these were available during ./scripts/setup.sh, UI dependencies are already installed. Otherwise, run cd frontends/ui && npm ci first.

Tip

Running on a remote VM? If you access the VM via SSH, you need to forward ports 3000 and 8000 to your local machine: ssh -L 3000:localhost:3000 -L 8000:localhost:8000 user@your-vm-host. See Troubleshooting — VM / Remote Development for details.

Step 3: Ask a Question#

Try one of these example queries to observe the system in action:

Shallow research (fast, concise answers):

What is CUDA and how does it relate to GPU programming?

What are the key differences between TCP and UDP?

Deep research (detailed, report-style output):

Compare the transformer architectures used in GPT-4 and Gemini, including their training approaches, parameter counts, and benchmark performance.

What are the current approaches to solving the protein folding problem, and how do AlphaFold and RoseTTAFold compare?

What to Expect#

  • Shallow queries produce a concise answer with inline citations and source links within a few seconds.

  • Deep queries trigger a multi-phase research process (planning, research, synthesis) that produces a structured report with a table of contents, inline citations, and a references section. This takes longer (typically 1–3 minutes depending on complexity).

The system automatically routes queries to the appropriate depth based on complexity. You do not need to specify shallow vs. deep – the orchestration node decides for you.

Example CLI Session#

$ ./scripts/start_cli.sh
============================================
  AI-Q Blueprint - CLI Mode
============================================

Config: config_cli_default.yml
Verbose: OFF (use -v to enable)

Type 'exit' or 'quit' to exit
--------------------------------------------

   NVIDIA AI-Q Blueprint
   Research Assistant powered by NVIDIA NeMo Agent Toolkit

AI-Q initialized!
Type 'exit', 'quit', or 'q' to quit.

You: What is the NVIDIA NeMo Agent Toolkit?

  Answer
  The NVIDIA NeMo Agent Toolkit (NAT) is a framework for building and deploying
  AI agents and multi-agent systems...

  Sources:
  [1] https://docs.nvidia.com/nemo/agent-toolkit/latest/ - NeMo Agent Toolkit Documentation
  [2] https://developer.nvidia.com/nemo - NVIDIA NeMo Overview

Verbose Logging#

To view detailed agent execution logs (tool calls, routing decisions, LLM interactions):

./scripts/start_cli.sh --verbose

What’s Next#

Now that you have the system running, explore these topics:

  • Architecture Overview – Understand how the orchestrator, shallow researcher, and deep researcher work together

  • Customization – Swap models, configure tools, adjust prompts, and tune agent behavior

  • Deployment – Run with Docker Compose

  • Evaluation – Measure research quality with built-in benchmarks