Prerequisites#

This section covers the prerequisites for deploying and using the VSS Blueprint.

Hardware Requirements#

The VSS Blueprint has been validated and tested on the following NVIDIA GPUs:

  • NVIDIA H100

  • NVIDIA RTX PRO 6000 Blackwell

  • NVIDIA L40S

  • NVIDIA DGX SPARK

  • NVIDIA IGX Thor

  • NVIDIA AGX Thor

For GPUs that are not validated and tested, the OTHER configuration is available (experimental). See the deployment steps in each workflow guide for the GPU tab options.

Note

If you’re looking for a pre-configured environment to try out the VSS Blueprint, please refer to the Brev Launchable section.

Software Requirements#

The following software must be installed on your system:

  • OS:

    • x86 hosts: Ubuntu 22.04 or Ubuntu 24.04

    • DGX-SPARK: DGX OS 7.4.0

    • IGX-THOR: Jetson Linux BSP (Rel 38.5)

    • AGX-THOR: Jetson Linux BSP (Rel 38.4)

  • NVIDIA Driver:

    • 580.105.08 (x86 hosts with Ubuntu 24.04)

    • 580.65.06 (x86 hosts with Ubuntu 22.04)

    • 580.95.05 (DGX-SPARK)

    • 580.00 (IGX-THOR and AGX-THOR)

  • NVIDIA Container Toolkit: 1.17.8+

  • Docker: 27.2.0+

  • Docker Compose: v2.29.0+

  • NGC CLI: 4.10.0+

Note

For detailed installation instructions for the software requirements listed above, see the Appendix:

Runtime Environment Settings#

In addition to the above prerequisites, the following runtime environment settings are required:

Linux Kernel Settings#

# Ensure the sysctl.d directory exists
sudo ls -l /etc/sysctl.d

# Write VSS kernel settings to 99-vss.conf (persistent across reboots)
sudo bash -c "printf '%s\n' \
  'net.ipv6.conf.all.disable_ipv6 = 1' \
  'net.ipv6.conf.default.disable_ipv6 = 1' \
  'net.ipv6.conf.lo.disable_ipv6 = 1' \
  'net.core.rmem_max = 5242880' \
  'net.core.wmem_max = 5242880' \
  'net.ipv4.tcp_rmem = 4096 87380 16777216' \
  'net.ipv4.tcp_wmem = 4096 65536 16777216' \
  > /etc/sysctl.d/99-vss.conf"

# Reload sysctl to apply the new settings
sudo sysctl --system

Power Mode (only on IGX-THOR and AGX-THOR)#

# Set max power mode
# NOTE: the below command requires a reboot to take effect
sudo nvpmodel -m 0

After reboot:

# Set max clocks
sudo jetson_clocks

Note

The above is reset to default settings if the system is rebooted.

Cache cleaner (only on DGX-SPARK, IGX-THOR and AGX-THOR)#

Run the cache cleaner script for DGX SPARK and Jetson Thor.

Create the cache cleaner script at /usr/local/bin/sys-cache-cleaner.sh:

sudo tee /usr/local/bin/sys-cache-cleaner.sh << 'EOF'
#!/bin/bash
# Exit immediately if any command fails
set -e

# Disable hugepages
echo "disable vm/nr_hugepage"
echo 0 | tee /proc/sys/vm/nr_hugepages

# Notify that the cache cleaner is running
echo "Starting cache cleaner - Running"
echo "Press Ctrl + C to stop"
# Repeatedly sync and drop caches every 3 seconds
while true; do
     sync && echo 3 | tee /proc/sys/vm/drop_caches > /dev/null
     sleep 3
done
EOF

sudo chmod +x /usr/local/bin/sys-cache-cleaner.sh

Running in the background#

sudo -b /usr/local/bin/sys-cache-cleaner.sh

Note

The above runs the cache cleaner in the current session only; it does not persist across reboots. To have the cache cleaner run across reboots, create a systemd service instead.

Minimum System Requirements#

  • 18 core CPU (x86 systems)

  • 128 GB RAM

  • 1 TB SSD

  • 1 x 1 Gbps network interface

  • 1 or 2 recommended NVIDIA GPUs based on the development profile

NGC API Key Access#

To deploy the VSS Blueprint, you need an NGC API key with access to the required resources.

Create an NVIDIA NGC Account#

  1. Go to https://ngc.nvidia.com

  2. Click Sign up if you don’t have an account, or Sign in if you do

  3. Complete the registration process with your email address

Generate an API Key#

  1. Once logged in, click on your username in the top right corner

  2. Select Setup from the dropdown menu

  3. Navigate to API Keys under the Keys/Secrets section

  4. Click Generate API Key

  5. Click Generate Personal Key (available at both the top-center and top-right of the page)

  6. Provide a descriptive name for your API key (e.g., “VSS Blueprint Development”)

  7. Select NGC Catalog for Key Permissions

  8. Click Generate Personal Key

  9. Copy the generated API key immediately - you won’t be able to see it again

  10. Store it securely in a password manager or encrypted file

Note

Keep your NGC API key secure and never commit it to version control systems.

Verify Your NGC Access#

To verify your NGC key has the correct permissions:

# Set your NGC API key
export NGC_CLI_API_KEY='your_ngc_api_key'

# Test access to the required resources
ngc registry resource list "nvidia/vss-developer/*"
ngc registry resource list "nvidia/vss-core/*"

# Test access to the required images
ngc registry image list "nvidia/vss-core/*"

# Test access to the required models
ngc registry model list "nvidia/tao/*"

If you encounter permission errors, contact NVIDIA support for assistance.

Development Profile GPU Requirements#

The following tables show the number of GPUs needed for each development profile:

Profile

Shared GPU

Dedicated GPU

Remote LLM

Remote VLM

Remote LLM + VLM

dev-profile-base

1

2

1

1

0

dev-profile-lvs

1

2

1

1

0

dev-profile-alerts (Alert verification with VLM) 1

2

3

2

2

1

dev-profile-alerts (Real-Time Alerts with VLM) 1

2

3

2

dev-profile-search 2

2

3

2

2

2

Profile

Shared GPU

Dedicated GPU

Remote LLM

Remote VLM

Remote LLM + VLM

dev-profile-base

1

2

1

1

0

dev-profile-lvs

1

2

1

1

0

dev-profile-alerts (Alert verification with VLM) 1

2

3

2

2

1

dev-profile-alerts (Real-Time Alerts with VLM) 1

2

3

2

dev-profile-search 2

2

3

2

2

2

Profile

Dedicated GPU

Remote LLM

Remote VLM

Remote LLM + VLM

dev-profile-base

2

1

1

0

dev-profile-lvs

2

1

1

0

dev-profile-alerts (Alert verification with VLM) 1

3

2

2

1

dev-profile-alerts (Real-Time Alerts with VLM) 1

3

2

dev-profile-search 2

3

2

2

2

Note

AGX/IGX Thor and DGX Spark platforms are partially supported in early access and will have fully-local deployment of all agent workflows (base, summarization, alerts and search) in the following releases.

References#

1 GPU device ‘0’ is needed to run alerts.

2 GPU devices ‘0’ and ‘1’ are needed to run search.

VLM Custom Weights#

The VSS Blueprint supports using VLM custom weights for specialized use cases. This section explains how to download custom weights from NGC or Hugging Face.

Download Custom Weights from NGC#

If VLM custom weights are available on NGC, you can download them using the NGC CLI:

# Set your NGC API key if not already set
export NGC_CLI_API_KEY='your_ngc_api_key'

# Download custom weights from NGC
ngc registry model download-version <org>/<team>/<model>:<version>

# Move downloaded custom weights to desired path/folder
mv </downloaded/path/in/ngc/output> </path/to/custom/weights>

The weights will be downloaded to a local directory. Note the path to this directory, as you’ll need to specify it when deploying with the --vlm-custom-weights flag.

Download Custom Weights from Hugging Face#

To download custom VLM weights from Hugging Face, you can use the huggingface-cli tool or Git LFS:

Using huggingface-cli:

# Install Hugging Face CLI if not already installed
pip install -U "huggingface_hub[cli]"

# Login to Hugging Face (required for gated models)
hf auth login

# Create a base directory where the custom weights can be downloaded
mkdir -p </path/to/custom/weights>

# Download a model
hf download <model-id> --local-dir </path/to/custom/weights>

Using Git LFS:

# Install Git LFS if not already installed
sudo apt install git-lfs
git lfs install

# Clone the model repository
git clone https://huggingface.co/<model-id> </path/to/custom/weights>

Note

Some Hugging Face models may be gated and require access approval. Visit the model page on Hugging Face and request access if needed. You must also authenticate using huggingface-cli login before downloading gated models.

Verify Downloaded Weights#

After downloading custom weights, verify the directory structure and contents:

# List the contents of the weights directory
ls -lh </path/to/custom/weights>

Typical VLM weight directories contain:

  • Model configuration files (e.g., config.json)

  • Model weights (e.g., pytorch_model.bin, model.safetensors, or sharded weights)

  • Tokenizer files (e.g., tokenizer.json, tokenizer_config.json)

  • Other metadata files

Common Installation Issues#

When pulling resources from NGC, you may encounter the following issues:

Error: Missing org - If Authenticated, org is also required.

Steps to resolve:

  1. Generate an NGC API key with your desired org selected.

  2. Ensure that the org you selected in ngc config set is the same as the org you selected in the NGC API key generation.

Appendix: Software Prerequisites Installation Guide#

Install NVIDIA Driver#

Ubuntu 24.04#

NVIDIA Driver version 580.105.08 is required with Ubuntu 24.04.

You can download the driver directly from: https://www.nvidia.com/en-us/drivers/details/257738/

Additional resources#

For detailed installation instructions, refer to the official NVIDIA Driver Installation Guide: https://docs.nvidia.com/datacenter/tesla/driver-installation-guide/index.html

You can also browse drivers for your specific GPU and platform from the NVIDIA Driver Downloads page: https://www.nvidia.com/Download/index.aspx

Note

After installation, verify the driver is correctly installed by running nvidia-smi to confirm the driver version and GPU detection.

Note

NVIDIA Fabric Manager Requirement

NVIDIA Fabric Manager is required on systems with multiple GPUs that are connected using NVLink or NVSwitch technology. This typically applies to:

  • NVIDIA DGX systems — e.g., DGX A100, DGX H100, DGX-2, DGX Station (multi-GPU)

  • HGX platforms — e.g., H100 4-GPU/8-GPU baseboards

  • NVSwitch-based systems — servers with NVSwitch fabric (e.g., 8-way HGX B200/H100 with NVSwitch)

  • Multi-GPU servers with NVLink — OEM systems that use NVLink to connect GPUs rather than PCIe only

  • Datacenter GPUs with NVLink — hosts with NVIDIA H100, B200, or similar GPUs in an NVLink topology

Fabric Manager is not required for:

  • Single GPU systems

  • Multi-GPU systems without NVLink/NVSwitch (PCIe-only configurations)

For full details, refer to the official NVIDIA Fabric Manager documentation:

Example setup on Ubuntu 24.04. Use Fabric Manager version 580.105.08 (see Install NVIDIA Driver for driver requirements):

# Install Fabric Manager 580.105.08 (version pin may vary by repository; use the package version that matches 580.105.08)
sudo apt-get update
sudo apt-get install -y nvidia-fabricmanager-580=580.105.08-1

# Enable and start the service
sudo systemctl enable nvidia-fabricmanager
sudo systemctl start nvidia-fabricmanager

# Verify the service is running
sudo systemctl status nvidia-fabricmanager

If that package version is not available from your apt repository, install from the NVIDIA Fabric Manager archive for 580.105.08: https://developer.download.nvidia.com/compute/nvidia-driver/redist/fabricmanager/linux-x86_64/fabricmanager-linux-x86_64-580.105.08-archive.tar.xz

Typically the package version will be the same as the driver version.

Install Docker#

Docker version 27.2.0+ is recommended. Follow the guide here for installing Docker: https://docs.docker.com/engine/install/ubuntu/

Run docker without sudo#

After installation, complete the Linux post-installation steps so that Docker can run without sudo. See Linux post-installation steps for Docker Engine.

Configure Docker#

After installing Docker, you must configure it to use the cgroupfs cgroup driver.

Edit the Docker daemon configuration file:

Add or verify the following entry in /etc/docker/daemon.json:

"exec-opts": ["native.cgroupdriver=cgroupfs"]

This configuration must be included within the JSON object. For example:

{
    "exec-opts": ["native.cgroupdriver=cgroupfs"]
}

If your /etc/docker/daemon.json already contains other settings, ensure this entry is added to the existing configuration.

Apply the configuration:

After editing the daemon configuration, restart Docker to apply the changes:

sudo systemctl daemon-reload
sudo systemctl restart docker

Note

Restarting Docker will temporarily stop all running containers. Plan accordingly if you have containers running.

Install Docker Compose#

Docker Compose v2.29.0 or later is required. Docker Compose v2 is typically installed as a plugin with modern Docker installations.

Verify if Docker Compose is already installed:

docker compose version

If not installed, install Docker Compose plugin:

sudo apt update
sudo apt install docker-compose-plugin

Verify installation:

docker compose version

Note

Docker Compose v2 uses the command docker compose (without hyphen) instead of the older docker-compose command.

Install NVIDIA Container Toolkit#

The NVIDIA Container Toolkit is required to run the NVIDIA containers. Follow the guide here for installing the NVIDIA Container Toolkit: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

Install NGC CLI#

NGC CLI version 4.10.0 or later is required to download blueprints and authenticate with NGC resources.

Download and install NGC CLI:

For ARM64 Linux:

curl -sLo "/tmp/ngccli.zip" https://api.ngc.nvidia.com/v2/resources/nvidia/ngc-apps/ngc_cli/versions/4.10.0/files/ngccli_arm64.zip

For AMD64 Linux:

curl -sLo "/tmp/ngccli.zip" https://api.ngc.nvidia.com/v2/resources/nvidia/ngc-apps/ngc_cli/versions/4.10.0/files/ngccli_linux.zip

After downloading, install NGC CLI:

sudo mkdir -p /usr/local/bin
sudo unzip -qo /tmp/ngccli.zip -d /usr/local/lib
sudo chmod +x /usr/local/lib/ngc-cli/ngc
sudo ln -sfn /usr/local/lib/ngc-cli/ngc /usr/local/bin/ngc

Verify installation:

ngc --version

Configure NGC CLI with your API key:

ngc config set

When prompted, enter your NGC API key. For information on obtaining an NGC API key, see the NGC API Key Access section.

Setup DGX-SPARK#

For setup instructions, see the DGX Spark User Guide.

Setup installs DGX OS (including the NVIDIA driver), Docker with NVIDIA Container Toolkit integration, and NGC access.

Note

Setup AGX-THOR#

Setup Rel 38.4 of Jetson BSP via instructions in the Jetson AGX Thor Developer Kit User Guide.

  • Setup installs Jetson Linux BSP (including the NVIDIA driver).

  • Install JetPack after the BSP is in place (for CUDA and other components), as described in the user guide.

Note

Setup IGX-THOR#

For IGX Thor, the IGX 2.0 GA ISO and Jetson BSP r38.5 are required. Set up Jetson BSP Rel 38.5 instructions are coming soon.

  • Setup installs Jetson Linux BSP (including the NVIDIA driver).

  • Install JetPack after the BSP is in place (for CUDA and other components), as described in the user guide.

Note

NIM configuration settings#

The deployment ships with NIM (NVIDIA Inference Microservice) configuration via environment variables for supported NIMs and GPUs. You can use the default env files, rely on NIM defaults for unsupported GPUs, or supply your own env files at install time.

Default configuration#

For each supported NIM and GPU, the deployment includes a corresponding env file under ./deployments/nim/<nim-name>/. For example, for the cosmos-reason2-8b NIM on an RTX 6000 Pro:

./deployments/nim/cosmos-reason2-8b/hw-RTXPRO6000BW.env

Unsupported GPU hardware#

For GPU hardware that is not in the default set but is supported by the OTHER option in dev-profile.sh, the deployment uses a default empty env file. The NIM then falls back to its own defaults.

./deployments/nim/cosmos-reason2-8b/hw-OTHER.env

Custom env files at install time#

You can define your own env variants for a NIM and pass them when you run the installer. Use --llm-env-file and/or --vlm-env-file with dev-profile.sh. Paths can be absolute or relative to the current directory.

scripts/dev-profile.sh up -p base --llm-env-file /path/to/llm.env --vlm-env-file /path/to/vlm.env