Prerequisites#
This section covers the prerequisites for deploying and using the VSS Blueprint.
Hardware Requirements#
The VSS Blueprint has been validated and tested on the following NVIDIA GPUs:
NVIDIA H100
NVIDIA L40S
NVIDIA RTX PRO 6000 Blackwell
Software Requirements#
The following software must be installed on your system:
OS: Ubuntu
24.04 (with NVIDIA Driver 580.105.08)
22.04 (with NVIDIA Driver 580.65.06)
NVIDIA Driver: 580.105.08 (Ubuntu 24.04) or 580.65.06 (Ubuntu 22.04)
NVIDIA Container Toolkit: 1.17.8
Docker: 27.2.0 or later
Docker Compose: v2.29.0 or later
NGC CLI: 4.10.0 or later
Note
For detailed installation instructions for the software requirements listed above, see the Appendix:
Runtime Environment Settings#
In addition to the above prerequisites, the following runtime environment settings are required:
Linux Kernel Settings#
sudo sysctl -w net.ipv6.conf.all.disable_ipv6=1
sudo sysctl -w net.ipv6.conf.default.disable_ipv6=1
sudo sysctl -w net.ipv6.conf.lo.disable_ipv6=1
sudo sysctl -w net.core.rmem_max=5242880
sudo sysctl -w net.core.wmem_max=5242880
sudo sysctl -w net.ipv4.tcp_rmem="4096 87380 16777216"
sudo sysctl -w net.ipv4.tcp_wmem="4096 65536 16777216"
Note
The above settings are not persistent across reboots. To make them persistent, add the configurations to /etc/sysctl.conf.
Minimum System Requirements#
x86-64 architecture
18-32 core CPU
128 GB RAM
1 TB SSD
1 x 1 Gbps network interface
NGC API Key Access#
To deploy the VSS Blueprint, you need an NGC API key with access to the required resources.
Create an NVIDIA NGC Account#
Go to https://ngc.nvidia.com
Click Sign up if you don’t have an account, or Sign in if you do
Complete the registration process with your email address
Generate an API Key#
Once logged in, click on your username in the top right corner
Select Setup from the dropdown menu
Navigate to API Keys under the Keys/Secrets section
Click Generate API Key
Click Generate Personal Key (available at both the top-center and top-right of the page)
Provide a descriptive name for your API key (e.g., “VSS Blueprint Development”)
Select NGC Catalog for Key Permissions
Click Generate Personal Key
Copy the generated API key immediately - you won’t be able to see it again
Store it securely in a password manager or encrypted file
Note
Keep your NGC API key secure and never commit it to version control systems.
Verify Your NGC Access#
To verify your NGC key has the correct permissions:
# Set your NGC API key
export NGC_CLI_API_KEY='your_ngc_api_key'
# Test access to the required resources
ngc registry resource info nvidia/vss-developer/dev-profile-compose:3.0.0
ngc registry resource info nvidia/vss-core/vss-agent-3rd-party-oss:3.0.0
# Test access to the required images
ngc registry image info nvidia/vss-core/vss-agent:3.0.0
If you encounter permission errors, contact NVIDIA support for assistance.
Development Profile GPU Requirements#
The following table shows the GPU requirements for each development profile:
Profile |
Local VLM+LLM (Shared GPU) |
Local VLM+LLM (Dedicated GPU) |
Remote VLM+LLM |
|---|---|---|---|
Default |
|||
1 x GPU |
2 x GPU |
0 x GPU |
|
1 x GPU |
2 x GPU |
0 x GPU |
|
2 x GPU |
3 x GPU |
1 x GPU |
|
2 x GPU |
3 x GPU |
1 x GPU |
|
1 x GPU |
1 x GPU |
1 x GPU |
Note
* GPU device ‘0’ should be available to run these profiles.
LLM + VLM Deployment Support Matrix
The following table shows the supported deployment configurations for different LLM and VLM model combinations:
LLM |
VLM |
Local Shared GPU |
Local Shared GPU |
Local Shared GPU |
Local Dedicated GPU |
Remote Endpoint |
|---|---|---|---|---|---|---|
H100 |
RTX PRO 6000 Blackwell |
L40S |
||||
nvidia-nemotron-nano-9b-v2 |
cosmos-reason2-8b |
✓ |
✓ |
✗ |
✓ 1 |
✓ 3 |
Custom LLM |
cosmos-reason2-8b |
✗ |
✗ |
✗ |
✓ 2 |
✓ 3 |
nvidia-nemotron-nano-9b-v2 |
Custom VLM |
✗ |
✗ |
✗ |
✓ 2 |
✓ 3 |
Custom LLM |
Custom VLM |
✗ |
✗ |
✗ |
✓ 2 |
✓ 3 |
References:
For
dev-profile-searchon L40S, thellm-device-idshould be updated to a non ‘0’ device ID when using dedicated GPU configuration.Verify GPU memory meets model requirements. Refer to model documentation for specifications. LLM NIM documentation: https://docs.nvidia.com/nim/large-language-models/latest/supported-models.html VLM NIM documentation: https://docs.nvidia.com/nim/vision-language-models/1.6.0/support-matrix.html
Remote endpoint support: LLM supports both self-hosted and build.nvidia.com endpoints. VLM supports only self-hosted endpoints.
Notes:
Custom LLM options:
nemotron-3-nanollama-3.3-nemotron-super-49b-v1.5gpt-oss-20b
Custom VLM options:
cosmos-reason1-7bqwen3-vl-8b-instruct
VLM Custom Weights#
The VSS Blueprint supports using VLM custom weights for specialized use cases. This section explains how to download custom weights from NGC or Hugging Face.
Download Custom Weights from NGC#
If VLM custom weights are available on NGC, you can download them using the NGC CLI:
# Set your NGC API key if not already set
export NGC_CLI_API_KEY='your_ngc_api_key'
# Download custom weights from NGC
ngc registry model download-version <org>/<team>/<model>:<version>
# Move downloaded custom weights to desired path/folder
mv </downloaded/path/in/ngc/output> </path/to/custom/weights>
The weights will be downloaded to a local directory. Note the path to this directory, as you’ll need to specify it when deploying with the --vlm-custom-weights flag.
Download Custom Weights from Hugging Face#
To download custom VLM weights from Hugging Face, you can use the huggingface-cli tool or Git LFS:
Using huggingface-cli:
# Install Hugging Face CLI if not already installed
pip install -U "huggingface_hub[cli]"
# Login to Hugging Face (required for gated models)
hf auth login
# Create a base directory where the custom weights can be downloaded
mkdir -p </path/to/custom/weights>
# Download a model
hf download <model-id> --local-dir </path/to/custom/weights>
Using Git LFS:
# Install Git LFS if not already installed
sudo apt install git-lfs
git lfs install
# Clone the model repository
git clone https://huggingface.co/<model-id> </path/to/custom/weights>
Note
Some Hugging Face models may be gated and require access approval. Visit the model page on Hugging Face and request access if needed. You must also authenticate using huggingface-cli login before downloading gated models.
Verify Downloaded Weights#
After downloading custom weights, verify the directory structure and contents:
# List the contents of the weights directory
ls -lh </path/to/custom/weights>
Typical VLM weight directories contain:
Model configuration files (e.g.,
config.json)Model weights (e.g.,
pytorch_model.bin,model.safetensors, or sharded weights)Tokenizer files (e.g.,
tokenizer.json,tokenizer_config.json)Other metadata files
Common Installation Issues#
When pulling resources from NGC, you may encounter the following issues:
Error: Missing org - If Authenticated, org is also required.
Steps to resolve:
Generate an NGC API key with your desired org selected.
Ensure that the org you selected in
ngc config setis the same as the org you selected in the NGC API key generation.
Appendix: Software Prerequisites Installation Guide#
Install NVIDIA Driver#
Ubuntu 24.04#
NVIDIA Driver version 580.105.08 is required with Ubuntu 24.04.
You can download the driver directly from: https://www.nvidia.com/en-us/drivers/details/257738/
Ubuntu 22.04#
NVIDIA Driver version 580.65.06 is required with Ubuntu 22.04.
You can download the driver directly from: https://www.nvidia.com/en-us/drivers/details/251363/
For either Ubuntu version#
For detailed installation instructions, refer to the official NVIDIA Driver Installation Guide: https://docs.nvidia.com/datacenter/tesla/driver-installation-guide/index.html
You can also browse drivers for your specific GPU and platform from the NVIDIA Driver Downloads page: https://www.nvidia.com/Download/index.aspx
Note
After installation, verify the driver is correctly installed by running nvidia-smi to confirm the driver version and GPU detection.
Note
NVIDIA Fabric Manager Requirement
NVIDIA Fabric Manager is required on systems with multiple GPUs that are connected using NVLink or NVSwitch technology. This typically applies to:
Multi-GPU systems with NVLink bridges (e.g., DGX systems, HGX platforms)
Systems with NVSwitch fabric interconnects
Hosts running NVIDIA H100, A100, V100, or other datacenter GPUs with NVLink
Fabric Manager is not required for:
Single GPU systems
Multi-GPU systems without NVLink/NVSwitch (PCIe-only configurations)
For installation instructions, refer to the official NVIDIA Fabric Manager documentation:
Installation Guide: https://docs.nvidia.com/datacenter/tesla/fabric-manager-user-guide/index.html
Verify Fabric Manager status after installation:
sudo systemctl status nvidia-fabricmanager
Install Docker#
Docker version 27.2.0+ is recommended. Follow the guide here for installing Docker: https://docs.docker.com/engine/install/ubuntu/
Configure Docker#
After installing Docker, you must configure it to use the cgroupfs cgroup driver.
Edit the Docker daemon configuration file:
Add or verify the following entry in /etc/docker/daemon.json:
"exec-opts": ["native.cgroupdriver=cgroupfs"]
This configuration must be included within the JSON object. For example:
{
"exec-opts": ["native.cgroupdriver=cgroupfs"]
}
If your /etc/docker/daemon.json already contains other settings, ensure this entry is added to the existing configuration.
Apply the configuration:
After editing the daemon configuration, restart Docker to apply the changes:
sudo systemctl daemon-reload
sudo systemctl restart docker
Note
Restarting Docker will temporarily stop all running containers. Plan accordingly if you have containers running.
Install Docker Compose#
Docker Compose v2.29.0 or later is required. Docker Compose v2 is typically installed as a plugin with modern Docker installations.
Verify if Docker Compose is already installed:
docker compose version
If not installed, install Docker Compose plugin:
sudo apt update
sudo apt install docker-compose-plugin
Verify installation:
docker compose version
Note
Docker Compose v2 uses the command docker compose (without hyphen) instead of the older docker-compose command.
Install NVIDIA Container Toolkit#
The NVIDIA Container Toolkit is required to run the NVIDIA containers. Follow the guide here for installing the NVIDIA Container Toolkit: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html
Install NGC CLI#
NGC CLI version 4.10.0 or later is required to download blueprints and authenticate with NGC resources.
Download and install NGC CLI:
For ARM64 Linux:
curl -sLo "/tmp/ngccli.zip" https://api.ngc.nvidia.com/v2/resources/nvidia/ngc-apps/ngc_cli/versions/4.10.0/files/ngccli_arm64.zip
For AMD64 Linux:
curl -sLo "/tmp/ngccli.zip" https://api.ngc.nvidia.com/v2/resources/nvidia/ngc-apps/ngc_cli/versions/4.10.0/files/ngccli_linux.zip
After downloading, install NGC CLI:
sudo mkdir -p /usr/local/bin
sudo unzip -qo /tmp/ngccli.zip -d /usr/local/lib
sudo chmod +x /usr/local/lib/ngc-cli/ngc
sudo ln -sfn /usr/local/lib/ngc-cli/ngc /usr/local/bin/ngc
Verify installation:
ngc --version
Configure NGC CLI with your API key:
ngc config set
When prompted, enter your NGC API key. For information on obtaining an NGC API key, see the NGC API Key Access section.
Note
For the latest version of NGC CLI, visit the NGC CLI downloads page: https://ngc.nvidia.com/setup/installers/cli