Supported Platforms#

The platform requirement can vary depending on the configuration and deployment topology used for VSS and dependencies like VLM, LLM, etc. Each section is for a specific GPU and is broken down into how VSS can be deployed for a given number of GPUs.

All GPUs listed here support remote deployments, where all the models are running elsewhere and VSS is connected through APIs endpoints.

The configurations listed here have been validated, however this is not an exhaustive list of the VLMs and LLMs which can be deployed.

The minimum number of gpus required for the LLM NIMs are provided here.

To customize the helm deployment for various configurations, please refer to Configuring GPU Allocation. For docker compose deployment, please refer to Configuring GPU Allocation.

Note

The default helm chart deployment topology is configured for:

8 x H100 (80 GB)
8 x H200 (140 GB)
8 x A100 (80 GB)
8 x L40S (48 GB)

Full local deployment recipe on single GPU using non-default low memory modes and smaller LLMs are available. Please refer to Helm Single GPU Deployment and Docker Compose Single GPU Deployment.

H200#

# of GPU	Supported Deployment Option
1	Single GPU deployment (Llama 3.1 8b low mem mode, NVILA 15b)
1+	Docker compose/Helm with remote endpoints (VLM and LLM)
1+	Docker compose/Helm with remote LLM
2+	Local Deployment (Llama 3.1 8b, NVILA 15b)
4+	Local Deployment (Llama 3.1 70b, NVILA 15b)
8	Default Helm Deployment (Llama 3.1 70b, VILA 1.5)

H100#

# of GPU	Supported Deployment Option
1	Single GPU deployment (Llama 3.1 8b low mem mode, NVILA 15b)
1+	Docker compose/Helm with remote endpoints (VLM and LLM)
1+	Docker compose/Helm with remote LLM
2+	Local Deployment (Llama 3.1 8b, NVILA 15b)
4+	Local Deployment (Llama 3.1 70b, NVILA 15b)
8	Default Helm Deployment (Llama 3.1 70b, VILA 1.5)

A100 (80 gb)#

# of GPU	Supported Deployment Option
1	Single GPU deployment (Llama 3.1 8b low mem mode, NVILA 15b)
1+	Docker compose/Helm with remote endpoints (VLM and LLM)
2+	Docker compose/Helm with remote LLM
2+	Local Deployment (Llama 3.1 8b, NVILA 15b)
4+	Local Deployment (Llama 3.1 70b, NVILA 15b)
8	Default Helm Deployment (Llama 3.1 70b, VILA 1.5)

A100 (40 gb)#

# of GPU	Supported Deployment Option
1+	Docker compose/Helm with remote endpoints (VLM and LLM)
4+	Docker compose/Helm with remote LLM
6+	Local Deployment (Llama 3.1 8b, NVILA 15b)

L40S#

# of GPU	Supported Deployment Option
1+	Docker compose/Helm with remote endpoints (VLM and LLM)
2+	Docker compose/Helm with remote LLM
4+	Local Deployment (Llama 3.1 8b, NVILA 15b)
6+	Local Deployment (Llama 3.1 70b, NVILA 15b)
8	Default Helm Deployment (Llama 3.1 70b, VILA 1.5)

A6000#

# of GPU	Supported Deployment Option
1+	Docker compose/Helm with remote endpoints (VLM and LLM)