Prerequisites#

Verify that your environment meets the following requirements before deploying NVIDIA Speech NIM microservices.

License and Hardware#

Requirement

Details

NVIDIA AI Enterprise (NVAIE)

Required license for self-hosting Speech NIMs.

NVIDIA GPU

Model-specific GPU and memory requirements vary by service. Refer to the support matrix for supported GPU and model combinations.

CPU Architecture

x86_64 only.

Operating System#

Use a Linux distribution that meets the following requirements:

CUDA Drivers#

Install CUDA drivers by following the CUDA installation guide for Linux.

Supported Driver Versions#

Major Version

EOL

Data Center and RTX/Quadro

GeForce

> 550

TBD

Yes

Yes

550

Feb 2025

Yes

Yes

545

Oct 2023

Yes

Yes

535

June 2026

Yes

525

Nov 2023

Yes

470

Sept 2024

Yes

Docker#

Install Docker Engine for your Linux distribution by following the Docker Engine installation guide.

After installation, verify that the Docker daemon is running and that your user can execute docker commands without sudo. Add your user to the docker group if needed:

sudo usermod -aG docker $USER

Log out and back in for the group change to take effect.

NVIDIA Container Toolkit#

The NVIDIA Container Toolkit enables Docker containers to access the host GPU.

  1. Install the toolkit by following the NVIDIA Container Toolkit installation guide.

  2. Configure Docker to use the NVIDIA runtime by following the Docker configuration steps.

  3. Restart the Docker daemon after configuration:

    sudo systemctl restart docker
    

Verify GPU Access#

Confirm that containers can access the GPU:

docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi

The output should display the driver version, CUDA version, and available GPU(s):

| NVIDIA-SMI 550.54.14   Driver Version: 550.54.14   CUDA Version: 12.4     |
| GPU  Name                 ...

If this succeeds, your environment is ready to run Speech NIM containers.

WSL2 (Windows)#

For Windows deployments with WSL2, refer to NVIDIA NIM on WSL2. Check the support matrix for your service to confirm WSL2-compatible models. You can need to adjust WSL memory allocation using .wslconfig and use podman instead of docker.