Deployment Guide for NVIDIA NIM for LLMs#

You can deploy and run models powered by NIM on your preferred NVIDIA accelerated infrastructure. This section describes deployment instructions by various supported target environments:

Deploy on Your Own Compute#

Using Docker, as described in Get Started
On Kubernetes, as described in NIM Operator for Kubernetes, Kubernetes Installation, and Deploy with Helm
Across multiple nodes, as described in Multi-Node Deployment
Air-gapped, as described in Air Gap Deployment

Additional Deployment Guides#

You can also deploy on other platforms:

NVIDIA NIM on WSL2 provides instructions on setting up and configuring to deploy on Windows PCs using Windows Subsystem for Linux (WSL). We recommend that you set NIM_RELAX_MEM_CONSTRAINTS=1 when you deploy with Docker on RTX GPUs to avoid high memory usage.
The NIM on KServe deployment guide provides step-by-step on how to deploy on KServe.
The NIM on AWS Elastic Kubernetes Service (EKS) deployment guide provides step-by-step instructions for deploying on AWS EKS.
The NIM on Azure Kubernetes Service (AKS) deployment guide provides step-by-step instructions for deploying AKS.
The NIM on Azure Machine Learning (AzureML) deployment guide provides step-by-step instructions for deploying AzureML using Azure CLI and Jupyter Notebook.
The NIM on AWS SageMaker deployment guide provides step-by-step instructions for deploying on AWS SageMaker using Jupyter Notebooks, Python CLI, and the shell.
The deployment guides for Google Cloud Platform (GCP) provide step-by-step instructions for deploying the multi-LLM compatible NIM container on the following GCP services:
The End to End LLM App development with Azure AI Studio, Prompt Flo, and NIMs deployment guide provides end-to-end LLM app development with Azure AI Studio, Prompt Flow, and NIMs.

Deploy as a Managed Endpoint Service#

For developers and organizations who prefer a fully managed, hosted solution, NIM can be deployed on hosting integration partners including Baseten, Fireworks AI, and Together AI.

Use hosting integration partners for NIM deployment