Skip to main content

Ctrl+K

NVIDIA NIM for Large Language Models 2.0

Documentation Home

NVIDIA NIM for Large Language Models 2.0

Documentation Home

Table of Contents

About NVIDIA NIM for LLMs

Overview
Enterprise-Grade Inference Software Stack
Release Notes

Get Started

About Get Started
Prerequisites
Installation
Configuration
Quickstart

Deployment

Model Profiles and Selection
Model Download
Model-Free NIM
Kubernetes Deployment
Cloud Service Provider (CSP) Deployment
- Google Cloud
- AWS
- Azure
- Oracle
Air-Gap Deployment
Multi-Node Deployment
vGPU Deployment

Advanced Use Cases

Fine-Tuning with LoRA
Custom Logits Processing
Prompt Embeddings

Reference

Architecture
Environment Variables
API Reference
CLI Reference
Advanced Configuration
Logging and Observability
1.x Migration Guide
Support Matrix

Resources

Support and FAQ
Related Products
Legal

NVIDIA NIM for Large Language Models Documentation#

About NVIDIA NIM for LLMs

Overview
Enterprise-Grade Inference Software Stack
Release Notes

Get Started

About Get Started
Prerequisites
Installation
Configuration
Quickstart

Deployment

Model Profiles and Selection
Model Download
Model-Free NIM
Kubernetes Deployment
Cloud Service Provider (CSP) Deployment
Air-Gap Deployment
Multi-Node Deployment
vGPU Deployment

Advanced Use Cases

Fine-Tuning with LoRA
Custom Logits Processing
Prompt Embeddings

Reference

Architecture
Environment Variables
API Reference
CLI Reference
Advanced Configuration
Logging and Observability
1.x Migration Guide
Support Matrix

Resources

Support and FAQ
Related Products
Legal

next

Overview

Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2024-2026, NVIDIA Corporation.

Last updated on Mar 12, 2026.