Skip to main content

Ctrl+K

NVIDIA NIM for Large Language Models (LLMs)

Documentation Home

NVIDIA NIM for Large Language Models (LLMs)

Documentation Home

Table of Contents

About NVIDIA NIM for LLMs

Overview
Release Notes
Supported Architectures for Multi-LLM NIM
Supported Models for LLM-specific NIMs

Get Started

Get Started with LLM NIM
Tutorials

Deploy NIM

Deployment Guide
Deploy with Helm
Air Gap Deployment
Multi-Node Deployment
Deploy Behind a Proxy

Work with Models

Model Profiles
Fine-Tuned Models
Reward Models
Reasoning Models

Use Key Features

Custom Guided Decoding Backends
Function (Tool) Calling
Thinking Budget Control
Message Roles
Observability
Parameter-Efficient Fine-Tuning
Structured Generation
Per-Request Metrics

Configure Your NIM

Benchmarking
Configure Your NIM
Deterministic Generation Mode
KV Cache Reuse
Repository Override
OCI S3 Object Storage

Reference

API Reference
Llama Stack API
Utilities
Troubleshoot
Notes on NIM Container Variants

Resources

Acknowledgements
EULA

Eula

Is this page helpful?

Eula#

By using this NIM, you acknowledge that you have read and agreed to the NVIDIA AI PRODUCT AGREEMENT.

previous

Acknowledgements

Privacy Policy | Manage My Privacy | Do Not Sell or Share My Data | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2024-2025, NVIDIA Corporation.

Last updated on Oct 10, 2025.