Skip to main content
country_code
Ctrl+K
NVIDIA NIM for Large Language Models - Home NVIDIA NIM for Large Language Models - Home

NVIDIA NIM for Large Language Models

  • Documentation Home
NVIDIA NIM for Large Language Models - Home NVIDIA NIM for Large Language Models - Home

NVIDIA NIM for Large Language Models

  • Documentation Home

Table of Contents

About NVIDIA NIM for LLMs

  • Overview
  • Enterprise-Grade Inference Software Stack
  • Release Notes

Get Started

  • About Get Started
  • Prerequisites
  • Installation
  • Configuration
  • Quickstart

Deployment

  • Model Profiles and Selection
  • Model Download
  • Model-Free NIM
  • Kubernetes Deployment
    • Helm and Kubernetes
    • KServe
    • OpenShift
    • Run:ai
    • NIM Operator Deployment
  • Cloud Service Provider (CSP) Deployment
    • Google Cloud
    • AWS
    • Azure
    • Oracle
  • Air-Gap Deployment
  • Multi-Node Deployment
  • vGPU Deployment

Advanced Use Cases

  • Fine-Tuning with LoRA
  • Custom Logits Processing
  • Prompt Embeddings

Reference

  • Architecture
  • Environment Variables
  • API Reference
  • CLI Reference
  • Advanced Configuration
  • Logging and Observability
  • 1.x Migration Guide
  • Support Matrix
  • Archived Versions

Troubleshooting

  • GPU Memory (OOM) Errors

Resources

  • Support and FAQ
  • Related Products
  • Legal
  • Troubleshooting
Is this page helpful?

Troubleshooting#

  • Troubleshooting GPU Memory Out-of-Memory Errors – diagnosing and resolving GPU out-of-memory (OOM) errors during startup

NVIDIA NVIDIA
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2024-2026, NVIDIA Corporation.

Last updated on Mar 25, 2026.