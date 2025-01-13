NVIDIA NIM for Large Language Models#
- Introduction
- Release Notes
- Getting Started
- Prerequisites
- Launch NVIDIA NIM for LLMs
- Docker Run Parameters
- Run Inference
- Stopping the container
- Kubernetes Installation
- Serving models from local assets
- Tutorials
- Multi-node Deployment
- Deploying with Helm
- Prerequisites
- Configuring helm
- Storage
- Multi-node Models
- Launching NIM in Kubernetes
- Running inference
- Troubleshooting FAQ
- Additional information
- Parameters
- Configuring a NIM
- Model Profiles
- Benchmarking
- Models
- Support Matrix
- Hardware
- Software
- GPUs
- General Guidelines
- Supported Models
- Llama 3 Swallow 70B Instruct V0.1
- Llama 3 Taiwan 70B Instruct
- Llama 3.1 8B Base
- Llama 3.1 8B Instruct
- Llama 3.1 70B Instruct
- Llama 3.1 405B Instruct
- Meta-Llama-3-8B-Instruct
- Meta-Llama-3-70B-Instruct
- Mistral-7B-Instruct-v0.3
- Mixtral-8x7B-v0.1
- Mistral-NeMo-12B-Instruct
- Mixtral-8x22B-v0.1
- Nemotron 4 340B Instruct
- API Reference
- Function Calling
- Llama Stack API (Experimental)
- Utilities
- Observability
- Structured Generation
- Parameter-Efficient Fine-Tuning
- LoRA Setup Overview
- LoRA Adapters
- LoRA Model Directory Structure
- Obtaining LoRA models
- PEFT Environment Variables
- Launch NIM for LLMs with PEFT
- Acknowledgements
