Skip to main content

Ctrl+K

NVIDIA NIM for Vision Language Models (VLMs)

NVIDIA NIM for Vision Language Models (VLMs)

Table of Contents

About NVIDIA NIM for VLMs

Overview
Release Notes

Get Started

Get Started with NIM
Query the Llama 3.1 Nemotron Nano VL 8B v1 API
Query the Llama 4 API
Query the Llama 3.2 Vision API
Query the nemoretriever-parse API

Deploy NIM

Deploy with Helm
Air Gap Deployment

Work with Models

Support Matrix
Llama Nemotron Nano VL Model Card
Llama 4 Model Card on GitHub
Llama 3.2 Vision Model Card
nemoretriever-parse Overview
Model Profiles
Fine-Tune a Model

Use Key Features

Observability
Structured Generation

Configure Your NIM

Configure Your NIM
Benchmarking
KV Cache Reuse (a.k.a. prefix caching)

Reference

API Reference
Utilities
Sampling Control

Notices

Acknowledgements
EULA

EULA

EULA#

By using this NIM, you acknowledge that you have read and agreed to the NVIDIA AI PRODUCT AGREEMENT.

previous

Acknowledgements

Privacy Policy | Manage My Privacy | Do Not Sell or Share My Data | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2024-2025, NVIDIA Corporation.

Last updated on Sep 30, 2025.