Skip to main content

Ctrl+K

NVIDIA NIM for Vision Language Models (VLMs)

NVIDIA NIM for Vision Language Models (VLMs)

Table of Contents

About NVIDIA NIM for VLMs

Overview
Release Notes

Get Started

Get Started with NIM
Query the Nemotron Nano 12B v2 VL API
Query the Nemotron Parse API
Query the Cosmos Reason1 7B API
Query the Llama 4 Maverick 17B 128E Instruct API
Query the Llama 4 Scout 17B 16E Instruct API
Query the Mistral Small 3.2 24B Instruct 2506 API
Query the Llama 3.1 Nemotron Nano VL 8B v1 API
Query the Llama 3.2 Vision API
Query the nemoretriever-parse API

Deploy NIM

Deploy with Helm
Air Gap Deployment

Work with Models

Support Matrix
Nemotron Nano 12B v2 VL Model Card
Nemotron Parse Overview
Cosmos Reason1 7B Model Card
Llama 4 Model Card on GitHub
Mistral Small 3.2 24B Instruct 2506 Model Card
Llama Nemotron Nano VL Model Card
Llama 3.2 Vision Model Card
nemoretriever-parse Overview
Model Profiles
Fine-Tune a Model

Use Key Features

Call Functions (Tools)
Observability
Structured Generation

Configure Your NIM

Configure Your NIM
Benchmarking
KV Cache Reuse (a.k.a. prefix caching)

Reference

API Reference
Utilities
Sampling Control

Notices

Acknowledgements
EULA

NVIDIA NIM for Vision Language Models (VLMs)#

About NVIDIA NIM for VLMs

Overview
Release Notes

Get Started

Get Started with NIM
Query the Nemotron Nano 12B v2 VL API
Query the Nemotron Parse API
Query the Cosmos Reason1 7B API
Query the Llama 4 Maverick 17B 128E Instruct API
Query the Llama 4 Scout 17B 16E Instruct API
Query the Mistral Small 3.2 24B Instruct 2506 API
Query the Llama 3.1 Nemotron Nano VL 8B v1 API
Query the Llama 3.2 Vision API
Query the nemoretriever-parse API

Deploy NIM

Deploy with Helm
Air Gap Deployment

Work with Models

Support Matrix
Nemotron Nano 12B v2 VL Model Card
Nemotron Parse Overview
Cosmos Reason1 7B Model Card
Llama 4 Model Card on GitHub
Mistral Small 3.2 24B Instruct 2506 Model Card
Llama Nemotron Nano VL Model Card
Llama 3.2 Vision Model Card
nemoretriever-parse Overview
Model Profiles
Fine-Tune a Model

Use Key Features

Call Functions (Tools)
Observability
Structured Generation

Configure Your NIM

Configure Your NIM
Benchmarking
KV Cache Reuse (a.k.a. prefix caching)

Reference

API Reference
Utilities
Sampling Control

Notices

Acknowledgements
EULA

next

Overview

Privacy Policy | Manage My Privacy | Do Not Sell or Share My Data | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2024-2025, NVIDIA Corporation.

Last updated on Nov 21, 2025.