Skip to main content
Ctrl+K
NVIDIA NeMo Microservices - Home NVIDIA NeMo Microservices - Home

NVIDIA NeMo Microservices

NVIDIA NeMo Microservices - Home NVIDIA NeMo Microservices - Home

NVIDIA NeMo Microservices

Table of Contents

About NeMo Microservices

  • Overview
  • Key Features
  • Concepts
    • Platform
    • Entities
    • Customization
    • Evaluation
    • Inference
    • Guardrails
    • Data Designer
    • Safe Synthesizer
  • Release Notes

Get Started

  • About Getting Started
  • Demo Cluster Setup
    • Demo Cluster Requirements
    • Minikube
      • Set Up Using Deployment Scripts
      • Set Up Manually
  • Install NeMo Microservices Python SDK
  • Beginner Platform Tutorials
    • Deploy NIM
    • Customize and Evaluate LLMs
    • Add Safety Checks to LLMs
    • Use Llama Stack APIs
  • Jupyter Notebooks

Manage Entities

  • About Managing Entities
  • Tutorials
    • Set Up Organizational Entities
    • Create Dataset Files
  • Namespaces
    • Create Namespace
    • Update Namespace
    • Get Namespace
    • List Namespaces
    • Delete Namespace
  • Projects
    • Create Project
    • Update Project
    • Get Project
    • List Projects
    • Delete Project
  • Datasets
    • Create Dataset
    • Get Dataset
    • Update Dataset
    • List Datasets
    • Delete Dataset
  • Models
    • Get Details of a Model
    • Update Model
    • List Models
    • Delete Model
  • Entity Fields Reference

Design Synthetic Data From Scratch or Seeds (Beta)

  • About Designing Synthetic Data From Scratch or Seeds
  • Deploy with Docker
  • Tutorials
  • Configure Models
  • Seeding SDG with External Data
  • Generate Realistic Personal Details
  • Define Data Columns
    • Column Types
      • Sampling-Based Columns
      • Expression Columns
      • LLM-Based Columns
    • Multi-Modal Context
    • Add Constraints
    • Using Jinja Templates
    • Structured Outputs
  • Generate Data
    • Data Generation Workflow
    • Manage Jobs
      • Preview Data Generation
      • Create Data Generation Job
      • Get Job Status
      • List Data Generation Jobs
      • Get Job Logs
      • Get Job Results
      • Download Job Results
  • Data Designer Configuration
    • Save Configurations
    • Load Configurations
  • Data Quality
    • Code Validation
    • Dataset Analysis and Profilers
    • Add Custom Validation

Generate Private Synthetic Data (Beta)

  • About Generating Private Synthetic Data
  • Deploy with Docker
  • Tutorials
    • Safe Synthesizer 101
  • Synthesize
    • Tabular Fine-Tuning
    • Differential Privacy
    • Replace PII
      • Named Entity Recognition in Free Text
      • Regex
      • LLM Column Classification
      • Supported Entities
  • Evaluate
    • Evaluation Report
  • Configure
    • Data Preparation
    • PII Replacement
    • Training
    • Generate
    • Evaluate
  • Python SDK
    • REST API Reference
    • Examples
  • Manage Jobs
    • Create Job
    • Monitor Status
    • Logs
    • Cancel Job
    • Retrieve and List Jobs
    • Delete Job
    • Job Results
      • List Results
      • Get Result Metadata
      • Download Result

Fine-Tune

  • About Fine-Tuning
  • Tutorials
    • Understanding Configs and Models
    • Format Training Dataset
    • LoRA Customization
    • SFT Customization
    • DPO Customization
    • Distillation Customization
    • Embedding Model Customization
    • Check Job Metrics
    • Optimize Throughput
    • Import Private HF Model
  • Model Catalog
    • Llama Models
    • Llama Nemotron Models
    • Phi Models
    • Embedding Models
    • GPT-OSS Models
    • Gemma Models
    • Dataset Format Requirements
  • Manage Targets
    • Create Target
    • Get Target Details
    • List Targets
    • Update Target
    • Delete Target
    • Target Values
  • Manage Configs
    • List Configs
    • Get Config Details
    • Create Config
    • Update Config
  • Manage Jobs
    • Create Job
    • Get Job Status
    • List Active Jobs
    • Cancel Job
    • Hyperparameter Options

Evaluate

  • About Evaluating
  • Deploy with Docker
  • Tutorials
    • Run an LLM Harness Eval
    • Run an LLM Judge Eval
  • Evaluation Flows
    • Academic Benchmarks
      • BigCode
      • BFCL
      • LM Harness
      • Safety Harness
      • Simple Evals
    • Retrieval
    • RAG
    • Agentic
    • LLM-as-a-Judge
    • Template
    • Prompt Optimization
  • Targets
    • Create Target
    • Delete Target
    • Data Source Targets
    • LLM Model Targets
    • Retriever Pipeline Targets
    • RAG Pipeline Targets
    • Target Schema
  • Configurations
    • Create Config
    • Delete Config
    • Config Schema
  • Jobs
    • Create Job
    • Get Job Details
    • Get Job Status
    • List Jobs
    • Get Job Results
    • Download Detailed Results
    • Get Job Logs
    • Delete Job
    • v2 Migration Guide
    • Job Target & Config Matrix
    • Job Durations
    • Job Schema
  • Live Evaluations
  • API Key Authentication
  • Results
  • Filter and Sort Responses
  • Support Matrix

Audit Model Safety (Beta)

  • About Auditing Models
  • Deploy with Docker
  • Tutorials
    • Run a Simple Job
    • Audit a Local NIM
    • Notebook
  • Manage Targets
    • Basic Target
    • NVIDIA Hosted NIM
    • NIM Proxy
    • Local NIM
    • OpenAI
    • List Targets
    • Update and Delete a Target
    • Target Schema
  • Manage Configs
    • Create Config
    • Get a Config
    • List Configs
    • Update and Delete a Config
    • Probes and Plugins
    • Config Schema
  • Run and Manage Audit Jobs
  • Viewing Audit Job Results
  • Reference

Deploy NIM and Run Inference

  • About Deploying and Running Inference on NIM
  • Tutorials
    • Deploy NIM
    • Run Inference on NIM
  • Manage NIM Deployments
    • Deploy NIM Microservices
    • Get NIM Deployment Details
    • List Deployments
    • Update Deployment
    • Delete NIM Deployment
    • Create Configuration
    • Get Configuration
    • List Configurations
    • Update Configuration
    • Delete Configuration
  • Run Inference on NIM
    • Health Check
    • List Models
    • Chat Completions
    • Completions
    • Embeddings

Manage Guardrails

  • About Guardrails
  • Terminology
  • Deploy with Docker
  • Tutorials
    • Integrate with NemoGuard NIM
    • Parallel Rails
    • Multimodal Data
    • Injection Detection
    • Custom HTTP Headers
    • Custom LLM Providers
  • Manage Configurations
    • Creating a Configuration
    • Listing Configurations
    • Getting a Configuration
    • Updating a Configuration
    • Deleting a Configuration
    • Configuration Store
  • Manage Access to Models
  • Check a Guardrail
  • Inference with Guardrails
  • Streaming Output
  • Observability
  • Reference

Admin Setup

  • About Admin Setup
  • Install as Platform
    • Prerequisites
    • Install
    • Ingress Setup
    • Upgrade
    • Uninstall
  • Install Individually
    • Tag-Based Installation
    • NeMo Auditor
      • Helm Chart
    • NeMo Customizer
      • Helm Chart
      • Manage GPUs
        • Configure Cluster GPUs
        • Model Configurations Matrix
        • Troubleshooting GPU Jobs
      • Configure Models
    • NeMo Data Designer
      • Helm Chart
    • NeMo Evaluator
      • Helm Chart
      • Chart Config Options
    • NeMo Guardrails
      • Basic Installation
      • Integrate with NIM Deployed in Cluster
      • Integrate with NIM from build.nvidia.com
      • High Availability
      • Configuration Store
      • Custom Dependencies
      • Observability
      • Deploy on Google Kubernetes Engine
    • NeMo Safe Synthesizer
      • Helm Chart
  • Infrastructure Configurations
    • Core Configs
      • Configure Jobs
    • Data Store Config
    • Entity Store Config
    • Deployment Management Config
    • NIM Proxy Config
    • Operator Config
  • Custom Resource Definitions
  • Manage Storage
    • Databases
      • PostgreSQL
      • Milvus
    • PVCs
      • AWS Peristent Volumes
      • Oracle Persistent Volumes
    • Object Storage
      • Amazon S3
    • Backup and Restore
  • Manage Secrets
    • Secrets for Accessing NGC Catalog
    • External Database Secrets
    • JSON Web Token Secrets
    • Object Store Secrets
    • MLFlow Customizer Secrets
    • Weights & Biases Keys
    • Hugging Face API Key Secret
  • OpenTelemetry Setup
  • Tenant Configuration Options
  • Security for NeMo Microservices

Reference

  • System Requirements
  • NeMo Microservice API Reference
    • Platform
    • Auditor
    • Customizer
    • Data Designer
    • Deployment Management
    • Entity Store
    • Evaluator
    • Guardrails
    • NIM Proxy
    • Safe Synthesizer
  • NeMo Microservices Python SDK Reference
    • Client APIs
    • API Reference
      • nemo_microservices
        • nemo_microservices.data_designer
        • nemo_microservices.lib
        • nemo_microservices.resources
        • nemo_microservices.types
        • nemo_microservices.pagination
  • NeMo Microservices Helm Chart
  • Troubleshooting
    • Troubleshoot Auditor
    • Troubleshoot Customizer
    • Troubleshoot Evaluator
    • Troubleshoot Guardrails
    • Troubleshoot Data Designer
    • Troubleshoot Setup
  • NVIDIA Distribution in Llama Stack
  • EULA

Resources

  • OSS License Acknowledgements
  • Manage Storage
  • Databases
  • Milvus

Milvus#

NeMo Evaluator uses Milvus for vector database storage for Retrieval Evaluation Flow and RAG Evaluation Flow.

To learn how to configure the NeMo Evaluator microservice to use Milvus, see Milvus in the NeMo Evaluator setup guide.

previous

PostgreSQL

next

ReadWriteMany Persistent Volumes

NVIDIA NVIDIA
Privacy Policy | Manage My Privacy | Do Not Sell or Share My Data | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2024-2025, NVIDIA Corporation.

Last updated on Oct 17, 2025.