Key Features#

The NVIDIA NeMo microservices platform delivers a comprehensive suite of features that help you build, evaluate, and serve custom Large Language Models (LLMs).


Data Management Features#

Manage your data assets with NeMo data management features:

  • Streamline your data operations with centralized entity management.

  • Add custom metadata to enhance organization and simplify retrieval.

  • Use Hugging Face Hub (HfApi) conventions to manage asset files.

Manage Entities

Iterate on your AI models by managing your projects, datasets, and models.

About Managing Entities
Deploy NeMo Data Store

Deploy NeMo Data Store.

NeMo Data Store Microservice Deployment and Setup Guide
Deploy NeMo Entity Store

Deploy NeMo Entity Store.

NeMo Entity Store Values Setup

Customization Features#

Transform base models into specialized solutions for your unique needs:

  • Create fine-tuned models by fine-tuning a base model on your own data.

  • Leverage state-of-the-art customization techniques including full supervised fine-tuning and parameter-efficent fine-tuning.

  • Implement model customization with a single API call.

  • Work with leading model families including Llama and Phi.

  • Deploy anywhere -—on-premises or in the cloud—- with Kubernetes support.

Fine-Tune

Fine-tune your models using customization jobs to improve their performance.

About Fine-Tuning
Deploy NeMo Customizer

Deploy NeMo Customizer to your Kubernetes cluster.

NeMo Customizer Microservice Deployment Guide

Evaluation Features#

Ensure your models meet quality and performance standards:

  • Control LLM and AI pipeline evaluations for both custom and standard benchmarks.

  • Scale evaluations with a single API call while maintaining full data control.

  • Maintain consistency across teams through versioned benchmark configurations.

  • Future-proof your applications with continuous benchmark additions.

  • Access enterprise-grade support with regularly updated security patches.

Evaluate

Set targets, define evaluation configurations, and run an evaluation job to measure your model’s performance.

About Evaluating
Deploy NeMo Evaluator

Deploy NeMo Evaluator to your Kubernetes cluster.

NeMo Evaluator Deployment Guide

Inference Features#

Deploy and manage your models as NIM for LLMs for inference.

  • Model Deployment. You can deploy models as NIMs using the NeMo Deployment Management microservice by specifying deployment configurations and submitting deployment requests.

  • Model Discovery. The NIM Proxy microservice auto-detects deployed models and lists them through a unified endpoint.

  • Inference Requests. You can send inference requests to the NIM Proxy endpoint, which routes the requests to the appropriate deployed model.

  • Model Management. You can manage the deployed models and lifecycle through the NeMo Deployment Management microservice, ensuring models are up-to-date.

  • Model Access Management. You can manage access to the deployed models through the NeMo Deployment Management microservice.

Deploy and Proxy NIM for LLMs for Inference

Deploy NIM for LLMs to your Kubernetes cluster.

About Deploying and Proxying NIM for LLMs
Install NeMo Deployment Management

Install NeMo Deployment Management to your Kubernetes cluster.

NeMo Deployment Management Setup Guide
Install NIM Proxy

Install NIM Proxy to your Kubernetes cluster.

NeMo NIM Proxy Helm Chart Values Setup

Guardrail Features#

Protect your AI applications with comprehensive safety features:

  • Guard against hallucinations, harmful content, and security vulnerabilities.

  • Implement customizable checks for specific business, language, or geographical requirements.

  • Optimize performance with Parallel Rails technology.

  • Integrate seamlessly with third-party APIs including OpenAI, ActiveFence, and TruEra (Snowflake).

  • Connect with popular Gen AI development tools like LangChain and LlamaIndex.

Add Guardrails

Add checks to moderate user input and model responses.

About Guardrails
Deploy NeMo Guardrails

Deploy NeMo Guardrails as a standalone service.

NeMo Guardrails Microservice Deployment Guide

Flexible Deployment on Kubernetes#

You can deploy the NeMo microservices as an integrated platform to use the entire platform to create an end-to-end data flywheel, or select specific microservices that complement your existing workflows. The following guides are for cluster administrators who want to deploy the NeMo microservices.

About Admin Setup

Use the admin setup guide to learn about deploying the NeMo microservices to Kubernetes.

About Admin Setup
Deployment Scenarios

Learn about the different deployment scenarios for setting up the NeMo microservices on Kubernetes.

Installation Scenarios
Deploy as a Platform

Deploy all NeMo microservices together as a platform using a single Helm chart.

Install NeMo Microservices as a Platform
Deploy Individually

Deploy any one or more of the NeMo microservices individually.

Install NeMo Microservices Individually