Release Notes for NeMo Platform#

Check out the latest release notes for the NeMo Platform.

Tip

If you’ve installed one of the previous releases of the NeMo Platform using Helm and want to upgrade, choose one of the following options:

To upgrade to the latest release, follow the steps at Upgrade Helm Chart.
To uninstall and reinstall, follow the steps at Uninstall and Install.

Release 26.3.0#

This release is a comprehensive revamp of the NeMo Platform. The architecture, deployment experience, and security model have all been redesigned to improve modularity, ease of use, and enterprise readiness.

New Quickstart Experience#

Getting started with NeMo Platform is now faster and simpler. A new Python-based CLI and SDK (nemo-platform) replaces the previous setup flow:

Install with a single command: pip install nemo-platform
Launch the full platform locally with nmp quickstart up — no Kubernetes required
By default, quickstart uses remote NVIDIA inference endpoints, so no GPU is needed on your machine
Optionally configure local GPU inference with nmp quickstart configure for full on-device model serving
A built-in chat command (nmp chat <model-name>) lets you interact with models immediately after startup

Redesigned Helm Chart#

The Kubernetes deployment has been consolidated into a single all-in-one Helm chart (nemo-platform), available from the NVIDIA NGC Helm registry:

One chart installs the entire NeMo Platform, replacing the previous multi-chart setup
Supports on-premises and cloud Kubernetes clusters
Configurable add-ons include external databases, persistent volumes, ingress, multi-node networking, and OpenShift compatibility
Upgrade and rollback follow standard Helm workflows (helm upgrade, helm rollback)

Authorization Overhaul#

The authorization system has been rebuilt around an embedded policy engine:

OPA (Open Policy Agent) policies now run as WebAssembly inside each service process — no external auth sidecar is required
All API endpoints are workspace-scoped (/v2/workspaces/{workspace}/...), making multi-tenant access control straightforward
Role-based access control policies are compiled to policy.wasm at build time and evaluated at ~5,000 decisions per second
Auth can be toggled on or off via configuration, simplifying local development and testing