Release 26.3.0#

This release is a comprehensive revamp of the NeMo Platform. The architecture, deployment experience, and security model have all been redesigned to improve modularity, ease of use, and enterprise readiness.

New Quickstart Experience#

Getting started with NeMo Platform is now faster and simpler. A new Python-based CLI and SDK (nemo-platform) replaces the previous setup flow:

  • Install with a single command: pip install nemo-platform

  • Launch the full platform locally with nmp quickstart up — no Kubernetes required

  • By default, quickstart uses remote NVIDIA inference endpoints, so no GPU is needed on your machine

  • Optionally configure local GPU inference with nmp quickstart configure for full on-device model serving

  • A built-in chat command (nmp chat <model-name>) lets you interact with models immediately after startup

Redesigned Helm Chart#

The Kubernetes deployment has been consolidated into a single all-in-one Helm chart (nemo-platform), available from the NVIDIA NGC Helm registry:

  • One chart installs the entire NeMo Platform, replacing the previous multi-chart setup

  • Supports on-premises and cloud Kubernetes clusters

  • Configurable add-ons include external databases, persistent volumes, ingress, multi-node networking, and OpenShift compatibility

  • Upgrade and rollback follow standard Helm workflows (helm upgrade, helm rollback)

Authorization Overhaul#

The authorization system has been rebuilt around an embedded policy engine:

  • OPA (Open Policy Agent) policies now run as WebAssembly inside each service process — no external auth sidecar is required

  • All API endpoints are workspace-scoped (/v2/workspaces/{workspace}/...), making multi-tenant access control straightforward

  • Role-based access control policies are compiled to policy.wasm at build time and evaluated at ~5,000 decisions per second

  • Auth can be toggled on or off via configuration, simplifying local development and testing