Release Notes for NeMo Microservices#
Check out the latest release notes for the NeMo microservices.
Tip
If you’ve installed one of the previous releases of the NeMo microservices using Helm and want to upgrade, choose one of the following options:
- To upgrade to the latest release, follow the steps at Upgrade NeMo Microservices Helm Chart. 
- To uninstall and reinstall, follow the steps at Uninstall NeMo Microservices Helm Chart and Install NeMo Microservices Helm Chart. 
Release 25.7.0#
The following are the key features and known issues for the NeMo microservices 25.7.0 release.
Key Features#
The following features are added in this release.
Platform#
- Released the NeMo Microservices Python SDK. To install and get started, refer to the latest beginners tutorials and the Python SDK documentation: 
- Added support for NVIDIA B200 GPUs. 
Customizer#
- Target and Configuration changes to the ConfigMap no longer require an application restart to propagate. 
- Enabled passing Weights & Biases project details such as entities, tags, and descriptions at runtime using the new integrations configuration in the Customization jobs endpoint. 
- Added support for NVIDIA B200 GPUs for customization jobs. 
Evaluator#
- Removed MT Bench LLM-as-a-Judge; use the Custom evaluation type with the LLM-Judge metric. 
- Added support to control reasoning for OAI-compatible models and Nemotron models as evaluation targets, as well as LLM-as-a-Judge across all benchmarks. 
- Enabled log downloads for - BigCodeand- LMEvalHarnessevaluations via the- /v1/evaluation/jobs/{job_id}/logsAPI endpoint.
- Improved benchmark accuracy with updated Evaluator APIs. 
- Enabled authentication for Custom LLM-Judge and Agentic Judge. 
Fixed Issues#
The following issues are fixed in this release.
Guardrails#
- Fixed the OpenAPI spec for the - /v1/guardrail/chat/completions,- /v1/guardrail/checks, and- /v1/guardrail/completionsendpoints to correctly specify that only 200 status codes are returned (removing the 201 option). Also fixed the OpenAPI spec to correctly include the- text/event-streamresponse type for streaming mode.
Known Issues#
- LLama Nemotron Nano and Super models: - Do not support LoRA adapters 
- Will not properly deploy with Deployment Manager. You can alternatively deploy these models as NIMs using Helm.