Release Notes for NeMo Microservices#

Check out the latest release notes for the NeMo microservices.

Tip

If you’ve installed one of the previous releases of the NeMo microservices using Helm and want to upgrade, choose one of the following options:

Release 25.9.0#

The following are the key features and known issues for the NeMo microservices 25.9.0 release.

Key Features#

The following features are added in this release.

Platform#

NeMo Auditor#

  • Added support for historical versioning of configs and targets. The purpose of the versioning is to create a record of additions, changes, and deletions. Refer to Auditor API, View Configuration History, and Viewing Target History.

  • Removed the report.logs.tgz job result artifact. The logs that were stored in the artifact are now available from the job logs.

  • Enhanced the response for job status. The response now includes a progress field that specifies the total number of probes to run and the number of probes that completed. The response also includes a message field that can assist with troubleshooting.

  • The microservice continues with early access availability for a second release and is subject to limited support and potential API changes in future releases.

NeMo Customizer#

  • Added comprehensive support for HuggingFace model import and customization for models under 70B parameters.

  • Added support for OpenAI GPT-OSS models including GPT-OSS 20B and GPT-OSS 120B variants.

  • Exposed additional hyperparameters for fine-tuning jobs. Users can now configure optimizer algorithm, warmup steps, max steps, and the random seed. Consult the hyperparameter documentation to review all available hyperparameters.

  • Added validated support for multi-node training on GCP GKE with high-performance networking.

  • Enhanced GPU support with optimized configurations for H100 and B200 benchmarking for SFT workloads.

  • Updated transformer library support to maintain compatibility with latest versions, including transformer 4.52.0.

NeMo Data Designer#

  • Simplified ModelConfig schema by removing the nested ApiEndpoint structure. The model field now accepts a string model identifier, and an optional provider field specifies which configured provider to use.

  • Enhanced model provider configuration with secure API key management and timeout parameter support. Model providers are now configured at deploy time, with support for sensitive API keys defined as environment variables or JSON secrets files. Users can now configure timeout values for LLM requests in their Model Configs to control request duration and improve reliability for different endpoint scalability scenarios.

  • Added multi-modal context support for incorporating images into synthetic data generation pipelines.

  • Added custom validation support through remote endpoint integration. Users can now integrate domain-specific validation logic into synthetic data generation workflows using ValidationWithRemoteEndpointColumn.

NeMo Evaluator#

  • New prompt optimization task using DSPy MIPROv2 which can be used to evaluate and optimize LLM-as-a-Judge prompt.

  • New Safety Harness evaluation type for model safety and alignment using Nemotron Content Safety V2 (Aegis V2) and WildGuard.

  • New Simple Evals type which includes GPQA, MMLU, Math Test 500, and AIME academic benchmarks.

  • Added support of all retriever metrics available in pytrec eval library.

Other Changes#

The following changes are made in this release.

NeMo Guardrails#

  • NemoGuard Jailbreak Detection NIM v1.10.1 with B200 support has been released. You can use it with NeMo Guardrails to detect jailbreak attempts on the services you run on B200 GPUs.

  • (Documentation) Revamped the NeMo Guardrails tutorials to use the latest NeMo microservices APIs and Python SDK.

  • (Documentation) Added a new NeMo Guardrails end-to-end workflow tutorial, Integrate NeMo Guardrails with NemoGuard NIM Microservices. It shows how to set up NeMo Guardrails with NemoGuard NIM and LLM NIM microservices. With this tutorial, you can achieve the following:

    • Install NeMo Guardrails with Helm on Kubernetes.

    • Use LLM-based NemoGuard NIM microservices and LLM NIM microservices with the key-value cache reuse feature enabled.

    • Enable parallel guardrail execution to reduce inference response time.

  • (Documentation) Revamped the NeMo Guardrails Helm installation guide.

NeMo Evaluator#

  • Added generation parameters to RAG evaluations to speed up evaluation: generation_max_tokens, generation_max_workers, generation_temperature

  • Fixed incorrect samples_processed reported for running custom evaluations.

  • Fixed jobs completed with no metrics are now marked as failed.

  • Fixed cleanup of failed jobs due to retry.

  • Improved job validation.

  • Improved job error handling for issues with dataset download.

Known Issues#

NeMo Auditor#

  • When adding an audit job configuration, the following limitations are known:

    • Specifying the reporting.taxonomy field has no effect.

    • Specifying run.probe_tags results in an error state for an audit job.

NeMo Evaluator#

  • Simple Evals, Safety Harness, RAG and Retriever jobs do not report samples_processed for job progress tracking.