Release Notes for NeMo Microservices#

Check out the latest release notes for the NeMo microservices.

Release 25.4.0#

Summary#

This is the first general availability (GA) release of the NeMo microservices.

To learn about the NeMo microservices, start from the following links:

About NeMo Microservices

Learn about the NeMo microservices.

Overview of NeMo Microservices
Key Features

Discover the key features of NeMo microservices.

Key Features
Core Concepts

Understand the core concepts of NeMo microservices.

Concepts
Get Started

Get started with NeMo microservices on a minikube cluster.

About Getting Started with NeMo Microservices

Known Issues#

  • The following are the known issues for the DGX Cloud Admission Controller microservice.

    • There is a known vulnerability in the golang crypto library which can open the system to a DDOS attack if you change the default configuration in the DGX Cloud Admission Controller Helm chart values file. Do not enable the runaicontroller and dgxcExport. For more information, refer to the DGX Cloud Admission Controller Helm installation page.

  • The following are the known issues for the Evaluator API.

    • The cancel endpoint is not available for evaluation jobs.

    • The logs endpoint is not available for evaluation jobs. Instead, use the download-results endpoint. For more information, refer to Get Evaluation Results.

    • The PATCH method is not supported.

  • The following are the known issues for NeMo Evaluator.

    • For tool-calling evaluation jobs, the nemo-ms-evaluator-about is delayed when there is incomplete type info. Tool calls might take more than 30 seconds if the descriptions for array types lack items specifications, or if the descriptions for object types lack properties specifications. Be sure to include these details in tool descriptions. For more information, refer to Custom Tool Calling Evaluation.

    • For tool-calling evaluation jobs, the microservice currently does not support functions with more than 8 parameters. Tool calls might freeze the NIM if a tool description includes a function with more than 8 parameters. If this occurs, restart the NIM. For more information, refer to Custom Tool Calling Evaluation.

    • When you run an LM Evaluation Harness evaluation of type gsm8k or its variants, there is a difference in results when you apply the chat template flag, for a subset of model endpoints compared to their corresponding public benchmark results. Prompt tokens from the model server add an extra beginning token.