Release Notes#
v2.0.1#
Features#
Added support for NeMo microservices v25.6.0. Updated the following NeMo microservices custom resources to support NeMo microservices v25.6.0:
NeMo Evaluator custom resource now supports updated
evaluationImages
for evaluation jobs.NeMo Customizer custom resource now supports Customization Targets to represent a model that can be customized (fine-tuned) using the Customizer service. Use
data.customizationTargets
to define your model targets in you Customizer model Configmap instead ofdata.models
, used in previous versions. Refer to the NeMo Customization Targets documentation for more details.
Refer to the NeMo microservices release notes for a full list of changes in this release.
Added the
spec.proxy
parameter to NIM cache and NIM service custom resource. This improves support for clusters operating behind HTTP proxies by allowing you to specify your proxy configuration using thespec.proxy
parameter. This must be configured in both your NIM cache and NIM service custom resources. This parameter replaces thespec.certConfig
parameter, which is now deprecated.Added support for specifying GRPC and metrics ports in NIM services custom resource for non-LLM NIMs running a Triton Inference Server. (PR#490).
Added support for specifying a custom scheduler in a NIM service custom resource. Use
spec.schedulerName
to specify the name of the scheduler to use for NIM jobs. If no custom scheduler name is set, your default Kubernetes scheduler is used. (PR#489).Added support for size limits for a emptyDir volume created for NIMService Deployments and NeMoCustomizer training jobs. Specify either
.spec.storage.sharedMemorySizeLimit
in NIM service custom resource or.spec.training.sharedMemorySizeLimit
in the NeMo Customizer custom resources to set a shared memory limit. By default, an emptyDir volume is created with no size limit. (PR#492).Added support for setting annotations for NIM Operator created PVCs. (PR#508).
Added support for pulling models and datasets from HuggingFace Hub and NeMo Data Store into your NIM cache using the
spec.source.hf
andspec.source.dataStore
parameters.
Bug fixes#
Updated the NIM service status to include model information for non-LLM NIMs. (PR#498)
Fixed an issue were the the resource field was missing in Helm upgrade hooks which could prevent the upgrade CRD jobs from running properly.
Fixed a bug where the NIM pipeline and NIM service status was incorrectly being marked as
Failed
and when the NIM cache was still in progress. (PR#504)Fixed an issue where the NIM cache would fail to find non-LLM models when no profile was specified in NIM cache custom resource. (PR#513)
Fixed an issue in the Data Flywheel with Jupyter notebook tutorial where the NIM Operator was not creating PVCs when the the default storage class was something other than “local-path” provisioner.
Removals and Deprecations#
Deprecated NIM cache
spec.certConfig
parameter. If you were using thespec.certConfig
parameter to specifc custom CA certificates in previous versions, you should update your NIM cache resources to usespec.proxy
and add your proxy configuration to the NIM service. Refer to supporting customer CA certificates for details.
Known Issues#
There is a known issue with NVIDIA NeMo Retriever models where filtering is not working correctly when
nimcache.spec.source.ngc.model.engine
is set to “tensorrt”. This issue is due to the swap in product-name string between the node label and model manifest.
v2.0.0#
Features#
Added support for deploying the NVIDIA NeMo microservices as custom resources with the NVIDIA NIM Operator. NVIDIA NeMo microservices are a modular set of tools that you can use to customize, evaluate, and secure large language models (LLMs) while optimizing AI applications across on-premises or cloud-based Kubernetes clusters. Deploying these mircoservices with the NIM Operator provides the ability to manage your AI workflows across your Kubernetes cluster.
The NIM Operator supports deploying the following NeMo microservices to your cluster as custom resources:
NeMo core microservices
NeMo Customizer
NeMo Evaluator
NeMo Guardrails
NeMo platform microservices
NeMo Data Store
NeMo Entity Store
Get started with the NeMo microservices.
Refer to the NeMo microservices documentation for details the using thse microservices.
Improved NIMService status to detail model information including the cluster endpoint, external endpoint, and name of the model the service is connected to.
Updated required fields for NIMService and NIMPipeline custom resources including:
nimservice.spec.image.respository
nimservice.spec.image.tag
nimpipeline.spec.image.respository
nimpipeline.spec.image.tag
nimpipeline.spec.expose
Enable caching of pre-built TRT-LLM engines and optimized artifacts for non-LLM NIMs, such as Riva and BioNeMo NIMs.
Add support for configuring annotations and security contexts for the NIM Operator deployment via Helm values, fixing issue #333.
Decouple NIM Operator upgrades from NIMService upgrades. NIMService pods are now only restarted when their corresponding CR specifications are updated.
Improve support for clusters operating behind HTTP proxies, including injection of proxy environment variables and custom CA certificates.
Bug fixes#
Fixed an issue where an empty storage class was set in the PVCs created by the NIM Operator for caching NIMs.
v1.0.1#
Breaking Changes#
Renamed the
spec.gpuSelectors
field in the NIM cache custom resource tospec.nodeSelector
. The purpose of the field remains the same–to specify the node selector labels for scheduling the caching job. Refer to About the NIM Cache Custom Resource Definition.Changed the Operator pod metrics from HTTPS protocol on port
8443
to HTTP protocol on port8080
.
Features#
Added a
spec.env
field to the NIM cache custom resource to support environment variables. One use of the field is to specify variables such asHTTPS_PROXY
for air-gapped and specialized networks. Refer to Caching Models in Air-Gapped Environments.Updated the
spec.expose.service.type
field in the NIM service custom resource to support common service types, such asLoadBalancer
.Added a
spec.runtimeClassName
field to the NIM service custom resource to support setting the runtime class on a NIM service deployment.Removed the
kube-rbac-proxy
container from the Operator pod. This change improves the security posture of the Operator. Previously, you might need to provide TLS certificates when you configured Prometheus. With this release, you no longer need to provide the certificates.Certified the Operator for use with Red Hat OpenShift Container Platform.
v1.0.0#
Features#
NVIDIA NIM Operator is new.
Known Issues#
The container versions for the NeMo Retriever Text Embedding NIM and NeMo Retriever Text Reranking NIM are not publicly available and result in an image pull back off error. The Operator and documentation were developed with release candidate versions of these microservices.
The Operator does not support configuring NIM microservices in a multi-node deployment.
For VMware vSphere with Tanzu clusters using vGPU software, to use an inference model that requires more than one GPU, the NVIDIA A100 or H100 GPUs must be connected with NVLink or NVLink Switch. These clusters also do not support multi-GPU models with L40S GPUs and vGPU software.
The Operator is not verified in an air-gapped network environment.
The sample RAG application cannot be deployed on Red Hat OpenShift Container Platform.
The Operator has transitive dependency on go.uber.org/zap v1.26.0. Findings indicate Cross-Site Scripting (XSS) vulnerabilities in the Zap package.