Release Notes#

v1.0.1#

Breaking Changes#

  • Renamed the spec.gpuSelectors field in the NIM cache custom resource to spec.nodeSelector. The purpose of the field remains the same–to specify the node selector labels for scheduling the caching job. Refer to About the NIM Cache Custom Resource Definition.

  • Changed the Operator pod metrics from HTTPS protocol on port 8443 to HTTP protocol on port 8080.

Features#

  • Added a spec.env field to the NIM cache custom resource to support environment variables. One use of the field is to specify variables such as HTTPS_PROXY for air-gapped and specialized networks. Refer to Caching Models in Air-Gapped Environments.

  • Updated the spec.expose.service.type field in the NIM service custom resource to support common service types, such as LoadBalancer.

  • Added a spec.runtimeClassName field to the NIM service custom resource to support setting the runtime class on a NIM service deployment.

  • Removed the kube-rbac-proxy container from the Operator pod. This change improves the security posture of the Operator. Previously, you might need to provide TLS certificates when you configured Prometheus. With this release, you no longer need to provide the certificates.

  • Certified the Operator for use with Red Hat OpenShift Container Platform.

v1.0.0#

Features#

  • NVIDIA NIM Operator is new.

Known Issues#

  • The container versions for the NeMo Retriever Text Embedding NIM and NeMo Retriever Text Reranking NIM are not publicly available and result in an image pull back off error. The Operator and documentation were developed with release candidate versions of these microservices.

  • The Operator does not support configuring NIM microservices in a multi-node deployment.

  • For VMware vSphere with Tanzu clusters using vGPU software, to use an inference model that requires more than one GPU, the NVIDIA A100 or H100 GPUs must be connected with NVLink or NVLink Switch. These clusters also do not support multi-GPU models with L40S GPUs and vGPU software.

  • The Operator is not verified in an air-gapped network environment.

  • The sample RAG application cannot be deployed on Red Hat OpenShift Container Platform.

  • The Operator has transitive dependency on go.uber.org/zap v1.26.0. Findings indicate Cross-Site Scripting (XSS) vulnerabilities in the Zap package.