NVIDIA Enterprise RAG LLM Operator
Enterprise RAG LLM Operator - (Latest Version)

Release Notes

Initial Release

The initial release of the NVIDIA Enterprise RAG LLM Operator enables NVIDIA AI Enterprise customers to deploy an Operator that manages the life cycle of the following key components for RAG pipelines:

  • NVIDIA Inference Microservice

  • NVIDIA NeMo Retriever Embedding Microservice

NVIDIA provides a sample RAG pipeline to demonstrate deploying an LLM model, pgvector as a sample vector database, a chat bot web application, and a query server that communicates with the microservices and the vector database.

Known Issues

  • Autoscaling the microservices is not operational. Alternatively, you can scale the microservices using the kubectl scale sts --replicas=<n> command.

  • Modifying a Helm pipeline specification and applying the change might not roll out the change. Alternatively, you can roll out the change using the kubectl rollout restart sts command.

  • The Operator is not verified in an air-gapped network environment.

Previous Connecting to a Vector Database
© Copyright 2024, NVIDIA. Last updated on Mar 21, 2024.