NeMo Evaluator Deployment Guide#
You can deploy NVIDIA NeMo Evaluator by using Helm. To use Helm, you must have a Kubernetes cluster installed and Helm ready to use.
Prerequisites#
Dependencies
Argo Workflows (Evaluation jobs are orchestrated by Argo workflows.)
Milvus (Used for Retriever and RAG evaluations.)
PostgreSQL (The persistent data store for NeMo Evaluator.)
For Argo Workflows, Milvus, and Postgres, you can install with the default values from the NeMo Evaluator Helm chart, or you can use your own versions.
Kubernetes
Secrets:
Install NIM for LLMs#
Use the following documentation to install NIM for LLMs. The service must be running before you can use NeMo Evaluator.
The inference URL for each evaluation job is specified at the API request level. Make sure the target model is running with NIM for LLMs before you create the evaluation target and submit an evaluation job for it.
Install and Configure NeMo Data Store#
Use the following documentation to install NeMo Data Store. The service must be running before you can use NeMo Evaluator.
In the custom-values.yaml
file, specify the data store endpoint to the evaluator.external.dataStore.endpoint
key.
external:
dataStore:
# external.dataStore.endpoint references an endpoint URL of an external to the chart installation of NeMo Data Store
endpoint: "http://<nemo-data-store-service>.<nemo-data-store-namespace>.svc.cluster.local:8000"
Configure Argo Workflows#
NeMo Evaluator orchestrates evaluation jobs using Argo workflows. You can install the Argo Workflows from the NeMo Evaluator Helm chart, or you can use your own Argo Workflows.
Install Argo Workflows with the Helm chart#
By default, the NeMo Evaluator Helm chart installs the Argo Workflows using the Argo Workflows Helm Chart.
The following code snippet shows the default Argo Workflows configured in the NeMo Evaluator Helm chart with argoWorkflows.enabled
set to true
and the following pre-configured parameters:
crds
: Install the CRDs to run an Argo Workflows server. There are two options: set it totrue
to install from Helm, or set it tofalse
and manually install it for the first time. Refer to the custom resource definition section in the Argo Workflows documentation for more information.argoServiceAccount
: Create a service account to execute the workflow.argoWorkflows.server.authModes
: Set the authentication mode toserver
.
argoWorkflows:
enabled: true
serviceName: argo-workflows-server
server:
authModes:
- "server"
servicePort: 2746
secure: true
crds:
install: false
external:
argoWorkflows:
endpoint: ""
argoServiceAccount:
create: true
name: workflow-executor
Warning
Due to a known issue of certain Argo Workflow CRD’s installed at a Cluster scope, Argo Workflows can only be installed once and at a cluster level. This limits re-installation and installations of Argo Workflows within a namespace when using the NeMo Evaluator Helm chart. Therefore, we strongly recommended that you choose the setup option with the pre-installed Argo Workflows, as described in the following section.
Use Your Own Argo Workflows#
To connect to your own pre-installed Argo Workflows,
in custom-values.yaml
set argoWorkflows.enabled
to false
,
and configure the endpoint details.
argoWorkflows:
enabled: false
external:
argoWorkflows:
endpoint: "<url to the pre-installed argo workflows>"
Configure Milvus#
Milvus vector database is used as a document store for Retriever and RAG pipelines evaluated with NeMo Evaluator. You can install Milvus with the NeMo Evaluator Helm chart, or you can use your own Milvus.
Install Milvus with the Helm chart#
To install Milvus with the default settings from the NeMo Evaluator Helm chart,
in custom-values.yaml
set milvus.enabled
to true
.
Warning
The default Milvus installation in the NeMo Evaluator Helm chart is using the Milvus Helm Chart. For production evaluations, connect to your own external Milvus (version 2.3.4 or later). Refer to Use Your Own Milvus for more information.
milvus:
enabled: true
serviceName: milvus
cluster:
enabled: false
etcd:
enabled: false
pulsar:
enabled: false
minio:
enabled: false
tls:
enabled: false
standalone:
persistence:
enabled: true
persistentVolumeClaim:
size: 100Gi
storageClass: standard
extraEnv:
- name: LOG_LEVEL
value: error
extraConfigFiles:
user.yaml: |+
etcd:
use:
embed: true
data:
dir: /var/lib/milvus/etcd
common:
storageType: local
Tip
For Retriever and RAG evaluations with large datasets, such as hotpotqa
, we recommend that you set the storage size to at least 100Gi.
Use Your Own Milvus#
To connect to your own externally installed Milvus,
in custom-values.yaml
set milvus.enabled
to false
,
and configure the endpoint.
milvus:
enabled: false
external:
milvus:
endpoint: "<url to the pre-installed milvus>"
Configure with External PostgreSQL#
By default, the NeMo Entity Store Helm chart uses the Bitnami PostgreSQL chart to deploy a PostgreSQL database. Refer to the PostgreSQL section for information on how to configure the microservice with an external PostgreSQL database.
Port-forward the NeMo Evaluator Microservice#
You can verify the installation by launching an evaluation job. Use the following procedure.
Port-forward the microservice (adjust based on the release name) to your local machine:
kubectl -n <NAMESPACE> port-forward service/myrelease-nemo-evaluator 7331:7331
After port-forwarding, your data scientists can use the local host URL to access the NeMo Evaluator microservice APIs.
Monitor Your Installation#
NeMo Evaluator is auto-instrumented with the OpenTelemetry SDK. It can export traces, metrics, and logs in an OPenTelemetry standard OTLP format.
By default NeMo Evaluator is configured with exporters disabled. Only logs are being printed into console in this mode.
To enable OpenTelemetry exporters
Set
otelExporterEnabled
totrue
in the Helm chart.Configure OpenTelemetry by adding standard OTel environment variables into
otelEnvVars
, as shown in the following example.
otelEnvVars
OTEL_EXPORTER_OTLP_ENDPOINT: "http://<otel-collector>:4317" # OTLP endpoint where the exporters will be sending telemetry to
OTEL_SERVICE_NAME: "nemo-evaluator" # name of the service associated with the telemetry records
OTEL_TRACES_EXPORTER: otlp # traces exporter, set to "none" if not needed
OTEL_METRICS_EXPORTER: otlp # metrics exporter, set to "none" if not needed
OTEL_LOGS_EXPORTER: otlp # logs exporter, set to "none" if not needed
OTEL_PROPAGATORS: "tracecontext,baggage" # propagators configuration for tracing
OTEL_RESOURCE_ATTRIBUTES: "deployment.environment=$(NAMESPACE)" # additional OTel record attributes
OTEL_PYTHON_EXCLUDED_URLS: "health" # URLs that are excluded and will not be exporting any telemetry
OTEL_PYTHON_LOGGING_AUTO_INSTRUMENTATION_ENABLED: "true" # setting enabling auto-instrumentation for logging
You can configure NeMo Evaluator to export telemetry to your own pre-installed OpenTelemetry Collector or use the collector that is included with the NeMo Evaluator Helm chart.
Set opentelemetry-collector.enabled
to true
to enable installing an OpenTelemetry Collector within the Helm chart. In this mode OTEL_EXPORTER_OTLP_ENDPOINT
is automatically set to the OpenTelemetry Collector endpoint.
Set zipkin.enabled
to true
to enable installing Zipkin (UI for tracing) within the Helm chart.
Values#
Name |
Description |
Value |
---|---|---|
|
Specify whether this chart deploys zipkin for metrics. |
|
|
Specify whether this chart deploys OpenTelemetry Collector for metrics. |
|
|
OpenTelemetry Collector configurations. |
|
|
Enable OpenTelemetry exporters for NeMo Evaluator. |
|
|
Env variables to configure OpenTelemetry for NeMo Evaluator, sane defaults in chart. |
|
|
Log level for both OTLP and console exporters. |
|
|
OpenTelemetry Collector configurations. Refer to Open Telemetry Setup for details |
|