Managing NeMo Data Stores#
About NeMo Data Store#
NVIDIA NeMo Data Store manages the life cycle of models and datasets in your projects and simplifies the storage and retrieval of models, datasets, and evaluation results in your applications.
The NIM Operator deploys a NeMo Data Store as a Kubernetes custom resource, nemodatastores.app.nvida.com
. A NeMo Data Store resource relies on a database and object store configured to hold model data.
Read the NeMo Data Store documentation for details on using a data store.
Prerequisites#
All the common NeMo microservice prerequisites.
Note
You can use the NeMo Dependencies Ansible Playbook to deploy all the following NeMo Data Store microservice dependencies.
Storage
A PostgreSQL database to store repository, branch, and LFS metadata. It can be externally provisioned or hosted in your cluster. Refer to the Helm installation guide for more information on installing a PostgreSQL database.
Kubernetes
Access to an NFS-backed Persistent Volume that supports
ReadWriteMany
access mode. Stores Git history for models and datasets.You can create a PVC and specify the name when you create the NeMo Data Store resource or you can request that the Operator creates the PVC.
The NeMo Data Store relies on several Kubernetes secrets to manage model data. The default secrets are created as part of NeMo Dependency Ansible playbooks, or you can apply the secrets file manually if you are deploying a NeMo Datastore as a separate component.
Download the
NeMo Data Store default secrets file
.$ kubectl apply -f nemo-datastore-default-secrets.yaml -n nemo
Create a secret for your database by creating a file called
nemo-datastore-secrets.yaml
, like the following example.--- apiVersion: v1 stringData: password: <ndspass> kind: Secret metadata: name: <datastore-pg-existing-secret> namespace: nemo type: Opaque ---
Apply the manifest:
$ kubectl apply -f nemo-datastore-secrets.yaml -n nemo
Deploying a NeMo Datastore#
Update the following sample scripts <inputs>
with values for your cluster configuration.
Create a file, such as
nemo-datastore.yaml
, with contents like the following example:apiVersion: apps.nvidia.com/v1alpha1 kind: NemoDatastore metadata: name: nemodatastore-sample namespace: nemo spec: secrets: datastoreConfigSecret: "<nemo-ms-nemo-datastore>" datastoreInitSecret: "<nemo-ms-nemo-datastore-init>" datastoreInlineConfigSecret: "<nemo-ms-nemo-datastore-inline-config>" giteaAdminSecret: "<gitea-admin-credentials>" lfsJwtSecret: "<nemo-ms-nemo-datastore--lfs-jwt>" databaseConfig: credentials: user: <ndsuser> secretName: <datastore-pg-existing-secret> passwordKey: <password> host: <datastore-pg-postgresql>.<nemo>.svc.cluster.local port: 5432 databaseName: <ndsdb> pvc: name: "pvc-shared-data" create: true storageClass: "<storage-class-name>" volumeAccessMode: ReadWriteMany size: "10Gi" expose: service: type: ClusterIP port: 8000 image: repository: nvcr.io/nvidia/nemo-microservices/datastore tag: "25.04" pullPolicy: IfNotPresent pullSecrets: - ngc-secret replicas: 1 resources: requests: memory: "256Mi" cpu: "500m" limits: memory: "512Mi" cpu: "1"
Refer to the Configuring a NeMo Data Store section for more details on configuration options.
Apply the manifest:
$ kubectl apply -n nemo -f nemo-datastore.yaml
Optional: View NeMo Data Store status:
$ kubectl get nemodatastores.apps.nvidia.com -n nemo
Partial Output
NAME STATUS AGE nemodatastore-sample Ready 5m
Optional: View information about the NeMo Data Store:
$ kubectl describe nemodatastore nemodatastore-sample -n nemo
Partial Output
... Conditions: Last Transition Time: 2025-04-24T02:02:36Z Message: deployment "nemodatastore-sample" successfully rolled out Reason: Ready Status: True Type: Ready Last Transition Time: 2025-04-24T01:57:35Z Message: Reason: Ready Status: False Type: Failed
State: Ready
## Verify NeMo Data store
1. List the NeMo Data store services.
```console
$ kubectl get services -n nemo
```
*Example output*
```output
nemodatastore-sample ClusterIP XX.XXX.XXX.XXX <none> 8000/TCP 31m
```
1. Start a pod that has access to the `curl` command.
Substitute any pod that has the command and meets your organization's security requirements:
```console
$ kubectl run --rm -it -n default curl --image=curlimages/curl:latest -- ash
After the pod starts, you are connected to the ash
shell in the pod.
Verify NeMo Data Store.
curl -X GET "http://nemodatastore-sample.nemo:8000/v1/hf/api/datasets"
Configuring a NeMo Data Store#
The following table shows more information about the commonly modified fields for the NeMo Data Store custom resource.
Field |
Description |
Default Value |
---|---|---|
|
Specifies to add the user-supplied annotations to the pod. |
None |
|
Specifies the name of a generic secret that contains NGC_API_KEY. This allows deployments to pull images from the NVIDIA GPU Cloudimage catalog. |
None |
|
Specifies the external PostgreSQL configuration details. |
None |
|
Specifies the password key used in the database credentials secret. |
|
|
Specifies the secret name for the database credentials. |
None |
|
Specifies the user for the database. |
None |
|
Specifies the name for the database. |
None |
|
Specifies the host for the database. |
None |
|
Specifies the port for the database. |
|
|
When set to If you have an ingress controller, values like the following sample configures an ingress for the ingress:
enabled: true
spec:
ingressClassName: nginx
host: nemo-datastore.example.com
paths:
- path: /
pathType: Prefix
|
|
|
Specifies the network port number for the NeMo Data Store microservice. |
|
|
Specifies the Kubernetes service type to create for the NIM microservice. |
|
|
Specifies the group for the pods.
This value is used to set the security context of the pod in the |
|
|
Specifies repository, tag, pull policy, and pull secret for the container image. |
None |
|
Specifies the user-supplied labels to add to the pod. |
None |
|
When set to |
|
|
Specifies the location and credentials for accessing the external object store. |
None |
|
Specifies the bucket name where LFS files are stored. This can be the name of an existing bucket, or the name of a bucket the NIM Operator will create for you. |
None |
|
Specifies the password key in the |
None |
|
Specifies the secret name for the object store login. |
None |
|
Specifies the user for the object store. |
None |
|
Specifies the fully qualified object store endpoint. |
None |
|
Specifies the CSP region where bucket is hosted. |
None |
|
When |
|
|
Specifies if SSL transport for the object store should be enabled. |
|
|
When set to |
|
|
Specifies the name for the PVC. |
None |
|
Specifies the size, in Gi, for the PVC to create. This field is required if you specify |
None |
|
Specifies the StorageClass for the PVC to create. Leave empty if you have |
None |
|
Specifies to create a subpath on the PVC and cache the model profiles in the directory. |
None |
|
Specifies the access mode for the PVC to create. |
None |
|
Specifies the number of Nemo Data Store replicas to have on the cluster. |
None |
|
Specifies the memory and CPU request for the Nemo Data Store. |
None |
|
Specifies the memory and CPU limits for the Nemo Data Store. |
None |
|
When set to The |
|
|
Specifies the required default secrets needed by the NeMo Data Store Microservice.
A default configuration of these secrets is availble in the secrets:
datastoreConfigSecret: "<nemo-ms-nemo-datastore>"
datastoreInitSecret: "<nemo-ms-nemo-datastore-init>"
datastoreInlineConfigSecret: "<nemo-ms-nemo-datastore-inline-config>"
giteaAdminSecret: "<gitea-admin-credentials>"
lfsJwtSecret: "nemo-ms-nemo-datastore--lfs-jwt"
|
None |
|
Specifies the tolerations for the pods. |
None |
|
Specifies the user ID for the pod.
This value is used to set the security context of the pod in the |
|
Next Steps#
Refer to the NeMo microservices documentation for detials on