Managing NeMo Data Stores#

About NeMo Data Store#

NVIDIA NeMo Data Store manages the life cycle of models and datasets in your projects and simplifies the storage and retrieval of models, datasets, and evaluation results in your applications.

The NIM Operator deploys a NeMo Data Store as a Kubernetes custom resource, nemodatastores.app.nvida.com. A NeMo Data Store resource relies on a database and object store configured to hold model data.

Read the NeMo Data Store documentation for details on using a data store.

Prerequisites#

Note

You can use the NeMo Dependencies Ansible Playbook to deploy all the following NeMo Data Store microservice dependencies.

Storage

  • A PostgreSQL database to store repository, branch, and LFS metadata. It can be externally provisioned or hosted in your cluster. Refer to the Helm installation guide for more information on installing a PostgreSQL database.

Kubernetes

  • Access to an NFS-backed Persistent Volume that supports ReadWriteMany access mode. Stores Git history for models and datasets.

    You can create a PVC and specify the name when you create the NeMo Data Store resource or you can request that the Operator creates the PVC.

  • The NeMo Data Store relies on several Kubernetes secrets to manage model data. The default secrets are created as part of NeMo Dependency Ansible playbooks, or you can apply the secrets file manually if you are deploying a NeMo Datastore as a separate component.

    Download the NeMo Data Store default secrets file.

    $ kubectl apply -f nemo-datastore-default-secrets.yaml -n nemo
    
  • Create a secret for your database by creating a file called nemo-datastore-secrets.yaml, like the following example.

    ---
    apiVersion: v1
    stringData:
      password: <ndspass>
    kind: Secret
    metadata:
      name: <datastore-pg-existing-secret>
      namespace: nemo
    type: Opaque
    ---
    

    Apply the manifest:

    $ kubectl apply -f nemo-datastore-secrets.yaml -n nemo
    

Deploying a NeMo Datastore#

Update the following sample scripts <inputs> with values for your cluster configuration.

  1. Create a file, such as nemo-datastore.yaml, with contents like the following example:

    apiVersion: apps.nvidia.com/v1alpha1
    kind: NemoDatastore
    metadata:
      name: nemodatastore-sample
      namespace: nemo
    spec:
      secrets:
        datastoreConfigSecret: "<nemo-ms-nemo-datastore>"
        datastoreInitSecret: "<nemo-ms-nemo-datastore-init>"
        datastoreInlineConfigSecret: "<nemo-ms-nemo-datastore-inline-config>"
        giteaAdminSecret: "<gitea-admin-credentials>"
        lfsJwtSecret: "<nemo-ms-nemo-datastore--lfs-jwt>" 
      databaseConfig:
        credentials:
          user: <ndsuser>
          secretName: <datastore-pg-existing-secret>
          passwordKey: <password>
        host: <datastore-pg-postgresql>.<nemo>.svc.cluster.local
        port: 5432
        databaseName: <ndsdb>
      pvc:
        name: "pvc-shared-data"
        create: true
        storageClass: "<storage-class-name>"
        volumeAccessMode: ReadWriteMany
        size: "10Gi"
      expose:
        service:
          type: ClusterIP
          port: 8000
      image:
        repository: nvcr.io/nvidia/nemo-microservices/datastore
        tag: "25.04"
        pullPolicy: IfNotPresent
        pullSecrets:
          - ngc-secret
      replicas: 1
      resources:
        requests:
          memory: "256Mi"
          cpu: "500m"
        limits:
          memory: "512Mi"
          cpu: "1"
    

    Refer to the Configuring a NeMo Data Store section for more details on configuration options.

  2. Apply the manifest:

    $ kubectl apply -n nemo -f nemo-datastore.yaml
    
  3. Optional: View NeMo Data Store status:

    $ kubectl get nemodatastores.apps.nvidia.com -n nemo
    

    Partial Output

    NAME                   STATUS     AGE
    nemodatastore-sample   Ready      5m 
    
  4. Optional: View information about the NeMo Data Store:

    $ kubectl describe nemodatastore nemodatastore-sample -n nemo
    

    Partial Output

    ...
    Conditions:
     Last Transition Time:  2025-04-24T02:02:36Z
     Message:               deployment "nemodatastore-sample" successfully rolled out
    
     Reason:                Ready
     Status:                True
     Type:                  Ready
     Last Transition Time:  2025-04-24T01:57:35Z
     Message:
     Reason:                Ready
     Status:                False
     Type:                  Failed
    

State: Ready

## Verify NeMo Data store

1. List the NeMo Data store services.

 ```console
 $ kubectl get services -n nemo
 ```

 *Example output*

 ```output
 nemodatastore-sample     ClusterIP   XX.XXX.XXX.XXX   <none>        8000/TCP               31m
 ```

1. Start a pod that has access to the `curl` command.
Substitute any pod that has the command and meets your organization's security requirements:

```console
$ kubectl run --rm -it -n default curl --image=curlimages/curl:latest -- ash

After the pod starts, you are connected to the ash shell in the pod.

  1. Verify NeMo Data Store.

    curl -X GET "http://nemodatastore-sample.nemo:8000/v1/hf/api/datasets"
    

Configuring a NeMo Data Store#

The following table shows more information about the commonly modified fields for the NeMo Data Store custom resource.

Field

Description

Default Value

spec.annotations

Specifies to add the user-supplied annotations to the pod.

None

spec.authSecret (required)

Specifies the name of a generic secret that contains NGC_API_KEY. This allows deployments to pull images from the NVIDIA GPU Cloudimage catalog.

None

spec.databaseConfig (required)

Specifies the external PostgreSQL configuration details.

None

spec.databaseConfig.credentials.passwordKey

Specifies the password key used in the database credentials secret.

password

spec.databaseConfig.credentials.secretName (required)

Specifies the secret name for the database credentials.

None

spec.databaseConfig.credentials.user (required)

Specifies the user for the database.

None

spec.databaseConfig.databaseName (required)

Specifies the name for the database.

None

spec.databaseConfig.host (required)

Specifies the host for the database.

None

spec.databaseConfig.port

Specifies the port for the database.

5432

spec.expose.ingress.enabled

When set to true, the Operator creates a Kubernetes Ingress resource for the NeMo Data Store. Specify the ingress specification in the spec.expose.ingress.spec field.

If you have an ingress controller, values like the following sample configures an ingress for the / endpoint.

ingress:
  enabled: true
  spec:
    ingressClassName: nginx
    host: nemo-datastore.example.com
    paths:
      - path: /
        pathType: Prefix

false

spec.expose.service.port (required)

Specifies the network port number for the NeMo Data Store microservice.

8000

spec.expose.service.type

Specifies the Kubernetes service type to create for the NIM microservice.

ClusterIP

spec.groupID

Specifies the group for the pods. This value is used to set the security context of the pod in the runAsGroup and fsGroup fields.

2000

spec.image (required)

Specifies repository, tag, pull policy, and pull secret for the container image.

None

spec.labels

Specifies the user-supplied labels to add to the pod.

None

spec.metrics.enabled

When set to true, the Operator configures a Prometheus service monitor for the service. Specify the service monitor specification in the spec.metrics.serviceMonitor field. Refer to the Observability page for more details.

false

spec.objectStoreConfig

Specifies the location and credentials for accessing the external object store.

None

spec.objectStoreConfig.bucketName

Specifies the bucket name where LFS files are stored. This can be the name of an existing bucket, or the name of a bucket the NIM Operator will create for you.

None

spec.objectStoreConfig.credentials.passwordKey

Specifies the password key in the CredentialsSecret secret for the object store credentials.

None

spec.objectStoreConfig.credentials.secretName

Specifies the secret name for the object store login.

None

spec.objectStoreConfig.credentials.user

Specifies the user for the object store.

None

spec.objectStoreConfig.endpoint

Specifies the fully qualified object store endpoint.

None

spec.objectStoreConfig.region

Specifies the CSP region where bucket is hosted.

None

spec.objectStoreConfig.serveDirect

When true specifies if traffic should be served directly from the object store.

true

spec.objectStoreConfig.ssl

Specifies if SSL transport for the object store should be enabled.

false

spec.pvc.create

When set to true, the Operator creates the PVC for you. If you delete a NeMo Data Store resource and this field was set to true, the Operator also deletes the PVC.

false

spec.pvc.name

Specifies the name for the PVC.

None

spec.pvc.size

Specifies the size, in Gi, for the PVC to create.

This field is required if you specify create: true.

None

spec.pvc.storageClass

Specifies the StorageClass for the PVC to create. Leave empty if you have create set to false and you already created the PVC.

None

spec.storage.pvc.subPath

Specifies to create a subpath on the PVC and cache the model profiles in the directory.

None

spec.storage.pvc.volumeAccessMode

Specifies the access mode for the PVC to create.

None

spec.replicas

Specifies the number of Nemo Data Store replicas to have on the cluster.

None

spec.resources.requests

Specifies the memory and CPU request for the Nemo Data Store.

None

spec.resources.limits

Specifies the memory and CPU limits for the Nemo Data Store.

None

spec.scale.enabled

When set to true, the Operator creates a Kubernetes horizontal pod autoscaler for the Nemo Data Store. Specify the HPA specification in the spec.scale.hpa field.

The spec.scale.hpa field supports the following subfields: minReplicas, maxReplicas, metrics, and behavior. These fields correspond to the same fields in a horizontal pod autoscaler resource specification.

false

spec.secrets.datastoreConfigSecret (required)

Specifies the required default secrets needed by the NeMo Data Store Microservice. A default configuration of these secrets is availble in the NeMo Data Store default secrets file. For details on configuring these secrets, see the Gitea configuration documentation.

secrets:
  datastoreConfigSecret: "<nemo-ms-nemo-datastore>"
  datastoreInitSecret: "<nemo-ms-nemo-datastore-init>"
  datastoreInlineConfigSecret: "<nemo-ms-nemo-datastore-inline-config>"
  giteaAdminSecret: "<gitea-admin-credentials>"
  lfsJwtSecret: "nemo-ms-nemo-datastore--lfs-jwt" 

None

spec.tolerations

Specifies the tolerations for the pods.

None

spec.userID

Specifies the user ID for the pod. This value is used to set the security context of the pod in the runAsUser fields.

1000

Next Steps#