File Storage#

The NeMo Platform Files service stores uploaded files and datasets. By default, it uses filesystem storage backed by a Kubernetes ReadWriteMany PersistentVolumeClaim (PVC). Alternatively, you can configure S3-compatible object storage.

Prerequisites#

If using S3 storage, you must provision and manage your own bucket and credentials. The platform does not create buckets, IAM roles, or manage permissions on your behalf.

Storage Options#

Storage Type

Description

local (default)

Local filesystem via PVC

s3

S3-compatible object storage

Local Storage (Default)#

By default, the Files service uses local filesystem storage. Files are stored on the shared PVC configured in Persistent Volumes.

The default configuration is equivalent to:

platformConfig:
  files:
    default_storage_config:
      type: local
      path: /vol/files

You do not need to add this to your values.yaml. Once the PVC is set up as described in Persistent Volumes, local storage works out of the box.

S3 Object Storage#

For production workloads, S3-compatible storage is recommended over RWX PVCs for better performance. This option works with AWS S3, MinIO, Ceph, and other S3-compatible infrastructure.

When using S3 for file storage, the shared PVC is still required for jobs storage, but files are stored directly in your S3 bucket.

Configuration#

Configure S3 storage in your Helm values:

platformConfig:
  files:
    default_storage_config:
      type: s3
      bucket: my-nemo-bucket
      region: us-east-1
      use_sdk_auth: true

The platform uses the boto3 credential chain for S3 authentication. It is the cluster administrator’s responsibility to ensure the platform pod has valid credentials available through one of the standard boto3 mechanisms:

  • IAM Roles for Service Accounts (IRSA) on EKS - credentials are automatically injected via service account annotations

  • Environment variables - AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY mounted into the pod

  • Shared credential files - AWS credentials file mounted at ~/.aws/credentials

See the boto3 credentials documentation for the full credential resolution order.

Example: Injecting Credentials from Kubernetes Secrets#

To inject S3 credentials from a Kubernetes secret, first create the secret:

kubectl create secret generic s3-credentials \
  --from-literal=access_key_id='<my_aws_access_key_id>' \
  --from-literal=secret_access_key='<my_aws_secret_access_key>'

Then use the api.env field in your Helm values with valueFrom to inject the credentials:

api:
  env:
    AWS_ACCESS_KEY_ID:
      valueFrom:
        secretKeyRef:
          name: s3-credentials
          key: access_key_id
    AWS_SECRET_ACCESS_KEY:
      valueFrom:
        secretKeyRef:
          name: s3-credentials
          key: secret_access_key

platformConfig:
  files:
    default_storage_config:
      type: s3
      bucket: my-nemo-bucket
      region: us-east-1
      use_sdk_auth: true

Additional Options#

Using a Prefix#

You can use the prefix field to scope the Files service to a specific path within your bucket. The prefix functions like a directory path, making it useful for organizing files or sharing a bucket across multiple applications.

platformConfig:
  files:
    default_storage_config:
      type: s3
      bucket: my-nemo-bucket
      prefix: nemo-platform/files
      region: us-east-1
      use_sdk_auth: true

All files will be stored under s3://my-nemo-bucket/nemo-platform/files/.

S3-Compatible Storage#

For S3-compatible storage like MinIO or Ceph, specify a custom endpoint URL:

platformConfig:
  files:
    default_storage_config:
      type: s3
      bucket: my-nemo-bucket
      endpoint_url: http://minio.minio-system.svc.cluster.local:9000
      region: us-east-1
      use_sdk_auth: true

Note

Some older S3-compatible systems may require signature_version: s3 instead of the default s3v4. Only change this if you encounter signature-related errors.

Configuration Reference#

Field

Type

Default

Description

type

string

local

Storage type: local or s3

bucket

string

-

S3 bucket name (required for S3)

prefix

string

""

Optional path prefix within the bucket

region

string

-

AWS region (e.g., us-east-1)

endpoint_url

string

-

Custom S3 endpoint for S3-compatible storage

use_sdk_auth

boolean

-

Use boto3 credential chain for authentication

signature_version

string

s3v4

AWS signature version (s3v4 or s3 for legacy systems)