6. Storage User Guide#

Run:ai on DGX Cloud has a storage control feature currently designed to provide these functions:

Storage class encapsulation:
- a standard set of storage classes to encapsulate shared storage across the CSP storage variants, allowing customers to clearly understand their quota management
Support for object storage mounts:
- an API for mounting object storage into customer workloads
Volume protection:
- an API for the retention and deletion of provisioned storage volumes

6.1. Storage Classes#

The storage control feature provides the following two custom storage classes:

dgxc-enterprise-file: the highest performance shared filesystem
dgxc-standard-object: POSIX mounted object storage

Run:ai on DGX Cloud Storage Class	CSP	CSP Driver
dgxc-enterprise-file	AWS	FSX for Lustre
dgxc-standard-object	AWS	AWS Mountpoint

6.1.1. Enterprise Storage Profiles#

These are the AWS storage classes that are encapsulated by dgxc-enterprise-file.

AWS

AWS-based clusters leverage shared storage provided by FSx Lustre.

Service Tier	Storage Class	PVC Size	Deployment	Access Mode
FSx Lustre (SSD)	lustre-sc	12-160 TiB	PERSISTENT 2	read/write/many

6.2. Object Storage#

Integrated object storage can be provisioned for AWS. You will first need the OIDC URL for the cluster which should be provided to you by your TAM as part of onboarding.

AWS Mountpoint is a POSIX compatible object storage interface. However, it is important to understand differences to non-object filesystems as they relate to the creation, modification, and deletion of files as objects. Also, in general, object storage has performance characteristics that are not well-suited to many small files but instead is optimal for fewer large files such as datasets.

Note

The following sets of instructions make use of multiple CLI:

the CSP CLI (aws) to interact with your CSP account
kubectl to interact with the Run:ai on DGX Cloud Kubernetes cluster

6.2.1. Creating a New AWS Object Storage Resource#

The following steps for creating a new AWS object resource in Run:ai on DGX Cloud involve multiple interactions with the AWS console, the AWS CLI, and kubectl in the Run:ai on DGX Cloud cluster.

The first step is to create a new bucket if necessary in the AWS console at https://us-east-1.console.aws.amazon.com/s3 Obviously this is dependent on the region where you want to provision the S3 bucket so update the URL or change region within the console after logging in. In the following instructions, we will use a $BUCKET_NAME of “my-s3-bucket-$ACCOUNT_ID”.

Note

S3 bucket names exist at a global scope so it is important to ensure that the name of your bucket is reasonably unique. Appending the $ACCOUNT_ID helps provide a unique name.

The next step is to create an OIDC identity provider in IAM at https://us-east-1.console.aws.amazon.com/s3 Again, set the region as appropriate.

The details of this will be the following:

Provider type: OpenID Connect
Provider URL: <the OIDC URL from your TAM>
Audience: sts.amazonaws.com

Next, use the following to capture your AWS account ID to a variable.

ACCOUNT_ID=$(aws sts get-caller-identity --query "Account" --output text)
echo "ACCOUNT ID: $ACCOUNT_ID"

Once the S3 bucket and OIDC identity provider are created, we will proceed with setting up Bash environment variables specific to the AWS-related requirements. Specify a different $BUCKET_NAME if you used a custom name in the AWS console, ensuring it is globally unique.

BUCKET_NAME="my-s3-bucket-$ACCOUNT_ID"
S3_REGION="<Customer S3 region; for example: us-east-1>"
POLICY_NAME="s3-example-policy"
ROLE_NAME="s3-example-role"
ROLE_DESCRIPTION="Role for testing AWS S3"
OIDC_URL="<OIDC URL provided by your TAM>"

Note that the $S3_REGION will depend on where you provisioned your bucket.

We will create a new project in the NVIDIA Run:ai UI that will reflect the Kubernetes namespace for this example. Again, NVIDIA Run:ai by default will prepend this project asset with runai- so the project s3-example in NVIDIA Run:ai will become the Kubernetes namespace runai-s3-example.

Next we define some variables for the Kubernetes-related resources that will be used later. Specify a different $BUCKET_NAME if you used a custom name in the AWS console.

NAMESPACE="runai-s3-example"
DATA_SOURCE_NAME="s3-example-data-source-1"
PVC_NAME="s3-example-pvc-1"
MOUNT_PATH="/mnt/s3_example"
SERVICE_ACCOUNT="default"

Now we will create a JSON file that defines an AWS Security Token Service (STS) role policy using the OIDC URL provided by your TAM. Note the use of Bash variable manipulation to omit the scheme from the URL. Then we create the role using the AWS CLI.

cat >aws-role.json <<EOF
{
    "Version": "2012-10-17",
    "Statement": [
     {
     "Effect": "Allow",
     "Principal": {
       "Federated": "arn:aws:iam::$ACCOUNT_ID:oidc-provider/${OIDC_URL#https://}"
     },
     "Action": "sts:AssumeRoleWithWebIdentity",
     "Condition": {
       "StringLike": {
          "${OIDC_URL#https://}:aud": "sts.amazonaws.com",
          "${OIDC_URL#https://}:sub": "system:serviceaccount:$NAMESPACE:$SERVICE_ACCOUNT"
          }
        }
     }
    ]
}
EOF

aws iam create-role --role-name "$ROLE_NAME"\
    --assume-role-policy-document file://aws-role.json \
    --description "$ROLE_DESCRIPTION"

Now we define and create the AWS policy we want for accessing the S3 bucket in another JSON file.

cat >aws-policy.json <<EOF
{
   "Version": "2012-10-17",
   "Statement": [
   {
   "Sid": "AllowS3ReadWrite",
   "Effect": "Allow",
   "Action": [
     "s3:ListBucket",
     "s3:GetBucketLocation"
     ],
   "Resource": "arn:aws:s3:::$BUCKET_NAME"
   },
   {
   "Effect": "Allow",
   "Action": [
      "s3:PutObject",
      "s3:GetObject",
      "s3:DeleteObject"
     ],
   "Resource": "arn:aws:s3:::$BUCKET_NAME/*"
   }
  ]
}
EOF

aws iam create-policy --policy-name "$POLICY_NAME" \
    --policy-document file://aws-policy.json

Now that we have created both a role and bucket access policy we will attach the policy to that role.

aws iam attach-role-policy --policy-arn arn:aws:iam::"$ACCOUNT_ID":policy/"$POLICY_NAME" \
    --role-name "$ROLE_NAME"

Kubernetes has the concept of service accounts as resources which are essentially non-human identities that can interact with the API server and other resources. Every namespace (NVIDIA Run:ai project) is provided a default service account which is used by pods and other resources if no other service account is specified. Here we will annotate the default service account with the AWS account and role that we created so that the pod that uses the PVC Data Source can access the bucket.

kubectl annotate sa default -n "$NAMESPACE" \
       eks.amazonaws.com/role-arn=arn:aws:iam::"$ACCOUNT_ID":role/"$ROLE_NAME"

Finally, we can define and create a NvStorageLocation resource that binds the bucket to a PVC Data Source. Note that the NvStorageLocation spec currently requires some arbitrary capacity that is unrelated to the size of the object storage.

Here is a list valid mount options for AWS Mountpoint.

Option	Description
–read-only	Mount file system in read-only mode
–allow-delete	Allow delete operations on file system
–allow-overwrite	Allow overwrite operations on file system
–auto-unmount	Automatically unmount on exit
–allow-root	Allow root user to access file system
–allow-other	Allow other users, including root, to access file system
–uid <UID>	Owner UID [default: current user’s UID]
–gid <GID>	Owner GID [default: current user’s GID]
–dir-mode <DIR_MODE>	Directory permissions [default: 0755]
–file-mode <FILE_MODE>	File permissions [default: 0644]

cat >aws-object-data-source.yaml <<EOF
apiVersion: "storage.dgxc.nvidia.com/v1beta1"
kind: NvStorageLocation
metadata:
  name: "$DATA_SOURCE_NAME"
  namespace: "$NAMESPACE"
spec:
  description: "An example NvStorageLocation for an AWS bucket"
  volumeName: "$PVC_NAME"
  mountPath: "$MOUNT_PATH"
  pvc:
    objectSpec:
      endpointUrl: "https://s3.$S3_REGION.amazonaws.com"
      bucket: "$BUCKET_NAME"
    mountOptions:
    # Set a user and group ID if desired
    - uid=1000
    - gid=1000
    # The following is an example of a mount option we might choose to apply
    # allowing other users access to the filesystem
    - allow-other
    # Currently some capacity needs to be specified
    capacity: "1Gi"
EOF

kubectl apply -f aws-object-data-source.yaml

6.2.1.1. Multiple Pods and S3 Bucket Combinations for AWS#

In the case that one pod needs to access two or more S3 buckets, we can add multiple buckets like we defined in the file aws-policy.json and then attach the single policy to the role. In the case that an S3 bucket needs to be accessed by different pods in different namespaces, we need to provide an IAM role like the one in aws-role.json for different namespaces (and possibly service accounts if not using default).

6.2.1.2. Mount Semantics for AWS S3#

AWS Mountpoint supports creating new objects in your S3 bucket by allowing writes to new files. If the --allow-overwrite flag is set at startup time, Mountpoint also supports replacing existing objects by allowing writes to existing files, but only when the O_TRUNC flag is used at open time to truncate the existing file. In both cases, writes must always start from the beginning of the file and must be made sequentially.

Mountpoint allows creating new directories with commands like mkdir. Creating a new directory is a local operation and no changes are made to your S3 bucket. A new directory will only be visible to other clients once a file has been written and uploaded inside it. You cannot remove or rename an existing directory with Mountpoint. However, you can remove a new directory created locally if no files have been written inside it. Mountpoint does not support hard or symbolic links.

Here are more detailed semantics for Mountpoint.

6.2.2. Destroying an Object Storage Resource#

The previously created NvStorageLocation resources can be destroyed by using the same YAML to delete it from the cluster.

# for the AWS resource
kubectl delete -f aws-object-data-source.yaml

This will not destroy the CSP bucket itself but instead remove the PVC for it.

6.3. Volume Protection#

In the interest of data protection, Run:ai on DGX Cloud modifies any PersistentVolume request by default to have a retention policy of “Retain”, regardless of whether the underlying StorageClass has a default policy of “Delete”. This effectively means that all volumes will remain even after being released by a pod. This ensures that important data will not be inadvertently lost due to default Kubernetes cluster storage policies. However, it also means that these volumes can accumulate over time and continue to count against your CSP project storage quota.

Note

The following steps require the privileges described at Advanced Kubernetes Usage for Admins.

You can use the same NvStorage resource mentioned in Managing Your Storage Utilization (CLI) to change the retention policy of any volume. After you have definitively identified a volume that can be deleted, use the following command:

kubectl edit nvstorages dgxc-enterprise-file -n dgxc-tenant-cluster-policies

This command will launch your editor, where you can modify the retention policy for any existing volume. For example:

spec:
  instances:
    pvc-0ffa54ec-049b-4ad8-847e-8476b44e18ca:
      name: pvc-0ffa54ec-049b-4ad8-847e-8476b44e18ca
      persistentVolumeReclaimPolicy: Retain

This could be changed to:

spec:
  instances:
    pvc-0ffa54ec-049b-4ad8-847e-8476b44e18ca:
      name: pvc-0ffa54ec-049b-4ad8-847e-8476b44e18ca
      persistentVolumeReclaimPolicy: Delete

Once you save your changes, the volume will be deleted when it is no longer used. Note that a PersistentVolume will not be deleted until all its pod references are gone. In other words, there are no more pods in the cluster referring to the volume claim.

6.4. Data Volumes in Run:ai on DGX Cloud#

NVIDIA Run:ai has recently introduced the Data Volumes feature which enables the replication of PersistentVolumeClaims (PVC) and their PersistentVolumes (PV) across Run:ai scopes:

multiple departments
multiple projects

The current implementation binds a PVC to PV in each desired namespace as is customary in Kubernetes. However, the implementation also makes it possible to create PVC that are bound to PV replicas of each other across scopes. This is an ideal mechanism for AI research teams who desire to share datasets, checkpoints, or any other data assets. The Data Volumes implementation treats the original PVC/PV as read-write while the other PVC/PV created from the original PVC are read-only. Run:ai is responsible for tracking this replica relationship. The origin PV and the replica PV all point to the same underlying file storage volume.

An example Data Volume workflow:

Create a source PVC (src0) with read-write-many (rwx) access.
Click NEW DATA VOLUME in the UI.
Enter a unique name for the data volume.
Set the project where the data is located, and choose the PVC created in step 1.
In the scope, choose the project or department where you want to share the PVC.
After finishing with the data volume, delete the jobs that use the data volume.
Delete the data volume.
Finally, once the source PVC and its data is also ready for deletion, edit its PV retention policy as described in Volume Protection.

Warning

Since the Data Volumes implementation relies on all the replica PV pointing to the same file storage volume provided by the CSP it is vital that users DO NOT explicitly delete a replica PV. You should only delete all PV when you are certain that there are no longer any workload consumers of it and it is safe to finally delete the data.