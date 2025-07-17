RDG for DPF Zero Trust (DPF-ZT) with OVN VPC DPU service Home
DPF Operator Installation

Install a CSI to Back the DPUCluster etcd

Download local-path-provisioner helm chart to your current working directory and create a NS for it:

Jump Node Console

$ curl https://codeload.github.com/rancher/local-path-provisioner/tar.gz/v0.0.30 | tar -xz --strip=3 local-path-provisioner-0.0.30/deploy/chart/local-path-provisioner/
$ kubectl create ns local-path-provisioner

The following values will be used for the installation:

manifests/01-dpf-operator-installation/helm-values/local-path-provisioner.yml

tolerations:
  - operator: Exists
    effect: NoSchedule
    key: node-role.kubernetes.io/control-plane
  - operator: Exists
    effect: NoSchedule
    key: node-role.kubernetes.io/master

Run the following command:

Jump Node Console

$ helm install -n local-path-provisioner local-path-provisioner ./local-path-provisioner --version 0.0.30 -f ./manifests/01-dpf-operator-installation/helm-values/local-path-provisioner.yml
 
NAME: local-path-provisioner
LAST DEPLOYED: Tue Jul 8 13:43:06 2025
NAMESPACE: local-path-provisioner
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
...

Ensure that the pod in the local-path-provisioner namespace is in the Ready state:

Jump Node Console

$ kubectl wait --for=condition=ready --namespace local-path-provisioner pods --all
pod/local-path-provisioner-75f649c47c-rsvb8 condition met

Create Storage Required by the DPF Operator

The following YAML file defines storage (for the BFB images) that are required by the DPF operator.

manifests/01-dpf-operator-installation/nfs-storage-for-bfb-dpf-ga.yaml

---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: bfb-pv
spec:
  capacity:
    storage: 10Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteMany
  nfs: 
    path: /mnt/dpf_share/bfb
    server: $NFS_SERVER_IP
  persistentVolumeReclaimPolicy: Delete
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: bfb-pvc
  namespace: dpf-operator-system
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 10Gi
  volumeMode: Filesystem
  storageClassName: ""

Run the following commands to first create the namespace for the DPF Operator, then substitute the environment variables using envsubst,and apply the YAML files:

Jump Node Console

$ kubectl create namespace dpf-operator-system
$ cat manifests/01-dpf-operator-installation/*.yaml | envsubst | kubectl apply -f -

Additional Dependencies

The following table lists all required Helm chart dependencies with their specific versions and purposes:

Helm Chart

Version

Description

Required

Post/Pre-installation

cert-manager

1.18.1

Certificate management for Kubernetes, provides automatic TLS certificate issuance and renewal

Pre-installation

argo-cd

7.8.2

GitOps continuous delivery tool for Kubernetes, necessary for DPUService integration

Pre-installation

node-feature-discovery

0.17.1

Discovers and advertises hardware features and capabilities of DPUs in the cluster

Pre-installation

maintenance-operator

0.2.0

Manages node maintenance operations and ensures graceful handling of node updates

Pre-installation

kamaji

1.1.0

Kubernetes cluster management platform for creating and managing the DPU Kubernetes clusters

Pre-installation

All of the components requires the DPF Operator to be installed before they can be installed.

We provide a working helmfile configuration that can be used to install all dependencies with the correct values.

The helmfiles are located at deploy/helmfiles/ in the DPF repository.

This approach ensures consistent deployment across different environments and simplifies the installation process.

But, this provided as a demo option and can't supported by NVIDIA official support.

Run the following commands to Install the dependencies:

Jump Node Console

$ wget https://github.com/helmfile/helmfile/releases/download/v1.1.2/helmfile_1.1.2_linux_amd64.tar.gz
$ tar  -xvf helmfile_1.1.2_linux_amd64.tar.gz
$ sudo mv ./helmfile /usr/local/bin/
$ helmfile version
$ cd 
$ cd doca-platform
$ cd deploy/helmfiles/
$ helmfile init --force
$ helmfile apply -f prereqs.yaml --color --suppress-diff --skip-diff-on-install --concurrency 0 --hide-notes
$ cd
$ cd docs/public/user-guides/hbn_only/

DPF Operator Deployment

The DPF Operator Helm values are detailed in the following YAML file:

manifests/01-dpf-operator-installation/helm-values/dpf-operator.yml

kamaji-etcd:
  persistentVolumeClaim:
    storageClassName: local-path
node-feature-discovery:
  worker:
    extraEnvs:
      - name: "KUBERNETES_SERVICE_HOST"
        value: "$TARGETCLUSTER_API_SERVER_HOST"
      - name: "KUBERNETES_SERVICE_PORT"
        value: "$TARGETCLUSTER_API_SERVER_PORT"

Run the following commands to substitute the environment variables and install the DPF Operator( remove in public: For development purposes... ):

Jump Node Console

$ helm repo add --force-update dpf-repository ${REGISTRY}
$ helm repo update
$ helm upgrade --install -n dpf-operator-system dpf-operator dpf-repository/dpf-operator --version=$TAG
 
For development purposes, if the $REGISTRY is an OCI Registry use this command:
$ envsubst < ./manifests/01-dpf-operator-installation/helm-values/dpf-operator.yml | helm upgrade --install -n dpf-operator-system dpf-operator $REGISTRY/dpf-operator --version=$TAG --values -
 
Release "dpf-operator" does not exist. Installing it now.
coalesce.go:286: warning: cannot overwrite table with non table for dpf-operator.parca.server.tolerations (map[])
NAME: dpf-operator
LAST DEPLOYED: Tue May 20 23:18:22 2025
NAMESPACE: dpf-operator-system
STATUS: deployed
REVISION: 1
TEST SUITE: None

Verify the DPF Operator installation by ensuring the deployment is available and all the pods are ready:

Note

The following verification commands may need to be run multiple times to ensure the conditions are met.

Jump Node Console

## Ensure the DPF Operator deployment is available.
$ kubectl rollout status deployment --namespace dpf-operator-system dpf-operator-controller-manager
deployment "dpf-operator-controller-manager" successfully rolled out
 
## Ensure all pods in the DPF Operator system are ready.
$ kubectl wait --for=condition=ready --namespace dpf-operator-system pods --all
pod/dpf-operator-argocd-application-controller-0 condition met
pod/dpf-operator-argocd-redis-5bc74d76fc-v6l7m condition met
pod/dpf-operator-argocd-repo-server-86c9454fc9-zqtqf condition met
pod/dpf-operator-argocd-server-554d9f446-lntpv condition met
pod/dpf-operator-controller-manager-67599cdcb7-5dchf condition met
pod/dpf-operator-kamaji-6dcf4ccdfd-fg64w condition met
pod/dpf-operator-kamaji-etcd-0 condition met
pod/dpf-operator-kamaji-etcd-1 condition met
pod/dpf-operator-kamaji-etcd-2 condition met
pod/dpf-operator-maintenance-operator-666b88bfcd-p72nn condition met
pod/dpf-operator-node-feature-discovery-gc-656b95dc48-gwtsb condition met
pod/dpf-operator-node-feature-discovery-master-76d5695c7c-6kwfz condition met

