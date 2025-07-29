RDG for DPF with OVN-Kubernetes and HBN Services
NVIDIA Docs Hub  NVIDIA Networking  Networking Solutions  RDG for DPF with OVN-Kubernetes and HBN Services  DPF Operator Installation

On This Page

DPF Operator Installation

Cert-manager Installation

Cert-manager is a powerful and extensible X.509 certificate controller for Kubernetes workloads. It will obtain certificates from a variety of Issuers, both popular public Issuers as well as private Issuers. It will ensure the certificates are valid and up-to-date and will attempt to renew certificates at a configured time before expiry.

In this deployment, it's a prerequisite used to provide certificates for webhooks used by DPF and its dependencies.

  1. Install Cert-manager using helm.

    1. The following values will be used for the helm chart installation:

      manifests/02-dpf-operator-installation/helm-values/cert-manager.yml

      Copy
      Copied!
                  
      
            
      startupapicheck:
  enabled: false
crds:
  enabled: true
affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
        - matchExpressions:
            - key: node-role.kubernetes.io/master
              operator: Exists
        - matchExpressions:
            - key: node-role.kubernetes.io/control-plane
              operator: Exists
tolerations:
  - operator: Exists
    effect: NoSchedule
    key: node-role.kubernetes.io/control-plane
  - operator: Exists
    effect: NoSchedule
    key: node-role.kubernetes.io/master
cainjector:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
              - key: node-role.kubernetes.io/master
                operator: Exists
          - matchExpressions:
              - key: node-role.kubernetes.io/control-plane
                operator: Exists
  tolerations:
    - operator: Exists
      effect: NoSchedule
      key: node-role.kubernetes.io/control-plane
    - operator: Exists
      effect: NoSchedule
      key: node-role.kubernetes.io/master
webhook:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
              - key: node-role.kubernetes.io/master
                operator: Exists
          - matchExpressions:
              - key: node-role.kubernetes.io/control-plane
                operator: Exists
  tolerations:
    - operator: Exists
      effect: NoSchedule
      key: node-role.kubernetes.io/control-plane
    - operator: Exists
      effect: NoSchedule
      key: node-role.kubernetes.io/master

    2. Run the following commands:

      Jump Node Console

      Copy
      Copied!
                  
      
            
      $ helm repo add jetstack https://charts.jetstack.io --force-update
$ helm upgrade --install --create-namespace --namespace cert-manager cert-manager jetstack/cert-manager --version v1.16.1 -f ./manifests/02-dpf-operator-installation/helm-values/cert-manager.yml
 
Release "cert-manager" does not exist. Installing it now.
NAME: cert-manager
LAST DEPLOYED: Tue May 20 12:59:30 2025
NAMESPACE: cert-manager
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
cert-manager v1.16.1 has been deployed successfully!

  2. Verify that all the pods in cert-manager namespace are in ready state:

    Jump Node Console

    Copy
    Copied!
                
    
            
    $ kubectl wait --for=condition=ready --namespace cert-manager pods --all
pod/cert-manager-6ffdf6c5f8-7k7zz condition met
pod/cert-manager-cainjector-66b8577665-fgcqg condition met
pod/cert-manager-webhook-5cb94cb7b6-9rk9m condition met

Install a CSI to Back the DPUCluster etcd

  1. Download local-path-provisioner helm chart to your current working directory and create a NS for it:

    Jump Node Console

    Copy
    Copied!
                
    
            
    $ curl https://codeload.github.com/rancher/local-path-provisioner/tar.gz/v0.0.30 | tar -xz --strip=3 local-path-provisioner-0.0.30/deploy/chart/local-path-provisioner/
$ kubectl create ns local-path-provisioner

  2. The following values will be used for the installation:

    manifests/02-dpf-operator-installation/helm-values/local-path-provisioner.yml

    Copy
    Copied!
                
    
            
    tolerations:
  - operator: Exists
    effect: NoSchedule
    key: node-role.kubernetes.io/control-plane
  - operator: Exists
    effect: NoSchedule
    key: node-role.kubernetes.io/master

    Run the following command:

    Jump Node Console

    Copy
    Copied!
                
    
            
    $ helm install -n local-path-provisioner local-path-provisioner ./local-path-provisioner --version 0.0.30 -f ./manifests/02-dpf-operator-installation/helm-values/local-path-provisioner.yml
 
NAME: local-path-provisioner
LAST DEPLOYED: Tue May 20 13:01:40 2025
NAMESPACE: local-path-provisioner
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
...

  3. Ensure that the pod in local-path-provisioner namespace is in ready state:

    Jump Node Console

    Copy
    Copied!
                
    
            
    $ kubectl wait --for=condition=ready --namespace local-path-provisioner pods --all
pod/local-path-provisioner-75f649c47c-fbccd condition met

Create Storage Required by the DPF Operator

  • Create the NS for the operator:

    Jump Node Console

    Copy
    Copied!
                
    
            
    $ kubectl create ns dpf-operator-system

  • The following YAML file defines storage (for the BFB image) that is required by the DPF operator.

    manifests/02-dpf-operator-installation/nfs-storage-for-bfb-dpf-ga.yaml

    Copy
    Copied!
                
    
            
    ---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: bfb-pv
spec:
  capacity:
    storage: 10Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteMany
  nfs:
    path: /mnt/dpf_share/bfb
    server: $NFS_SERVER_IP
  persistentVolumeReclaimPolicy: Delete
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: bfb-pvc
  namespace: dpf-operator-system
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 10Gi
  volumeMode: Filesystem
  storageClassName: ""

  • Run the following command to substitute the environment variables using envsubst and apply the yaml file:

    Jump Node Console

    Copy
    Copied!
                
    
            
    $ cat manifests/02-dpf-operator-installation/*.yaml | envsubst | kubectl apply -f -

DPF Operator Deployment

  1. The DPF Operator helm values are detailed in the following YAML file:

    manifests/02-dpf-operator-installation/helm-values/dpf-operator.yml

    Copy
    Copied!
                
    
            
    kamaji-etcd:
  persistentVolumeClaim:
    storageClassName: local-path
node-feature-discovery:
  worker:
    extraEnvs:
      - name: "KUBERNETES_SERVICE_HOST"
        value: "$TARGETCLUSTER_API_SERVER_HOST"
      - name: "KUBERNETES_SERVICE_PORT"
        value: "$TARGETCLUSTER_API_SERVER_PORT"

    Run the following commands to substitute the environment variables and install the DPF Operator:

    Jump Node Console

    Copy
    Copied!
                
    
            
    $ helm repo add --force-update dpf-repository ${REGISTRY}
$ helm repo update
$ envsubst < ./manifests/02-dpf-operator-installation/helm-values/dpf-operator.yml | helm upgrade --install -n dpf-operator-system dpf-operator dpf-repository/dpf-operator --version=$TAG --values -
 
Release "dpf-operator" does not exist. Installing it now.
NAME: dpf-operator
LAST DEPLOYED: Tue May 20 13:18:58 2025
NAMESPACE: dpf-operator-system
STATUS: deployed
REVISION: 1
TEST SUITE: None

  2. Verify the DPF Operator installation by ensuring the deployment is available and all the pods are ready:

    Note

    The following verification commands may need to be run multiple times to ensure the conditions are met.

    Jump Node Console

    Copy
    Copied!
                
    
            
    $ kubectl rollout status deployment --namespace dpf-operator-system dpf-operator-controller-manager
deployment "dpf-operator-controller-manager" successfully rolled out
 
$ kubectl wait --for=condition=ready --namespace dpf-operator-system pods --all
pod/dpf-operator-argocd-application-controller-0 condition met
pod/dpf-operator-argocd-redis-5bc74d76fc-dclfd condition met
pod/dpf-operator-argocd-repo-server-86c9454fc9-5wwkw condition met
pod/dpf-operator-argocd-server-554d9f446-sbz8b condition met
pod/dpf-operator-controller-manager-67599cdcb7-mzsc8 condition met
pod/dpf-operator-kamaji-6dcf4ccdfd-hdzwb condition met
pod/dpf-operator-kamaji-etcd-0 condition met
pod/dpf-operator-kamaji-etcd-1 condition met
pod/dpf-operator-kamaji-etcd-2 condition met
pod/dpf-operator-maintenance-operator-666b88bfcd-hx8h5 condition met
pod/dpf-operator-node-feature-discovery-gc-656b95dc48-z9tld condition met
pod/dpf-operator-node-feature-discovery-master-76d5695c7c-d6jlj condition met

© Copyright 2025, NVIDIA. Last updated on Jul 29, 2025.
content here