DOCA Platform Framework (DPF) Documentation v25.7.0

HBN in DPF Zero Trust

Note

Follow this guide from the source GitHub repo at github.com/NVIDIA/doca-platform and moving to the docs/public/user-guides/zero-trust/use-cases/hbn/README.md for better formatting of the code.

This configuration provides instructions for deploying the NVIDIA DOCA Platform Framework (DPF) on high-performance, bare-metal infrastructure in Zero Trust mode, utilizing DPU BMC and Redfish. It focuses on provisioning NVIDIA® BlueField®-3 DPUs using DPF, installing the HBN DPUService on those DPUs and enabling workload traffic to pass through HBN before leaving the DPU.

This guide should be run by cloning the repo from github.com/NVIDIA/doca-platform and moving to the docs/public/user-guides/zero-trust/use-cases/hbn directory.

The system is set up as described in the prerequisites.

In addition, for this use case, the Top of Rack switch(ToR) should be configured to support unnumbered BGP towards the two ports of the DPU, where HBN will act as peer, and advertise routes over BGP to allow for ECMP from the DPU. Additional information about how to do that can be found in the RDG for DPF Zero Trust (DPF-ZT) with HBN DPU Service.

Software prerequisites

The following tools must be installed on the machine where the commands contained in this guide run:

  • kubectl

  • helm

  • envsubst

This guide assumes that the setup includes only 2 workers with DPUs. If your setup has more than 2 workers, then you will need to set additional variables to enable the rest of the DPUs.

0. Required variables

The following variables are required by this guide. A sensible default is provided where it makes sense, but many will be specific to the target infrastructure.

Commands in this guide are run in the same directory that contains this readme.

  Environment variables file

Copy
Copied!
            

## IP Address for the Kubernetes API server of the target cluster on which DPF is installed. ## This should never include a scheme or a port. ## e.g. 10.10.10.10 export TARGETCLUSTER_API_SERVER_HOST= ## Virtual IP used by the load balancer for the DPU Cluster. Must be a reserved IP from the management subnet and not ## allocated by DHCP. export DPUCLUSTER_VIP= ## Interface on which the DPUCluster load balancer will listen. Should be the management interface of the control plane node. export DPUCLUSTER_INTERFACE= ## IP address to the NFS server used as storage for the BFB. export NFS_SERVER_IP= ## The repository URL for the NVIDIA Helm chart registry. ## Usually this is the NVIDIA Helm NGC registry. For development purposes, this can be set to a different repository. export HELM_REGISTRY_REPO_URL=https://helm.ngc.nvidia.com/nvidia/doca ## The repository URL for the HBN container image. ## Usually this is the NVIDIA NGC registry. For development purposes, this can be set to a different repository. export HBN_NGC_IMAGE_URL=nvcr.io/nvidia/doca/doca_hbn ## The DPF REGISTRY is the Helm repository URL where the DPF Operator Chart resides. ## Usually this is the NVIDIA Helm NGC registry. For development purposes, this can be set to a different repository. export REGISTRY=https://helm.ngc.nvidia.com/nvidia/doca ## The DPF TAG is the version of the DPF components which will be deployed in this guide. export TAG=v25.7.0 ## URL to the BFB used in the `bfb.yaml` and linked by the DPUSet. export BFB_URL="https://content.mellanox.com/BlueField/BFBs/Ubuntu22.04/bf-bundle-3.1.0-76_25.07_ubuntu-22.04_prod.bfb" ## IP_RANGE_START and IP_RANGE_END ## These define the IP range for DPU discovery via Redfish/BMC interfaces ## Example: If your DPUs have BMC IPs in range 192.168.1.100-110 ## export IP_RANGE_START=192.168.1.100 ## export IP_RANGE_END=192.168.1.110 export IP_RANGE_START= export IP_RANGE_END= # The password used for DPU BMC root login, must be the same for all DPUs export BMC_ROOT_PASSWORD= ## Serial number of DPUs. If you have more than 2 DPUs, you will need to parameterize the system accordingly and expose ## additional variables. ## All serial numbers must be in lowercase. export DPU1_SERIAL= export DPU2_SERIAL=

Modify the variables in manifests/00-env-vars/envvars.env to fit your environment, then source the file:

Copy
Copied!
            

source manifests/00-env-vars/envvars.env


1. DPF Operator installation

Create storage required by the DPF Operator

A number of environment variables must be set before running this command.

Copy
Copied!
            

kubectl create ns dpf-operator-system cat manifests/01-dpf-operator-installation/*.yaml | envsubst | kubectl apply -f -

This deploys the following objects:

  PersistentVolume and PersistentVolumeClaim for the provisioning controller

Copy
Copied!
            

--- apiVersion: v1 kind: PersistentVolume metadata: name: bfb-pv spec: capacity: storage: 10Gi volumeMode: Filesystem accessModes: - ReadWriteMany nfs: path: /mnt/dpf_share/bfb server: $NFS_SERVER_IP persistentVolumeReclaimPolicy: Delete --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: bfb-pvc namespace: dpf-operator-system spec: accessModes: - ReadWriteMany resources: requests: storage: 10Gi volumeMode: Filesystem storageClassName: ""

Create DPU BMC shared password secret

In Zero Trust mode, provisioning DPUs requires authentication with Redfish. In order to do that, you must set the same root password to access the BMC for all DPUs DPF is going to manage.

For more information on how to set the BMC root password refer to BlueField DPU Administrator Quick Start Guide

The password is provided to DPF by creating the following secret:

Copy
Copied!
            

kubectl create secret generic -n dpf-operator-system bmc-shared-password --from-literal=password=$BMC_ROOT_PASSWORD


Additional Dependencies

Before deploying the DPF Operator, ensure that Helm is properly configured according to the Helm prerequisites.

Warning

This is a critical prerequisite step that must be completed for the DPF Operator to function properly.


Deploy the DPF Operator

A number of environment variables must be set before running this command.

HTTP Registry (default)

If the $REGISTRY is an HTTP Registry (default value) use this command:

Copy
Copied!
            

helm repo add --force-update dpf-repository ${REGISTRY} helm repo update helm upgrade --install -n dpf-operator-system dpf-operator dpf-repository/dpf-operator --version=$TAG


OCI Registry

For development purposes, if the $REGISTRY is an OCI Registry use this command:

Copy
Copied!
            

helm upgrade --install -n dpf-operator-system dpf-operator $REGISTRY/dpf-operator --version=$TAG

Verification

These verification commands may need to be run multiple times to ensure the condition is met.

Verify the DPF Operator installation with:

Copy
Copied!
            

## Ensure the DPF Operator deployment is available. kubectl rollout status deployment --namespace dpf-operator-system dpf-operator-controller-manager ## Ensure all pods in the DPF Operator system are ready. kubectl wait --for=condition=ready --namespace dpf-operator-system pods --all

2. DPF system installation

This section involves creating the DPF system components and some basic infrastructure required for a functioning DPF-enabled cluster.

Deploy the DPF System components

A number of environment variables must be set before running this command.

Copy
Copied!
            

kubectl create ns dpu-cplane-tenant1 cat manifests/02-dpf-system-installation/*.yaml | envsubst | kubectl apply -f -

This will create the following objects:

  DPF Operator to install the DPF System components

Copy
Copied!
            

--- apiVersion: operator.dpu.nvidia.com/v1alpha1 kind: DPFOperatorConfig metadata: name: dpfoperatorconfig namespace: dpf-operator-system spec: dpuDetector: disable: true provisioningController: bfbPVCName: "bfb-pvc" dmsTimeout: 900 installInterface: installViaRedfish: # Set this to the IP of one of your control plane nodes + 8080 port bfbRegistryAddress: "$TARGETCLUSTER_API_SERVER_HOST:8080" kamajiClusterManager: disable: false

  DPUCluster to serve as Kubernetes control plane for DPU nodes

Copy
Copied!
            

--- apiVersion: provisioning.dpu.nvidia.com/v1alpha1 kind: DPUCluster metadata: name: dpu-cplane-tenant1 namespace: dpu-cplane-tenant1 spec: type: kamaji maxNodes: 10 clusterEndpoint: # deploy keepalived instances on the nodes that match the given nodeSelector. keepalived: # interface on which keepalived will listen. Should be the oob interface of the control plane node. interface: $DPUCLUSTER_INTERFACE # Virtual IP reserved for the DPU Cluster load balancer. Must not be allocatable by DHCP. vip: $DPUCLUSTER_VIP # virtualRouterID must be in range [1,255], make sure the given virtualRouterID does not duplicate with any existing keepalived process running on the host virtualRouterID: 126 nodeSelector: node-role.kubernetes.io/control-plane: ""

  DPUDiscovery to discover DPUDevices or DPUNodes

Copy
Copied!
            

--- apiVersion: provisioning.dpu.nvidia.com/v1alpha1 kind: DPUDiscovery metadata: name: dpu-discovery namespace: dpf-operator-system spec: ipRangeSpec: ipRange: startIP: $IP_RANGE_START endIP: $IP_RANGE_END

Verification

These verification commands may need to be run multiple times to ensure the condition is met.

Verify the DPF System with:

Copy
Copied!
            

## Ensure the provisioning and DPUService controller manager deployments are available. kubectl rollout status deployment --namespace dpf-operator-system dpf-provisioning-controller-manager dpuservice-controller-manager ## Ensure all other deployments in the DPF Operator system are Available. kubectl rollout status deployment --namespace dpf-operator-system ## Ensure the DPUCluster is ready for nodes to join. kubectl wait --for=condition=ready --namespace dpu-cplane-tenant1 dpucluster --all

3. DPU Provisioning and Service Installation

There are 2 types of installation a user can do. The first one is using the PFs of the host and the second one is using both PFs and VFs. You should choose the one that fits best on your use case.

In the following section, we provision our DPUs and the services tht will run on them. The user is expected to create a DPUDeployment object that reflects a set of DPUServices that should run on a set of DPUs.

If you want to learn more about DPUDeployments, feel free to check the DPUDeployment documentation.

Using PFs

In this scenario, the PF0 and PF1 are connected to separate VRFs which means that: * PF0 on Host 1 will be able to communicate with PF0 on Host 2 * PF0 on Host 1 will not be able to communicate with PF1 on Host 1 and 2

  • PF1 on Host 1 will be able to communicate with PF1 on Host 2

  • PF1 on Host 1 will not be able to communicate with PF0 on Host 1 and 2

We make use of a PF on the host to test traffic.

Create the DPUDeployment, DPUServiceConfig, DPUServiceTemplate and other necessary objects

A number of environment variables must be set before running this command.

Copy
Copied!
            

cat manifests/03.1-dpudeployment-installation-pf/*.yaml | envsubst | kubectl apply -f -

This will deploy the following objects:

  BFB to download Bluefield Bitstream to a shared volume

Copy
Copied!
            

--- apiVersion: provisioning.dpu.nvidia.com/v1alpha1 kind: BFB metadata: name: bf-bundle namespace: dpf-operator-system spec: url: $BFB_URL

  HBN DPUFlavor to correctly configure the DPUs on provisioning

Copy
Copied!
            

--- apiVersion: provisioning.dpu.nvidia.com/v1alpha1 kind: DPUFlavor metadata: name: hbn namespace: dpf-operator-system spec: dpuMode: zero-trust bfcfgParameters: - UPDATE_ATF_UEFI=yes - UPDATE_DPU_OS=yes - WITH_NIC_FW_UPDATE=yes configFiles: - operation: override path: /etc/mellanox/mlnx-bf.conf permissions: "0644" raw: | ALLOW_SHARED_RQ="no" IPSEC_FULL_OFFLOAD="no" ENABLE_ESWITCH_MULTIPORT="yes" - operation: override path: /etc/mellanox/mlnx-ovs.conf permissions: "0644" raw: | CREATE_OVS_BRIDGES="no" OVS_DOCA="yes" - operation: override path: /etc/mellanox/mlnx-sf.conf permissions: "0644" raw: "" grub: kernelParameters: - console=hvc0 - console=ttyAMA0 - earlycon=pl011,0x13010000 - fixrttc - net.ifnames=0 - biosdevname=0 - iommu.passthrough=1 - cgroup_no_v1=net_prio,net_cls - hugepagesz=2048kB - hugepages=3072 nvconfig: - device: '*' parameters: - PF_BAR2_ENABLE=0 - PER_PF_NUM_SF=1 - PF_TOTAL_SF=20 - PF_SF_BAR_SIZE=10 - NUM_PF_MSIX_VALID=0 - PF_NUM_PF_MSIX_VALID=1 - PF_NUM_PF_MSIX=228 - INTERNAL_CPU_MODEL=1 - INTERNAL_CPU_OFFLOAD_ENGINE=0 - SRIOV_EN=1 - NUM_OF_VFS=46 - LAG_RESOURCE_ALLOCATION=1 ovs: rawConfigScript: | _ovs-vsctl() { ovs-vsctl --no-wait --timeout 15 "$@" } _ovs-vsctl set Open_vSwitch . other_config:doca-init=true _ovs-vsctl set Open_vSwitch . other_config:dpdk-max-memzones=50000 _ovs-vsctl set Open_vSwitch . other_config:hw-offload=true _ovs-vsctl set Open_vSwitch . other_config:pmd-quiet-idle=true _ovs-vsctl set Open_vSwitch . other_config:max-idle=20000 _ovs-vsctl set Open_vSwitch . other_config:max-revalidator=5000 _ovs-vsctl --if-exists del-br ovsbr1 _ovs-vsctl --if-exists del-br ovsbr2 _ovs-vsctl --may-exist add-br br-sfc _ovs-vsctl set bridge br-sfc datapath_type=netdev _ovs-vsctl set bridge br-sfc fail_mode=secure _ovs-vsctl --may-exist add-port br-sfc p0 _ovs-vsctl set Interface p0 type=dpdk _ovs-vsctl set Interface p0 mtu_request=9216 _ovs-vsctl set Port p0 external_ids:dpf-type=physical

  DPUDeployment to provision DPUs on worker nodes

Copy
Copied!
            

--- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUDeployment metadata: name: hbn namespace: dpf-operator-system spec: dpus: bfb: bf-bundle flavor: hbn nodeEffect: noEffect: true dpuSets: - nameSuffix: "dpuset1" nodeSelector: matchLabels: feature.node.kubernetes.io/dpu-enabled: "true" services: doca-hbn: serviceTemplate: doca-hbn serviceConfiguration: doca-hbn serviceChains: switches: - ports: - serviceInterface: matchLabels: interface: p0 - service: name: doca-hbn interface: p0_if - ports: - serviceInterface: matchLabels: interface: p1 - service: name: doca-hbn interface: p1_if - ports: - serviceInterface: matchLabels: interface: pf0hpf - service: name: doca-hbn interface: pf0hpf_if - ports: - serviceInterface: matchLabels: interface: pf1hpf - service: name: doca-hbn interface: pf1hpf_if

  DPUServiceConfig and DPUServiceTemplate to deploy HBN workloads to the DPUs

Copy
Copied!
            

--- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUServiceConfiguration metadata: name: doca-hbn namespace: dpf-operator-system spec: deploymentServiceName: "doca-hbn" serviceConfiguration: serviceDaemonSet: annotations: k8s.v1.cni.cncf.io/networks: |- [ {"name": "iprequest", "interface": "ip_lo", "cni-args": {"poolNames": ["loopback"], "poolType": "cidrpool"}}, {"name": "iprequest", "interface": "ip_pf0hpf", "cni-args": {"poolNames": ["pool1"], "poolType": "cidrpool", "allocateDefaultGateway": true}}, {"name": "iprequest", "interface": "ip_pf1hpf", "cni-args": {"poolNames": ["pool2"], "poolType": "cidrpool", "allocateDefaultGateway": true}} ] helmChart: values: configuration: perDPUValuesYAML: | - hostnamePattern: "*" values: bgp_peer_group: hbn vrf1: RED vrf2: BLUE l3vni1: 100001 l3vni2: 100002 - hostnamePattern: "dpu-node-${DPU1_SERIAL}*" values: bgp_autonomous_system: 65101 - hostnamePattern: "dpu-node-${DPU2_SERIAL}*" values: bgp_autonomous_system: 65201 startupYAMLJ2: | - header: model: bluefield nvue-api-version: nvue_v1 rev-id: 1.0 version: HBN 2.4.0 - set: evpn: enable: on route-advertise: {} interface: lo: ip: address: {{ ipaddresses.ip_lo.ip }}/32: {} type: loopback p0_if,p1_if,pf0hpf_if,pf1hpf_if: type: swp link: mtu: 9000 pf0hpf_if: ip: address: {{ ipaddresses.ip_pf0hpf.cidr }}: {} vrf: {{ config.vrf1 }} pf1hpf_if: ip: address: {{ ipaddresses.ip_pf1hpf.cidr }}: {} vrf: {{ config.vrf2 }} nve: vxlan: arp-nd-suppress: on enable: on source: address: {{ ipaddresses.ip_lo.ip }} router: bgp: enable: on graceful-restart: mode: full vrf: default: router: bgp: address-family: ipv4-unicast: enable: on redistribute: connected: enable: on multipaths: ebgp: 16 l2vpn-evpn: enable: on autonomous-system: {{ config.bgp_autonomous_system }} enable: on neighbor: p0_if: peer-group: {{ config.bgp_peer_group }} type: unnumbered address-family: l2vpn-evpn: enable: on add-path-tx: off p1_if: peer-group: {{ config.bgp_peer_group }} type: unnumbered address-family: l2vpn-evpn: enable: on add-path-tx: off path-selection: multipath: aspath-ignore: on peer-group: {{ config.bgp_peer_group }}: address-family: ipv4-unicast: enable: on l2vpn-evpn: enable: on remote-as: external router-id: {{ ipaddresses.ip_lo.ip }} {{ config.vrf1 }}: evpn: enable: on vni: {{ config.l3vni1 }}: {} router: bgp: address-family: ipv4-unicast: enable: on redistribute: connected: enable: on route-export: to-evpn: enable: on autonomous-system: {{ config.bgp_autonomous_system }} enable: on {{ config.vrf2 }}: evpn: enable: on vni: {{ config.l3vni2 }}: {} router: bgp: address-family: ipv4-unicast: enable: on redistribute: connected: enable: on route-export: to-evpn: enable: on autonomous-system: {{ config.bgp_autonomous_system }} enable: on interfaces: - name: p0_if network: mybrhbn - name: p1_if network: mybrhbn - name: pf0hpf_if network: mybrhbn - name: pf1hpf_if network: mybrhbn

Copy
Copied!
            

--- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUServiceTemplate metadata: name: doca-hbn namespace: dpf-operator-system spec: deploymentServiceName: "doca-hbn" helmChart: source: repoURL: $HELM_REGISTRY_REPO_URL version: 1.0.3 chart: doca-hbn values: image: repository: $HBN_NGC_IMAGE_URL tag: 3.1.0-doca3.1.0 resources: memory: 6Gi nvidia.com/bf_sf: 4

  DPUServiceInterfaces for physical ports on the DPU

Copy
Copied!
            

--- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUServiceInterface metadata: name: p0 namespace: dpf-operator-system spec: template: spec: template: metadata: labels: interface: "p0" spec: interfaceType: physical physical: interfaceName: p0 --- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUServiceInterface metadata: name: p1 namespace: dpf-operator-system spec: template: spec: template: metadata: labels: interface: "p1" spec: interfaceType: physical physical: interfaceName: p1 --- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUServiceInterface metadata: name: pf0hpf namespace: dpf-operator-system spec: template: spec: template: metadata: labels: interface: "pf0hpf" spec: interfaceType: pf pf: pfID: 0 --- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUServiceInterface metadata: name: pf1hpf namespace: dpf-operator-system spec: template: spec: template: metadata: labels: interface: "pf1hpf" spec: interfaceType: pf pf: pfID: 1

  DPUServiceIPAM to set up IP Address Management on the DPUCluster

Copy
Copied!
            

--- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUServiceIPAM metadata: name: pool1 namespace: dpf-operator-system spec: ipv4Network: network: "10.0.121.0/24" gatewayIndex: 2 prefixSize: 29 # These preallocations are not necessary. We specify them so that the validation commands are straightforward. allocations: dpu-node-${DPU1_SERIAL}-${DPU1_SERIAL}: 10.0.121.0/29 dpu-node-${DPU2_SERIAL}-${DPU2_SERIAL}: 10.0.121.8/29 --- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUServiceIPAM metadata: name: pool2 namespace: dpf-operator-system spec: ipv4Network: network: "10.0.122.0/24" gatewayIndex: 2 prefixSize: 29

  DPUServiceIPAM for the loopback interface in HBN

Copy
Copied!
            

--- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUServiceIPAM metadata: name: loopback namespace: dpf-operator-system spec: ipv4Network: network: "11.0.0.0/24" prefixSize: 32

Verification

These verification commands may need to be run multiple times to ensure the condition is met.

Note that the DPUService name will have a random suffix. For example, doca-hbn-l2xsl.

Verify the DPU and Service installation with:

Copy
Copied!
            

## Ensure the DPUServices are created and have been reconciled. kubectl wait --for=condition=ApplicationsReconciled --namespace dpf-operator-system dpuservices -l svc.dpu.nvidia.com/owned-by-dpudeployment=dpf-operator-system_hbn ## Ensure the DPUServiceIPAMs have been reconciled kubectl wait --for=condition=DPUIPAMObjectReconciled --namespace dpf-operator-system dpuserviceipam --all ## Ensure the DPUServiceInterfaces have been reconciled kubectl wait --for=condition=ServiceInterfaceSetReconciled --namespace dpf-operator-system dpuserviceinterface --all ## Ensure the DPUServiceChains have been reconciled kubectl wait --for=condition=ServiceChainSetReconciled --namespace dpf-operator-system dpuservicechain --all ## Ensure the DPUs have the condition Initialized (this may take time) kubectl wait --for=condition=Initialized --namespace dpf-operator-system dpu --all

or with dpfctl:

Copy
Copied!
            

$ kubectl -n dpf-operator-system exec deploy/dpf-operator-controller-manager -- /dpfctl describe dpudeployments NAME NAMESPACE STATUS REASON SINCE MESSAGE DPFOperatorConfig/dpfoperatorconfig dpf-operator-system │ ├─Ready False Pending 17m The following conditions are not ready: │ │ * SystemComponentsReady │ └─SystemComponentsReady False Error 16m System components must be ready for DPF Operator to continue: │ * nvidia-k8s-ipam: DPUService dpf-operator-system/nvidia-k8s-ipam is not ready └─DPUDeployments └─DPUDeployment/hbn dpf-operator-system │ ├─Ready False Pending 11m The following conditions are not ready: │ │ * DPUSetsReady │ └─DPUSetsReady False Pending 11m Objects are not ready: │ * dpf-operator-system/hbn-dpuset1 ├─DPUServiceChains │ └─DPUServiceChain/hbn-8kkjz dpf-operator-system Ready: True Success 11m ├─DPUServiceInterfaces │ └─4 DPUServiceInterfaces... dpf-operator-system Ready: True Success 11m See doca-hbn-p0-if-mcqp4, doca-hbn-p1-if-6x2hh, doca-hbn-pf0hpf-if-q9lvk, doca-hbn-pf1hpf-if-979t7 ├─DPUSets │ └─DPUSet/hbn-dpuset1 dpf-operator-system │ ├─BFB/bf-bundle dpf-operator-system Ready: True Ready 13m File: bf-bundle-3.1.0-53_25.07_ubuntu-22.04_prod.bfb, DOCA: 3.1.0 │ └─DPUs │ ├─DPU/dpu-node-mt2402xz0f6v-mt2402xz0f6v dpf-operator-system │ │ └─Ready False OS Installing 8m39s │ └─DPU/dpu-node-mt2404xz0c98-mt2404xz0c98 dpf-operator-system │ └─Ready False OS Installing 8m39s └─Services ├─DPUServiceTemplates │ └─DPUServiceTemplate/doca-hbn dpf-operator-system Ready: True Success 13m └─DPUServices └─DPUService/doca-hbn-jmj45 dpf-operator-system Ready: True Success 11m

Making the DPUs Ready

In order to make the DPUs ready, we will need to manually power cycle the host. This operation should be done in the most graceful manner by gracefully shutting down the Host and DPU, powering off the server and then powering it on to avoid corruption. This should happen when the object gives us the signal. The described flow can be automated by the admin depending on the infrastructure.

The following verification command may need to be run multiple times to ensure the condition is met.

Copy
Copied!
            

## Ensure the DPUs have the condition WaitingForManualPowerCycleOrReboot (this may take time) kubectl wait --for=condition=WaitingForManualPowerCycleOrReboot --namespace dpf-operator-system dpu --all

or with dpfctl:

Copy
Copied!
            

$ kubectl -n dpf-operator-system exec deploy/dpf-operator-controller-manager -- /dpfctl describe dpudeployments NAME NAMESPACE STATUS REASON SINCE MESSAGE DPFOperatorConfig/dpfoperatorconfig dpf-operator-system │ ├─Ready False Pending 66m The following conditions are not ready: │ │ * SystemComponentsReady │ └─SystemComponentsReady False Error 66m System components must be ready for DPF Operator to continue: │ * nvidia-k8s-ipam: DPUService dpf-operator-system/nvidia-k8s-ipam is not ready └─DPUDeployments └─DPUDeployment/hbn dpf-operator-system │ ├─Ready False Pending 61m The following conditions are not ready: │ │ * DPUSetsReady │ └─DPUSetsReady False Pending 61m Objects are not ready: │ * dpf-operator-system/hbn-dpuset1 ├─DPUServiceChains │ └─DPUServiceChain/hbn-8kkjz dpf-operator-system Ready: True Success 61m ├─DPUServiceInterfaces │ └─4 DPUServiceInterfaces... dpf-operator-system Ready: True Success 61m See doca-hbn-p0-if-mcqp4, doca-hbn-p1-if-6x2hh, doca-hbn-pf0hpf-if-q9lvk, doca-hbn-pf1hpf-if-979t7 ├─DPUSets │ └─DPUSet/hbn-dpuset1 dpf-operator-system │ ├─BFB/bf-bundle dpf-operator-system Ready: True Ready 62m File: bf-bundle-3.1.0-53_25.07_ubuntu-22.04_prod.bfb, DOCA: 3.1.0 │ └─DPUs │ ├─DPU/dpu-node-mt2402xz0f6v-mt2402xz0f6v dpf-operator-system │ │ ├─Rebooted False WaitingForManualPowerCycleOrReboot 11m │ │ └─Ready False Rebooting 11m │ └─DPU/dpu-node-mt2404xz0c98-mt2404xz0c98 dpf-operator-system │ ├─Rebooted False WaitingForManualPowerCycleOrReboot 5m49s │ └─Ready False Rebooting 5m49s └─Services ├─DPUServiceTemplates │ └─DPUServiceTemplate/doca-hbn dpf-operator-system Ready: True Success 62m └─DPUServices └─DPUService/doca-hbn-jmj45 dpf-operator-system Ready: True Success 61m

At this point, we have to power cycle the hosts. Once all the hosts are back online, we have to remove an annotation from the DPUNodes. The user can choose to remove this annotation node by node but to make it simpler in this guide, we do that all at once.

Copy
Copied!
            

kubectl annotate dpunodes -n dpf-operator-system --all provisioning.dpu.nvidia.com/dpunode-external-reboot-required-

After this is done, we should expect that all DPUs become Ready:

Copy
Copied!
            

kubectl wait --for="jsonpath={.status.phase}=Ready" --namespace dpf-operator-system dpu --all

or with dpfctl:

Copy
Copied!
            

$ kubectl -n dpf-operator-system exec deploy/dpf-operator-controller-manager -- /dpfctl describe dpudeployments NAME NAMESPACE STATUS REASON SINCE MESSAGE DPFOperatorConfig/dpfoperatorconfig dpf-operator-system Ready: True Success 8m19s └─DPUDeployments └─DPUDeployment/hbn dpf-operator-system Ready: True Success 19s ├─DPUServiceChains │ └─DPUServiceChain/hbn-8kkjz dpf-operator-system Ready: True Success 90m ├─DPUServiceInterfaces │ └─4 DPUServiceInterfaces... dpf-operator-system Ready: True Success 48s See doca-hbn-p0-if-mls69, doca-hbn-p1-if-dv6ds, doca-hbn-pf0hpf-if-q9lvk, doca-hbn-pf1hpf-if-979t7 ├─DPUSets │ └─DPUSet/hbn-dpuset1 dpf-operator-system │ ├─BFB/bf-bundle dpf-operator-system Ready: True Ready 91m File: bf-bundle-3.1.0-53_25.07_ubuntu-22.04_prod.bfb, DOCA: 3.1.0 │ └─DPUs │ └─2 DPUs... dpf-operator-system Ready: True DPUReady 25m See dpu-node-mt2402xz0f6v-mt2402xz0f6v, dpu-node-mt2404xz0c98-mt2404xz0c98 └─Services ├─DPUServiceTemplates │ └─DPUServiceTemplate/doca-hbn dpf-operator-system Ready: True Success 91m └─DPUServices └─DPUService/doca-hbn-6rhsx dpf-operator-system Ready: True Success 21s

Test Traffic

After the DPUs are provisioned and the rest of the objects are Ready, we can test traffic by assigning an IP to the PF0 on the host for each DPU, and run a simple ping. Although the configuration is enabling both PFs, we focus on the PF0 for testing traffic. Assuming the PF0 is named ens5f0np0:

On the host with DPU with serial number DPU1_SERIAL:

Copy
Copied!
            

ip link set dev ens5f0np0 up ip addr add 10.0.121.1/29 dev ens5f0np0 ip route add 10.0.121.0/24 dev ens5f0np0 via 10.0.121.2

On the host with DPU with serial number DPU2_SERIAL:

Copy
Copied!
            

ip link set dev ens5f0np0 up ip addr add 10.0.121.9/29 dev ens5f0np0 ip route add 10.0.121.0/24 dev ens5f0np0 via 10.0.121.10

On the host with DPU with serial number DPU1_SERIAL:

Copy
Copied!
            

$ ping 10.0.121.9 -c3 PING 10.0.121.9 (10.0.121.9) 56(84) bytes of data. 64 bytes from 10.0.121.9: icmp_seq=1 ttl=64 time=0.387 ms 64 bytes from 10.0.121.9: icmp_seq=2 ttl=64 time=0.344 ms 64 bytes from 10.0.121.9: icmp_seq=3 ttl=64 time=0.396 ms   --- 10.0.121.9 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2053ms rtt min/avg/max/mdev = 0.344/0.375/0.396/0.022 ms

Using PFs + VFs

In this scenario, the PF0, PF1, VF10 of the PF0 and VF10 of the PF1 are connected to separate VRFs which means that:

  • PF0 on Host 1 will be able to communicate with PF0 on Host 2

  • PF0 on Host 1 will not be able to communicate with PF1 on Host 1 and 2

  • PF0 on Host 1 will not be able to communicate with PF0VF10 on Host 1 and 2

  • PF0 on Host 1 will not be able to communicate with PF1VF10 on Host 1 and 2

  • PF1 on Host 1 will be able to communicate with PF1 on Host 2

  • PF1 on Host 1 will not be able to communicate with PF0 on Host 1 and 2

  • PF1 on Host 1 will not be able to communicate with PF0VF10 on Host 1 and 2

  • PF1 on Host 1 will not be able to communicate with PF1VF10 on Host 1 and 2

  • PF0VF10 on Host 1 will be able to communicate with PF0VF10 on Host 2

  • PF0VF10 on Host 1 will not be able to communicate with PF0 on Host 1 and 2

  • PF0VF10 on Host 1 will not be able to communicate with PF1 on Host 1 and 2

  • PF0VF10 on Host 1 will not be able to communicate with PF1VF10 on Host 1 and 2

  • PF1VF10 on Host 1 will be able to communicate with PF1VF10 on Host 2

  • PF1VF10 on Host 1 will not be able to communicate with PF0 on Host 1 and 2

  • PF1VF10 on Host 1 will not be able to communicate with PF1 on Host 1 and 2

  • PF1VF10 on Host 1 will not be able to communicate with PF0VF10 on Host 1 and 2

We make use of a PF and a VF on the host to test traffic.

Create the DPUDeployment, DPUServiceConfig, DPUServiceTemplate and other necessary objects

A number of environment variables must be set before running this command.

Copy
Copied!
            

cat manifests/03.2-dpudeployment-installation-pf-vf/*.yaml | envsubst | kubectl apply -f -

This will deploy the following objects:

  BFB to download Bluefield Bitstream to a shared volume

Copy
Copied!
            

--- apiVersion: provisioning.dpu.nvidia.com/v1alpha1 kind: BFB metadata: name: bf-bundle namespace: dpf-operator-system spec: url: $BFB_URL

  HBN DPUFlavor to correctly configure the DPUs on provisioning

Copy
Copied!
            

--- apiVersion: provisioning.dpu.nvidia.com/v1alpha1 kind: DPUFlavor metadata: name: hbn namespace: dpf-operator-system spec: dpuMode: zero-trust bfcfgParameters: - UPDATE_ATF_UEFI=yes - UPDATE_DPU_OS=yes - WITH_NIC_FW_UPDATE=yes configFiles: - operation: override path: /etc/mellanox/mlnx-bf.conf permissions: "0644" raw: | ALLOW_SHARED_RQ="no" IPSEC_FULL_OFFLOAD="no" ENABLE_ESWITCH_MULTIPORT="yes" - operation: override path: /etc/mellanox/mlnx-ovs.conf permissions: "0644" raw: | CREATE_OVS_BRIDGES="no" OVS_DOCA="yes" - operation: override path: /etc/mellanox/mlnx-sf.conf permissions: "0644" raw: "" grub: kernelParameters: - console=hvc0 - console=ttyAMA0 - earlycon=pl011,0x13010000 - fixrttc - net.ifnames=0 - biosdevname=0 - iommu.passthrough=1 - cgroup_no_v1=net_prio,net_cls - hugepagesz=2048kB - hugepages=3072 nvconfig: - device: '*' parameters: - PF_BAR2_ENABLE=0 - PER_PF_NUM_SF=1 - PF_TOTAL_SF=20 - PF_SF_BAR_SIZE=10 - NUM_PF_MSIX_VALID=0 - PF_NUM_PF_MSIX_VALID=1 - PF_NUM_PF_MSIX=228 - INTERNAL_CPU_MODEL=1 - INTERNAL_CPU_OFFLOAD_ENGINE=0 - SRIOV_EN=1 - NUM_OF_VFS=46 - LAG_RESOURCE_ALLOCATION=1 ovs: rawConfigScript: | _ovs-vsctl() { ovs-vsctl --no-wait --timeout 15 "$@" } _ovs-vsctl set Open_vSwitch . other_config:doca-init=true _ovs-vsctl set Open_vSwitch . other_config:dpdk-max-memzones=50000 _ovs-vsctl set Open_vSwitch . other_config:hw-offload=true _ovs-vsctl set Open_vSwitch . other_config:pmd-quiet-idle=true _ovs-vsctl set Open_vSwitch . other_config:max-idle=20000 _ovs-vsctl set Open_vSwitch . other_config:max-revalidator=5000 _ovs-vsctl --if-exists del-br ovsbr1 _ovs-vsctl --if-exists del-br ovsbr2 _ovs-vsctl --may-exist add-br br-sfc _ovs-vsctl set bridge br-sfc datapath_type=netdev _ovs-vsctl set bridge br-sfc fail_mode=secure _ovs-vsctl --may-exist add-port br-sfc p0 _ovs-vsctl set Interface p0 type=dpdk _ovs-vsctl set Interface p0 mtu_request=9216 _ovs-vsctl set Port p0 external_ids:dpf-type=physical

  DPUDeployment to provision DPUs on worker nodes

Copy
Copied!
            

--- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUDeployment metadata: name: hbn namespace: dpf-operator-system spec: dpus: bfb: bf-bundle flavor: hbn nodeEffect: noEffect: true dpuSets: - nameSuffix: "dpuset1" nodeSelector: matchLabels: feature.node.kubernetes.io/dpu-enabled: "true" services: doca-hbn: serviceTemplate: doca-hbn serviceConfiguration: doca-hbn serviceChains: switches: - ports: - serviceInterface: matchLabels: interface: p0 - service: name: doca-hbn interface: p0_if - ports: - serviceInterface: matchLabels: interface: p1 - service: name: doca-hbn interface: p1_if - ports: - serviceInterface: matchLabels: interface: pf0hpf - service: name: doca-hbn interface: pf0hpf_if - ports: - serviceInterface: matchLabels: interface: pf1hpf - service: name: doca-hbn interface: pf1hpf_if - ports: - serviceInterface: matchLabels: interface: pf0vf10 - service: name: doca-hbn interface: pf0vf10_if - ports: - serviceInterface: matchLabels: interface: pf1vf10 - service: name: doca-hbn interface: pf1vf10_if

  DPUServiceConfig and DPUServiceTemplate to deploy HBN workloads to the DPUs

Copy
Copied!
            

--- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUServiceConfiguration metadata: name: doca-hbn namespace: dpf-operator-system spec: deploymentServiceName: "doca-hbn" serviceConfiguration: serviceDaemonSet: annotations: k8s.v1.cni.cncf.io/networks: |- [ {"name": "iprequest", "interface": "ip_lo", "cni-args": {"poolNames": ["loopback"], "poolType": "cidrpool"}}, {"name": "iprequest", "interface": "ip_pf0hpf", "cni-args": {"poolNames": ["pool1"], "poolType": "cidrpool", "allocateDefaultGateway": true}}, {"name": "iprequest", "interface": "ip_pf1hpf", "cni-args": {"poolNames": ["pool2"], "poolType": "cidrpool", "allocateDefaultGateway": true}}, {"name": "iprequest", "interface": "ip_pf0vf10", "cni-args": {"poolNames": ["pool3"], "poolType": "cidrpool", "allocateDefaultGateway": true}}, {"name": "iprequest", "interface": "ip_pf1vf10", "cni-args": {"poolNames": ["pool4"], "poolType": "cidrpool", "allocateDefaultGateway": true}} ] helmChart: values: configuration: perDPUValuesYAML: | - hostnamePattern: "*" values: bgp_peer_group: hbn vrf1: RED vrf2: BLUE vrf3: GREEN vrf4: YELLOW l3vni1: 100001 l3vni2: 100002 l3vni3: 100003 l3vni4: 100004 - hostnamePattern: "dpu-node-${DPU1_SERIAL}*" values: bgp_autonomous_system: 65101 - hostnamePattern: "dpu-node-${DPU2_SERIAL}*" values: bgp_autonomous_system: 65201 startupYAMLJ2: | - header: model: bluefield nvue-api-version: nvue_v1 rev-id: 1.0 version: HBN 2.4.0 - set: evpn: enable: on route-advertise: {} interface: lo: ip: address: {{ ipaddresses.ip_lo.ip }}/32: {} type: loopback p0_if,p1_if,pf0vf10_if,pf1vf10_if,pf0hpf_if,pf1hpf_if: type: swp link: mtu: 9000 pf0vf10_if: ip: address: {{ ipaddresses.ip_pf0vf10.cidr }}: {} vrf: {{ config.vrf1 }} pf1vf10_if: ip: address: {{ ipaddresses.ip_pf1vf10.cidr }}: {} vrf: {{ config.vrf2 }} pf0hpf_if: ip: address: {{ ipaddresses.ip_pf0hpf.cidr }}: {} vrf: {{ config.vrf3 }} pf1hpf_if: ip: address: {{ ipaddresses.ip_pf1hpf.cidr }}: {} vrf: {{ config.vrf4 }} nve: vxlan: arp-nd-suppress: on enable: on source: address: {{ ipaddresses.ip_lo.ip }} router: bgp: enable: on graceful-restart: mode: full vrf: default: router: bgp: address-family: ipv4-unicast: enable: on redistribute: connected: enable: on multipaths: ebgp: 16 l2vpn-evpn: enable: on autonomous-system: {{ config.bgp_autonomous_system }} enable: on neighbor: p0_if: peer-group: {{ config.bgp_peer_group }} type: unnumbered address-family: l2vpn-evpn: enable: on add-path-tx: off p1_if: peer-group: {{ config.bgp_peer_group }} type: unnumbered address-family: l2vpn-evpn: enable: on add-path-tx: off path-selection: multipath: aspath-ignore: on peer-group: {{ config.bgp_peer_group }}: address-family: ipv4-unicast: enable: on l2vpn-evpn: enable: on remote-as: external router-id: {{ ipaddresses.ip_lo.ip }} {{ config.vrf1 }}: evpn: enable: on vni: {{ config.l3vni1 }}: {} router: bgp: address-family: ipv4-unicast: enable: on redistribute: connected: enable: on route-export: to-evpn: enable: on autonomous-system: {{ config.bgp_autonomous_system }} enable: on {{ config.vrf2 }}: evpn: enable: on vni: {{ config.l3vni2 }}: {} router: bgp: address-family: ipv4-unicast: enable: on redistribute: connected: enable: on route-export: to-evpn: enable: on autonomous-system: {{ config.bgp_autonomous_system }} enable: on {{ config.vrf3 }}: evpn: enable: on vni: {{ config.l3vni3 }}: {} router: bgp: address-family: ipv4-unicast: enable: on redistribute: connected: enable: on route-export: to-evpn: enable: on autonomous-system: {{ config.bgp_autonomous_system }} enable: on {{ config.vrf4 }}: evpn: enable: on vni: {{ config.l3vni4 }}: {} router: bgp: address-family: ipv4-unicast: enable: on redistribute: connected: enable: on route-export: to-evpn: enable: on autonomous-system: {{ config.bgp_autonomous_system }} enable: on interfaces: - name: p0_if network: mybrhbn - name: p1_if network: mybrhbn - name: pf0vf10_if network: mybrhbn - name: pf1vf10_if network: mybrhbn - name: pf0hpf_if network: mybrhbn - name: pf1hpf_if network: mybrhbn

Copy
Copied!
            

--- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUServiceTemplate metadata: name: doca-hbn namespace: dpf-operator-system spec: deploymentServiceName: "doca-hbn" helmChart: source: repoURL: $HELM_REGISTRY_REPO_URL version: 1.0.3 chart: doca-hbn values: image: repository: $HBN_NGC_IMAGE_URL tag: 3.1.0-doca3.1.0 resources: memory: 6Gi nvidia.com/bf_sf: 6

  DPUServiceInterfaces for physical ports on the DPU

Copy
Copied!
            

--- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUServiceInterface metadata: name: p0 namespace: dpf-operator-system spec: template: spec: template: metadata: labels: interface: "p0" spec: interfaceType: physical physical: interfaceName: p0 --- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUServiceInterface metadata: name: p1 namespace: dpf-operator-system spec: template: spec: template: metadata: labels: interface: "p1" spec: interfaceType: physical physical: interfaceName: p1 --- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUServiceInterface metadata: name: pf0vf10-rep namespace: dpf-operator-system spec: template: spec: template: metadata: labels: interface: "pf0vf10" spec: interfaceType: vf vf: parentInterfaceRef: p0 pfID: 0 vfID: 10 --- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUServiceInterface metadata: name: pf1vf10-rep namespace: dpf-operator-system spec: template: spec: template: metadata: labels: interface: "pf1vf10" spec: interfaceType: vf vf: parentInterfaceRef: p1 pfID: 1 vfID: 10 --- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUServiceInterface metadata: name: pf0hpf namespace: dpf-operator-system spec: template: spec: template: metadata: labels: interface: "pf0hpf" spec: interfaceType: pf pf: pfID: 0 --- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUServiceInterface metadata: name: pf1hpf namespace: dpf-operator-system spec: template: spec: template: metadata: labels: interface: "pf1hpf" spec: interfaceType: pf pf: pfID: 1

  DPUServiceIPAM to set up IP Address Management on the DPUCluster

Copy
Copied!
            

--- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUServiceIPAM metadata: name: pool1 namespace: dpf-operator-system spec: ipv4Network: network: "10.0.121.0/24" gatewayIndex: 2 prefixSize: 29 # These preallocations are not necessary. We specify them so that the validation commands are straightforward. allocations: dpu-node-${DPU1_SERIAL}-${DPU1_SERIAL}: 10.0.121.0/29 dpu-node-${DPU2_SERIAL}-${DPU2_SERIAL}: 10.0.121.8/29 --- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUServiceIPAM metadata: name: pool2 namespace: dpf-operator-system spec: ipv4Network: network: "10.0.122.0/24" gatewayIndex: 2 prefixSize: 29 --- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUServiceIPAM metadata: name: pool3 namespace: dpf-operator-system spec: ipv4Network: network: "10.0.123.0/24" gatewayIndex: 2 prefixSize: 29 # These preallocations are not necessary. We specify them so that the validation commands are straightforward. allocations: dpu-node-${DPU1_SERIAL}-${DPU1_SERIAL}: 10.0.123.0/29 dpu-node-${DPU2_SERIAL}-${DPU2_SERIAL}: 10.0.123.8/29 --- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUServiceIPAM metadata: name: pool4 namespace: dpf-operator-system spec: ipv4Network: network: "10.0.124.0/24" gatewayIndex: 2 prefixSize: 29

  DPUServiceIPAM for the loopback interface in HBN

Copy
Copied!
            

--- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUServiceIPAM metadata: name: loopback namespace: dpf-operator-system spec: ipv4Network: network: "11.0.0.0/24" prefixSize: 32

Verification

These verification commands may need to be run multiple times to ensure the condition is met.

Note that the DPUService name will have a random suffix. For example, doca-hbn-l2xsl.

Verify the DPU and Service installation with:

Copy
Copied!
            

## Ensure the DPUServices are created and have been reconciled. kubectl wait --for=condition=ApplicationsReconciled --namespace dpf-operator-system dpuservices -l svc.dpu.nvidia.com/owned-by-dpudeployment=dpf-operator-system_hbn ## Ensure the DPUServiceIPAMs have been reconciled kubectl wait --for=condition=DPUIPAMObjectReconciled --namespace dpf-operator-system dpuserviceipam --all ## Ensure the DPUServiceInterfaces have been reconciled kubectl wait --for=condition=ServiceInterfaceSetReconciled --namespace dpf-operator-system dpuserviceinterface --all ## Ensure the DPUServiceChains have been reconciled kubectl wait --for=condition=ServiceChainSetReconciled --namespace dpf-operator-system dpuservicechain --all ## Ensure the DPUs have the condition Initialized (this may take time) kubectl wait --for=condition=Initialized --namespace dpf-operator-system dpu --all

or with dpfctl:

Copy
Copied!
            

$ kubectl -n dpf-operator-system exec deploy/dpf-operator-controller-manager -- /dpfctl describe dpudeployments NAME NAMESPACE STATUS REASON SINCE MESSAGE DPFOperatorConfig/dpfoperatorconfig dpf-operator-system │ ├─Ready False Pending 3m13s The following conditions are not ready: │ │ * SystemComponentsReady │ └─SystemComponentsReady False Error 2m28s System components must be ready for DPF Operator to continue: │ * nvidia-k8s-ipam: DPUService dpf-operator-system/nvidia-k8s-ipam is not ready └─DPUDeployments └─DPUDeployment/hbn dpf-operator-system │ ├─Ready False Pending 77s The following conditions are not ready: │ │ * DPUSetsReady │ └─DPUSetsReady False Pending 79s Objects are not ready: │ * dpf-operator-system/hbn-dpuset1 ├─DPUServiceChains │ └─DPUServiceChain/hbn-5zgs4 dpf-operator-system Ready: True Success 79s ├─DPUServiceInterfaces │ └─6 DPUServiceInterfaces... dpf-operator-system Ready: True Success 79s See doca-hbn-p0-if-w6f6b, doca-hbn-p1-if-p7565, doca-hbn-pf0hpf-if-wb84j, doca-hbn-pf0vf10-if-mr6fj, │ doca-hbn-pf1hpf-if-cnbz8, doca-hbn-pf1vf10-if-7r6r6 ├─DPUSets │ └─DPUSet/hbn-dpuset1 dpf-operator-system │ ├─BFB/bf-bundle dpf-operator-system Ready: True Ready 105s File: bf-bundle-3.1.0-76_25.07_ubuntu-22.04_prod.bfb, DOCA: 3.1.0 │ └─DPUs │ ├─DPU/dpu-node-mt2402xz0f6v-mt2402xz0f6v dpf-operator-system │ │ └─Ready False OS Installing 72s │ └─DPU/dpu-node-mt2404xz0c98-mt2404xz0c98 dpf-operator-system │ └─Ready False OS Installing 69s └─Services ├─DPUServiceTemplates │ └─DPUServiceTemplate/doca-hbn dpf-operator-system Ready: True Success 104s └─DPUServices └─DPUService/doca-hbn-bjqbh dpf-operator-system Ready: True Success 77s

Making the DPUs Ready

In order to make the DPUs ready, we will need to manually power cycle the host. This operation should be done in the most graceful manner by gracefully shutting down the Host and DPU, powering off the server and then powering it on to avoid corruption. This should happen when the object gives us the signal. The described flow can be automated by the admin depending on the infrastructure.

The following verification command may need to be run multiple times to ensure the condition is met.

Copy
Copied!
            

## Ensure the DPUs have the condition WaitingForManualPowerCycleOrReboot (this may take time) kubectl wait --for=condition=WaitingForManualPowerCycleOrReboot --namespace dpf-operator-system dpu --all

or with dpfctl:

Copy
Copied!
            

$ kubectl -n dpf-operator-system exec deploy/dpf-operator-controller-manager -- /dpfctl describe dpudeployments NAME NAMESPACE STATUS REASON SINCE MESSAGE DPFOperatorConfig/dpfoperatorconfig dpf-operator-system │ ├─Ready False Pending 17m The following conditions are not ready: │ │ * SystemComponentsReady │ └─SystemComponentsReady False Error 16m System components must be ready for DPF Operator to continue: │ * nvidia-k8s-ipam: DPUService dpf-operator-system/nvidia-k8s-ipam is not ready └─DPUDeployments └─DPUDeployment/hbn dpf-operator-system │ ├─Ready False Pending 15m The following conditions are not ready: │ │ * DPUSetsReady │ └─DPUSetsReady False Pending 15m Objects are not ready: │ * dpf-operator-system/hbn-dpuset1 ├─DPUServiceChains │ └─DPUServiceChain/hbn-5zgs4 dpf-operator-system Ready: True Success 15m ├─DPUServiceInterfaces │ └─6 DPUServiceInterfaces... dpf-operator-system Ready: True Success 15m See doca-hbn-p0-if-w6f6b, doca-hbn-p1-if-p7565, doca-hbn-pf0hpf-if-wb84j, doca-hbn-pf0vf10-if-mr6fj, │ doca-hbn-pf1hpf-if-cnbz8, doca-hbn-pf1vf10-if-7r6r6 ├─DPUSets │ └─DPUSet/hbn-dpuset1 dpf-operator-system │ ├─BFB/bf-bundle dpf-operator-system Ready: True Ready 15m File: bf-bundle-3.1.0-76_25.07_ubuntu-22.04_prod.bfb, DOCA: 3.1.0 │ └─DPUs │ ├─DPU/dpu-node-mt2402xz0f6v-mt2402xz0f6v dpf-operator-system │ │ ├─Rebooted False WaitingForManualPowerCycleOrReboot 2m36s │ │ └─Ready False Rebooting 2m36s │ └─DPU/dpu-node-mt2404xz0c98-mt2404xz0c98 dpf-operator-system │ ├─Rebooted False WaitingForManualPowerCycleOrReboot 2m36s │ └─Ready False Rebooting 2m36s └─Services ├─DPUServiceTemplates │ └─DPUServiceTemplate/doca-hbn dpf-operator-system Ready: True Success 15m └─DPUServices └─DPUService/doca-hbn-bjqbh dpf-operator-system Ready: True Success 15m

At this point, we have to power cycle the hosts. Once all the hosts are back online, we have to remove an annotation from the DPUNodes. The user can choose to remove this annotation node by node but to make it simpler in this guide, we do that all at once.

Copy
Copied!
            

kubectl annotate dpunodes -n dpf-operator-system --all provisioning.dpu.nvidia.com/dpunode-external-reboot-required-

After this is done, we should expect that all DPUs become Ready:

Copy
Copied!
            

kubectl wait --for="jsonpath={.status.phase}=Ready" --namespace dpf-operator-system dpu --all

or with dpfctl:

Copy
Copied!
            

$ kubectl -n dpf-operator-system exec deploy/dpf-operator-controller-manager -- /dpfctl describe dpudeployments NAME NAMESPACE STATUS REASON SINCE MESSAGE NAME NAMESPACE STATUS REASON SINCE MESSAGE DPFOperatorConfig/dpfoperatorconfig dpf-operator-system Ready: True Success 6m5s └─DPUDeployments └─DPUDeployment/hbn dpf-operator-system Ready: True Success 2s ├─DPUServiceChains │ └─DPUServiceChain/hbn-5zgs4 dpf-operator-system Ready: True Success 36s ├─DPUServiceInterfaces │ └─6 DPUServiceInterfaces... dpf-operator-system Ready: True Success 6s See doca-hbn-p0-if-w6f6b, doca-hbn-p1-if-p7565, doca-hbn-pf0hpf-if-wb84j, doca-hbn-pf0vf10-if-mr6fj, │ doca-hbn-pf1hpf-if-cnbz8, doca-hbn-pf1vf10-if-7r6r6 ├─DPUSets │ └─DPUSet/hbn-dpuset1 dpf-operator-system │ ├─BFB/bf-bundle dpf-operator-system Ready: True Ready 28m File: bf-bundle-3.1.0-76_25.07_ubuntu-22.04_prod.bfb, DOCA: 3.1.0 │ └─DPUs │ └─2 DPUs... dpf-operator-system Ready: True DPUReady 5m52s See dpu-node-mt2402xz0f6v-mt2402xz0f6v, dpu-node-mt2404xz0c98-mt2404xz0c98 └─Services ├─DPUServiceTemplates │ └─DPUServiceTemplate/doca-hbn dpf-operator-system Ready: True Success 28m └─DPUServices └─DPUService/doca-hbn-bjqbh dpf-operator-system Ready: True Success 3s

Test Traffic

After the DPUs are provisioned and the rest of the objects are Ready, we can test traffic by assigning an IP to the PF0 on the host for each DPU, and run a simple ping. Although the configuration is enabling both PFs, we focus on the PF0 for testing traffic. Assuming the PF0 is named ens5f0np0:

On the host with DPU with serial number DPU1_SERIAL:

Copy
Copied!
            

ip link set dev ens5f0np0 up ip addr add 10.0.121.1/29 dev ens5f0np0 ip route add 10.0.121.0/24 dev ens5f0np0 via 10.0.121.2

On the host with DPU with serial number DPU2_SERIAL:

Copy
Copied!
            

ip link set dev ens5f0np0 up ip addr add 10.0.121.9/29 dev ens5f0np0 ip route add 10.0.121.0/24 dev ens5f0np0 via 10.0.121.10

On the host with DPU with serial number DPU1_SERIAL:

Copy
Copied!
            

$ ping 10.0.121.9 -c3 PING 10.0.121.9 (10.0.121.9) 56(84) bytes of data. 64 bytes from 10.0.121.9: icmp_seq=1 ttl=64 time=0.387 ms 64 bytes from 10.0.121.9: icmp_seq=2 ttl=64 time=0.344 ms 64 bytes from 10.0.121.9: icmp_seq=3 ttl=64 time=0.396 ms   --- 10.0.121.9 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2053ms rtt min/avg/max/mdev = 0.344/0.375/0.396/0.022 ms

In addition, we can test traffic by assigning an IP to the 10th VF of PF0 on the host for each DPU, and run a simple ping. We could use any VF, but the DPUDeployment and DPUServiceInterface will need to be adjusted accordingly. First thing to do is to create the VFs on the hosts where the each DPU belongs to:

Copy
Copied!
            

echo 12 > /sys/class/net/ens5f0np0/device/sriov_numvfs

Then, assuming the VF is named ens5f0v10:

On the host with DPU with serial number DPU1_SERIAL:

Copy
Copied!
            

ip link set dev ens5f0v10 up ip addr add 10.0.123.1/29 dev ens5f0v10 ip route add 10.0.123.0/24 dev ens5f0v10 via 10.0.123.2

On the host with DPU with serial number DPU2_SERIAL:

Copy
Copied!
            

ip link set dev ens5f0v10 up ip addr add 10.0.123.9/29 dev ens5f0v10 ip route add 10.0.123.0/24 dev ens5f0v10 via 10.0.123.10

On the host with DPU with serial number DPU1_SERIAL:

Copy
Copied!
            

$ ping 10.0.123.9 -c3 PING 10.0.123.9 (10.0.123.9) 56(84) bytes of data. 64 bytes from 10.0.123.9: icmp_seq=1 ttl=64 time=0.387 ms 64 bytes from 10.0.123.9: icmp_seq=2 ttl=64 time=0.344 ms 64 bytes from 10.0.123.9: icmp_seq=3 ttl=64 time=0.396 ms   --- 10.0.123.9 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2053ms rtt min/avg/max/mdev = 0.344/0.375/0.396/0.022 ms

This section covers only the DPF related components and not the prerequisites as these must be managed by the admin.

Delete the DPF Operator system and DPF Operator

Copy
Copied!
            

kubectl delete -n dpf-operator-system dpfoperatorconfig dpfoperatorconfig --wait helm uninstall -n dpf-operator-system dpf-operator --wait


Delete DPF Operator PVC

Copy
Copied!
            

kubectl -n dpf-operator-system delete pvc bfb-pvc kubectl delete pv bfb-pv

Note: there can be a race condition with deleting the underlying Kamaji cluster which runs the DPU cluster control plane in this guide. If that happens it may be necessary to remove finalizers manually from DPUCluster and Datastore objects.

© Copyright 2025, NVIDIA. Last updated on Sep 3, 2025.