RDG for DPF Zero Trust (DPF-ZT) with HBN and SNAP Block Bootable Disk DPU Services
Created on Jan 1, 2026 (v 25.10)
This Reference Deployment Guide (RDG) provides detailed instructions for deploying the NVIDIA DOCA Platform Framework (DPF) in Zero-Trust mode on high-performance, bare-metal infrastructure. The guide focuses on configuring Host-Based Networking (HBN) and Storage-Defined Network Accelerated Processing (SNAP) services on NVIDIA® BlueField®-3 DPUs, with SNAP operating in block device mode. This deployment delivers secure, isolated, and hardware-accelerated environments optimized for Zero-Trust architectures.
This document is an extension of the RDG for DPF Zero Trust (DPF-ZT) (referred to as the Baseline RDG ). It details the additional steps and modifications required to deploy SNAP DPU service with HBN DPU service in addition to the services in the Baseline RDG and orchestrate them.
The guide is intended for experienced system administrators, systems engineers, and solution architects who build highly secure bare-metal environments with Host-Based Networking enabled using NVIDIA BlueField DPUs for acceleration, isolation, and infrastructure offload.
This reference implementation, as the name implies, is a specific, opiniated deployment example designed to address the usecase described above.
While other approaches may exist to implement similar solutions, this document provides a detailed guide for this particular method.
Term | Definition | Term | Definition |
BFB | BlueField Bootstream | MAAS | Metal as a Service |
BGP | Border Gateway Protocol | OVN | Open Virtual Network |
CNI | Container Network Interface | PVC | Persistent Volume Claim |
CRD | Custom Resource Definition | RDG | Reference Deployment Guide |
CSI | Container Storage Interface | RDMA | Remote Direct Memory Access |
DHCP | Dynamic Host Configuration Protocol | SF | Scalable Function |
DOCA | Data Center Infrastructure-on-a-Chip Architecture | SFC | Service Function Chaining |
DOCA SNAP | NVIDIA® DOCA™ Storage-Defined Network Accelerated Processing | SPDK | Storage Performance Development Kit |
DPF | DOCA Platform Framework | SR-IOV | Single Root Input/Output Virtualization |
DPU | Data Processing Unit | TOR | Top of Rack |
DTS | DOCA Telemetry Service | VF | Virtual Function |
GENEVE | Generic Network Virtualization Encapsulation | VLAN | Virtual LAN (Local Area Network) |
HBN | Host Based Networking | VRR | Virtual Router Redundancy |
IPAM | IP Address Management | VTEP | Virtual Tunnel End Point |
K8S | Kubernetes | VXLAN | Virtual Extensible LAN |
The NVIDIA BlueField-3 Data Processing Unit is a powerful infrastructure compute platform designed for high-speed processing of software-defined networking, storage, and cybersecurity . With a capacity of 400 Gb/s, BlueField-3 combines robust computing, high-speed networking, and extensive programmability to deliver hardware-accelerated, software-defined solutions for demanding workloads.
Deploying and managing DPUs and their associated DOCA services, especially at scale, can be quite challenging. Without a proper provisioning and orchestration system, handling the DPU lifecycle and configuring DOCA services place a heavy operational burden on system administrators. The NVIDIA DOCA Platform Foundation addresses this challenge by streamlining and automating the lifecycle management of DOCA services.
NVIDIA DOCA unleashes the full power of the BlueField® platform, empowering organizations to rapidly build next-generation applications and services that offload, accelerate, and isolate critical data center workloads. By leveraging DOCA, businesses can achieve unmatched performance, security, and efficiency across modern infrastructure.
A prime example of this innovation is NVIDIA DOCA SNAP — a breakthrough DPU-based storage solution designed to accelerate and optimize storage protocols using BlueField’s advanced hardware acceleration. DOCA SNAP delivers a family of services that virtualize local storage at the hardware level, presenting networked storage as local block devices to the host and emulating physical drives over the PCIe bus. In our use case the block device will be serve as boot drive. With DOCA SNAP, organizations gain high-performance, low-latency access to storage by bypassing traditional filesystem overhead and interacting directly with raw block devices. This results in faster data access, reduced CPU utilization, and improved workload efficiency. Integrated into the DOCA Platform Framework (DPF), SNAP is packaged as containerized components deployed seamlessly across x86 and DPU Kubernetes clusters—delivering a scalable, Zero-Trust architecture for the modern data center.
DPF supports multiple deployment models. This guide focuses on the Zero Trust bare-metal deployment model. In this scenario:
The DPU is managed through its Baseboard Management Controller ( BMC )
All management traffic occurs over the DPU's out-of-band ( OOB ) network
The host is considered as an untrusted entity towards the data center network. The DPU acts as a barrier between the host and the network.
The host sees the DPU as a standard NIC , with no access to the internal DPU management plane (Zero Trust Mode)
This Reference Deployment Guide (RDG) provides a step-by-step example for installing DPF in Zero-Trust mode with HBN and SNAP DPU services. As part of the reference implementation, open-source components outside the scope of DPF (e.g., MAAS, pfSense, Kubespray) are used to simulate a realistic customer deployment environment. The guide includes the full end-to-end deployment process, including:
Infrastructure provisioning
DPF deployment
DPU provisioning (redfish)
Service configuration and deployment
Service chaining
In our guide we used the Storage Performance Development Kit (SPDK) as an example of storage backend service.
This storage backend service is used only for demonstration purposes and is not intended or supported for production use cases.
Key Components and Technologies
NVIDIA BlueField® Data Processing Unit (DPU)
The NVIDIA® BlueField® data processing unit (DPU) ignites unprecedented innovation for modern data centers and supercomputing clusters. With its robust compute power and integrated software-defined hardware accelerators for networking, storage, and security, BlueField creates a secure and accelerated infrastructure for any workload in any environment, ushering in a new era of accelerated computing and AI.
NVIDIA DOCA Software Framework
NVIDIA DOCA™ unlocks the potential of the NVIDIA® BlueField® networking platform. By harnessing the power of BlueField DPUs and SuperNICs, DOCA enables the rapid creation of applications and services that offload, accelerate, and isolate data center workloads. It lets developers create software-defined, cloud-native, DPU- and SuperNIC-accelerated services with zero-trust protection, addressing the performance and security demands of modern data centers.
10/25/40/50/100/200 and 400G Ethernet Network Adapters
The industry-leading NVIDIA® ConnectX® family of smart network interface cards (SmartNICs) offer advanced hardware offloads and accelerations.
NVIDIA Ethernet adapters enable the highest ROI and lowest Total Cost of Ownership for hyperscale, public and private clouds, storage, machine learning, AI, big data, and telco platforms.
The NVIDIA® LinkX® product family of cables and transceivers provides the industry’s most complete line of 10, 25, 40, 50, 100, 200, and 400GbE in Ethernet and 100, 200 and 400Gb/s InfiniBand products for Cloud, HPC, hyperscale, Enterprise, telco, storage and artificial intelligence, data center applications.
NVIDIA Spectrum Ethernet Switches
Flexible form-factors with 16 to 128 physical ports, supporting 1GbE through 400GbE speeds.
Based on a ground-breaking silicon technology optimized for performance and scalability, NVIDIA Spectrum switches are ideal for building high-performance, cost-effective, and efficient Cloud Data Center Networks, Ethernet Storage Fabric, and Deep Learning Interconnects.
NVIDIA combines the benefits of NVIDIA Spectrum™ switches, based on an industry-leading application-specific integrated circuit (ASIC) technology, with a wide variety of modern network operating system choices, including NVIDIA Cumulus® Linux , SONiC and NVIDIA Onyx®.
NVIDIA® Cumulus® Linux is the industry's most innovative open network operating system that allows you to automate, customize, and scale your data center network like no other.
Kubernetes is an open-source container orchestration platform for deployment automation, scaling, and management of containerized applications.
Kubespray is a composition of Ansible playbooks, inventory, provisioning tools, and domain knowledge for generic OS/Kubernetes clusters configuration management tasks and provides:
A highly available cluster
Composable attributes
Support for most popular Linux distributions
Solution Design
Solution Logical Design
The logical design includes the following components:
1 x Hypervisor node (KVM-based) with ConnectX-7:
1 x Firewall VM
1 x Jump Node VM
1 x MaaS VM
1 x DPU DHCP VM
3 x K8s Master VMs running all K8s management components
2 x Worker nodes (PCI Gen5), each with a 1 x BlueField-3 NIC
Storage Target Node with ConnectX-7 and SPDK target apps
Single High-Speed (HS) switch
1 Gb Host Management network
SFC Logical Diagram
The DOCA Platform Framework simplifies DPU management by providing orchestration through a K8s API. It handles the provisioning and lifecycle management of DPUs, orchestrates specialized DPU services, and automates service function chaining (SFC) tasks. This ensures seamless deployment of NVIDIA DOCA services, allowing traffic to be efficiently offloaded and routed through HBN's data plane. The SFC logical diagram implemented in this guide is shown below.
Disk Emulation Logical Diagram
The following logical diagram demonstrates the main components involved in a disk mount procedure to tenant workload pod.
Upon receiving a new request for an emulated NVMe drive, DOCA SNAP components bring a block device (BDEV) via NVMe-oF using either RDMA or TCP storage protocols to the required BM worker node. The DPU then emulates it as a block device on the x86 host via the "BlueField NVMe SNAP Controller".
Firewall Design
The pfSense firewall in this solution serves a dual purpose:
Firewall – Provides an isolated environment for the DPF system, ensuring secure operations
Router – Enables internet access and connectivity between the host management network and the high-speed network
Port-forwarding rules for SSH and RDP are configured on the firewall to route traffic to the jump node’s IP address in the host management network. From the jump node, administrators can manage and access various devices in the setup, as well as handle the deployment of the Kubernetes (K8s) cluster and DPF components.
The following diagram illustrates the firewall design used in this solution:
Software Stack Components
Make sure to use the exact same versions for the software stack as described above.
Bill of Materials
Node and Switch Definitions
These are the definitions and parameters used for deploying the demonstrated fabric :
Switch Port Usage | ||
| 1 | swp1-6 |
| 1 | swp1,2,11-14,32 |
Hosts | |||||
Rack | Server Type | Server Name | Switch Port | IP and NICs | Default Gateway |
Rack1 | Hypervisor Node |
| mgmt-switch: swp1 hs-switch: | lab-br (interface eno1): Trusted LAN IP mgmt-br (interface eno2): - hs-br (interface ens2f0np0): | Trusted LAN GW |
Rack1 | Storage Target Node |
| mgmt-switch: hs-switch: | enp1s0f0: 10.0.110.25/24 enp144s0f0np0: 10.0.124.1/24 | 10.0.110.254 |
Rack1 | Worker Node |
| mgmt-switch: hs-switch: | ens15f0: 10.0.110.21/24 ens5f0np0/ens5f1np1: 10.0.120.0/22 | 10.0.110.254 |
Rack1 | Worker Node |
| mgmt-switch: hs-switch: | ens15f0: 10.0.110.22/24 ens5f0np0/ens5f1np1: 10.0.120.0/22 | 10.0.110.254 |
Rack1 | Firewall (Virtual) |
| - | WAN (lab-br): Trusted LAN IP LAN (mgmt-br): 10.0.110.254/24 OPT1 (hs-br): 172.169.50.1/30 | Trusted LAN GW |
Rack1 | Jump Node (Virtual) |
| - | enp1s0: 10.0.110.253/24 | 10.0.110.254 |
Rack1 | MAAS (Virtual) |
| - | enp1s0: 10.0.110.252/24 | 10.0.110.254 |
Rack1 | DPU DHCP (Virtual) |
| - | enp1s0: 10.0.125.4/24 | 10.0.125.1 |
Rack1 | Master Node (Virtual) |
| - | enp1s0: 10.0.110.1/24 | 10.0.110.254 |
Rack1 | Master Node (Virtual) |
| - | enp1s0: 10.0.110.2/24 | 10.0.110.254 |
Rack1 | Master Node (Virtual) |
| - | enp1s0: 10.0.110.3/24 | 10.0.110.254 |
Wiring
Hypervisor Node
Bare Metal Worker Node
Storage Target Node
Fabric Configuration
Updating Cumulus Linux
As a best practice, make sure to use the latest released Cumulus Linux NOS version.
For information on how to upgrade Cumulus Linux, refer to the Cumulus Linux User Guide.
Configuring the Cumulus Linux Switch
The SN3700 switch (hs-switch), is configured as follows:
The following commands configure BGP unnumbered on
hs-switch.Cumulus Linux enables the BGP equal-cost multipathing (ECMP) option by default.
SN3700 Switch Console
nv set bridge domain br_default vlan 10 vni 10
nv set evpn state enabled
nv set interface eth0 ipv4 dhcp-client state enabled
nv set interface eth0 type eth
nv set interface eth0 vrf mgmt
nv set interface lo ipv4 address 11.0.0.101/32
nv set interface lo type loopback
nv set interface swp1 ipv4 address 172.169.50.2/30
nv set interface swp1 link speed auto
nv set interface swp1-32 type swp
nv set interface swp2 ipv4 address 10.0.125.254/24
nv set interface swp32 bridge domain br_default access 10
nv set nve vxlan source address 11.0.0.101
nv set nve vxlan state enabled
nv set qos roce mode lossless
nv set qos roce state enabled
nv set router bgp autonomous-system 65001
nv set router bgp graceful-restart mode full
nv set router bgp router-id 11.0.0.101
nv set router bgp state enabled
nv set system hostname hs-switch
nv set vrf default router bgp address-family ipv4-unicast network 10.0.125.0/24
nv set vrf default router bgp address-family ipv4-unicast network 11.0.0.101/32
nv set vrf default router bgp address-family ipv4-unicast state enabled
nv set vrf default router bgp address-family ipv6-unicast redistribute connected state enabled
nv set vrf default router bgp address-family ipv6-unicast state enabled
nv set vrf default router bgp address-family l2vpn-evpn state enabled
nv set vrf default router bgp neighbor swp11 enforce-first-as disabled
nv set vrf default router bgp neighbor swp11 peer-group hbn
nv set vrf default router bgp neighbor swp11 type unnumbered
nv set vrf default router bgp neighbor swp12 enforce-first-as disabled
nv set vrf default router bgp neighbor swp12 peer-group hbn
nv set vrf default router bgp neighbor swp12 type unnumbered
nv set vrf default router bgp neighbor swp13 enforce-first-as disabled
nv set vrf default router bgp neighbor swp13 peer-group hbn
nv set vrf default router bgp neighbor swp13 type unnumbered
nv set vrf default router bgp neighbor swp14 enforce-first-as disabled
nv set vrf default router bgp neighbor swp14 peer-group hbn
nv set vrf default router bgp neighbor swp14 type unnumbered
nv set vrf default router bgp path-selection multipath aspath-ignore enabled
nv set vrf default router bgp peer-group hbn address-family ipv4-unicast default-route-origination state enabled
nv set vrf default router bgp peer-group hbn address-family ipv4-unicast state enabled
nv set vrf default router bgp peer-group hbn address-family ipv6-unicast state enabled
nv set vrf default router bgp peer-group hbn address-family l2vpn-evpn state enabled
nv set vrf default router bgp peer-group hbn enforce-first-as disabled
nv set vrf default router bgp peer-group hbn remote-as external
nv set vrf default router bgp state enabled
nv set vrf default router static 0.0.0.0/0 address-family ipv4-unicast
nv set vrf default router static 0.0.0.0/0 via 172.169.50.1 type ipv4-address
nv config apply -y
The SN2201 switch (mgmt-switch) is configured as follows:
SN2201 Switch Console
nv set bridge domain br_default untagged 1
nv set interface swp1-6 link state up
nv set interface swp1-6 type swp
nv set interface swp1-6 bridge domain br_default
nv config apply -y
Installation and Configuration
Make sure that the BIOS settings on the worker node servers have SR-IOV enabled and that the servers are tuned for maximum performance.
All worker nodes must have the same PCIe placement for the BlueField-3 NIC and must display the same interface name.
Make sure that you have DPU BMC and OOB MAC addresses.
No change from the Baseline RDG (Section "Deployment and Configuration", Subsection "Host Configuration").
Hypervisor Installation and Configuration
No change from the Baseline RDG (Section "Hypervisor Installation and Configuration").
Prepare Infrastructure Servers
No change from the Baseline RDG (Section "Deployment and Configuration", Subsection "Prepare Infrastructure Servers") regarding Jump VM, MaaS VM.
Regarding Firewall VM, it should be configured according to section "Firewall VM - pfSense Installation and Interface Configuration" from the RDG for DPF with OVN-Kubernetes and HBN Services.
Provisioning "DPU DHCP VM"
Please install Rocky Linux 9.0 in minimal server configuration.
Configure manually IP address to 10.0.125.4/24 with default GW 10.0.125.1/24 and your prefferred DNS server.
Install following modules:
Jump Node Console
sudo dnf -y update sudo dnf install -y lldpd dnsmasq
Apply following configuration to DNSMASQ apps - file /etc/dnsmasq.conf
/etc/dnsmasq.conf
# #Disable the DNS server set: port=0 # port=53 # #Setup the server to be your authoritative DHCP server # dhcp-authoritative # #Set the DHCP server to hand addresses sequentially # dhcp-sequential-ip # #Enable more detailed logging for DHCP # log-dhcp log-queries no-resolv log-facility=/var/log/dnsmasq.log domain=x86.dpf.rdg.local.domain local=/x86.dpf.rdg.local.domain/ server=8.8.8.8 # #Create different dhcp scopes for each of the three simulated subnets here, using tags for ID #Format is: dhcp-range=<your_tag_here>,<start_of_scope>,<end_of_scope>,<subnet_mask>,<lease_time> # dhcp-range=subnet0,10.0.120.2,10.0.120.6,255.255.255.248,8h dhcp-option=subnet0,42,192.114.62.250 dhcp-option=subnet0,6,10.0.125.4 dhcp-option=subnet0,3,10.0.120.1 dhcp-range=subnet1,10.0.120.10,10.0.120.14,255.255.255.248,8h dhcp-option=subnet1,42,192.114.62.250 dhcp-option=subnet1,6,10.0.125.4 dhcp-option=subnet1,3,10.0.120.9
InfoThe following dnsmasq configuration is customized for our specific deployment use case and should not be used as a default configuration.
Start and enable autostart for dnsmasq.service.
Jump Node Console
sudo systemctl start dnsmasq.service sudo systemctl enable dnsmasq.service
Check service status
Jump Node Console
sudo systemctl status dnsmasq.service ### Command output should look like: ### dnsmasq.service - DNS caching server. Loaded: loaded (/usr/lib/systemd/system/dnsmasq.service; enabled; preset: disabled) Active: active (running) since Wed 2025-12-24 08:49:28 EST; 2 weeks 3 days ago Invocation: 10eb617fa5fe4bedb1fc021ddcc7751f Process: 1172 ExecStart=/usr/sbin/dnsmasq (code=exited, status=0/SUCCESS) Main PID: 1193 (dnsmasq) Tasks: 1 (limit: 23017) Memory: 2M (peak: 2.5M) CPU: 112ms CGroup: /system.slice/dnsmasq.service └─1193 /usr/sbin/dnsmasq Dec 24 08:49:28 hbn-dhcp systemd[1]: Starting dnsmasq.service - DNS caching server.... Dec 24 08:49:28 hbn-dhcp systemd[1]: Started dnsmasq.service - DNS caching server..
Provision SPDK Target Apps on Storage Target Node
Login as root account to Storage Target Node:
Jump Node Console
$ ssh target $ sudo -i
Build SPDK from source (root privileges is required!):
Jump Node Console
git clone https://github.com/spdk/spdk cd spdk git submodule update --init apt update && apt install meson python3-pyelftools -y ./scripts/pkgdep.sh --rdma ./configure --with-rdma make
Run SPDK target:
Jump Node Console
# Get all nvme devices lshw -c storage -businfo Bus info Device Class Description =========================================================== pci@0000:08:00.0 storage PCIe Data Center SSD pci@0000:00:11.4 storage C610/X99 series chipset sSATA Controller [AHCI mode] pci@0000:00:1f.2 storage C610/X99 series chipset 6-Port SATA Controller [AHCI mode] pci@0000:81:00.0 scsi4 storage MegaRAID SAS-3 3108 [Invader] # Start target scripts/setup.sh build/bin/nvmf_tgt & # Add bdevs with nvme backend scripts/rpc.py bdev_nvme_attach_controller -b Nvme0 -t PCIe -a 0000:08:00.0 # Add logical volume store on base bdev scripts/rpc.py bdev_lvol_create_lvstore Nvme0n1 lvs0 # Display current logical volume list scripts/rpc.py bdev_lvol_get_lvstores scripts/rpc_http_proxy.py 10.0.110.25 8000 exampleuser examplepassword &
SPDK target is ready.
Provision Master VMs and Worker Nodes Using MaaS
No change from the Baseline RDG ((Section "Provision Master VMs Using MaaS").
UEFI Server BIOS mode is require for Bare-Metal Worker Nodes.
DPU Service Installation
Change the DPUDeployment, DPUServiceConfig, DPUServiceTemplate and other necessary objects according to your setup environment.
Before deploying the objects under doca-platform/docs/public/user-guides/zero-trust/use-cases/hbn-snapdirectory, a few adjustments are required.
It is necessary to set several environment variables before running this command.
$ source manifests/00-env-vars/envvars.env
Change directory to readme.md from where all the commands will be run:
Jump Node Console
$ cd doca-platform/docs/public/user-guides/zero-trust/use-cases/hbn-snap
Modify the variables in
manifests/00-env-vars/envvars.envto fit your environment, then source the file:WarningReplace the values for the variables in the following file with the values that fit your setup. Specifically, pay attention to
DPUCLUSTER_INTERFACEandBMC_ROOT_PASSWORD.manifests/00-env-vars/envvars.env
## IP Address for the Kubernetes API server of the target cluster on which DPF is installed. ## This should never include a scheme or a port. ## e.g. 10.10.10.10 export TARGETCLUSTER_API_SERVER_HOST=10.0.110.10 ## Port for the Kubernetes API server of the target cluster on which DPF is installed. export TARGETCLUSTER_API_SERVER_PORT=6443 ## Virtual IP used by the load balancer for the DPU Cluster. Must be a reserved IP from the management subnet and not allocated by DHCP. export DPUCLUSTER_VIP=10.0.110.200 ## DPU_P0 is the name of the first port of the DPU. This name must be the same on all worker nodes. #export DPU_P0=enp204s0f0np0 ## Interface on which the DPUCluster load balancer will listen. Should be the management interface of the control plane node. export DPUCLUSTER_INTERFACE=enp1s0 # IP address to the NFS server used as storage for the BFB. export NFS_SERVER_IP=10.0.110.253 ## The repository URL for the NVIDIA Helm chart registry. ## Usually this is the NVIDIA Helm NGC registry. For development purposes, this can be set to a different repository. export HELM_REGISTRY_REPO_URL=https://helm.ngc.nvidia.com/nvidia/doca ## The repository URL for the HBN container image. ## Usually this is the NVIDIA NGC registry. For development purposes, this can be set to a different repository. export HBN_NGC_IMAGE_URL=nvcr.io/nvidia/doca/doca_hbn ## The repository URL for the SNAP VFS container image. ## Usually this is the NVIDIA NGC registry. For development purposes, this can be set to a different repository. export SNAP_NGC_IMAGE_URL=nvcr.io/nvidia/doca/doca_vfs ## The DPF REGISTRY is the Helm repository URL where the DPF Operator Chart resides. ## Usually this is the NVIDIA Helm NGC registry. For development purposes, this can be set to a different repository. export REGISTRY=https://helm.ngc.nvidia.com/nvidia/doca ## The DPF TAG is the version of the DPF components which will be deployed in this guide. export TAG=v25.10.0 ## URL to the BFB used in the `bfb.yaml` and linked by the DPUSet. export BFB_URL="https://content.mellanox.com/BlueField/BFBs/Ubuntu24.04/bf-bundle-3.2.1-34_25.11_ubuntu-24.04_64k_prod.bfb" ## IP_RANGE_START and IP_RANGE_END ## These define the IP range for DPU discovery via Redfish/BMC interfaces ## Example: If your DPUs have BMC IPs in range 192.168.1.100-110 ## export IP_RANGE_START=192.168.1.100 ## export IP_RANGE_END=192.168.1.110 ## IP_RANGE_START and IP_RANGE_END ## Start of DPUDiscovery IpRange export IP_RANGE_START=10.0.110.75 ## End of DPUDiscovery IpRange export IP_RANGE_END=10.0.110.76 # The password used for DPU BMC root login, must be the same for all DPUs # For more information on how to set the BMC root password refer to BlueField DPU Administrator Quick Start Guide. export BMC_ROOT_PASSWORD=<set your BMC_ROOT_PASSWORD> ## Serial number of DPUs. If you have more than 2 DPUs, you will need to parameterize the system accordingly and expose ## additional variables. ## All serial numbers must be in lowercase. ## Serial number of DPU1 export DPU1_SERIAL=mt2334xz09f0 ## Serial number of DPU2 export DPU2_SERIAL=mt2334xz09f1
Export environment variables for the installation:
Jump Node Console
$ source manifests/00-env-vars/envvars.env
Apply the necessary updates to all YAML files (dpudeployment.yaml, hbn-dpuserviceconfig.yaml, hbn-dpuservicetemplate.yaml, hbn-ipam.yaml) located in the manifests/03.1-dpudeployment-installation-nvme/ directory:
dpudeployment.yaml
--- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUDeployment metadata: name: hbn-snap namespace: dpf-operator-system spec: dpus: bfb: bf-bundle-$TAG flavor: hbn-snap-nvme-$TAG nodeEffect: noEffect: true dpuSets: - nameSuffix: "dpuset1" dpuAnnotations: storage.nvidia.com/preferred-dpu: "true" nodeSelector: matchLabels: feature.node.kubernetes.io/dpu-enabled: "true" services: doca-hbn: serviceTemplate: doca-hbn serviceConfiguration: doca-hbn snap-host-controller: serviceTemplate: snap-host-controller serviceConfiguration: snap-host-controller snap-node-driver: serviceTemplate: snap-node-driver serviceConfiguration: snap-node-driver doca-snap: serviceTemplate: doca-snap serviceConfiguration: doca-snap block-storage-dpu-plugin: serviceTemplate: block-storage-dpu-plugin serviceConfiguration: block-storage-dpu-plugin spdk-csi-controller: serviceTemplate: spdk-csi-controller serviceConfiguration: spdk-csi-controller spdk-csi-controller-dpu: serviceTemplate: spdk-csi-controller-dpu serviceConfiguration: spdk-csi-controller-dpu serviceChains: switches: - ports: - serviceInterface: matchLabels: interface: p0 - service: name: doca-hbn interface: p0_if - ports: - serviceInterface: matchLabels: interface: p1 - service: name: doca-hbn interface: p1_if - ports: - serviceInterface: matchLabels: interface: pf0hpf - service: name: doca-hbn interface: pf0hpf_if - ports: - service: name: doca-snap interface: app_sf ipam: matchLabels: svc.dpu.nvidia.com/pool: storage-pool - service: name: doca-hbn interface: snap_if
hbn-dpuserviceconfig.yaml
--- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUServiceConfiguration metadata: name: doca-hbn namespace: dpf-operator-system spec: deploymentServiceName: "doca-hbn" serviceConfiguration: serviceDaemonSet: annotations: k8s.v1.cni.cncf.io/networks: |- [ {"name": "iprequest", "interface": "ip_lo", "cni-args": {"poolNames": ["loopback"], "poolType": "cidrpool"}}, {"name": "iprequest", "interface": "ip_pf0hpf", "cni-args": {"poolNames": ["pool1"], "poolType": "cidrpool", "allocateDefaultGateway": true}} ] helmChart: values: configuration: perDPUValuesYAML: | - hostnamePattern: "*" values: bgp_peer_group: hbn - hostnamePattern: "dpu-node-${DPU1_SERIAL}*" values: bgp_autonomous_system: 65101 - hostnamePattern: "dpu-node-${DPU2_SERIAL}*" values: bgp_autonomous_system: 65201 startupYAMLJ2: | - header: model: bluefield nvue-api-version: nvue_v1 rev-id: 1.0 version: HBN 3.0.0 - set: evpn: enable: on nve: vxlan: enable: on source: address: {{ ipaddresses.ip_lo.ip }} bridge: domain: br_default: vlan: '10': vni: '10': {} interface: lo: ip: address: {{ ipaddresses.ip_lo.ip }}/32: {} type: loopback p0_if,p1_if,snap_if: type: swp link: mtu: 9000 pf0hpf_if: ip: address: {{ ipaddresses.ip_pf0hpf.cidr }}: {} type: swp link: mtu: 9000 snap_if: bridge: domain: br_default: access: 10 vlan10: type: svi vlan: 10 router: bgp: autonomous-system: {{ config.bgp_autonomous_system }} enable: on graceful-restart: mode: full router-id: {{ ipaddresses.ip_lo.ip }} service: dhcp-relay: default: server: 10.0.125.4: {} vrf: default: router: bgp: address-family: ipv4-unicast: enable: on redistribute: connected: enable: on ipv6-unicast: enable: on redistribute: connected: enable: on l2vpn-evpn: enable: on enable: on neighbor: p0_if: peer-group: {{ config.bgp_peer_group }} type: unnumbered p1_if: peer-group: {{ config.bgp_peer_group }} type: unnumbered path-selection: multipath: aspath-ignore: on peer-group: {{ config.bgp_peer_group }}: address-family: ipv4-unicast: enable: on ipv6-unicast: enable: on l2vpn-evpn: enable: on remote-as: external interfaces: - name: p0_if network: mybrhbn - name: p1_if network: mybrhbn - name: pf0hpf_if network: mybrhbn - name: snap_if network: mybrhbn
hbn-dpuservicetemplate.yaml
--- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUServiceTemplate metadata: name: doca-hbn namespace: dpf-operator-system spec: deploymentServiceName: "doca-hbn" helmChart: source: repoURL: $HELM_REGISTRY_REPO_URL version: 1.0.5 chart: doca-hbn values: image: repository: $HBN_NGC_IMAGE_URL tag: 3.2.1-doca3.2.1 resources: memory: 6Gi nvidia.com/bf_sf: 4
hbn-ipam.yaml
--- apiVersion: svc.dpu.nvidia.com/v1alpha1 kind: DPUServiceIPAM metadata: name: pool1 namespace: dpf-operator-system spec: ipv4Network: network:
"10.0.120.0/22"gatewayIndex:1prefixSize:29Run the command to deploy DPU deployment:
Jump Node Console
$ cat manifests/03.1-dpudeployment-installation-nvme/*.yaml | envsubst |kubectl apply -f -
Apply the following updates to manifests/04.1-storage-configuration-nvme/policy-block-dpustoragepolicy.yaml YAML file and run the command to deploy storage configuration:
manifests/04.1-storage-configuration-nvme/policy-block-dpustoragepolicy.yaml
--- apiVersion: storage.dpu.nvidia.com/v1alpha1 kind: DPUStoragePolicy metadata: name: policy-block namespace: dpf-operator-system spec: dpuStorageVendors: - spdk-csi selectionAlgorithm:
"NumberVolumes"parameters: num_queues:"16"Jump Node Console
$ cat manifests/04.1-storage-configuration-nvme/*.yaml | envsubst |kubectl apply -f -
Wait for the Rebooted stage and then Power Cycle the bare-metal host manual :
Jump Node Console
$ kubectl -n dpf-operator-system exec deploy/dpf-operator-controller-manager -- /dpfctl describe dpudeployments### Command partial output ####----DPUs |-DPU/dpu-node-mt2334xz09f0-mt2334xz09f0 dpf-operator-system | |-Rebooted False WaitingForManualPowerCycleOrReboot 51m | |-Ready False Rebooting 51m |-DPU/dpu-node-mt2334xz09f1-mt2334xz09f1 dpf-operator-system | |-Rebooted False WaitingForManualPowerCycleOrReboot 51m | |-Ready False Rebooting 51m ----
After the DPU is up, run following command:
Jump Node Console
$ kubectl -n dpf-operator-system annotate dpunode --all provisioning.dpu.nvidia.com/dpunode-external-reboot-required-
At this point, the DPU workers should be added to the cluster. As they being added to the cluster, the DPUs are provisioned. Finally, validate that all the different DPU-related objects are now in the Ready state:
Jump Node Console
$ kubectl -n dpf-operator-system exec deploy/dpf-operator-controller-manager -- /dpfctl describe dpudeployments
Congratulations, the DPF system has been successfully installed.
Provisioning SNAP DPU Service block device
Before starting SNAP block device provisioning, reboot your bare-metal hosts to apply the latest BlueField firmware settings.
Please review YAML configuratuion files before deploying the objects under doca-platform/docs/public/user-guides/zero-trust/use-cases/hbn-snap/manifests/05.1-storage-test-nvmedirectory.
dpuvolume.yaml
---
apiVersion: storage.dpu.nvidia.com/v1alpha1
kind: DPUVolume
metadata:
name: test-volume-static-pf-${DPU1_SERIAL}
namespace: dpf-operator-system
spec:
dpuStoragePolicyName: policy-block
resources:
requests:
storage: 60Gi
accessModes:
- ReadWriteOnce
volumeMode: Block
---
apiVersion: storage.dpu.nvidia.com/v1alpha1
kind: DPUVolume
metadata:
name: test-volume-static-pf-${DPU2_SERIAL}
namespace: dpf-operator-system
spec:
dpuStoragePolicyName: policy-block
resources:
requests:
storage: 60Gi
accessModes:
- ReadWriteOnce
volumeMode: Block
dpuvolumeattachment.yaml
---
apiVersion: storage.dpu.nvidia.com/v1alpha1
kind: DPUVolumeAttachment
metadata:
name: test-volume-attachment-static-pf-${DPU1_SERIAL}
namespace: dpf-operator-system
spec:
dpuNodeName: dpu-node-${DPU1_SERIAL}
dpuVolumeName: test-volume-static-pf-${DPU1_SERIAL}
functionType: pf
hotplugFunction: false
---
apiVersion: storage.dpu.nvidia.com/v1alpha1
kind: DPUVolumeAttachment
metadata:
name: test-volume-attachment-static-pf-${DPU2_SERIAL}
namespace: dpf-operator-system
spec:
dpuNodeName: dpu-node-${DPU2_SERIAL}
dpuVolumeName: test-volume-static-pf-${DPU2_SERIAL}
functionType: pf
hotplugFunction: false
Run the command to deploy SNAP block device:
Jump Node Console
$ cat manifests/05.1-storage-test-nvme/*.yaml | envsubst |kubectl apply -f -
dpuvolumeattachment.storage.dpu.nvidia.com/test-volume-attachment-static-pf-mt2334xz09f0 created
dpuvolumeattachment.storage.dpu.nvidia.com/test-volume-attachment-static-pf-mt2334xz09f1 created
dpuvolume.storage.dpu.nvidia.com/test-volume-static-pf-mt2334xz09f0 created
dpuvolume.storage.dpu.nvidia.com/test-volume-static-pf-mt2334xz09f1 created
Check deployment:
Jump Node Console
$ kubectl get dpuvolume -A
NAMESPACE NAME DPUSTORAGEPOLICYNAME VOLUMEMODE SIZE READY AGE
dpf-operator-system test-volume-static-pf-mt2334xz09f0 policy-block Block 60Gi True 16s
dpf-operator-system test-volume-static-pf-mt2334xz09f1 policy-block Block 60Gi True 16s
$ kubectl get dpuvolumeattachments -A
NAMESPACE NAME DPUVOLUMENAME DPUNODENAME FUNCTIONTYPE HOTPLUGFUNCTION READY AGE
dpf-operator-system test-volume-attachment-static-pf-mt2334xz09f0 test-volume-static-pf-mt2334xz09f0 dpu-node-mt2334xz09f0 pf false True 25s
dpf-operator-system test-volume-attachment-static-pf-mt2334xz09f1 test-volume-static-pf-mt2334xz09f1 dpu-node-mt2334xz09f1 pf false True 25s
SNAP block device deployed successfully.
Bare-metal Server customization
Using SNAP block device as BOOT BLOCK device require server BIOS customization steps and depend on the server manufacturing.
For our server "NVMe controller and Drive information" is look like:
Please install OS from Virtual ISO (in our case: Rocky Linux).
Completed Rocky Linux OS installation for our server in UEFI BIOS mode should look like:
Inside installed OS it looks like:
Congratulations! The SNAP NVMe drive has been successfully configured and is now ready for use.
|
|
Over the past few years, Vitaliy Razinkov has been working as a Solutions Architect on the NVIDIA Networking team, responsible for complex Kubernetes/OpenShift and Microsoft's leading solutions, research and design. He previously spent more than 25 years in senior positions at several companies. Vitaliy has written several reference design guides on Microsoft technologies, RoCE/RDMA accelerated machine learning in Kubernetes/OpenShift, and container solutions, all of which are available on the NVIDIA Networking Documentation website. |
This document is provided for information purposes only and shall not be regarded as a warranty of a certain functionality, condition, or quality of a product. NVIDIA Corporation (“NVIDIA”) makes no representations or warranties, expressed or implied, as to the accuracy or completeness of the information contained in this document and assumes no responsibility for any errors contained herein. NVIDIA shall have no liability for the consequences or use of such information or for any infringement of patents or other rights of third parties that may result from its use. This document is not a commitment to develop, release, or deliver any Material (defined below), code, or functionality. NVIDIA reserves the right to make corrections, modifications, enhancements, improvements, and any other changes to this document, at any time without notice. Customer should obtain the latest relevant information before placing orders and should verify that such information is current and complete. NVIDIA products are sold subject to the NVIDIA standard terms and conditions of sale supplied at the time of order acknowledgement, unless otherwise agreed in an individual sales agreement signed by authorized representatives of NVIDIA and customer (“Terms of Sale”). NVIDIA hereby expressly objects to applying any customer general terms and conditions with regards to the purchase of the NVIDIA product referenced in this document. No contractual obligations are formed either directly or indirectly by this document.