HBN service is available on NGC, NVIDIA's container catalog. For information about the deployment of DOCA containers on top of the BlueField, refer to NVIDIA DOCA Container Deployment Guide.

Pull the latest DOCA container resource as a *.zip file from NGC and extract it to the <resource> folder. The HBN resource from NGC is available here

The HBN script ( hbn-dpu-setup.sh ) performs the following steps on BlueField Arm which are required for HBN service to run:

Sets up interface MTU if needed. Sets up mount points between BlueField Arm and HBN container for logs and configuration persistency. Sets up various paths as needed by supervisord and other services inside container. Enables the REST API access if needed. Creates or updates credentials

The script is located in <resource>/scripts/doca_hbn/<hbn_version>/ folder, which is downloaded as part of the DOCA Container Resource.

Note To achieve the desired configuration on HBN's first boot, before running preparation script, users can update default NVUE or flat (network interfaces and FRR) configuration files, which are located in <resource>/scripts/doca_hbn/<hbn_version>/ . For NVUE-based configuration: etc/nvue.d/startup.yaml

For flat-files based configuration: etc/network/interfaces etc/frr/frr.conf etc/frr/daemons



Run the following commands to execute the hbn-dpu-setup.sh script:

Copy Copied! cd <resource>/scripts/doca_hbn/2.4.0/ chmod +x hbn-dpu-setup.sh sudo ./hbn-dpu-setup.sh

The following is the help menu for the hbn-dpu-setup.sh script:

Copy Copied! ./hbn-dpu-setup.sh -h usage: hbn-dpu-setup.sh hbn-dpu-setup.sh -m|--mtu <MTU> Use <MTU> bytes for all HBN interfaces (default 9216) hbn-dpu-setup.sh -u|--username <username> User creation hbn-dpu-setup.sh -p|--password <password> Password for --username <username> hbn-dpu-setup.sh -e|-- enable -rest-api-access Enable REST API from external IPs hbn-dpu-setup.sh -h|--help

To enable the REST API access:

Change the default password for the nvidia username: Copy Copied! ./hbn-dpu-setup.sh -u nvidia -p <new-password> Enable REST API: Copy Copied! ./hbn-dpu-setup.sh -- enable -rest-api-access Perform BlueField system-level reset.

HBN container .yaml configuration is called doca_hbn.yaml and it is located in <resource>/configs/<doca_version>/ directory. To spawn the HBN container, simply copy the doca_hbn.yaml file to the /etc/kubelet.d directory:

Copy Copied! cd <resource>/configs/2.9.0/ sudo cp doca_hbn.yaml /etc/kubelet.d/

Kubelet automatically pulls the container image from NGC and spawns a pod executing the container. The DOCA HBN Service starts executing right away.

To inspect the HBN container and verify if it is running correctly:

Check HBN pod and container status and logs: Examine the currently active pods and their IDs (it may take up to 20 seconds for the pod to start): Copy Copied! sudo crictl pods View currently active containers and their IDs: Copy Copied! sudo crictl ps Examine logs of a given container: Copy Copied! sudo crictl logs Examine kubelet logs if something did not work as expected: Copy Copied! sudo journalctl -u kubelet@mgmt Log into the HBN container: Copy Copied! sudo crictl exec -it $(crictl ps | grep hbn | awk '{print $1;}' ) bash While logged into HBN container, verify that the frr , nl2doca , and neighmgr services are running: Copy Copied! (hbn-container)$ supervisorctl status frr (hbn-container)$ supervisorctl status nl2doca (hbn-container)$ supervisorctl status neighmgr Users may also examine various logs under /var/log inside the HBN container.

The HBN service comes with four types of configurable interfaces:

Two uplinks ( p0_if , p1_if )

Two PF port representors ( pf0hpf_if , pf1hpf_if )

User-defined number of VFs (i.e., pf0vf0_if , pf0vf1_if , …, pf1vf0_if , pf1vf1_if , …)

DPU interfaces to connect to services running on BlueField, outside of the HBN container ( pf0dpu1_if and pf0dpu3_if )

The *_if suffix indicates that these are sub-functions and are different from the physical uplinks (i.e., PFs, VFs). They can be viewed as virtual interfaces from a virtualized BlueField.

Each of these interfaces is connected outside the HBN container to the corresponding physical interface, see section "Service Function Chaining" (SFC) for more details.

The HBN container runs as an isolated namespace and does not see any interfaces outside the container ( oob_net0 , real uplinks and PFs, *_if_r representors).

This is the default deployment model of HBN. In this model, only one OVS bridge is created.

The following is a sample bf.cfg and the resulting OVS and port configurations:

Sample bf.cfg : bf.cfg Collapse Source Copy Copied! BR_HBN_UPLINKS= "p0,p1" BR_HBN_REPS= "pf0hpf,pf1hpf,pf0vf0-pf0vf12,pf1vf0-pf1vf1" BR_HBN_SFS= "svc1,svc2"

Generated hbn.conf : Generated hbn.conf Collapse Source Copy Copied! [BR_HBN_UPLINKS] p0 p1 [BR_HBN_REPS] pf0hpf pf0vf0 pf0vf1 pf0vf2 pf0vf3 pf0vf4 pf0vf5 pf0vf6 pf0vf7 pf0vf8 pf0vf9 pf0vf10 pf0vf11 pf0vf12 pf1hpf pf1vf0 pf1vf1 [BR_HBN_SFS] svc1 svc2 [BR_SFC_UPLINKS] [BR_SFC_REPS] [BR_SFC_SFS] [BR_HBN_SFC_PATCH_PORTS] [LINK_PROPAGATION] p0:p0_if_r p1:p1_if_r pf0hpf:pf0hpf_if_r pf0vf0:pf0vf0_if_r pf0vf1:pf0vf1_if_r pf0vf2:pf0vf2_if_r pf0vf3:pf0vf3_if_r pf0vf4:pf0vf4_if_r pf0vf5:pf0vf5_if_r pf0vf6:pf0vf6_if_r pf0vf7:pf0vf7_if_r pf0vf8:pf0vf8_if_r pf0vf9:pf0vf9_if_r pf0vf10:pf0vf10_if_r pf0vf11:pf0vf11_if_r pf0vf12:pf0vf12_if_r pf1hpf:pf1hpf_if_r pf1vf0:pf1vf0_if_r pf1vf1:pf1vf1_if_r svc1_r:svc1_if_r svc2_r:svc2_if_r [ENABLE_BR_SFC] [ENABLE_BR_SFC_DEFAULT_FLOWS]

The following is a sample bf.cfg and the resulting OVS and port configurations:

Sample bf.cfg : bf.cfg Collapse Source Copy Copied! BR_HBN_UPLINKS= "" BR_SFC_UPLINKS= "p0,p1" BR_HBN_REPS= "" BR_SFC_REPS= "pf0hpf,pf1hpf,pf0vf0-pf0vf1,pf1vf0-pf1vf1" BR_HBN_SFS= "" BR_SFC_SFS= "" BR_HBN_SFC_PATCH_PORTS= "tss0" LINK_PROPAGATION= "pf0hpf:tss0" ENABLE_BR_SFC=yes ENABLE_BR_SFC_DEFAULT_FLOWS=yes

Generated hbn.conf : Generated hbn.conf Collapse Source Copy Copied! [BR_HBN_UPLINKS] [BR_HBN_REPS] [BR_HBN_SFS] [BR_SFC_UPLINKS] p0 p1 [BR_SFC_REPS] pf0hpf pf0vf0 pf0vf1 pf1hpf pf1vf0 pf1vf1 [BR_SFC_SFS] [BR_HBN_SFC_PATCH_PORTS] tss0 [LINK_PROPAGATION] pf0hpf:tss0 p0:p0_if_r p1:p1_if_r pf0vf0:pf0vf0_if_r pf0vf1:pf0vf1_if_r pf1hpf:pf1hpf_if_r pf1vf0:pf1vf0_if_r pf1vf1:pf1vf1_if_r [ENABLE_BR_SFC] yes [ENABLE_BR_SFC_DEFAULT_FLOWS] yes

The following is a sample bf.cfg and the resulting OVS and port configurations:

Sample bf.cfg : bf.cfg Collapse Source Copy Copied! BR_HBN_UPLINKS= "p1" BR_SFC_UPLINKS= "p0" BR_HBN_REPS= "pf1hpf,pf0vf0" BR_SFC_REPS= "pf0hpf,pf0vf1" BR_HBN_SFS= "svc1,svc2" BR_SFC_SFS= "ovn" BR_HBN_SFC_PATCH_PORTS= "tss0" LINK_PROPAGATION= "pf0hpf:tss0" ENABLE_BR_SFC=yes ENABLE_BR_SFC_DEFAULT_FLOWS=yes

Generated hbn.conf : Generated hbn.conf Collapse Source Copy Copied! [BR_HBN_UPLINKS] p1 [BR_HBN_REPS] pf0vf0 pf1hpf [BR_HBN_SFS] svc1 svc2 [BR_SFC_UPLINKS] p0 [BR_SFC_REPS] pf0hpf pf0vf1 [BR_SFC_SFS] ovn [BR_HBN_SFC_PATCH_PORTS] tss0 [LINK_PROPAGATION] pf0hpf:tss0 p1:p1_if_r p0:p0_if_r pf0vf0:pf0vf0_if_r pf0hpf:pf0hpf_if_r pf0vf1:pf0vf1_if_r pf1hpf:pf1hpf_if_r svc1_r:svc1_if_r svc2_r:svc2_if_r ovn_r:ovn_if_r [ENABLE_BR_SFC] yes [ENABLE_BR_SFC_DEFAULT_FLOWS] yes

When HBN is deployed with SFC, the interface state of the following network devices is propagated to their corresponding SFs:

Uplinks – p0 , p1

PFs – pf0hpf , pf1hpf

VFs – pf0vfX , pf1vfX where X is the VF number

For example, if the p0 uplink cable gets disconnected:

p0 transitions to DOWN state with NO-CARRIER (default behavior on Linux); and

p0 state is propagated to p0_if whose state also becomes DOWN with NO-CARRIER

After p0 connection is reestablished:

p0 transitions to UP state; and

p0 state is propagated to p0_if whose state becomes UP

Interface state propagation only happens in the uplink/PF/VF-to-SF direction.

A daemon called sfc-state-propagation runs on BlueField, outside of the HBN container, to sync the state. The daemon listens to netlink notifications for interfaces and transfers the state to SFs.

In the HBN container, all the interfaces MTU are set to 9216 by default. MTU of specific interfaces can be overwritten using flat-files configuration or NVUE.

On BlueField side (i.e., outside of the HBN container), the MTU of the uplinks, PFs and VFs interfaces are also set to 9216. This can be changed by modifying /etc/systemd/network/30-hbn-mtu.network or by adding a new configuration file in the /etc/systemd/network for specific directories.

To reload this configuration, run:

Copy Copied! systemctl restart systemd-networkd





There are various SF ports (named pf0dpuX_if , where X is [0..n]) on BlueField Arm, which can be used to run any services on BlueField and use HBN to provide network connectivity. These ports can have a flexible naming convention based on the service name. For example, to support OVN service, it can create an interface named ovn which can be used by the OVN service running on the BlueField Arm, and it will get a corresponding HBN port named ovn_if . These interfaces are created using either BR_SFC_SFS or BR_HBN_SFS based on which the bridge needs the service interface and mode of service deployment.

Traffic between BlueField and the outside world is hardware-accelerated when the HBN side port is an L3 interface or access-port using switch virtual interface (SVI). So, it is treated the same way as PF or VF ports from a traffic handling standpoint.

Info There are 2 SF port pairs created by default on BlueField Arm side so there can be 2 separate DOCA services running at same time.





The uplink ports must be always kept administratively up for proper operation of HBN. Otherwise, the NVIDIA® ConnectX® firmware would bring down the corresponding representor port which would cause data forwarding to stop.

Note Change in operational status of uplink (e.g., carrier down) would result in traffic being switched to the other uplink.

When using ECMP failover on the two uplink SFs, locally disabling one uplink does not result in traffic switching to the second uplink. Disabling local link in this case means to set one uplink admin DOWN directly on BlueField.

To test ECMP failover scenarios correctly, the uplink must be disabled from its remote counterpart (i.e., execute admin DOWN on the remote system's link which is connected to the uplink).

The preconfigured default user credentials are as follows:

Username nvidia Password nvidia

NVUE user credentials can be added post installation:

This can be done by specifying additional –-username and –-password to the HBN startup script (refer to "Running HBN Preparation Script"). For example: Copy Copied! sudo ./hbn-dpu-setup.sh -u newuser -p newpassword After executing this script, respawn the container or start the decrypt-user-add script inside running HBN container: Copy Copied! supervisorctl start decrypt-user-add decrypt-user-add: started The script creates a new user in the HBN container: Copy Copied! cat /etc/passwd | grep newuser newuser:x:1001:1001::/home/newuser:/bin/bash

Interface Interface Type NVUE Type p0_if Uplink representor swp p1_if Uplink representor swp lo Loopback loopback pf0hpf_if Host representor swp pf1hpf_if Host representor swp pf0vfx_if (where x is 0-255) VF representor swp pf1vfx_if (where x is 0-255) VF representor swp

The following directories are mounted from BlueField Arm to the HBN container namespace and are persistent across HBN service restarts and BlueField reboots:

BlueField Arm Mount Point HBN Container Mount Point Configuration file mount points /var/lib/hbn/etc/network/ /etc/network/ /var/lib/hbn/etc/frr/ /etc/frr/ /var/lib/hbn/etc/nvue.d/ /etc/nvue.d/ /var/lib/hbn/etc/supervisor/conf.d/ /etc/supervisor/conf.d/ /var/lib/hbn/var/lib/nvue/ /var/lib/nvue/ Support and log file mount points /var/lib/hbn/var/support/ /var/support/ /var/log/doca/hbn/ /var/log/hbn/

The first step to use SR-IOV is to create Virtual Functions (VFs) on the host server.

VFs can be created using the following command:

Copy Copied! sudo echo N > /sys/class/net/<host-rep>/device/sriov_numvfs

Where:

<host-rep> is one of the two host representors (e.g., ens1f0 or ens1f1 )

0≤ N ≤16 is the desired total number of VFs Set N =0 to delete all the VFs on 0≤N≤16 N =16 is the maximum number of VFs supported on HBN across all representors



VFs created on the host must have corresponding VF representor devices and SF devices for HBN on BlueField side. For example:

ens1f0vf0 is the first SR-IOV VF device from the first host representor; this interface is created on the host server

pf0vf0 is the corresponding VF representor device to ens1f0vf0 ; this device is present on the BlueField Arm side and automatically created at the same time as ens1f0vf0 is created by the user on the host side

pf0vf0_if is the corresponding SF device for pf0vf0 which is used to connect the VF to HBN pipeline

The creation of the SF device for VFs is done ahead of time when provisioning the BlueField and installing the DOCA image on it, see section "Enabling SFC" to see how to select how many SFs to create ahead of time.

The SF devices for VFs (i.e., pfXvfY ) are pre-mapped to work with the corresponding VF representors when these are created with the command from the previous step.

Two management VRFs are automatically configured for HBN when BlueField is deployed with SFC:

The first management VRF is outside the HBN container on BlueField. This VRF provides separation between out-of-band (OOB) traffic (via oob_net0 or tmfifo_net0 ) and data-plane traffic via uplinks and PFs.

The second management VRF is inside the HBN container and provides similar separation. The OOB traffic (via eth0 ) is isolated from the traffic via the *_if interfaces.

The management (mgmt) VRF is enabled by default when the BlueField is deployed with SFC (see section "Enabling SFC"). The mgmt VRF provides separation between the OOB management network and the in-band data plane network.

The uplinks and PFs/VFs use the default routing table while the oob_net0 (OOB Ethernet port) and the tmifo_net0 netdevices use the mgmt VRF to route their packets.

When logging in either via SSH or the console, the shell is by default in mgmt VRF context. This is indicated by a mgmt added to the shell prompt:

Copy Copied! root@bf2:mgmt:/home/ubuntu root@bf2:mgmt:/home/ubuntu mgmt.

When logging into the HBN container with crictl , the HBN shell will be in the default VRF. Users must switch to MGMT VRF manually if OOB access is required. Use ip vrf exec to do so.

Copy Copied! root@bf2:mgmt:/home/ubuntu

The user must run ip vrf exec mgmt to perform operations requiring OOB access (e.g., apt-get update).

Network devices belonging to the mgmt VRF can be listed with the vrf utility:

Copy Copied! root@bf2:mgmt:/home/ubuntu VRF: mgmt -------------------- tmfifo_net0 UP 00:1a:ca:ff:ff:03 <BROADCAST,MULTICAST,UP,LOWER_UP> oob_net0 UP 08:c0:eb:c0:5a:32 <BROADCAST,MULTICAST,UP,LOWER_UP> root@bf2:mgmt:/home/ubuntu vrf <OPTS> VRF domains: vrf list Links associated with VRF domains: vrf link list [<vrf-name>] Tasks and VRF domain asociation: vrf task exec <vrf-name> < command > vrf task list [<vrf-name>] vrf task identify <pid> NOTE: This command affects only AF_INET and AF_INET6 sockets opened by the command that gets exec 'ed. Specifically, it has *no* impact on netlink sockets (e.g., ip command ).

To show the routing table for the default VRF, run:

Copy Copied! root@bf2:mgmt:/home/ubuntu

To show the routing table for the mgmt VRF, run:

Copy Copied! root@bf2:mgmt:/home/ubuntu





Inside the HBN container, a separate mgmt VRF is present. Similar commands as those listed under section "MGMT VRF on BlueField Arm" can be used to query management routes.

The *_if interfaces use the default routing table while the eth0 (OOB) uses the mgmt VRF to route out-of-band packets out of the container. The OOB traffic gets NATed through the oob_net0 interface on BlueField Arm, ultimately using the BlueField OOB's IP address.

When logging into the HBN container via crictl , the shell enters the default VRF context by default. Switching to the mgmt VRF can be done using the command ip vrf exec mgmt <cmd> .

On the BlueField Arm, outside the HBN container, a set of existing services run in the mgmt VRF context as they need OOB network access:

containerd

kubelet

ssh

docker

These services can be restarted and queried for their status using the command systemctl while adding @mgmt to the original service name. For example:

To restart containerd: Copy Copied! root@bf2:mgmt:/home/ubuntu

To query containerd status: Copy Copied! root@bf2:mgmt:/home/ubuntu

Note The original version of these services (without @mgmt ) are not used and must not be started.





If a service needs OOB access to run, it can be added to the set of services running in mgmt VRF context. Adding such a service is only possible on the BlueField Arm (i.e., outside the HBN container).

