DOCA Documentation v2.8.0
DOCA 2.8.0

HBN Service Deployment

Info

Refer to the "HBN Service Release Notes" page for information on the specific hardware and software requirements for HBN.

The following subsections describe specific prerequisites for the BlueField before deploying the DOCA HBN Service.

Enabling BlueField DPU Mode

HBN requires BlueField to work in either DPU mode or zero-trust mode of operation. Information about configuring BlueField modes of operation can be found under "NVIDIA BlueField Modes of Operation".

Enabling SFC

HBN requires SFC configuration to be activated on the BlueField before running the HBN service container. SFC allows for additional services/containers to be chained to HBN and provides additional data manipulation capabilities. SFC can be configured in 3 modes:

  1. HBN-only mode – In this mode, one OVS bridge is created, br-hbn. All HBN-specific ports are added to this bridge. This is the default mode of operation. This mode is configured by setting ENABLE_BR_HBN=yes in bf.cfg and leaving ENABLE_BR_SFC to default.

  2. Dual bridge mode – In this mode, 2 OVS bridges are created, br-hbn and bf-sfc. All HBN-specific ports are added to bf-sfc bridge and all these ports are patched into the br-hbn bridge. bf-sfc can be used to add various custom steering flows to direct traffic across different ports in the bridge. In this mode, both ENABLE_BR_SFC and ENABLE_BR_HBN are set as to yes. BR_HBN_XXX parameters are not set and all ports are under BR_SFC_XXX variables.

  3. Mixed mode – this is similar to the dual bridge model, except that ports can be assigned to either of the bridges (i.e., some ports in br-hbn and some in br-sfc bridge). In this mode, ports are under BR_SFC_XXX and BR_HBN_XXX.

    The use of the bridge br-sfc allows defining deployment-specific rules before or after HBN pipeline. User can add OpenFlow rules directly to bf-sfc bridge. If ENABLE_BR_SFC_DEFAULT_FLOWS is set to yes, make sure user rules are inserted at higher priority to make it effective.

The following table describes various bf.cfg parameters used to configure these modes as well as other parameters which assign ports to various bridges:

Parameter

Description

Mandatory

Default Value

Example

ENABLE_BR_HBN

Setting this parameter to yes enables the br-hbn bridge

Note

This setting is necessary to work with HBN.

Yes

no

Copy
Copied!
            

ENABLE_BR_HBN=yes

ENABLE_BR_SFC

Setting this parameter to yes enables the br-sfc bridge

Info

This is only needed when the second OVS bridge is required for custom steering flows.

No

no

Copy
Copied!
            

ENABLE_BR_SFC=no

BR_HBN_UPLINKS

Uplinks added to br-hbn directly

No

p0,p1

Copy
Copied!
            

BR_HBN_UPLINKS="p0,p1"

BR_SFC_UPLINKS

Uplinks added to br-sfc directly

No

""

Copy
Copied!
            

BR_SFC_UPLINKS=""

BR_HBN_REPS

PFs and VFs added to br-hbn directly

No

""

Copy
Copied!
            

BR_HBN_REPS="pf0hpf,pf1hpf,pf0vf0-pf0vf12,pf1vf0-pf1vf4"

BR_SFC_REPS

PFs and VFs added to br-sfc directly

No

""

Copy
Copied!
            

BR_SFC_REPS=""

BR_HBN_SFS

DPU ports added to br-hbn directly. These ports are mostly service ports present on the DPU which require using HBN network offload services.

No

""

Copy
Copied!
            

BR_HBN_SFS=svc1,svc2

BR_SFC_SFS

DPU ports added to br-sfc directly

No

""

Copy
Copied!
            

BR_SFC_SFS=svc1,svc2

BR_HBN_SFC_PATCH_PORTS

Patch ports added to br-sfc. These are general purpose ports meant for muxing or demuxing of traffic across various PF/VF ports.

No

Copy
Copied!
            

BR_HBN_SFC_PATCH_PORTS=patch1

LINK_PROPAGATION

Mapping of how link propagation should work. If nothing is provided, each uplink/PF/VF port reflects its status in its corresponding HBN port. For example, the status of p0 is reflected in p0_if.

No

Uplink/PF/VF to the corresponding HBN port

Copy
Copied!
            

LINK_PROPAGATION=""

ENABLE_BR_SFC_DEFAULT_FLOWS

This parameter is used to provide default connectivity in the br-sfc bridge so that each port can send traffic to its corresponding output port

No

no

Copy
Copied!
            

ENABLE_BR_SFC_DEFAULT_FLOWS=yes

Info

More detail about port connectivity in each mode is provided in section "HBN Deployment Configuration".

The following subsections provide additional information about SFC and instructions on enabling it during BlueField DOCA image installation.

Deploying BlueField DOCA Image with SFC from Host

For DOCA image installation on BlueField, the user should follow the instructions under NVIDIA DOCA Installation Guide for Linux with the following extra notes to enable BlueField for HBN setup:

  1. Make sure link type is set to ETH under the "Installing Software on Host" section.

  2. Add the following parameters to the bf.cfg configuration file:

    1. This configuration example is relevant for "HBN-only mode". Set the appropriate variables and values depending on your deployment model.

    2. Enable HBN specific OVS bridge on BlueField Arm by setting ENABLE_BR_HBN=yes.

    3. Define the uplink ports to be used by HBN BR_HBN_UPLINKS='<port>'.

      Note

      Must include both ports (i.e., p0,p1) for dual-port BlueField devices and only p0 for single-port BlueField devices.

    4. Include PF and VF ports to be used by HBN. The following example sets both PFs and 8 VFs on each uplink: BR_HBN_REPS='pf0hpf,pf1hpf,pf0vf0-pf0vf7,pf1vf0-pf1vf7'.

    5. (Optional) Include SF devices to be created and connected to HBN bridge on the BlueField Arm side by setting BR_HBN_SFS='pf0dpu1,pf0dpu3'.

      Info

      If nothing is provided, pf0dpu1 and pf0dpu3 are created by default.

      Warning

      While older formats of bf.cfg still work in this release, they will be deprecated over the next 2 releases. So, its advisable to move to the new format to avoid any upgrade issues in future releases. The following is an example for the old bf.cfg format:

      Copy
      Copied!
                  

      ENABLE_SFC_HBN=yes NUM_VFs_PHYS_PORT0=12 # <num VFs supported by HBN on Physical Port 0> (valid range: 0-127) Default 14 NUM_VFs_PHYS_PORT1=2 # <num VFs supported by HBN on Physical Port 1> (valid range: 0-127) Default 0

  3. Then run:

    Copy
    Copied!
                

    bfb-install -c bf.cfg -r rshim0 -b <BFB-image>

  4. Once SFC deployment is done, it creates 3 set of files:

    • /etc/mellanox/hbn.conf – this file can be used to redeploy SFC without the need to pass through bf.cfg again to modify interface mapping

    • /etc/mellanox/sfc.conf – this file provides a view of how various ports are connected in different bridges

    • /etc/mellanox/mlnx-sf.conf – this file includes all the HBN ports to be created and corresponding commands to create the port

Deploying BlueField DOCA Image with SFC Using PXE Boot

To enable HBN SFC using a PXE installation environment with BFB content, use the following configuration for PXE:

Copy
Copied!
            

bfnet=<IFNAME>:<IPADDR>:<NETMASK> or <IFNAME>:dhcp bfks=<URL of the kickstart script>

The kickstart script (bash) should include the following lines:

Copy
Copied!
            

cat >> /etc/bf.cfg << EOF   ENABLE_BR_HBN=yes BR_HBN_UPLINKS='p0,p1' BR_HBN_REPS='pf0hpf,pf1hpf,pf0vf0-pf0vf7,pf1vf0-pf1vf7' BR_HBN_SFS='pf0dpu1,pf0dpu3'  EOF

The /etc/bf.cfg generated above is sourced by the BFB install.sh script.

Note

It is recommended to verify the accuracy of the BlueField's clock post-installation. This can be done using the following command:

Copy
Copied!
            

$ date

Please refer to the known issues listed in the "NVIDIA DOCA Release Notes" for more information.


Redeploying SFC from BlueField

Redeploying SFC from BlueField can be done after the DPU has already been deployed using bf.cfg and either port mapping or bridge configuration needs to be change.

To redeploy SFC from BlueField:

  1. Edit /etc/mellanox/hbn.conf by adding or removing entries in each segment as necessary.

  2. Rerun the SFC install script:

    Copy
    Copied!
                

    /opt/mellanox/sfc-hbn/install.sh -c -r

    This generates a new set of sfc.conf and mlnx-sf.conf and reloads the DPU.

    Configuration and reload can be split into 2 steps by removing the -r option and rebooting BlueField post configuration.

After the BlueField reloads, the command ovs-vsctl show should show all the new ports and bridges configured in OVS.

Deploying HBN with Other Services

When the HBN container is deployed by itself, BlueField Arm is configured with 3k huge pages. If it is deployed with other services, the actual number of huge-pages must be adjusted based on the requirements of those services. For example, SNAP or NVMesh may need approximately 1k to 5k huge pages. So, if HBN is running with either of these services on the same BlueField, the total number of hugepages must be set to the sum of the hugepage requirement of all the services.

For example, if NVMesh needs 3k hugepages, 6k total hugepages must be set when running with HBN. To do that, add the following parameters to the bf.cfg configuration file alongside other desired parameters.

Copy
Copied!
            

HUGEPAGE_COUNT=6144

Warning

This should be performed only on a BlueField-3 running with 32G of memory. Doing this on 16G system may cause memory issues for various applications on BlueField Arm.

Also, HBN with other services is qualified only for 16 VFs.

HBN Service Container Deployment

HBN service is available on NGC, NVIDIA's container catalog. For information about the deployment of DOCA containers on top of the BlueField, refer to NVIDIA DOCA Container Deployment Guide.

Downloading DOCA Container Resource File

Pull the latest DOCA container resource as a *.zip file from NGC and extract it to the <resource> folder (doca_container_configs_2.7.0v1 in this example):

Copy
Copied!
            

wget https://api.ngc.nvidia.com/v2/resources/nvidia/doca/doca_container_configs/versions/2.7.0v1/zip -O doca_container_configs_2.7.0v1.zip unzip -o doca_container_configs_2.7.0v1.zip -d doca_container_configs_2.7.0v1


Running HBN Preparation Script

The HBN script (hbn-dpu-setup.sh) performs the following steps on BlueField Arm which are required for HBN service to run:

  1. Sets the BlueField to DPU mode if needed.

  2. Enables IPv4/IPv6 kernel forwarding.

  3. Sets up interface MTU if needed.

  4. Sets up mount points between BlueField Arm and HBN container for logs and configuration persistency.

  5. Sets up various paths as needed by supervisord and other services inside container.

  6. Enables the REST API access if needed.

  7. Creates or updates credentials

The script is located in <resource>/scripts/doca_hbn/<hbn_version>/ folder, which is downloaded as part of the DOCA Container Resource.

Note

To achieve the desired configuration on HBN's first boot, before running preparation script, users can update default NVUE or flat (network interfaces and FRR) configuration files, which are located in <resource>/scripts/doca_hbn/<hbn_version>/.

  • For NVUE-based configuration:

    • etc/nvue.d/startup.yaml

  • For flat-files based configuration:

    • etc/network/interfaces

    • etc/frr/frr.conf

    • etc/frr/daemons

Run the following commands to execute the hbn-dpu-setup.sh script:

Copy
Copied!
            

cd <resource>/scripts/doca_hbn/2.3.0/ chmod +x hbn-dpu-setup.sh sudo ./hbn-dpu-setup.sh

The following is the help menu for the hbn-dpu-setup.sh script:

Copy
Copied!
            

./hbn-dpu-setup.sh -h usage: hbn-dpu-setup.sh hbn-dpu-setup.sh -m|--mtu <MTU> Use <MTU> bytes for all HBN interfaces (default 9216) hbn-dpu-setup.sh -u|--username <username> User creation hbn-dpu-setup.sh -p|--password <password> Password for --username <username> hbn-dpu-setup.sh -e|--enable-rest-api-access Enable REST API from external IPs hbn-dpu-setup.sh -h|--help

Enabling REST API Access

To enable the REST API access:

  1. Change the default password for the nvidia username:

    Copy
    Copied!
                

    ./hbn-dpu-setup.sh -u nvidia -p <new-password>

  2. Enable REST API:

    Copy
    Copied!
                

    ./hbn-dpu-setup.sh --enable-rest-api-access

  3. Perform BlueField system-level reset.

Spawning HBN Container

HBN container .yaml configuration is called doca_hbn.yaml and it is located in <resource>/configs/<doca_version>/ directory. To spawn the HBN container, simply copy the doca_hbn.yaml file to the /etc/kubelet.d directory:

Copy
Copied!
            

cd <resource>/configs/2.8.0/ sudo cp doca_hbn.yaml /etc/kubelet.d/

Kubelet automatically pulls the container image from NGC and spawns a pod executing the container. The DOCA HBN Service starts executing right away.

Verifying HBN Container is Running

To inspect the HBN container and verify if it is running correctly:

  1. Check HBN pod and container status and logs:

    1. Examine the currently active pods and their IDs (it may take up to 20 seconds for the pod to start):

      Copy
      Copied!
                  

      sudo crictl pods

    2. View currently active containers and their IDs:

      Copy
      Copied!
                  

      sudo crictl ps

    3. Examine logs of a given container:

      Copy
      Copied!
                  

      sudo crictl logs

    4. Examine kubelet logs if something did not work as expected:

      Copy
      Copied!
                  

      sudo journalctl -u kubelet@mgmt

  2. Log into the HBN container:

    Copy
    Copied!
                

    sudo crictl exec -it $(crictl ps | grep hbn | awk '{print $1;}') bash

  3. While logged into HBN container, verify that the frr, nl2doca, and neighmgr services are running:

    Copy
    Copied!
                

    (hbn-container)$ supervisorctl status frr (hbn-container)$ supervisorctl status nl2doca (hbn-container)$ supervisorctl status neighmgr

  4. Users may also examine various logs under /var/log inside the HBN container.

HBN Deployment Configuration

The HBN service comes with four types of configurable interfaces:

  • Two uplinks (p0_if, p1_if)

  • Two PF port representors (pf0hpf_if, pf1hpf_if)

  • User-defined number of VFs (i.e., pf0vf0_if, pf0vf1_if, …, pf1vf0_if, pf1vf1_if, …)

  • DPU interfaces to connect to services running on BlueField, outside of the HBN container (pf0dpu1_if and pf0dpu3_if)

The *_if suffix indicates that these are sub-functions and are different from the physical uplinks (i.e., PFs, VFs). They can be viewed as virtual interfaces from a virtualized BlueField.

Each of these interfaces is connected outside the HBN container to the corresponding physical interface, see section "Service Function Chaining" (SFC) for more details.

The HBN container runs as an isolated namespace and does not see any interfaces outside the container (oob_net0, real uplinks and PFs, *_if_r representors).

HBN-only Deployment Configuration

This is the default deployment model of HBN. In this model, only one OVS bridge is created.

The following is a sample bf.cfg and the resulting OVS and port configurations:

  • Sample bf.cfg:

    bf.cfg

    Copy
    Copied!
                

    BR_HBN_UPLINKS="p0,p1"  BR_HBN_REPS="pf0hpf,pf1hpf,pf0vf0-pf0vf12,pf1vf0-pf1vf1"  BR_HBN_SFS="svc1,svc2"

  • Generated hbn.conf:

    Generated hbn.conf

    Copy
    Copied!
                

    [BR_HBN_UPLINKS] p0 p1 [BR_HBN_REPS] pf0hpf pf0vf0 pf0vf1 pf0vf2 pf0vf3 pf0vf4 pf0vf5 pf0vf6 pf0vf7 pf0vf8 pf0vf9 pf0vf10 pf0vf11 pf0vf12 pf1hpf pf1vf0 pf1vf1   [BR_HBN_SFS] svc1 svc2   [BR_SFC_UPLINKS]     [BR_SFC_REPS]   [BR_SFC_SFS]   [BR_HBN_SFC_PATCH_PORTS]     [LINK_PROPAGATION] p0:p0_if_r p1:p1_if_r pf0hpf:pf0hpf_if_r pf0vf0:pf0vf0_if_r pf0vf1:pf0vf1_if_r pf0vf2:pf0vf2_if_r pf0vf3:pf0vf3_if_r pf0vf4:pf0vf4_if_r pf0vf5:pf0vf5_if_r pf0vf6:pf0vf6_if_r pf0vf7:pf0vf7_if_r pf0vf8:pf0vf8_if_r pf0vf9:pf0vf9_if_r pf0vf10:pf0vf10_if_r pf0vf11:pf0vf11_if_r pf0vf12:pf0vf12_if_r pf1hpf:pf1hpf_if_r pf1vf0:pf1vf0_if_r pf1vf1:pf1vf1_if_r svc1_r:svc1_if_r svc2_r:svc2_if_r   [ENABLE_BR_SFC]   [ENABLE_BR_SFC_DEFAULT_FLOWS]

sec-2.2.1-hbn-only-mode-version-1-modificationdate-1724254630133-api-v2.png


Dual Bridge HBN Deployment Configuration

The following is a sample bf.cfg and the resulting OVS and port configurations:

  • Sample bf.cfg:

    bf.cfg

    Copy
    Copied!
                

    BR_HBN_UPLINKS="" BR_SFC_UPLINKS="p0,p1" BR_HBN_REPS=""  BR_SFC_REPS="pf0hpf,pf1hpf,pf0vf0-pf0vf1,pf1vf0-pf1vf1"  BR_HBN_SFS=""  BR_SFC_SFS=""  BR_HBN_SFC_PATCH_PORTS="tss0"  LINK_PROPAGATION="pf0hpf:tss0"  ENABLE_BR_SFC=yes ENABLE_BR_SFC_DEFAULT_FLOWS=yes

  • Generated hbn.conf:

    Generated hbn.conf

    Copy
    Copied!
                

    [BR_HBN_UPLINKS]     [BR_HBN_REPS]     [BR_HBN_SFS]     [BR_SFC_UPLINKS] p0 p1   [BR_SFC_REPS] pf0hpf pf0vf0 pf0vf1 pf1hpf pf1vf0 pf1vf1   [BR_SFC_SFS]     [BR_HBN_SFC_PATCH_PORTS] tss0   [LINK_PROPAGATION] pf0hpf:tss0 p0:p0_if_r p1:p1_if_r pf0vf0:pf0vf0_if_r pf0vf1:pf0vf1_if_r pf1hpf:pf1hpf_if_r pf1vf0:pf1vf0_if_r pf1vf1:pf1vf1_if_r   [ENABLE_BR_SFC] yes   [ENABLE_BR_SFC_DEFAULT_FLOWS] yes

sec-2.2.2.-sfc-mode-version-1-modificationdate-1724254631850-api-v2.png


Mixed Mode HBN Deployment Configuration

The following is a sample bf.cfg and the resulting OVS and port configurations:

  • Sample bf.cfg:

    bf.cfg

    Copy
    Copied!
                

    BR_HBN_UPLINKS="p1"  BR_SFC_UPLINKS="p0" BR_HBN_REPS="pf1hpf,pf0vf0"  BR_SFC_REPS="pf0hpf,pf0vf1"  BR_HBN_SFS="svc1,svc2"  BR_SFC_SFS="ovn"  BR_HBN_SFC_PATCH_PORTS="tss0"  LINK_PROPAGATION="pf0hpf:tss0" ENABLE_BR_SFC=yes ENABLE_BR_SFC_DEFAULT_FLOWS=yes

  • Generated hbn.conf:

    Generated hbn.conf

    Copy
    Copied!
                

    [BR_HBN_UPLINKS] p1   [BR_HBN_REPS] pf0vf0 pf1hpf   [BR_HBN_SFS] svc1 svc2   [BR_SFC_UPLINKS] p0   [BR_SFC_REPS] pf0hpf pf0vf1   [BR_SFC_SFS] ovn   [BR_HBN_SFC_PATCH_PORTS] tss0   [LINK_PROPAGATION] pf0hpf:tss0 p1:p1_if_r p0:p0_if_r pf0vf0:pf0vf0_if_r pf0hpf:pf0hpf_if_r pf0vf1:pf0vf1_if_r pf1hpf:pf1hpf_if_r svc1_r:svc1_if_r svc2_r:svc2_if_r ovn_r:ovn_if_r   [ENABLE_BR_SFC] yes   [ENABLE_BR_SFC_DEFAULT_FLOWS] yes

sec-2.2.3-mixed_mode-version-1-modificationdate-1724254632137-api-v2.png

HBN Deployment Considerations

SF Interface State Tracking

When HBN is deployed with SFC, the interface state of the following network devices is propagated to their corresponding SFs:

  • Uplinks – p0, p1

  • PFs – pf0hpf, pf1hpf

  • VFs – pf0vfX, pf1vfX where X is the VF number

For example, if the p0 uplink cable gets disconnected:

  • p0 transitions to DOWN state with NO-CARRIER (default behavior on Linux); and

  • p0 state is propagated to p0_if whose state also becomes DOWN with NO-CARRIER

After p0 connection is reestablished:

  • p0 transitions to UP state; and

  • p0 state is propagated to p0_if whose state becomes UP

Interface state propagation only happens in the uplink/PF/VF-to-SF direction.

A daemon called sfc-state-propagation runs on BlueField, outside of the HBN container, to sync the state. The daemon listens to netlink notifications for interfaces and transfers the state to SFs.

SF Interface MTU

In the HBN container, all the interfaces MTU are set to 9216 by default. MTU of specific interfaces can be overwritten using flat-files configuration or NVUE.

On BlueField side (i.e., outside of the HBN container), the MTU of the uplinks, PFs and VFs interfaces are also set to 9216. This can be changed by modifying /etc/systemd/network/30-hbn-mtu.network or by adding a new configuration file in the /etc/systemd/network for specific directories.

To reload this configuration, run:

Copy
Copied!
            

systemctl restart systemd-networkd


Connecting to DOCA Services to HBN on BlueField Arm

There are various SF ports (named pf0dpuX_if, where X is [0..n]) on BlueField Arm, which can be used to run any services on BlueField and use HBN to provide network connectivity. These ports can have a flexible naming convention based on the service name. For example, to support OVN service, it can create an interface named ovn which can be used by the OVN service running on the BlueField Arm, and it will get a corresponding HBN port named ovn_if. These interfaces are created using either BR_SFC_SFS or BR_HBN_SFS based on which the bridge needs the service interface and mode of service deployment.

Traffic between BlueField and the outside world is hardware-accelerated when the HBN side port is an L3 interface or access-port using switch virtual interface (SVI). So, it is treated the same way as PF or VF ports from a traffic handling standpoint.

Info

There are 2 SF port pairs created by default on BlueField Arm side so there can be 2 separate DOCA services running at same time.


The uplink ports must be always kept administratively up for proper operation of HBN. Otherwise, the NVIDIA® ConnectX® firmware would bring down the corresponding representor port which would cause data forwarding to stop.

Note

Change in operational status of uplink (e.g., carrier down) would result in traffic being switched to the other uplink.

When using ECMP failover on the two uplink SFs, locally disabling one uplink does not result in traffic switching to the second uplink. Disabling local link in this case means to set one uplink admin DOWN directly on BlueField.

To test ECMP failover scenarios correctly, the uplink must be disabled from its remote counterpart (i.e., execute admin DOWN on the remote system's link which is connected to the uplink).

HBN NVUE User Credentials

The preconfigured default user credentials are as follows:

Username

nvidia

Password

nvidia

NVUE user credentials can be added post installation:

  1. This can be done by specifying additional –-username and –-password to the HBN startup script (refer to "Running HBN Preparation Script"). For example:

    Copy
    Copied!
                

    sudo ./hbn-dpu-setup.sh -u newuser -p newpassword

  2. After executing this script, respawn the container or start the decrypt-user-add script inside running HBN container:

    Copy
    Copied!
                

    supervisorctl start decrypt-user-add decrypt-user-add: started

    The script creates a new user in the HBN container:

    Copy
    Copied!
                

    cat /etc/passwd | grep newuser newuser:x:1001:1001::/home/newuser:/bin/bash

HBN NVUE Interface Classification

Interface

Interface Type

NVUE Type

p0_if

Uplink representor

swp

p1_if

Uplink representor

swp

lo

Loopback

loopback

pf0hpf_if

Host representor

swp

pf1hpf_if

Host representor

swp

pf0vfx_if (where x is 0-255)

VF representor

swp

pf1vfx_if (where x is 0-255)

VF representor

swp


HBN Files Persistence

The following directories are mounted from BlueField Arm to the HBN container namespace and are persistent across HBN service restarts and BlueField reboots:

BlueField Arm Mount Point

HBN Container Mount Point

Configuration file mount points

/var/lib/hbn/etc/network/

/etc/network/

/var/lib/hbn/etc/frr/

/etc/frr/

/var/lib/hbn/etc/nvue.d/

/etc/nvue.d/

/var/lib/hbn/etc/supervisor/conf.d/

/etc/supervisor/conf.d/

/var/lib/hbn/var/lib/nvue/

/var/lib/nvue/

Support and log file mount points

/var/lib/hbn/var/support/

/var/support/

/var/log/doca/hbn/

/var/log/hbn/


SR-IOV Support in HBN

Creating SR-IOV VFs on Host

The first step to use SR-IOV is to create Virtual Functions (VFs) on the host server.

VFs can be created using the following command:

Copy
Copied!
            

sudo echo N > /sys/class/net/<host-rep>/device/sriov_numvfs

Where:

  • <host-rep> is one of the two host representors (e.g., ens1f0 or ens1f1)

  • 0≤N≤16 is the desired total number of VFs

    • Set N=0 to delete all the VFs on 0≤N≤16

    • N=16 is the maximum number of VFs supported on HBN across all representors

Automatic Creation of VF Representors and SF Devices on BlueField

VFs created on the host must have corresponding VF representor devices and SF devices for HBN on BlueField side. For example:

  • ens1f0vf0 is the first SR-IOV VF device from the first host representor; this interface is created on the host server

  • pf0vf0 is the corresponding VF representor device to ens1f0vf0; this device is present on the BlueField Arm side and automatically created at the same time as ens1f0vf0 is created by the user on the host side

  • pf0vf0_if is the corresponding SF device for pf0vf0 which is used to connect the VF to HBN pipeline

The creation of the SF device for VFs is done ahead of time when provisioning the BlueField and installing the DOCA image on it, see section "Enabling SFC" to see how to select how many SFs to create ahead of time.

The SF devices for VFs (i.e., pfXvfY) are pre-mapped to work with the corresponding VF representors when these are created with the command from the previous step.

Management VRF

Two management VRFs are automatically configured for HBN when BlueField is deployed with SFC:

  • The first management VRF is outside the HBN container on BlueField. This VRF provides separation between out-of-band (OOB) traffic (via oob_net0 or tmfifo_net0) and data-plane traffic via uplinks and PFs.

  • The second management VRF is inside the HBN container and provides similar separation. The OOB traffic (via eth0) is isolated from the traffic via the *_if interfaces.

MGMT VRF on BlueField Arm

The management (mgmt) VRF is enabled by default when the BlueField is deployed with SFC (see section "Enabling SFC"). The mgmt VRF provides separation between the OOB management network and the in-band data plane network.

The uplinks and PFs/VFs use the default routing table while the oob_net0 (OOB Ethernet port) and the tmifo_net0 netdevices use the mgmt VRF to route their packets.

When logging in either via SSH or the console, the shell is by default in mgmt VRF context. This is indicated by a mgmt added to the shell prompt:

Copy
Copied!
            

root@bf2:mgmt:/home/ubuntu# root@bf2:mgmt:/home/ubuntu# ip vrf identify mgmt.

When logging into the HBN container with crictl, the HBN shell will be in the default VRF. Users must switch to MGMT VRF manually if OOB access is required. Use ip vrf exec to do so.

Copy
Copied!
            

root@bf2:mgmt:/home/ubuntu# ip vrf exec mgmt bash

The user must run ip vrf exec mgmt to perform operations requiring OOB access (e.g., apt-get update).

Network devices belonging to the mgmt VRF can be listed with the vrf utility:

Copy
Copied!
            

root@bf2:mgmt:/home/ubuntu# vrf link list   VRF: mgmt -------------------- tmfifo_net0 UP 00:1a:ca:ff:ff:03 <BROADCAST,MULTICAST,UP,LOWER_UP> oob_net0 UP 08:c0:eb:c0:5a:32 <BROADCAST,MULTICAST,UP,LOWER_UP>   root@bf2:mgmt:/home/ubuntu# vrf help vrf <OPTS>   VRF domains: vrf list   Links associated with VRF domains: vrf link list [<vrf-name>]   Tasks and VRF domain asociation: vrf task exec <vrf-name> <command> vrf task list [<vrf-name>] vrf task identify <pid>   NOTE: This command affects only AF_INET and AF_INET6 sockets opened by the command that gets exec'ed. Specifically, it has *no* impact on netlink sockets (e.g., ip command).

To show the routing table for the default VRF, run:

Copy
Copied!
            

root@bf2:mgmt:/home/ubuntu# ip route show

To show the routing table for the mgmt VRF, run:

Copy
Copied!
            

root@bf2:mgmt:/home/ubuntu# ip route show vrf mgmt


MGMT VRF Inside HBN Container

Inside the HBN container, a separate mgmt VRF is present. Similar commands as those listed under section "MGMT VRF on BlueField Arm" can be used to query management routes.

The *_if interfaces use the default routing table while the eth0 (OOB) uses the mgmt VRF to route out-of-band packets out of the container. The OOB traffic gets NATed through the oob_net0 interface on BlueField Arm, ultimately using the BlueField OOB's IP address.

When logging into the HBN container via crictl, the shell enters the default VRF context by default. Switching to the mgmt VRF can be done using the command ip vrf exec mgmt <cmd>.

Existing Services in MGMT VRF on BlueField Arm

On the BlueField Arm, outside the HBN container, a set of existing services run in the mgmt VRF context as they need OOB network access:

  • containerd

  • kubelet

  • ssh

  • docker

These services can be restarted and queried for their status using the command systemctl while adding @mgmt to the original service name. For example:

  • To restart containerd:

    Copy
    Copied!
                

    root@bf2:mgmt:/home/ubuntu# systemctl restart containerd@mgmt

  • To query containerd status:

    Copy
    Copied!
                

    root@bf2:mgmt:/home/ubuntu# systemctl status containerd@mgmt

Note

The original version of these services (without @mgmt) are not used and must not be started.


Running New Service in MGMT VRF on BlueField Arm

If a service needs OOB access to run, it can be added to the set of services running in mgmt VRF context. Adding such a service is only possible on the BlueField Arm (i.e., outside the HBN container).

To add a service to the set of mgmt VRF services:

  1. Add it to /etc/vrf/systemd.conf (if it is not present already). For example, NTP is already listed in this file.

  2. Run the following:

    Copy
    Copied!
                

    root@bf2:mgmt:/home/ubuntu# systemctl daemon-reload

  3. Stop and disable to the non-VRF version of the service to be able to start the mgmt VRF one:

    Copy
    Copied!
                

    root@bf2:mgmt:/home/ubuntu# systemctl stop ntp root@bf2:mgmt:/home/ubuntu# systemctl disable ntp root@bf2:mgmt:/home/ubuntu# systemctl enable ntp@mgmt root@bf2:mgmt:/home/ubuntu# systemctl start ntp@mgmt

© Copyright 2024, NVIDIA. Last updated on Aug 21, 2024.