HBN Service Deployment
Refer to the "HBN Service Release Notes" page for information on the specific hardware and software requirements for HBN.
The following subsections describe specific prerequisites for the BlueField before deploying the DOCA HBN Service.
Enabling BlueField DPU Mode
HBN requires BlueField to work in either DPU mode or zero-trust mode of operation. Information about configuring BlueField modes of operation can be found under "NVIDIA BlueField Modes of Operation".
Enabling SFC
HBN requires SFC configuration to be activated on the BlueField before running the HBN service container. SFC allows for additional services/containers to be chained to HBN and provides additional data manipulation capabilities. SFC can be configured in 3 modes:
HBN-only mode – In this mode, one OVS bridge is created, br-hbn. All HBN-specific ports are added to this bridge. This is the default mode of operation. This mode is configured by setting ENABLE_BR_HBN=yes in bf.cfg and leaving ENABLE_BR_SFC to default.
Dual bridge mode – In this mode, 2 OVS bridges are created, br-hbn and bf-sfc. All HBN-specific ports are added to bf-sfc bridge and all these ports are patched into the br-hbn bridge. bf-sfc can be used to add various custom steering flows to direct traffic across different ports in the bridge. In this mode, both ENABLE_BR_SFC and ENABLE_BR_HBN are set as to yes. BR_HBN_XXX parameters are not set and all ports are under BR_SFC_XXX variables.
Mixed mode – this is similar to the dual bridge model, except that ports can be assigned to either of the bridges (i.e., some ports in br-hbn and some in br-sfc bridge). In this mode, ports are under BR_SFC_XXX and BR_HBN_XXX.
The use of the bridge br-sfc allows defining deployment-specific rules before or after HBN pipeline. User can add OpenFlow rules directly to bf-sfc bridge. If ENABLE_BR_SFC_DEFAULT_FLOWS is set to yes, make sure user rules are inserted at higher priority to make it effective.
The following table describes various bf.cfg parameters used to configure these modes as well as other parameters which assign ports to various bridges:
Parameter |
Description |
Mandatory |
Default Value |
Example |
ENABLE_BR_HBN |
Setting this parameter to yes enables the br-hbn bridge Note
This setting is necessary to work with HBN.
|
Yes |
no |
|
ENABLE_BR_SFC |
Setting this parameter to yes enables the br-sfc bridge Info
This is only needed when the second OVS bridge is required for custom steering flows.
|
No |
no |
|
BR_HBN_UPLINKS |
Uplinks added to br-hbn directly |
No |
p0,p1 |
|
BR_SFC_UPLINKS |
Uplinks added to br-sfc directly |
No |
"" |
|
BR_HBN_REPS |
PFs and VFs added to br-hbn directly |
No |
"" |
|
BR_SFC_REPS |
PFs and VFs added to br-sfc directly |
No |
"" |
|
BR_HBN_SFS |
DPU ports added to br-hbn directly. These ports are mostly service ports present on the DPU which require using HBN network offload services. |
No |
"" |
|
BR_SFC_SFS |
DPU ports added to br-sfc directly |
No |
"" |
|
BR_HBN_SFC_PATCH_PORTS |
Patch ports added to br-sfc. These are general purpose ports meant for muxing or demuxing of traffic across various PF/VF ports. |
No |
|
|
LINK_PROPAGATION |
Mapping of how link propagation should work. If nothing is provided, each uplink/PF/VF port reflects its status in its corresponding HBN port. For example, the status of p0 is reflected in p0_if. |
No |
Uplink/PF/VF to the corresponding HBN port |
|
ENABLE_BR_SFC_DEFAULT_FLOWS |
This parameter is used to provide default connectivity in the br-sfc bridge so that each port can send traffic to its corresponding output port |
No |
no |
|
More detail about port connectivity in each mode is provided in section "HBN Deployment Configuration".
The following subsections provide additional information about SFC and instructions on enabling it during BlueField DOCA image installation.
Deploying BlueField DOCA Image with SFC from Host
For DOCA image installation on BlueField, the user should follow the instructions under NVIDIA DOCA Installation Guide for Linux with the following extra notes to enable BlueField for HBN setup:
Make sure link type is set to ETH under the "Installing Software on Host" section.
Add the following parameters to the bf.cfg configuration file:
This configuration example is relevant for "HBN-only mode". Set the appropriate variables and values depending on your deployment model.
Enable HBN specific OVS bridge on BlueField Arm by setting ENABLE_BR_HBN=yes.
Define the uplink ports to be used by HBN BR_HBN_UPLINKS='<port>'.
NoteMust include both ports (i.e., p0,p1) for dual-port BlueField devices and only p0 for single-port BlueField devices.
Include PF and VF ports to be used by HBN. The following example sets both PFs and 8 VFs on each uplink: BR_HBN_REPS='pf0hpf,pf1hpf,pf0vf0-pf0vf7,pf1vf0-pf1vf7'.
(Optional) Include SF devices to be created and connected to HBN bridge on the BlueField Arm side by setting BR_HBN_SFS='pf0dpu1,pf0dpu3'.
InfoIf nothing is provided, pf0dpu1 and pf0dpu3 are created by default.
WarningWhile older formats of bf.cfg still work in this release, they will be deprecated over the next 2 releases. So, its advisable to move to the new format to avoid any upgrade issues in future releases. The following is an example for the old bf.cfg format:
ENABLE_SFC_HBN=
yes
NUM_VFs_PHYS_PORT0=12# <num VFs supported by HBN on Physical Port 0> (valid range: 0-127) Default 14
NUM_VFs_PHYS_PORT1=2# <num VFs supported by HBN on Physical Port 1> (valid range: 0-127) Default 0
Then run:
bfb-
install
-c bf.cfg -r rshim0 -b <BFB-image>Once SFC deployment is done, it creates 3 set of files:
/etc/mellanox/hbn.conf – this file can be used to redeploy SFC without the need to pass through bf.cfg again to modify interface mapping
/etc/mellanox/sfc.conf – this file provides a view of how various ports are connected in different bridges
/etc/mellanox/mlnx-sf.conf – this file includes all the HBN ports to be created and corresponding commands to create the port
Deploying BlueField DOCA Image with SFC Using PXE Boot
To enable HBN SFC using a PXE installation environment with BFB content, use the following configuration for PXE:
bfnet=<IFNAME>:<IPADDR>:<NETMASK> or <IFNAME>:dhcp
bfks=<URL of the kickstart script>
The kickstart script (bash) should include the following lines:
cat
>> /etc/bf.cfg << EOF
ENABLE_BR_HBN=yes
BR_HBN_UPLINKS='p0,p1'
BR_HBN_REPS='pf0hpf,pf1hpf,pf0vf0-pf0vf7,pf1vf0-pf1vf7'
BR_HBN_SFS='pf0dpu1,pf0dpu3'
EOF
The /etc/bf.cfg generated above is sourced by the BFB install.sh script.
It is recommended to verify the accuracy of the BlueField's clock post-installation. This can be done using the following command:
$ date
Please refer to the known issues listed in the "NVIDIA DOCA Release Notes" for more information.
Redeploying SFC from BlueField
Redeploying SFC from BlueField can be done after the DPU has already been deployed using bf.cfg and either port mapping or bridge configuration needs to be change.
To redeploy SFC from BlueField:
Edit /etc/mellanox/hbn.conf by adding or removing entries in each segment as necessary.
Rerun the SFC install script:
/opt/mellanox/sfc-hbn/install.sh -c -r
This generates a new set of sfc.conf and mlnx-sf.conf and reloads the DPU.
Configuration and reload can be split into 2 steps by removing the -r option and rebooting BlueField post configuration.
After the BlueField reloads, the command ovs-vsctl show should show all the new ports and bridges configured in OVS.
Deploying HBN with Other Services
When the HBN container is deployed by itself, BlueField Arm is configured with 3k huge pages. If it is deployed with other services, the actual number of huge-pages must be adjusted based on the requirements of those services. For example, SNAP or NVMesh may need approximately 1k to 5k huge pages. So, if HBN is running with either of these services on the same BlueField, the total number of hugepages must be set to the sum of the hugepage requirement of all the services.
For example, if NVMesh needs 3k hugepages, 6k total hugepages must be set when running with HBN. To do that, add the following parameters to the bf.cfg configuration file alongside other desired parameters.
HUGEPAGE_COUNT=6144
This should be performed only on a BlueField-3 running with 32G of memory. Doing this on 16G system may cause memory issues for various applications on BlueField Arm.
Also, HBN with other services is qualified only for 16 VFs.
HBN Service Container Deployment
HBN service is available on NGC, NVIDIA's container catalog. For information about the deployment of DOCA containers on top of the BlueField, refer to NVIDIA DOCA Container Deployment Guide.
Downloading DOCA Container Resource File
Pull the latest DOCA container resource as a *.zip file from NGC and extract it to the <resource> folder (doca_container_configs_2.7.0v1 in this example):
wget https://api.ngc.nvidia.com/v2/resources/nvidia/doca/doca_container_configs/versions/2.7.0v1/zip -O doca_container_configs_2.7.0v1.zip
unzip -o doca_container_configs_2.7.0v1.zip -d doca_container_configs_2.7.0v1
Running HBN Preparation Script
The HBN script (hbn-dpu-setup.sh) performs the following steps on BlueField Arm which are required for HBN service to run:
Sets the BlueField to DPU mode if needed.
Enables IPv4/IPv6 kernel forwarding.
Sets up interface MTU if needed.
Sets up mount points between BlueField Arm and HBN container for logs and configuration persistency.
Sets up various paths as needed by supervisord and other services inside container.
Enables the REST API access if needed.
Creates or updates credentials
The script is located in <resource>/scripts/doca_hbn/<hbn_version>/ folder, which is downloaded as part of the DOCA Container Resource.
To achieve the desired configuration on HBN's first boot, before running preparation script, users can update default NVUE or flat (network interfaces and FRR) configuration files, which are located in <resource>/scripts/doca_hbn/<hbn_version>/.
For NVUE-based configuration:
etc/nvue.d/startup.yaml
For flat-files based configuration:
etc/network/interfaces
etc/frr/frr.conf
etc/frr/daemons
Run the following commands to execute the hbn-dpu-setup.sh script:
cd
<resource>/scripts/doca_hbn/2.3.0/
chmod
+x hbn-dpu-setup.sh
sudo
./hbn-dpu-setup.sh
The following is the help menu for the hbn-dpu-setup.sh script:
./hbn-dpu-setup.sh -h
usage: hbn-dpu-setup.sh
hbn-dpu-setup.sh -m|--mtu <MTU> Use <MTU> bytes for
all HBN interfaces (default 9216)
hbn-dpu-setup.sh -u|--username <username> User creation
hbn-dpu-setup.sh -p|--password <password> Password for
--username <username>
hbn-dpu-setup.sh -e|--enable
-rest-api-access Enable REST API from external IPs
hbn-dpu-setup.sh -h|--help
Enabling REST API Access
To enable the REST API access:
Change the default password for the nvidia username:
./hbn-dpu-setup.sh -u nvidia -p <new-password>
Enable REST API:
./hbn-dpu-setup.sh --
enable
-rest-api-accessPerform BlueField system-level reset.
Spawning HBN Container
HBN container .yaml configuration is called doca_hbn.yaml and it is located in <resource>/configs/<doca_version>/ directory. To spawn the HBN container, simply copy the doca_hbn.yaml file to the /etc/kubelet.d directory:
cd
<resource>/configs/2.8.0/
sudo
cp
doca_hbn.yaml /etc/kubelet.d/
Kubelet automatically pulls the container image from NGC and spawns a pod executing the container. The DOCA HBN Service starts executing right away.
Verifying HBN Container is Running
To inspect the HBN container and verify if it is running correctly:
Check HBN pod and container status and logs:
Examine the currently active pods and their IDs (it may take up to 20 seconds for the pod to start):
sudo
crictl podsView currently active containers and their IDs:
sudo
crictlps
Examine logs of a given container:
sudo
crictl logsExamine kubelet logs if something did not work as expected:
sudo
journalctl -u kubelet@mgmt
Log into the HBN container:
sudo
crictlexec
-it $(crictlps
|grep
hbn |awk
'{print $1;}'
)bash
While logged into HBN container, verify that the frr, nl2doca, and neighmgr services are running:
(hbn-container)$ supervisorctl status frr (hbn-container)$ supervisorctl status nl2doca (hbn-container)$ supervisorctl status neighmgr
Users may also examine various logs under /var/log inside the HBN container.
HBN Deployment Configuration
The HBN service comes with four types of configurable interfaces:
Two uplinks (p0_if, p1_if)
Two PF port representors (pf0hpf_if, pf1hpf_if)
User-defined number of VFs (i.e., pf0vf0_if, pf0vf1_if, …, pf1vf0_if, pf1vf1_if, …)
DPU interfaces to connect to services running on BlueField, outside of the HBN container (pf0dpu1_if and pf0dpu3_if)
The *_if suffix indicates that these are sub-functions and are different from the physical uplinks (i.e., PFs, VFs). They can be viewed as virtual interfaces from a virtualized BlueField.
Each of these interfaces is connected outside the HBN container to the corresponding physical interface, see section "Service Function Chaining" (SFC) for more details.
The HBN container runs as an isolated namespace and does not see any interfaces outside the container (oob_net0, real uplinks and PFs, *_if_r representors).
HBN-only Deployment Configuration
This is the default deployment model of HBN. In this model, only one OVS bridge is created.
The following is a sample bf.cfg and the resulting OVS and port configurations:
Sample bf.cfg:
bf.cfg
BR_HBN_UPLINKS=
"p0,p1"
BR_HBN_REPS="pf0hpf,pf1hpf,pf0vf0-pf0vf12,pf1vf0-pf1vf1"
BR_HBN_SFS="svc1,svc2"
Generated hbn.conf:
Generated hbn.conf
[BR_HBN_UPLINKS] p0 p1 [BR_HBN_REPS] pf0hpf pf0vf0 pf0vf1 pf0vf2 pf0vf3 pf0vf4 pf0vf5 pf0vf6 pf0vf7 pf0vf8 pf0vf9 pf0vf10 pf0vf11 pf0vf12 pf1hpf pf1vf0 pf1vf1 [BR_HBN_SFS] svc1 svc2 [BR_SFC_UPLINKS] [BR_SFC_REPS] [BR_SFC_SFS] [BR_HBN_SFC_PATCH_PORTS] [LINK_PROPAGATION] p0:p0_if_r p1:p1_if_r pf0hpf:pf0hpf_if_r pf0vf0:pf0vf0_if_r pf0vf1:pf0vf1_if_r pf0vf2:pf0vf2_if_r pf0vf3:pf0vf3_if_r pf0vf4:pf0vf4_if_r pf0vf5:pf0vf5_if_r pf0vf6:pf0vf6_if_r pf0vf7:pf0vf7_if_r pf0vf8:pf0vf8_if_r pf0vf9:pf0vf9_if_r pf0vf10:pf0vf10_if_r pf0vf11:pf0vf11_if_r pf0vf12:pf0vf12_if_r pf1hpf:pf1hpf_if_r pf1vf0:pf1vf0_if_r pf1vf1:pf1vf1_if_r svc1_r:svc1_if_r svc2_r:svc2_if_r [ENABLE_BR_SFC] [ENABLE_BR_SFC_DEFAULT_FLOWS]
Dual Bridge HBN Deployment Configuration
The following is a sample bf.cfg and the resulting OVS and port configurations:
Sample bf.cfg:
bf.cfg
BR_HBN_UPLINKS=
""
BR_SFC_UPLINKS="p0,p1"
BR_HBN_REPS=""
BR_SFC_REPS="pf0hpf,pf1hpf,pf0vf0-pf0vf1,pf1vf0-pf1vf1"
BR_HBN_SFS=""
BR_SFC_SFS=""
BR_HBN_SFC_PATCH_PORTS="tss0"
LINK_PROPAGATION="pf0hpf:tss0"
ENABLE_BR_SFC=yes ENABLE_BR_SFC_DEFAULT_FLOWS=yesGenerated hbn.conf:
Generated hbn.conf
[BR_HBN_UPLINKS] [BR_HBN_REPS] [BR_HBN_SFS] [BR_SFC_UPLINKS] p0 p1 [BR_SFC_REPS] pf0hpf pf0vf0 pf0vf1 pf1hpf pf1vf0 pf1vf1 [BR_SFC_SFS] [BR_HBN_SFC_PATCH_PORTS] tss0 [LINK_PROPAGATION] pf0hpf:tss0 p0:p0_if_r p1:p1_if_r pf0vf0:pf0vf0_if_r pf0vf1:pf0vf1_if_r pf1hpf:pf1hpf_if_r pf1vf0:pf1vf0_if_r pf1vf1:pf1vf1_if_r [ENABLE_BR_SFC] yes [ENABLE_BR_SFC_DEFAULT_FLOWS] yes
Mixed Mode HBN Deployment Configuration
The following is a sample bf.cfg and the resulting OVS and port configurations:
Sample bf.cfg:
bf.cfg
BR_HBN_UPLINKS=
"p1"
BR_SFC_UPLINKS="p0"
BR_HBN_REPS="pf1hpf,pf0vf0"
BR_SFC_REPS="pf0hpf,pf0vf1"
BR_HBN_SFS="svc1,svc2"
BR_SFC_SFS="ovn"
BR_HBN_SFC_PATCH_PORTS="tss0"
LINK_PROPAGATION="pf0hpf:tss0"
ENABLE_BR_SFC=yes ENABLE_BR_SFC_DEFAULT_FLOWS=yesGenerated hbn.conf:
Generated hbn.conf
[BR_HBN_UPLINKS] p1 [BR_HBN_REPS] pf0vf0 pf1hpf [BR_HBN_SFS] svc1 svc2 [BR_SFC_UPLINKS] p0 [BR_SFC_REPS] pf0hpf pf0vf1 [BR_SFC_SFS] ovn [BR_HBN_SFC_PATCH_PORTS] tss0 [LINK_PROPAGATION] pf0hpf:tss0 p1:p1_if_r p0:p0_if_r pf0vf0:pf0vf0_if_r pf0hpf:pf0hpf_if_r pf0vf1:pf0vf1_if_r pf1hpf:pf1hpf_if_r svc1_r:svc1_if_r svc2_r:svc2_if_r ovn_r:ovn_if_r [ENABLE_BR_SFC] yes [ENABLE_BR_SFC_DEFAULT_FLOWS] yes
HBN Deployment Considerations
SF Interface State Tracking
When HBN is deployed with SFC, the interface state of the following network devices is propagated to their corresponding SFs:
Uplinks – p0, p1
PFs – pf0hpf, pf1hpf
VFs – pf0vfX, pf1vfX where X is the VF number
For example, if the p0 uplink cable gets disconnected:
p0 transitions to DOWN state with NO-CARRIER (default behavior on Linux); and
p0 state is propagated to p0_if whose state also becomes DOWN with NO-CARRIER
After p0 connection is reestablished:
p0 transitions to UP state; and
p0 state is propagated to p0_if whose state becomes UP
Interface state propagation only happens in the uplink/PF/VF-to-SF direction.
A daemon called sfc-state-propagation runs on BlueField, outside of the HBN container, to sync the state. The daemon listens to netlink notifications for interfaces and transfers the state to SFs.
SF Interface MTU
In the HBN container, all the interfaces MTU are set to 9216 by default. MTU of specific interfaces can be overwritten using flat-files configuration or NVUE.
On BlueField side (i.e., outside of the HBN container), the MTU of the uplinks, PFs and VFs interfaces are also set to 9216. This can be changed by modifying /etc/systemd/network/30-hbn-mtu.network or by adding a new configuration file in the /etc/systemd/network for specific directories.
To reload this configuration, run:
systemctl restart systemd-networkd
Connecting to DOCA Services to HBN on BlueField Arm
There are various SF ports (named pf0dpuX_if, where X is [0..n]) on BlueField Arm, which can be used to run any services on BlueField and use HBN to provide network connectivity. These ports can have a flexible naming convention based on the service name. For example, to support OVN service, it can create an interface named ovn which can be used by the OVN service running on the BlueField Arm, and it will get a corresponding HBN port named ovn_if. These interfaces are created using either BR_SFC_SFS or BR_HBN_SFS based on which the bridge needs the service interface and mode of service deployment.
Traffic between BlueField and the outside world is hardware-accelerated when the HBN side port is an L3 interface or access-port using switch virtual interface (SVI). So, it is treated the same way as PF or VF ports from a traffic handling standpoint.
There are 2 SF port pairs created by default on BlueField Arm side so there can be 2 separate DOCA services running at same time.
Disabling BlueField Uplinks
The uplink ports must be always kept administratively up for proper operation of HBN. Otherwise, the NVIDIA® ConnectX® firmware would bring down the corresponding representor port which would cause data forwarding to stop.
Change in operational status of uplink (e.g., carrier down) would result in traffic being switched to the other uplink.
When using ECMP failover on the two uplink SFs, locally disabling one uplink does not result in traffic switching to the second uplink. Disabling local link in this case means to set one uplink admin DOWN directly on BlueField.
To test ECMP failover scenarios correctly, the uplink must be disabled from its remote counterpart (i.e., execute admin DOWN on the remote system's link which is connected to the uplink).
HBN NVUE User Credentials
The preconfigured default user credentials are as follows:
Username |
nvidia |
Password |
nvidia |
NVUE user credentials can be added post installation:
This can be done by specifying additional –-username and –-password to the HBN startup script (refer to "Running HBN Preparation Script"). For example:
sudo
./hbn-dpu-setup.sh -u newuser -p newpasswordAfter executing this script, respawn the container or start the decrypt-user-add script inside running HBN container:
supervisorctl start decrypt-user-add decrypt-user-add: started
The script creates a new user in the HBN container:
cat
/etc/passwd |grep
newuser newuser:x:1001:1001::/home/newuser:/bin/bash
HBN NVUE Interface Classification
Interface |
Interface Type |
NVUE Type |
p0_if |
Uplink representor |
swp |
p1_if |
Uplink representor |
swp |
lo |
Loopback |
loopback |
pf0hpf_if |
Host representor |
swp |
pf1hpf_if |
Host representor |
swp |
pf0vfx_if (where x is 0-255) |
VF representor |
swp |
pf1vfx_if (where x is 0-255) |
VF representor |
swp |
HBN Files Persistence
The following directories are mounted from BlueField Arm to the HBN container namespace and are persistent across HBN service restarts and BlueField reboots:
BlueField Arm Mount Point |
HBN Container Mount Point |
|
Configuration file mount points |
/var/lib/hbn/etc/network/ |
/etc/network/ |
/var/lib/hbn/etc/frr/ |
/etc/frr/ |
|
/var/lib/hbn/etc/nvue.d/ |
/etc/nvue.d/ |
|
/var/lib/hbn/etc/supervisor/conf.d/ |
/etc/supervisor/conf.d/ |
|
/var/lib/hbn/var/lib/nvue/ |
/var/lib/nvue/ |
|
Support and log file mount points |
/var/lib/hbn/var/support/ |
/var/support/ |
/var/log/doca/hbn/ |
/var/log/hbn/ |
SR-IOV Support in HBN
Creating SR-IOV VFs on Host
The first step to use SR-IOV is to create Virtual Functions (VFs) on the host server.
VFs can be created using the following command:
sudo
echo
N > /sys/class/net/<host-rep>/device/sriov_numvfs
Where:
<host-rep> is one of the two host representors (e.g., ens1f0 or ens1f1)
0≤N≤16 is the desired total number of VFs
Set N=0 to delete all the VFs on 0≤N≤16
N=16 is the maximum number of VFs supported on HBN across all representors
Automatic Creation of VF Representors and SF Devices on BlueField
VFs created on the host must have corresponding VF representor devices and SF devices for HBN on BlueField side. For example:
ens1f0vf0 is the first SR-IOV VF device from the first host representor; this interface is created on the host server
pf0vf0 is the corresponding VF representor device to ens1f0vf0; this device is present on the BlueField Arm side and automatically created at the same time as ens1f0vf0 is created by the user on the host side
pf0vf0_if is the corresponding SF device for pf0vf0 which is used to connect the VF to HBN pipeline
The creation of the SF device for VFs is done ahead of time when provisioning the BlueField and installing the DOCA image on it, see section "Enabling SFC" to see how to select how many SFs to create ahead of time.
The SF devices for VFs (i.e., pfXvfY) are pre-mapped to work with the corresponding VF representors when these are created with the command from the previous step.
Management VRF
Two management VRFs are automatically configured for HBN when BlueField is deployed with SFC:
The first management VRF is outside the HBN container on BlueField. This VRF provides separation between out-of-band (OOB) traffic (via oob_net0 or tmfifo_net0) and data-plane traffic via uplinks and PFs.
The second management VRF is inside the HBN container and provides similar separation. The OOB traffic (via eth0) is isolated from the traffic via the *_if interfaces.
MGMT VRF on BlueField Arm
The management (mgmt) VRF is enabled by default when the BlueField is deployed with SFC (see section "Enabling SFC"). The mgmt VRF provides separation between the OOB management network and the in-band data plane network.
The uplinks and PFs/VFs use the default routing table while the oob_net0 (OOB Ethernet port) and the tmifo_net0 netdevices use the mgmt VRF to route their packets.
When logging in either via SSH or the console, the shell is by default in mgmt VRF context. This is indicated by a mgmt added to the shell prompt:
root@bf2:mgmt:/home/ubuntu#
root@bf2:mgmt:/home/ubuntu# ip vrf identify
mgmt.
When logging into the HBN container with crictl, the HBN shell will be in the default VRF. Users must switch to MGMT VRF manually if OOB access is required. Use ip vrf exec to do so.
root@bf2:mgmt:/home/ubuntu# ip vrf exec mgmt bash
The user must run ip vrf exec mgmt to perform operations requiring OOB access (e.g., apt-get update).
Network devices belonging to the mgmt VRF can be listed with the vrf utility:
root@bf2:mgmt:/home/ubuntu# vrf link list
VRF: mgmt
--------------------
tmfifo_net0 UP 00:1a:ca:ff:ff:03 <BROADCAST,MULTICAST,UP,LOWER_UP>
oob_net0 UP 08:c0:eb:c0:5a:32 <BROADCAST,MULTICAST,UP,LOWER_UP>
root@bf2:mgmt:/home/ubuntu# vrf help
vrf <OPTS>
VRF domains:
vrf list
Links associated with VRF domains:
vrf link list [<vrf-name>]
Tasks and VRF domain asociation:
vrf task exec
<vrf-name> <command
>
vrf task list [<vrf-name>]
vrf task identify <pid>
NOTE: This command
affects only AF_INET and AF_INET6 sockets opened by the
command
that gets exec
'ed. Specifically, it has *no* impact on netlink
sockets (e.g., ip command
).
To show the routing table for the default VRF, run:
root@bf2:mgmt:/home/ubuntu# ip route show
To show the routing table for the mgmt VRF, run:
root@bf2:mgmt:/home/ubuntu# ip route show vrf mgmt
MGMT VRF Inside HBN Container
Inside the HBN container, a separate mgmt VRF is present. Similar commands as those listed under section "MGMT VRF on BlueField Arm" can be used to query management routes.
The *_if interfaces use the default routing table while the eth0 (OOB) uses the mgmt VRF to route out-of-band packets out of the container. The OOB traffic gets NATed through the oob_net0 interface on BlueField Arm, ultimately using the BlueField OOB's IP address.
When logging into the HBN container via crictl, the shell enters the default VRF context by default. Switching to the mgmt VRF can be done using the command ip vrf exec mgmt <cmd>.
Existing Services in MGMT VRF on BlueField Arm
On the BlueField Arm, outside the HBN container, a set of existing services run in the mgmt VRF context as they need OOB network access:
containerd
kubelet
ssh
docker
These services can be restarted and queried for their status using the command systemctl while adding @mgmt to the original service name. For example:
To restart containerd:
root@bf2:mgmt:/home/ubuntu
# systemctl restart containerd@mgmt
To query containerd status:
root@bf2:mgmt:/home/ubuntu
# systemctl status containerd@mgmt
The original version of these services (without @mgmt) are not used and must not be started.
Running New Service in MGMT VRF on BlueField Arm
If a service needs OOB access to run, it can be added to the set of services running in mgmt VRF context. Adding such a service is only possible on the BlueField Arm (i.e., outside the HBN container).
To add a service to the set of mgmt VRF services:
Add it to /etc/vrf/systemd.conf (if it is not present already). For example, NTP is already listed in this file.
Run the following:
root@bf2:mgmt:/home/ubuntu# systemctl daemon-reload
Stop and disable to the non-VRF version of the service to be able to start the mgmt VRF one:
root@bf2:mgmt:/home/ubuntu# systemctl stop ntp root@bf2:mgmt:/home/ubuntu# systemctl disable ntp root@bf2:mgmt:/home/ubuntu# systemctl enable ntp@mgmt root@bf2:mgmt:/home/ubuntu# systemctl start ntp@mgmt