What can I help you with?
DOCA Documentation v3.0.0

HBN Service Release Notes

The following subsections provide information about new features, interoperability, known issues, and bug fixes.

HBN 3.0.0 offers the following new features and updates:

  • Aligned HBN versioning with DOCA to simplify lifecycle management

  • Improved ECMP failover performance through asynchronous processing

  • Added Bidirectional Forwarding Detection (BFD) support for BGP (GA)

  • Bug fixes

You can upgrade to HBN 3.0.0 from the previous HBN version.

Supported BlueField Networking Platforms

HBN 3.0.0 has been validated on the following NVIDIA BlueField networking platforms:

  • BlueField-2 DPUs:

    • BlueField-2 P-Series DPU 25GbE Dual-Port SFP56; PCIe Gen4 x8; Crypto Enabled; 16GB on-board DDR; 1GbE OOB management; HHHL

    • BlueField-2 P-Series DPU 25GbE Dual-Port SFP56; integrated BMC; PCIe Gen4 x8; Secure Boot Enabled; Crypto Enabled; 16GB on-board DDR; 1GbE OOB management; FHHL

    • BlueField-2 P-Series DPU 25GbE Dual-Port SFP56; integrated BMC; PCIe Gen4 x8; Secure Boot Enabled; Crypto Enabled; 32GB on-board DDR; 1GbE OOB management; FHHL

    • BlueField-2 P-Series DPU 100GbE Dual-Port QSFP56; integrated BMC; PCIe Gen4 x16; Secure Boot Enabled; Crypto Enabled; 32GB on-board DDR; 1GbE OOB management; FHHL

  • BlueField-3 DPUs:

    • BlueField-3 B3210E E-Series FHHL DPU; 100GbE (default mode) / HDR100 IB; Dual-port QSFP112; PCIe Gen5.0 x16 with x16 PCIe extension option; 16 Arm cores; 32GB on-board DDR; integrated BMC; Crypto Enabled

    • BlueField-3 B3220 P-Series FHHL DPU; 200GbE (default mode)/NDR200 IB; Dual-port QSFP112; PCIe Gen5.0 x16 with x16 PCIe extension option; 16 Arm cores; 32GB on-board DDR; integrated BMC; Crypto Enabled

    • BlueField-3 B3240 P-Series Dual-slot FHHL DPU; 400GbE/NDR IB (default mode); Dual-port QSFP112; PCIe Gen5.0 x16 with x16 PCIe extension option; 16 Arm cores; 32GB on-board DDR; integrated BMC; Crypto Enabled

  • BlueField-3 SuperNICs:

    • BlueField-3 B3210L E-series FHHL SuperNIC, 100GbE (default mode)/HDR100 IB, Dual port QSFP112, PCIe Gen4.0 x16, 8 Arm cores, 16GB on-board DDR, integrated BMC, Crypto Enabled

    • BlueField-3 B3220L E-Series FHHL SuperNIC, 200GbE (default mode)/NDR200 IB, Dual-port QSFP112, PCIe Gen5.0 x16, 8 Arm cores, 16GB on-board DDR, integrated BMC, Crypto Enabled

    • BlueField-3 B3140L E-Series FHHL SuperNIC, 400GbE/NDR IB (default mode), Single-port QSFP112, PCIe Gen5.0 x16, 8 Arm cores, 16GB on-board DDR, integrated BMC, Crypto Enabled

    • BlueField-3 B3140H E-series HHHL SuperNIC, 400GbE (default mode)/NDR IB, Single-port QSFP112, PCIe Gen5.0 x16, 8 Arm cores, 16GB on board DDR, integrated BMC, Crypto Enabled

Note

HBN does not support BlueField platforms with 8GB on-board DDR memory.


Supported BlueField OS

HBN 3.0.0 supports DOCA 3.0.0 (BSP 4.11.0) on Ubuntu 22.04 OS.

Verified Scalability Limits

HBN 3.0.0 has been tested to sustain the following maximum scalability limits:

Limit

BlueField-2

BlueField-3

Comments

VTEP peers (BlueFields per control plane) in the fabric

8K*

8K*

Number of BlueFields (VTEPs) within a single overlay fabric (reachable in the underlay)

L2 VNIs/Overlay networks per BlueField

20

20

Total number of L2 VNIs in the fabric for L2 VXLAN use-case assuming every interface is associated with its own VLAN + L2 VNI

L3 VNIs/Overlay networks per BlueField

20 for up to 4K VTEPs

10 for up to 8K VTEPs

20 for up to 4K VTEPs

10 for up to 8K VTEPs

Total number of L3 VNIs in the fabric for L3 VXLAN use-case assuming every interface is associated with its own VLAN + L2 VNI + L3 VNI + VRF

BlueFields per a single L2 VNI network

8K

8K

Total number of DPUs, configured with the same L2 VNI (3 real DPUs, 2000 emulated VTEPs)

BlueFields per a single L3 VNI network

8K

8K

Total number of DPUs, configured with the same L3 VNI (3 real DPUs, 2000 emulated VTEPs)

Maximum number of local MAC/ARP entries per BlueField

20

20

Max total number of MAC/ARP entries learned from the host on the DPU

Maximum number of local BGP routes per BlueField

200

200

Max total number of BGP routes advertised by the host to the BlueField (BGP peering with the host): 100 IPv4 + 100 IPv6

Maximum number of remote L3 LPM routes (underlay)

8K

8K

IPv4 or IPv6 underlay LPM routes per BlueField (default + host routes + LPM)

Maximum number of EVPN type-2 entries

16K

16K

Remote overlay MAC/IP entries for compute peers stored on a single BlueField (L2 EVPN)

Maximum number of EVPN type-5 entries

32K

80K

Remote overlay L3 LPM entries for compute peers stored on a single BlueField (L3 EVPN)

Maximum number of next-hops in ECMP next-hop group

16

16

Max number of next-hops in ECMP next-hop group (for overlay ECMP)

Maximum number of PFs on the host side

2

2

Total number of PFs visible to the host

Maximum number of VFs on the host side

16

16

Total number of VFs created on the host

Maximum number of SFs on the BlueField side

2

2

Total number of SF devices created on BlueField Arm

*Tested with 4 VNIs

The following table lists known issues and limitations for this release of HBN.

ReferenceDescription
4418454Description: The bf.cfg file contains configurations for two uplinks, but the DPU has only one uplink. Due to this misconfiguration, SFC initialization will stall, and the HBN container will not be able to start.
Workaround: Edit the bf.cfg file to match the number of uplinks.
Keywords: SFC, HBN, BFB installation, Single port DPU
Reported in HBN Version: 3.0.0
4255661Description: ARP packets between DPU and Outside Global that are forwarded using HBN are not HW offloaded.
Workaround: N/A
Keywords: ARP, Outside World/Global, HW offload
Reported in HBN Version: 2.5.0
4255708Description: When several ports are configured as part of a bridge and later reconfigured as L3 interfaces, only one port (the 1st port that was enslaved previously to bridge) is correctly reprogrammed as an L3 interface in nl2doca. The remaining ports continue to appear as bridged ports in nl2doca.
Workaround: Restart the HBN container after unconfiguring the bridge ports and before reconfiguring them as L3 interfaces
Keywords: Bridge port, L3 port
Reported in HBN Version: 2.5.0
4214631

Description: Packets with destination port 4789 coming from the host-side will be dropped if HBN is configured as L3-EVPN, leading to Customer/Tenant encapsulated VXLAN traffic drops in an L3-EVPN scenario.

This prevents running VXLAN underlay over HBN VXLAN overlay in L3 EVPN scenarios.

Workaround: N/A
Keywords: 4789, VXLAN overlay, VXLAN underlay
Reported in HBN Version: 2.5.0
4193046Description: When LLDP is enabled on BlueField, it might not work on uplink ports when the HBN service is running. This might happen if LLDP is running without an interface filter configuration.

Workaround: Configure LLDP to run only on interfaces where LLDP is required, using a configuration file, /etc/lldpd.d/ports.conf, for the lldpd daemon. The interfaces can be specified using a regular expression, if required. For example:

  • To run LLDP only on the uplinks (p0 and p1), configure it as follows:

    Copy
    Copied!
                

    $ cat /etc/lldpd.d/ports.conf


    Configure system interface pattern p[01].

  • To run LLDP on the uplinks plus some host-facing PFs or VFs, configure it as follows:

    Copy
    Copied!
                

    $ cat /etc/lldpd.d/ports.conf


    Configure system interface pattern p[0-1],pf[0-1]hpf,pf[0-1]vf[0-12].

If this configuration file is changed while the LLDP service is running, it must be restarted using systemctl restart lldpd.

Keywords: LLDP
Reported in HBN version: 2.4.1
4011688

Description: The following critical error message is generated during HBN POD reboot. It can be safely ignored.

Copy
Copied!
            

Error message: "CRIT Server 'unix_http_server' running without any HTTP authentication checking"

Workaround: N/A
Keywords: Log
Reported in HBN version: 2.4.0
4098158Description: OVS restart may result in temporary loss of connectivity.
Workaround: N/A
Keywords: OVS
Reported in HBN version: 2.4.0
3743942Description: The HBN container may hang in init-sfs during a container restart when the HBN YAML file (/etc/kubelet.d/doca_hbn.yaml) is modified while the container is running.
Workaround: If the container hangs in init-sfs for more than 1 minute, reload the DPU.
Keywords: Hang; container
Reported in HBN version: 2.3.0
3961387

Description: Changing the port number for the NVUE REST API using the NVUE CLI or API is not supported. Do not use the following command to change the port number:

Copy
Copied!
            

nv set system api port <port-no>

Workaround: On HBN, NVUE is accessible through 8765 (the default port number).
Keywords: NVUE API; port number
Reported in HBN version: 2.3.0
3967748Description: The command nv show system api connections does not return any data.
Workaround: N/A
Keywords: REST API; nginx
Reported in HBN version: 2.3.0
3865633Description: Packets with destination port 4789/8472 coming from the host side will be dropped if HBN is configured as an L3 EVPN. Encapsulated VXLAN traffic will be dropped in L3 EVPN scenarios.
Workaround: N/A
Keywords: 4789, 8472
Reported in HBN version: 2.2.0
3769309Description: A ping (or other IP connectivity check) from a locally connected host in VRF-X to an interface IP address on the DPU/HBN itself in VRF-Y will not work, even if VRF route-leaking is enabled between these two VRFs.
Workaround: N/A
Keyword: IP
Reported in HBN version: 2.2.0
3835295Description: Traffic entering the HBN service on a host PF/VF main-interface and exiting on a sub-interface of the same PF/VF (and vice-versa) is not hardware offloaded. Similarly, traffic entering HBN service on one sub-interface and exiting through another sub-interface of the same host PF/VF is also not hardware offloaded.
Workaround: N/A
Keyword: Hardware offload; interfaces
Reported in HBN version: 2.2.0
3772552Description: The DHCP relay gateway-interface IP address does not automatically pick up the IP address assigned to the associated VRF.
Workaround: The gateway-interface IP address must be explicitly configured.
Keyword: DHCP relay gateway; IP
Reported in HBN version: 2.2.0
3891542Description: If an NVUE-based routing policy (route map) configuration is used to associate route target extended communities with an EVPN route, only one route target can be specified.
Workaround: N/A
Keyword: NVUE; route target
Reported in HBN version: 2.2.0
3757686

Description: When the HBN container is coming up and applying a large configuration through the NVUE startup service which includes entities used by DHCP relay (for example, interfaces, SVIs, and VRFs), the DHCP relay service may go into a FATAL state. It can be observed using the following command:

Copy
Copied!
            

supervisorctl status | grep isc-dhcp-relay isc-dhcp-relay-vrf11 RUNNING pid 2069, uptime 0:11:31 isc-dhcp-relay-vrf12 RUNNING pid 2071, uptime 0:11:31 isc-dhcp-relay-vrf13 FATAL Exited too quickly (process log may have details) isc-dhcp-relay-vrf14 FATAL Exited too quickly (process log may have details)

Workaround: Restart the DHCP relay service that is in the FATAL state using the command:

Copy
Copied!
            

supervisorctl restart <relay-service-name>

Keyword: DHCP relay; fatal; container; restart
Reported in HBN version: 2.1.0
3605486Description: When the DPU boots up after issuing a "reboot" command from the DPU itself, some host-side interfaces may remain down.

Workaround:

  1. Restart openibd:

    Copy
    Copied!
                

    systemctl restart openibd

  2. Recreate SR-IOV interfaces, if required.

  3. Replay interface config. For example:

    • If using ifupdown2:

      Copy
      Copied!
                  

      ifreload -a 

    • If using Netplan:

      Copy
      Copied!
                  

      netplan apply

Keyword: Reboot
Reported in HBN version: 1.5.0
3547103Description: IPv6 stateless ACLs are not supported.
Workaround: N/A
Keyword: IPv6 ACL
Reported in HBN version: 1.5.0
3339304Description: Statistics for hardware-offloaded traffic are not reflected on SFs within an HBN container.
Workaround: Look up the statistics using ip -s link show on PFs outside of the HBN container. PFs display Tx/Rx data for traffic that is hardware-accelerated in the HBN container.
Keyword: Statistics; container
Reported in HBN version: 1.4.0
3352003Description: NVUE show, config, and apply commands malfunction if the nvued and nvued-startup services are not in the RUNNING and EXITED states, respectively.
Workaround: N/A
Keyword: NVUE commands
Reported in HBN version: 1.3.0
3184745Description: The command nv show interface <intf> acl does not display correct information if there are multiple ACLs bound to the interface.
Workaround: Use the command nv show interface <intf> to view the ACLs bound to an interface.
Keyword: ACLs
Reported in HBN version: 1.2.0
3158934Description: Deleting an NVUE user by removing their password file and restarting the decrypt-user-add service on the HBN container does not work.
Workaround: Either respawn the container after deleting the file or delete the password file corresponding to the user by running userdel -r username.
Keyword: User deletion
Reported in HBN version: 1.2.0
3185003Description: When a packet is encapsulated with a VXLAN header, it adds extra bytes which may cause the packet to exceed the MTU of link. Typically, the packet would be fragmented but it is silently dropped without fragmentation.
Workaround: Make sure that the MTU on the uplink port is always 50 bytes more than the host ports so that even after adding VXLAN headers, ingress packets do not exceed the MTU.
Keyword: MTU; VXLAN
Reported in HBN version: 1.2.0
3184905Description: During VXLAN encapsulation, the DF flag is not propagated to the outer header. Such a packet may be truncated when forwarded in the kernel, and it may be dropped when the hardware is offloaded.
Workaround: Make sure that the MTU on the uplink port is always 50 bytes more than host ports so that even after adding VXLAN headers, ingress packets do not exceed the MTU.
Keyword: VXLAN
Reported in HBN version: 1.2.0
3188688Description: When stopping a container using the crictl stop command, you may receive an error. The command uses a timeout of 0 which is insufficient to stop all the processes in the HBN container.

Workaround: Pass a greater timeout value when stopping the HBN container by running:

Copy
Copied!
            

crictl stop --timeout 60 <hbn-container>

Keyword: Timeout
Reported in HBN version: 1.2.0
3129749Description: The same ACL rule cannot be applied in both the inbound and outbound direction on a port.
Workaround: N/A
Keyword: ACLs
Reported in HBN version: 1.2.0
3126560Description: The system's time zone cannot be modified using NVUE in the HBN container.

Workaround: The time zone can be manually changed by symlinking the /etc/localtime file to a binary time zone's identifier in the /usr/share/zoneinfo directory. For example:

Copy
Copied!
            

sudo ln -sf /usr/share/zoneinfo/GMT /etc/localtime

Keyword: Time zone; NVUE
Reported in HBN version: 1.2.0
3118204Description: Auto-BGP functionality (where the ASN does not need to be configured but is dynamically inferred by the system based on the system's role as a leaf or spine device) is not supported on HBN.
Workaround: If BGP is configured and used on HBN, the BGP ASN must be manually configured.
Keyword: BGP
Reported in HBN version: 1.2.0
3233088Description: Since checksum calculation is offloaded to the hardware (and not by the kernel), you might see an incorrect checksum in the tcpdump for locally generated, outgoing packets. BGP keepalives and updates are examples of the packets that display incorrect checksums in tcpdump.
Workaround: N/A
Keyword: BGP
Reported in HBN version: 1.2.0
2821785Description: MAC addresses are not learned in the hardware but only in software. This may affect performance in pure L2 unicast traffic.
Workaround: N/A
Keyword: MAC; L2
Reported in HBN version: 1.3.0
3017202Description: Due to disabled backend foundation units, some NVUE commands return 500 INTERNAL SERVER ERROR/404 NOT FOUND. These commands are related to features or subsystems which are not supported on HBN.
Workaround: N/A
Keyword: Unsupported NVUE commands
Reported in HBN version: 1.3.0
2828838

Description: NetworkManager and other services not directly related to HBN may display the following message in syslog:

Copy
Copied!
            

"netlink: read: too many netlink events. Need to resynchronize platform cache"

The message has no functional impact and can be safely ignored.

Workaround: N/A
Keyword: Error
Reported in HBN version: 1.3.0

The following table lists the known issues which have been fixed for this release of HBN.

Reference

Description

4200335

Description: Sometimes the DNS resolution might fail if resolv.conf is not updated with the proper name server, leading to a loss of OOB connectivity.

Fixed in HBN Version: 3.0.0

4196880

Description: DHCP issues may lead to incomplete resolve.conf on the HBN container. The consequences can be DNS resolution failures and/or the hostname being set to 'localhost'.

Fixed in HBN VersionL 3.0.0

4264397

Description: OVS does not punt the IPv6 neighbor advertisements with unicast destination MAC address to the CPU; therefore, the endpoint MAC may not be learnt on the VTEP as long as the endpoint is silent (resulting in traffic towards endpoint to be software-forwarded). This is applicable only for absolutely silent end hosts which do not initiate any IPv6 neighbor solicitation messages. After the silent end host initiates the traffic, traffic will be hardware forwarded. This issue will persist only if the end points never initiate any traffic but only send IPv6 neighbor advertisements as a response to IPv6 neighbor solicitation (rare).

Fixed in HBN Version: 3.0.0

4155959

Description: With uplinks in the br-sfc bridge, IPv6 traffic in the uplink-to-uplink direction results in an OVS crash, resulting in complete traffic drop.

Fixed in HBN Version: 2.5.0

4197067

Description: The management VRF does not have an IPv6 address configured, resulting in the absence of a default IPv6 route in the management VRF. Consequently, IPv6 connectivity on the management port is unavailable and only IPv4 connectivity is supported.

Fixed in HBN Version: 2.5.0

4093502

Description: VRF interfaces have a loopback address, but these loopback addresses have scope global, not scope host which can break source IP address lookup for packets originating from the VRF.

Fixed in HBN version: 2.4.0

4029473

Description: Rarely, after deleting then re-creating an interface, BGP peering over that interface may announce IPv6 routes with an IPv4-mapped IPv6 address as the next hop, which the BGP peer device at the other end can reject.

Fixed in HBN version: 2.4.0

4125363

Description: On newer BlueField-2 and BlueField-3 devices, /sys/class/dmi/id/sys_vendor shows Nvidia instead of https://www.mellanox.com, causing NVUE to fail to apply configurations.

Fixed in HBN version: 2.4.0

3965589

Description: When SR-IOV VFs are created or deleted, then recreated, some ports may stay in ethX naming format and not be properly renamed to pfXvfY format. This results in the port remaining in an error state when running the command ovs-vsctl show due to the SFC and HBN not recognizing it.

Fixed in HBN version: 2.4.0

4004191

Description: Due to security fixes on BlueField-2, the number of context switches increased by 20% which may result in user applications (for example, nl2doca) running slower.

Fixed in HBN version: 2.4.0

3880352

Description: Deleting and re-adding SR-IOV ports might result in some ports in the br-hbn bridge going into an error state.

Fixed in HBN version: 2.4.0

3960825

Description: When either ENABLE_SFC_HBN or ENABLE_BR_HBN is set to yes in bf.cfg, the initial DHCP request from oob_net0 during the adapter boot does not contain the NVIDIA/BF/OOB string in DHCP option 60 (vendor class identifier).

Fixed in HBN version: 2.3.0

3538167

Description: An explicit restart of the FRR service may be required if the BGP AS number is changed via NVUE.

Fixed in HBN version: 2.3.0

3360699

Description: To decrease the default MTU on HBN interfaces, you must make the change on both the BlueField and within HBN, then reboot the BlueField for the change to take effect.

Fixed in HBN version: 2.3.0

3864080

Description: When an interface is toggled off and on, its sub-interfaces lose their IPv6 addresses and do not get them back.

Fixed in HBN version: 2.3.0

3632344

Description: HBN interfaces on the BlueField side (outside the HBN container) may not receive their proper MTU set from systemd-network.

Fixed in HBN version: 2.2.0

3760869

Description: Datapath flow with very low PPS may be deleted before aging time (60 sec) in environments with a large number of routes (16K+).

Fixed in HBN version: 2.2.0

3770992

Description: It is not possible to configure an IPv6 default (::/0) static route using NVUE.

Fixed in HBN version: 2.2.0

3824881

Description: When the number of unique ECMP groups exceeds 6, programming prefixes with more than 6 ECMP groups fails. Uniqueness is determined by the ECMP content, so multiple routes with the same nexthop paths use only one ECMP group.

Fixed in HBN version: 2.2.0

3705894

Description: In an EVPN Symmetric Routing scenario, IPv6 traffic is not hardware-offloaded.

Fixed in HBN version: 2.2.0

3519324

Description: The DOCA HBN container takes one minute longer to spawn, compared to the 1.4.0 HBN release.

Fixed in HBN version: 2.1.0

3219539

Description: TC rules are set by OVS to map uplink and host representor ports to the HBN service. These rules can expire, so packets may need to be software-forwarded periodically to refresh them.

Fixed in HBN version: 2.1.0

3610971

Description: The output of nv show interface command does not display information about VRFs, VXLAN, or bridges.

Fixed in HBN version: 2.0.0

3452914

Description: IPv6 OOB connectivity from the HBN container stops working if the br-mgmt interface on the DPU goes down. When going down, the br-mgmt interface loses its IPv6 address, which is used as the gateway address for the HBN container. If the br-mgmt interface comes back up, its IPv6 address is not added back and IPv6 OOB connectivity from the HBN container will not work.

Fixed in HBN version: 1.5.0

3191433

Description: ECMP selection for the underlay path uses the ingress port and identifies uplink ports via round robin. This may not result in a uniform traffic distribution.

Fixed in HBN version: 1.4.0

3049879

Description: When reloading (ifreload) an empty /etc/network/interfaces file, the previously created interfaces are not deleted.

Fixed in HBN version: 1.4.0

3284607

Description: When an ACL is configured for IPv4 and L4 parameters (protocol tcp/udp, source, and destination ports) match, the ACL also matches IPv6 traffic with the specified L4 parameters.

Fixed in HBN version: 1.4.0

3282113

Description: Some DPUs experience an issue with the clock settings after installing a BlueField OS in an HBN setting in which the date reverts back to "Thu Sep 8, 2022."

Fixed in HBN version: 1.4.0

3354029

Description: If interfaces configured for BGP unnumbered peering are not defined in the /etc/network/interfaces file, BGP peering will not be established on them.

Fixed in HBN version: 1.4.0

© Copyright 2025, NVIDIA. Last updated on May 6, 2025.