HBN Service Release Notes
The following subsections provide information on HBN service new features, interoperability, known issues, and bug fixes.
HBN 2.0.0 offers the following new features and updates:
New hardware-accelerated dataplane based on OVS-DOCA
Added support for HBN interoperating with SNAP storage service for NVMe-oF and NVMe block device emulation scenario (for NVIDIA® BlueField®-2 only)
Supported BlueField Networking Platforms
HBN 2.0.0 has been validated on the following NVIDIA BlueField Networking Platforms:
BlueField-2 DPU platforms:
BlueField-2 P-Series DPU 25GbE Dual-Port SFP56; PCIe Gen4 x8; Crypto Enabled; 16GB on-board DDR; 1GbE OOB management; HHHL
BlueField-2 P-Series DPU 25GbE Dual-Port SFP56; integrated BMC; PCIe Gen4 x8; Secure Boot Enabled; Crypto Enabled; 16GB on-board DDR; 1GbE OOB management; FHHL
BlueField-2 P-Series DPU 25GbE Dual-Port SFP56; integrated BMC; PCIe Gen4 x8; Secure Boot Enabled; Crypto Enabled; 32GB on-board DDR; 1GbE OOB management; FHHL
BlueField-2 P-Series DPU 100GbE Dual-Port QSFP56; integrated BMC; PCIe Gen4 x16; Secure Boot Enabled; Crypto Enabled; 32GB on-board DDR; 1GbE OOB management; FHHL
BlueField-3 DPU platforms:
BlueField-3 B3210 P-Series FHHL DPU; 100GbE (default mode) / HDR100 IB; Dual-port QSFP112; PCIe Gen5.0 x16 with x16 PCIe extension option; 16 Arm cores; 32GB on-board DDR; integrated BMC; Crypto Enabled
BlueField-3 B3220 P-Series FHHL DPU; 200GbE (default mode) / NDR200 IB; Dual-port QSFP112; PCIe Gen5.0 x16 with x16 PCIe extension option; 16 Arm cores; 32GB on-board DDR; integrated BMC; Crypto Enabled
BlueField-3 B3240 P-Series Dual-slot FHHL DPU; 400GbE / NDR IB (default mode); Dual-port QSFP112; PCIe Gen5.0 x16 with x16 PCIe extension option; 16 Arm cores; 32GB on-board DDR; integrated BMC; Crypto Enabled
Single-port BlueField platforms are currently not supported with HBN.
Supported BlueField OS
HBN 2.0.0 supports DOCA 2.5.0 (BSP 4.5.0) on Ubuntu 22.04 OS.
Verified Scalability Limits
HBN 2.0.0 has been tested to sustain the following maximum scalability limits:
Limit |
BlueField-2 |
BlueField-3 |
Comments |
VTEP peers (DPUs per control plane) in the fabric |
2k |
2k |
Number of DPUs (VTEPs) within a single overlay fabric (reachable in the underlay) |
VNIs/overlay networks in the fabric |
18 |
18 |
Total number of L2 VNIs in the fabric (max VNIs = max VF + max PF) |
DPUs per VNI/overlay network |
3, 2000 |
3, 2000 |
Total number of DPUs configured with the same VNI (3 real DPUs, 2000 emulated VTEPs) |
Tenants (L3 VNIs) per server |
8 |
8 |
Maximum number of tenants on the same host server |
VM/pods per server |
16 |
16 |
Maximum number of IP addresses advertised by EVPN in DPU |
Maximum number of L3 LPM routes (underlay) |
256 |
256 |
IPv4 or IPv6 underlay LPM routes per DPU |
Maximum number of EVPN type-2 entries |
4K |
4k |
Remote overlay MAC/IP entries for compute peers stored on a single DPU (L2 EVPN use case) |
Maximum number of EVPN type-5 entries |
16K |
16K |
Remote overlay L3 LPM entries for compute peers stored on a single DPU (L3 EVPN use case) Note
Supported at beta level.
|
Maximum number of PFs |
2 |
2 |
Total number of PFs visible to the host |
Maximum number of VFs |
16 |
16 |
Total number of VFs created on the host |
The following table lists the known issues and limitations for this release of HBN.
Reference |
Description |
3705894 |
Description: In an EVPN Symmetric Routing scenario, IPv6 traffic is not hardware offloaded. It is only IPv6 traffic that is routed using L3VNIs to remote hosts that is affected. |
Workaround: N/A |
|
Keyword: EVPN; IPv6 |
|
Reported in HBN version: 2.0.0 |
|
3378928 |
Description: When an interface is brought down or deleted (e.g., an SVI deletion), the routes learned over that interface, though removed from kernel, are not notified to netlink. Hence, these routes are still present in nl2doca and consequently in the FDB. If upon raising an interface these older routes are not newly installed, then those stale routes in nl2doca remain until nl2doca is restarted or a suggested workaround is applied. |
Workaround: Resync netlink cache with kernel.
|
|
Keyword: Container |
|
Reported in HBN version: 2.0.0 |
|
3519324 |
Description: The DOCA HBN container takes about 1 minute longer to spawn, as compared to previous HBN release (1.4.0). |
Workaround: N/A |
|
Keyword: Container |
|
Reported in HBN version: 1.5.0 |
|
3605486 |
Description: When the DPU boots up after issuing a "reboot" command from the DPU itself, some host-side interfaces may remain down. |
Workaround: N/A |
|
Keyword: Reboot |
|
Reported in HBN version: 1.5.0 |
|
3547103 |
Description: IPv6 stateless ACLs are not supported. |
Workaround: N/A |
|
Keyword: IPv6 ACL |
|
Reported in HBN version: 1.5.0 |
|
3339304 |
Description: Statistics for hardware-offloaded traffic are not reflected on SFs inside an HBN container. |
Workaround: Look up the stats using ip -s link show on PFs outside of the HBN container. PFs would show Tx/Rx stats for traffic that is hardware-accelerated in the HBN container. |
|
Keyword: Statistics; container |
|
Reported in HBN version: 1.4.0 |
|
3352003 |
Description: NVUE show, config, and apply commands malfunction if the nvued and nvued-startup services are not in the RUNNING and EXITED states respectively. |
Workaround: N/A |
|
Keyword: NVUE commands |
|
Reported in HBN version: 1.3.0 |
|
3168683 |
Description: If many interfaces are participating in EVPN/routing, it is possible for the routing process to run out of memory. |
Workaround: Have a maximum of 8 VF interfaces participating in routing/VXLAN. |
|
Keyword: R outing; memory |
|
Reported in HBN version: 1.2.0 |
|
3219539 |
Description: TC rules are programmed by OVS to map uplink and host representor ports to HBN service. These rules are ageable and can result in packets needing to get software forwarded periodically to refresh the rules. |
Workaround: The timeout value can be adjusted by changing the OVS parameter other_config : max-idle as documented here. The shipped default value is 10000ms (10s). |
|
Keyword: SFC; aging |
|
Reported in HBN version: 1.2.0 |
|
3184745 |
Description: The command nv show interface <intf> acl does not show correct information if there are multiple ACLs bound to the interface. |
Workaround: Use the command nv show interface <intf> to view the ACLs bound to an interface. |
|
Keyword: ACLs |
|
Reported in HBN version: 1.2.0 |
|
3158934 |
Description: Deleting an NVUE user by removing their password file and restarting the decrypt-user-add service on the HBN container does not work. |
Workaround: Either respawn the container after deleting the file, or delete the password file corresponding to the user by running userdel -r username. |
|
Keyword: User deletion |
|
Reported in HBN version: 1.2.0 |
|
3185003 |
Description: When a packet is encapsulated with a VXLAN header, it adds extra bytes which may cause the packet to exceed the MTU of link. Typically, the packet would be fragmented but its silently dropped and no fragmentation happens. |
Workaround: Make sure that the MTU on the uplink port is always 50 bytes more than host ports so that even after adding VXLAN headers, ingress packets do not exceed the MTU. |
|
Keyword: MTU; VXLAN |
|
Reported in HBN version: 1.2.0 |
|
3184905 |
Description: On VXLAN encapsulation, the DF flag is not propagated to the outer header. Such a packet may be truncated when forwarded in the kernel, and it may be dropped when hardware offloaded. |
Workaround: Make sure that the MTU on the uplink port is always 50 bytes more than host ports so that even after adding VXLAN headers, ingress packets do not exceed the MTU. |
|
Keyword: VXLAN |
|
Reported in HBN version: 1.2.0 |
|
3188688 |
Description: When stopping the container using the command crictl stop an error may be reported because the command uses a timeout of 0 which is not enough to stop all the processes in the HBN container. |
Workaround: Pass a timeout value when stopping the HBN container by running:
|
|
Keyword: Timeout |
|
Reported in HBN version: 1.2.0 |
|
3129749 |
Description: The same ACL rule cannot be applied in both the inbound and outbound direction on a port. |
Workaround: N/A |
|
Keyword: ACLs |
|
Reported in HBN version: 1.2.0 |
|
3126560 |
Description: The system's time zone cannot be modified using NVUE in the HBN container. |
Workaround: The timezone can be manually changed by symlinking the /etc/localtime file to a binary time zone's identifier in the /usr/share/zoneinfo directory. For example:
|
|
Keyword: Time zone; NVUE |
|
Reported in HBN version: 1.2.0 |
|
3118204 |
Description: Auto-BGP functionality (where the ASN does not need to be configured but is dynamically inferred by the system based on the system's role as a leaf or spine device) is not supported on HBN. |
Workaround: If BGP is configured and used on HBN, the BGP ASN must be manually configured. |
|
Keyword: BGP |
|
Reported in HBN version: 1.2.0 |
|
3233088 |
Description: Since checksum calculation is offloaded to the hardware (not done by the kernel), it is expected to see an incorrect checksum in the tcpdump for locally generated, outgoing packets. BGP keepalives and updates are some of the packets that show such incorrect checksum in tcpdump. |
Workaround: N/A |
|
Keyword: BGP |
|
Reported in HBN version: 1.2.0 |
|
2821785 |
Description: MAC addresses are not learned in the hardware but only in software. This may affect performance in pure L2 unicast traffic. |
Workaround: N/A |
|
Keyword: MAC; L2 |
|
Reported in HBN version: 1.3.0 |
|
3017202 |
Description: Due to disabled backend foundation units, some NVUE commands return 500 INTERNAL SERVER ERROR/404 NOT FOUND. These commands are related to features or subsystems which are not supported on HBN. |
Workaround: N/A |
|
Keyword: Unsupported NVUE commands |
|
Reported in HBN version: 1.3.0 |
|
2828838 |
Description: NetworkManager and other services not directly related to HBN may display the following message in syslog:
The message has no functional impact and may be ignored. |
Workaround: N/A |
|
Keyword: Error |
|
Reported in HBN version: 1.3.0 |
The following table lists the known issues which have been fixed for this release of HBN.
Reference |
Description |
3610971 |
Description: The output of the command nv show interface does not display information about VRFs, VXLAN, and bridge. |
Fixed in HBN version: 2.0.0 |
|
3378928 |
Description: Service functions (*_sf) inside the HBN container are UP at container start irrespective of their presence/absence in the /etc/network/interfaces file. But once any of them are added to /e/n/i and later taken off from /e/n/i, they stay DOWN unless added back to /e/n/i. |
Fixed in HBN version: 2.0.0 |
|
3452914 |
Description: IPv6 OOB connectivity from the HBN container stops working if the br-mgmt interface on the DPU goes down. When going down, the br-mgmt interface loses its IPv6 address, which is used as the gateway address for the HBN container. If the br-mgmt interface comes back up, its IPv6 address is not added back and IPv6 OOB connectivity from the HBN container will not work |
Fixed in HBN version: 1.5.0 |
|
3191433 |
Description: ECMP selection for the underlay path uses the ingress port and identifies uplink ports via round robin. This may not result in uniform spread of the traffic. |
Fixed in HBN version: 1.4.0 |
|
3049879 |
Description: When reloading (ifreload) an empty /etc/network/interfaces file, the previously created interfaces are not deleted. |
Fixed in HBN version: 1.4.0 |
|
3284607 |
Description: When an ACL is configured for IPv4 and L4 parameters (protocol tcp/udp, source, and destination ports) match, the ACL also matches IPv6 traffic with the specified L4 parameters. |
Fixed in HBN version: 1.4.0 |
|
3282113 |
Description: Some DPUs experience an issue with the clock settings after installing a BlueField OS in an HBN setting in which the date reverts back to "Thu Sep 8, 2022". |
Fixed in HBN version: 1.4.0 |
|
3354029 |
Description: If interfaces on which BGP unnumbered peering is configured are not defined in the /etc/network/interfaces configuration file, BGP peering does not get established on them. |
Fixed in HBN version: 1.4.0 |