What can I help you with?
NVIDIA BlueField Platform Software Troubleshooting Guide

Virtio-net

The guidelines are for Virtio-net users and customers. It is recommended to read the user manual first.

Command

Description

systemctl status virtio-net-controller.service

Check status of virtio-net-controller service

virtnet -v

Version of virtio-net-controller

virtnet -h

Lis virtnet command line help manual

virtnet list

List all virtnet device and general information

virtnet query -p x

Query detailed information of device x

To check controller log. Run from DPU side:

Copy
Copied!
            

$ journalctl -u virtio-net-controller -f -n 400

This command shows 400 lines of the latest log, adjust the number of lines as needed.

BlueField-3 Jumbo MTU Does Not Work

Problem

Ping failed with packet size greater than 1500/4000 after configuring jumbo MTU.

Solution

Jumbo MTU is supported starting from the following kernel version:

Release

Upstream

VM kernel: 4.18.0-193.el8.x86_64

VM Linux version supports big MTU after 4.11.

Ubuntu

DOCA_2.5.0_BSP_4.5.0_Ubuntu_22.04

Virtnet

v1.7 or v1.6.26

The following steps configure jumbo MTU:

  1. Change the MTU of uplink representor (or bond) from the BlueField Arm OS:

    Copy
    Copied!
                

    # echo 9216 > /sys/bus/pci/devices/0000:03:00.0/net/p0/mtu

  2. Restart virtio-net-controller from the BlueField Arm OS:

    Copy
    Copied!
                

    # systemctl restart virtio-net-controlle

  3. Change the corresponding device MTU on BlueField Arm OS. For example, for the first VF on the first PF, run:

    Copy
    Copied!
                

    # virtnet modify -p 0 -v 0 device -t 9216

  4. Reload the virtio driver from the guest OS:

    Copy
    Copied!
                

    # modprobe -rv virtio-net && modprobe -v virtio-net

  5. Verify the VQs' MTU configuration is correct on BlueField Arm OS:

    Copy
    Copied!
                

    # virtnet query -p 0 -v 0 --dbg_stats | grep jumbo_mtu   "jumbo_mtu": 1   "jumbo_mtu": 1

  6. Change the MTU of the virtio-net interface from the guest OS:

    Copy
    Copied!
                

    # echo 9216 > /sys/bus/pci/devices/0000:af:00.2/virtio0/net/enp175s0f2/mtu

Virtio-net-controller.service Fails to Start

Problem

The problem can be verified using the following commands:

Copy
Copied!
            

# virtnet list ERR: Can't connect to virtnet controller: [Errno 111] Connection refused Check 'systemctl status virtio-net-controller' Or controller is not ready to accept commands

Copy
Copied!
            

# systemctl status virtio-net-controller virtio-net-controller.service - Nvidia VirtIO Net Controller Daemon Loaded: loaded (/etc/systemd/system/virtio-net-controller.service; enabled; vendor preset: disabled) Active: inactive (dead) since Fri 2023-10-27 17:46:59 CDT; 2min 26s ago Docs: file:/opt/mellanox/mlnx_virtnet/README.md Process: 29652 ExecStart=/usr/sbin/virtio_net_manager (code=exited, status=0/SUCCESS) Main PID: 29652 (code=exited, status=0/SUCCESS)


Solution

The problem may happen due to the following reasons.

Virtio-net Not Enabled

  1. Check if mlxconfig has VIRTIO_NET_EMULATION_ENABLE enabled:

    Copy
    Copied!
                

    # mlxconfig -d 03:00.0 -e q | grep -i VIRTIO_NET_EMULATION_ENABLE * VIRTIO_NET_EMULATION_ENABLE False(0) True(1) True(1)

    Both 2 and 3 columns should appear as true.

  2. If they are not, perform the following from the BlueField Arm side:

    Copy
    Copied!
                

    # mlxconfig -d 03:00.0 s VIRTIO_NET_EMULATION_ENABLE=1

  3. Perform a BlueField system-level reset as documented in the BlueField software documentation.

Not Enough SFs Reserved

This can happen when more VIRTIO_NET_EMULATION_NUM_PF are reserved than PF_TOTAL_SF, as each virtio-net PF/VF requires a corresponding SF created:

Copy
Copied!
            

# mlxconfig -d 03:00.0 -e q | grep -iE 'PF_TOTAL_SF|VIRTIO_NET_EMULATION_NUM_PF' *        VIRTIO_NET_EMULATION_NUM_PF                 0               4               4 *        PF_TOTAL_SF                                 0               8             8

Info

By default, the BlueField creates an SF for each PF. Take this into consideration when reserving PF_TOTAL_SF.

Function Not Implemented Error when Creating VF

Problem

Creating a virtio-net VF returns an error from the command line:

Copy
Copied!
            

# echo 3 > /sys/bus/pci/drivers/virtio-pci/0000:41:00.2/sriov_numvfs write error: Function not implemented

The host-side dmesg shows the following:

Copy
Copied!
            

[ 301.204661] virtio-pci 0000:41:00.2: Driver doesn't support SRIOV configuration via sysfs


Solution

Virtio SR-IOV is only supported starting from the following kernel version:

Release

Upstream

4.18 with commit cfecc2918d2b3

Ubuntu

Ubuntu-hwe-4.18.0-9.10_18.04.1

CentOS

3.10.0-957.el7 / 7.6.1810

Guest OS Stuck when Creating VF

Problem

The following command from the hypervisor hangs:

Copy
Copied!
            

# echo 100 > /sys/bus/pci/drivers/virtio-pci/0000:89:00.4/sriov_numvfs


Solution

This can happen when more VIRTIO_NET_EMULATION_NUM_PF/VIRTIO_NET_EMULATION_NUM_VF are reserved than PF_TOTAL_SF (VIRTIO_NET_EMULATION_NUM_PF + VIRTIO_NET_EMULATION_NUM_VF > PF_TOTAL_SF) as each virtio-net PF/VF requires a corresponding SF created. Example:

Copy
Copied!
            

# mlxconfig -d 03:00.0 -e q | grep -iE 'PF_TOTAL_SF|VIRTIO_NET_EMULATION_NUM_PF|VIRTIO_NET_EMULATION_NUM_VF' * VIRTIO_NET_EMULATION_NUM_VF 0 126 126 * VIRTIO_NET_EMULATION_NUM_PF 0 4 4 * PF_TOTAL_SF 0 508 508

Info

By default, BlueField creates an SF for each PF. Take this into consideration when reserving PF_TOTAL_SF.

Note

BlueField supports a limited number of SFs. The SF reserved on the BlueField Arm side and host side are not shared. Make sure to remove the SFs reserved on the host side when reserving a large number on the BlueField Arm side.

Hotplug Device Does Not Show Correctly in Guest OS

Problem

After creating a hotplug device from the BlueField side, probing virtio drivers does not create the virtio-net device correctly.

Solution

The problem may happen due to the following reasons.

BAR 0

Possible failure on BAR 0. check dmesg from guest OS for corresponding hotplug BDF:

Copy
Copied!
            

[10.874845] pci 0000:87:00.1: BAR 0: failed to assign [mem size 0x00100000]

Info

In this example, the hotplug PCIe BDF is 87:00.1. This value can be retrieved using "lspci | grep -i virtio" from the guest OS.

This can be normally resolved by adding "pci=realloc" in the Linux command line (grub).

BAR 14/15

Possible failure on other PCIe BAR. Check the dmesg from the guest OS for the corresponding hotplug BDF:

Copy
Copied!
            

[ 2893.484281] pcieport 0000:10:01.0: bridge window [mem 0x00100000-0x000fffff] to [bus 12] add_size 200000 add_align 100000 [ 2893.484285] pcieport 0000:10:01.0: BAR 14: no space for [mem size 0x00200000] [ 2893.484287] pcieport 0000:10:01.0: BAR 14: failed to assign [mem size 0x00200000] [ 2893.484289] pcieport 0000:10:01.0: BAR 14: no space for [mem size 0x00200000] [ 2893.484290] pcieport 0000:10:01.0: BAR 14: failed to assign [mem size 0x00200000]

Info

In this example, the hotplug PCIe BDF is 10:01.0. This value can be retrieved using "lspci | grep -i virtio" from the guest OS.

  • This is mostly due to there being insufficient BAR resources. Try to reduce the PF BAR size by performing the following from the BlueField side:

    Copy
    Copied!
                

    # mlxconfig -d 03:00.0 s PF_LOG_BAR_SIZE=0

  • This can also be caused by the BIOS provider not reserving enough memory. Check the guest OS's dmesg for similar messages for the PCIe bus of the BlueField device:

    Copy
    Copied!
                

    [3.979061] pci_bus 0000:a0: root bus resource [mem 0x41c0800000-0x41c10fffff window] (9M) [3.979062] pci_bus 0000:a0: root bus resource [bus a0-bf] [4.017770] pci 0000:a4:00.0:   bridge window [mem 0x41c0800000-0x41c0ffffff 64bit pref] (8M) [4.018243] pci 0000:a4:00.0: BAR 15: no space for [mem size 0x05800000 64bit pref] (88M) [4.018245] pci 0000:a4:00.0: BAR 15: failed to assign [mem size 0x05800000 64bit pref]

    • On the host, the prefetchable memory limit of the root bus ( a0 ) is only 9 M. This means that all the devices under this bus (including BlueField) can only be allocated 9M prefetchable memory in total.

    • The BAR 15 is the total prefetchable memory limit on the bridge (a4) of the device. The PCI bridge window of the BlueField for prefetchable memory is 8M, but the bridge requires 88M for its child device ( BlueField ). A fter several attempts, the PCIe bridge did not find sufficient IO memory to allocate for BlueField BARs. This can be solved by contacting the BIOS provider to provide enough memory to the PCI root.

Rescan

If the the hotplug operation from the BlueField Arm side is performed before the guest OS is up, and the virtio device is not found by the command "lspci | grep -i virtio". Try to rescan from guest OS:

Copy
Copied!
            

# echo 1>/sys/bus/pci/rescan


No Hotplug from BIOS

The server BIOS may not support hotplug device. This can be confirmed by looking at guest OS dmesg:

Copy
Copied!
            

[8.209406] acpi PNP0A08:03: _OSC: platform does not support [PCIeHotplug PME]

Try to enable hotplug from the BIOS:

image2022-10-13_9-50-39-version-1-modificationdate-1665651039697-api-v2.png


Force Hotplug

Guest OS may be running a kernel older than 4.19, the virtio device is not found by "lspci | grep -i virtio". Add the entry pciehp.pciehp_force=1 to the grub command line.

Hot-unplug Devices with Heavy Self-traffic, Guest OS Gets Call Trace

Problem

When the guest OS is running heavy traffic (e.g., iperf/iperf3) on a hotplug virtio-net device, unplugging those devices from BlueField side at the same time may results in the guest OS hanging.

The guest OS would print a call traffic similar like the following:

Copy
Copied!
            

[ 203.886218] CPU: 35 PID: 3077 Comm: iperf3 Not tainted 6.6.0 #1 [ 203.886222] Hardware name: Dell Inc. PowerEdge R7525/0590KW, BIOS 2.2.5 04/08/2021 [ 203.886224] RIP: 0010:free_old_xmit_skbs+0x5d/0xf0 [virtio_net] [ 203.886247] Code: 41 f6 c4 01 75 75 66 90 44 89 fe 4c 89 e7 45 03 6c 24 70 e8 65 1a 0a f0 83 c3 01 49 8b 3e 48 8d 75 cc e8 26 21 d1 ef 49 89 c4 <48> 85 c0 75 d1 85 db 74 0e 4d 01 ae 80 02 00 00 49 01 9e 78 02 00 [ 203.886249] RSP: 0018:ffffac62cb837678 EFLAGS: 00000246 [ 203.886253] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff9a35e7dbc000 [ 203.886255] RDX: 0000000000000000 RSI: ffffac62cb83767c RDI: ffff9a2e5e7d8900 [ 203.886257] RBP: ffffac62cb8376b0 R08: 0000000000000000 R09: 000000000003b2f0 [ 203.886259] R10: ffff9a2e4a570b00 R11: 000000000000000c R12: 0000000000000000 [ 203.886261] R13: 0000000000000000 R14: ffff9a2e62a48800 R15: 0000000000000000 [ 203.886263] FS: 00007f8444643400(0000) GS:ffff9a359f2c0000(0000) knlGS:0000000000000000 [ 203.886266] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 203.886268] CR2: 000056277998d028 CR3: 0000000127976000 CR4: 0000000000350ee0 [ 203.886270] Call Trace: [ 203.886274] <NMI> [ 203.886277] ? show_regs+0x6e/0x80 [ 203.886289] ? nmi_cpu_backtrace+0xb1/0x120 [ 203.886298] ? nmi_cpu_backtrace_handler+0x15/0x20 [ 203.886305] ? nmi_handle+0x6b/0x180 [ 203.886310] ? default_do_nmi+0x45/0x120 [ 203.886316] ? exc_nmi+0x142/0x1c0 [ 203.886319] ? end_repeat_nmi+0x16/0x67 [ 203.886328] ? free_old_xmit_skbs+0x5d/0xf0 [virtio_net] [ 203.886334] ? free_old_xmit_skbs+0x5d/0xf0 [virtio_net] [ 203.886341] ? free_old_xmit_skbs+0x5d/0xf0 [virtio_net] [ 203.886347] </NMI> [ 203.886348] <TASK> [ 203.886349] ? free_old_xmit_skbs+0x8c/0xf0 [virtio_net] [ 203.886356] start_xmit+0x149/0x500 [virtio_net] [ 203.886364] dev_hard_start_xmit+0x95/0x1e0 [ 203.886370] ? validate_xmit_skb_list+0x51/0x80 [ 203.886374] sch_direct_xmit+0x10c/0x3a0 [ 203.886381] __dev_queue_xmit+0xa47/0xda0 [ 203.886387] ip_finish_output2+0x2ef/0x5a0 [ 203.886393] ? srso_return_thunk+0x5/0x10 [ 203.886400] ? nf_conntrack_in+0xeb/0x6c0 [nf_conntrack] [ 203.886428] __ip_finish_output+0xb7/0x190 [ 203.886433] ip_finish_output+0x32/0x100 [ 203.886437] ip_output+0x63/0xf0 [ 203.886441] ? __pfx_ip_finish_output+0x10/0x10 [ 203.886446] ip_local_out+0x62/0x70 [ 203.886449] __ip_queue_xmit+0x18e/0x4b0 [ 203.886454] ip_queue_xmit+0x19/0x20 [ 203.886456] __tcp_transmit_skb+0xb2d/0xcd0 [ 203.886462] ? srso_return_thunk+0x5/0x10 [ 203.886469] tcp_write_xmit+0x565/0x1620 [ 203.886474] tcp_push_one+0x40/0x50 [ 203.886476] tcp_sendmsg_locked+0x350/0xee0 [ 203.886481] ? tcp_current_mss+0x75/0xd0 [ 203.886488] tcp_sendmsg+0x31/0x50 [ 203.886491] inet_sendmsg+0x47/0x80 [ 203.886498] sock_write_iter+0x163/0x190 [ 203.886507] vfs_write+0x342/0x3f0 [ 203.886517] ksys_write+0xb9/0xf0 [ 203.886520] __x64_sys_write+0x1d/0x30 [ 203.886522] do_syscall_64+0x60/0x90 [ 203.886528] ? srso_return_thunk+0x5/0x10 [ 203.886531] ? ksys_write+0xb9/0xf0 [ 203.886532] ? srso_return_thunk+0x5/0x10 [ 203.886535] ? exit_to_user_mode_prepare+0x35/0x180 [ 203.886542] ? srso_return_thunk+0x5/0x10 [ 203.886544] ? syscall_exit_to_user_mode+0x38/0x50 [ 203.886549] ? __x64_sys_write+0x1d/0x30 [ 203.886551] ? srso_return_thunk+0x5/0x10 [ 203.886553] ? do_syscall_64+0x6d/0x90 [ 203.886556] ? srso_return_thunk+0x5/0x10 [ 203.886558] ? syscall_exit_to_user_mode+0x38/0x50 [ 203.886561] ? srso_return_thunk+0x5/0x10 [ 203.886564] ? do_syscall_64+0x6d/0x90 [ 203.886566] ? __x64_sys_write+0x1d/0x30 [ 203.886568] ? srso_return_thunk+0x5/0x10 [ 203.886570] ? do_syscall_64+0x6d/0x90 [ 203.886572] ? srso_return_thunk+0x5/0x10 [ 203.886575] ? sysvec_apic_timer_interrupt+0x52/0x90 [ 203.886578] entry_SYSCALL_64_after_hwframe+0x6e/0xd8


Root Cause

From kernel 5.14, the following patch introduced a while loop for the virtio-net TX path which may enter infinite when VQ is broken (e.g., device is removed) under heavy traffic:

Copy
Copied!
            

commit a7766ef18b33674fa164e2e2916cef16d4e17f43 Author: Michael S. Tsirkin <mst@redhat.com> Date: Tue Apr 13 01:30:45 2021 -0400   virtio_net: disable cb aggressively   There are currently two cases where we poll TX vq not in response to a callback: start xmit and rx napi. We currently do this with callbacks enabled which can cause extra interrupts from the card. Used not to be a big issue as we run with interrupts disabled but that is no longer the case, and in some cases the rate of spurious interrupts is so high linux detects this and actually kills the interrupt.   Fix up by disabling the callbacks before polling the tx vq.   Signed-off-by: Michael S. Tsirkin <mst@redhat.com>


Solution

Currently, there is no official fix from the kernel side, some The following workarounds may be employed:

  • Use kernel without the offending kernel patches

  • Stop heavy traffic while performing unplug

Ubuntu Guest OS Stuck with Kernel 5.15.0-88/89-generic

Problem

When probing the virtio-pci and virtio-net kernel modules while running Ubuntu 22.04 with kernel 5.15.0-88/89-generic with any virtio function (i.e, PF or VF), the guest OS hangs and prints call traces as follows:

Copy
Copied!
            

[ 2052.109566] CPU: 0 PID: 1183 Comm: systemd-udevd Tainted: P O L 5.15.0-88-generic #98-Ubuntu [ 2052.109568] Hardware name: Red Hat KVM, BIOS 1.15.0-2.module+el8.6.0+14757+c25ee005 04/01/2014 [ 2052.109570] RIP: 0010:virtqueue_is_broken+0x9/0x20 [ 2052.109579] RSP: 0018:ffffc206423a79c0 EFLAGS: 00000246 [ 2052.109581] RAX: 0000000000000000 RBX: ffff9e8980bfa980 RCX: 0000000000000a20 [ 2052.109582] RDX: 0000000000000000 RSI: ffffc206423a79cc RDI: ffff9e89847b9000 [ 2052.109583] RBP: ffffc206423a7a60 R08: 0000000000000000 R09: 0000000000000003 [ 2052.109584] R10: 0000000000000003 R11: 0000000000000002 R12: ffffc206423a79f0 [ 2052.109585] R13: 0000000000000002 R14: 0000000000000004 R15: ffff9e8984667400 [ 2052.109586] FS: 00007f3e295388c0(0000) GS:ffff9e89bbc00000(0000) knlGS:0000000000000000 [ 2052.109588] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2052.109590] CR2: 0000555613432be0 CR3: 0000000116af0002 CR4: 0000000000170ef0 [ 2052.109593] Call Trace: [ 2052.109595] <IRQ> [ 2052.109598] ? show_trace_log_lvl+0x1d6/0x2ea [ 2052.109605] ? show_trace_log_lvl+0x1d6/0x2ea [ 2052.109609] ? _virtnet_set_queues+0xbb/0x100 [virtio_net] [ 2052.109615] ? show_regs.part.0+0x23/0x29 [ 2052.109618] ? show_regs.cold+0x8/0xd [ 2052.109621] ? watchdog_timer_fn+0x1be/0x220 [ 2052.109625] ? lockup_detector_update_enable+0x60/0x60 [ 2052.109627] ? __hrtimer_run_queues+0x107/0x230 [ 2052.109631] ? kvm_clock_get_cycles+0x11/0x20 [ 2052.109637] ? hrtimer_interrupt+0x101/0x220 [ 2052.109640] ? __sysvec_apic_timer_interrupt+0x61/0xe0 [ 2052.109644] ? sysvec_apic_timer_interrupt+0x7b/0x90 [ 2052.109650] </IRQ> [ 2052.109650] <TASK> [ 2052.109651] ? asm_sysvec_apic_timer_interrupt+0x1b/0x20 [ 2052.109655] ? virtqueue_is_broken+0x9/0x20 [ 2052.109656] ? virtnet_send_command+0x105/0x170 [virtio_net] [ 2052.109660] _virtnet_set_queues+0xbb/0x100 [virtio_net] [ 2052.109670] virtnet_probe+0x4ca/0xa10 [virtio_net] [ 2052.109674] virtio_dev_probe+0x1ae/0x260 [ 2052.109676] really_probe+0x222/0x420 [ 2052.109679] __driver_probe_device+0xe8/0x140 [ 2052.109681] driver_probe_device+0x23/0xc0 [ 2052.109683] __driver_attach+0xf7/0x1f0 [ 2052.109685] ? __device_attach_driver+0x140/0x140 [ 2052.109687] bus_for_each_dev+0x7f/0xd0 [ 2052.109691] driver_attach+0x1e/0x30 [ 2052.109693] bus_add_driver+0x148/0x220 [ 2052.109695] driver_register+0x95/0x100 [ 2052.109697] register_virtio_driver+0x20/0x40 [ 2052.109698] virtio_net_driver_init+0x74/0x1000 [virtio_net] [ 2052.109702] ? 0xffffffffc0d6f000 [ 2052.109704] do_one_initcall+0x49/0x1e0 [ 2052.109709] ? kmem_cache_alloc_trace+0x19e/0x2e0 [ 2052.109713] do_init_module+0x52/0x260 [ 2052.109716] load_module+0xb2b/0xbc0 [ 2052.109718] __do_sys_finit_module+0xbf/0x120 [ 2052.109721] __x64_sys_finit_module+0x18/0x20 [ 2052.109722] do_syscall_64+0x5c/0xc0 [ 2052.109725] ? do_syscall_64+0x69/0xc0 [ 2052.109726] ? syscall_exit_to_user_mode+0x35/0x50 [ 2052.109729] ? __x64_sys_newfstatat+0x1c/0x30 [ 2052.109733] ? do_syscall_64+0x69/0xc0 [ 2052.109735] entry_SYSCALL_64_after_hwframe+0x62/0xcc


Solution

There is a bug in upstream version v6.5-rc4, which is fixed in v6.5-rc7. Canonical backported the problematic patch to Ubuntu 5.15.0-88/89.generic, which triggers this Virtio-net deadlock issue:

Copy
Copied!
            

commit 51b813176f098ff61bd2833f627f5319ead098a5 Author: Jason Wang <jasowang@redhat.com> Date: Wed Aug 9 23:12:56 2023 -0400   virtio-net: set queues after driver_ok   Commit 25266128fe16 ("virtio-net: fix race between set queues and probe") tries to fix the race between set queues and probe by calling _virtnet_set_queues() before DRIVER_OK is set. This violates virtio spec. Fixing this by setting queues after virtio_device_ready().   Note that rtnl needs to be held for userspace requests to change the number of queues. So we are serialized in this way.   Fixes: 25266128fe16 ("virtio-net: fix race between set queues and probe") Reported-by: Dragos Tatulea <dtatulea@nvidia.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>

Switch default kernel back to another version (e.g., 5.15.0-79-generic).

Note

From 5.15.0-90-generic, the Ubuntu official kernel has the issue fixed.

There are multiple ways to switch the default kernel. The following is only one example:

Note

Users must have root permission before proceeding.

  1. Open /etc/default/grub and change GRUB_DEFAULT as follows:

    Copy
    Copied!
                

    GRUB_DEFAULT=saved

  2. Save file.

  3. Run the following to get the number of the kernel you want

    Copy
    Copied!
                

    # grep "menuentry 'Ubuntu," /boot/grub/grub.cfg

    Info

    Numbering starts from 0 (i.e., first entry is 0)

  4. Run the following to set the default kernel:

    Copy
    Copied!
                

    # grub-set-default num_from_last_step

  5. Reboot.

© Copyright 2024, NVIDIA. Last updated on Nov 12, 2024.