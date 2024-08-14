Ubuntu Guest OS Hangs with Kernel 5.15.0-88/89-generic
When probing the virtio-pci and virtio-net kernel modules while running Ubuntu 22.04 with kernel 5.15.0-88/89-generic with any virtio function (i.e, PF or VF), the guest OS hangs and prints call traces as follows:
[
2052.109566] CPU:
0 PID:
1183 Comm: systemd-udevd Tainted: P O L
5.15.
0-
88-generic #
98-Ubuntu
[
2052.109568] Hardware name: Red Hat KVM, BIOS
1.15.
0-
2.module+el8.
6.0+
14757+c25ee005
04/
01/
2014
[
2052.109570] RIP:
0010:virtqueue_is_broken+
0x9/
0x20
[
2052.109579] RSP:
0018:ffffc206423a79c0 EFLAGS:
00000246
[
2052.109581] RAX:
0000000000000000 RBX: ffff9e8980bfa980 RCX: 0000000000000a20
[
2052.109582] RDX:
0000000000000000 RSI: ffffc206423a79cc RDI: ffff9e89847b9000
[
2052.109583] RBP: ffffc206423a7a60 R08:
0000000000000000 R09:
0000000000000003
[
2052.109584] R10:
0000000000000003 R11:
0000000000000002 R12: ffffc206423a79f0
[
2052.109585] R13:
0000000000000002 R14:
0000000000000004 R15: ffff9e8984667400
[
2052.109586] FS: 00007f3e295388c0(
0000) GS:ffff9e89bbc00000(
0000) knlGS:
0000000000000000
[
2052.109588] CS:
0010 DS:
0000 ES:
0000 CR0:
0000000080050033
[
2052.109590] CR2: 0000555613432be0 CR3: 0000000116af0002 CR4: 0000000000170ef0
[
2052.109593] Call Trace:
[
2052.109595] <IRQ>
[
2052.109598] ? show_trace_log_lvl+
0x1d6/
0x2ea
[
2052.109605] ? show_trace_log_lvl+
0x1d6/
0x2ea
[
2052.109609] ? _virtnet_set_queues+
0xbb/
0x100 [virtio_net]
[
2052.109615] ? show_regs.part.
0+
0x23/
0x29
[
2052.109618] ? show_regs.cold+
0x8/
0xd
[
2052.109621] ? watchdog_timer_fn+
0x1be/
0x220
[
2052.109625] ? lockup_detector_update_enable+
0x60/
0x60
[
2052.109627] ? __hrtimer_run_queues+
0x107/
0x230
[
2052.109631] ? kvm_clock_get_cycles+
0x11/
0x20
[
2052.109637] ? hrtimer_interrupt+
0x101/
0x220
[
2052.109640] ? __sysvec_apic_timer_interrupt+
0x61/
0xe0
[
2052.109644] ? sysvec_apic_timer_interrupt+
0x7b/
0x90
[
2052.109650] </IRQ>
[
2052.109650] <TASK>
[
2052.109651] ? asm_sysvec_apic_timer_interrupt+
0x1b/
0x20
[
2052.109655] ? virtqueue_is_broken+
0x9/
0x20
[
2052.109656] ? virtnet_send_command+
0x105/
0x170 [virtio_net]
[
2052.109660] _virtnet_set_queues+
0xbb/
0x100 [virtio_net]
[
2052.109670] virtnet_probe+
0x4ca/
0xa10 [virtio_net]
[
2052.109674] virtio_dev_probe+
0x1ae/
0x260
[
2052.109676] really_probe+
0x222/
0x420
[
2052.109679] __driver_probe_device+
0xe8/
0x140
[
2052.109681] driver_probe_device+
0x23/
0xc0
[
2052.109683] __driver_attach+
0xf7/
0x1f0
[
2052.109685] ? __device_attach_driver+
0x140/
0x140
[
2052.109687] bus_for_each_dev+
0x7f/
0xd0
[
2052.109691] driver_attach+
0x1e/
0x30
[
2052.109693] bus_add_driver+
0x148/
0x220
[
2052.109695] driver_register+
0x95/
0x100
[
2052.109697] register_virtio_driver+
0x20/
0x40
[
2052.109698] virtio_net_driver_init+
0x74/
0x1000 [virtio_net]
[
2052.109702] ?
0xffffffffc0d6f000
[
2052.109704] do_one_initcall+
0x49/
0x1e0
[
2052.109709] ? kmem_cache_alloc_trace+
0x19e/
0x2e0
[
2052.109713] do_init_module+
0x52/
0x260
[
2052.109716] load_module+
0xb2b/
0xbc0
[
2052.109718] __do_sys_finit_module+
0xbf/
0x120
[
2052.109721] __x64_sys_finit_module+
0x18/
0x20
[
2052.109722] do_syscall_64+
0x5c/
0xc0
[
2052.109725] ? do_syscall_64+
0x69/
0xc0
[
2052.109726] ? syscall_exit_to_user_mode+
0x35/
0x50
[
2052.109729] ? __x64_sys_newfstatat+
0x1c/
0x30
[
2052.109733] ? do_syscall_64+
0x69/
0xc0
[
2052.109735] entry_SYSCALL_64_after_hwframe+
0x62/
0xcc
There is a bug in upstream version v6.5-rc4, which is fixed in v6.5-rc7. Canonical backported the problematic patch to Ubuntu 5.15.0-88/89.generic, which triggers this Virtio-net deadlock issue:
commit 51b813176f098ff61bd2833f627f5319ead098a5
Author: Jason Wang <jasowang
@redhat.com>
Date: Wed Aug
9
23:
12:
56
2023 -
0400
virtio-net: set queues after driver_ok
Commit 25266128fe16 ("virtio-net: fix race between set queues and
probe") tries to fix the race between set queues and probe by calling
_virtnet_set_queues() before DRIVER_OK is set. This violates virtio
spec. Fixing
this by setting queues after virtio_device_ready().
Note that rtnl needs to be held
for userspace requests to change the
number of queues. So we are serialized in
this way.
Fixes: 25266128fe16 (
"virtio-net: fix race between set queues and probe")
Reported-by: Dragos Tatulea <dtatulea
@nvidia.com>
Acked-by: Michael S. Tsirkin <mst
@redhat.com>
Signed-off-by: Jason Wang <jasowang
@redhat.com>
Signed-off-by: David S. Miller <davem
@davemloft.net>
Switch default kernel back to another version (e.g., 5.15.0-79-generic).
From 5.15.0-90-generic, the Ubuntu official kernel has the issue fixed.
There are multiple ways to switch the default kernel. The following is only one example:
Users must have root permission before proceeding.
Open /etc/default/grub and change GRUB_DEFAULT as follows:
GRUB_DEFAULT=saved
Save file.
Run the following to get the number of the kernel you want
# grep
"menuentry 'Ubuntu,"/boot/grub/grub.cfgInfo
Numbering starts from 0 (i.e., first entry is 0)
Run the following to set the default kernel:
# grub-set-
defaultnum_from_last_step
Reboot.