Recent 5.13 kernel has broken KVM support

Bug #1966499 reported by Stéphane Graber
84
This bug affects 14 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Unassigned
Impish
Fix Released
High
Po-Hsu Lin
Jammy
Fix Released
Undecided
Unassigned

Bug Description

[Impact]
This is caused by commit 08335308 "KVM: x86: check PIR even for vCPUs
with disabled APICv", this patch needs 7e1901f6c "KVM: VMX: prepare
sync_pir_to_irr for running with APICv disabled" otherwise if APICv
is disabled in this vcpu it will trigger warning messages in
vmx_sync_pir_to_irr() of vmx.c:
    WARN_ON(!vcpu->arch.apicv_active);

With warnings like:
------------[ cut here ]------------
WARNING: CPU: 13 PID: 6997 at arch/x86/kvm/vmx/vmx.c:6336 vmx_sync_pir_to_irr+0x9e/0xc0 [kvm_intel]
? xfer_to_guest_mode_work+0xe2/0x110
Modules linked in: vhost_net vhost vhost_iotlb tap xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter nf_tables nfnetlink bridge stp llc nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm joydev input_leds ioatdma rapl intel_cstate efi_pstore ipmi_si mei_me mei mac_hid acpi_pad
vcpu_run+0x4d/0x220 [kvm]
acpi_power_meter sch_fq_codel ipmi_devintf ipmi_msghandler msr ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid mgag200 i2c_algo_bit drm_kms_helper crct10dif_pclmul syscopyarea crc32_pclmul sysfillrect sysimgblt ghash_clmulni_intel fb_sys_fops ixgbe cec aesni_intel rc_core crypto_simd xfrm_algo cryptd drm ahci dca i2c_i801 xhci_pci mdio libahci i2c_smbus lpc_ich xhci_pci_renesas wmi
CPU: 13 PID: 6997 Comm: qemu-system-x86 Tainted: G W I 5.13.0-39-generic #44-Ubuntu
Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE5C610.86B.01.01.1008.031920151331 03/19/2015
kvm_arch_vcpu_ioctl_run+0xc5/0x4f0 [kvm]
RIP: 0010:vmx_sync_pir_to_irr+0x9e/0xc0 [kvm_intel]
Code: e8 47 f5 18 00 8b 93 00 03 00 00 89 45 ec 83 e2 20 85 d2 74 dc 48 8b 55 f0 65 48 2b 14 25 28 00 00 00 75 1d 48 8b 5d f8 c9 c3 <0f> 0b eb 87 f0 80 4b 39 40 8b 93 00 03 00 00 8b 45 ec 83 e2 20 eb
RSP: 0018:ffffae4d8d107c98 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff99c552942640 RCX: ffff99c5043a72f0
RDX: ffff99c552942640 RSI: 0000000000000001 RDI: ffff99c552942640
RBP: ffffae4d8d107cb0 R08: ffff99c86f6a7140 R09: 0000000000027100
R10: 0000000042280000 R11: 000000000000000a R12: ffff99c552942640
R13: 0000000000000000 R14: ffffae4d8d1a63e0 R15: ffff99c552942640
FS: 00007f6ae9be7640(0000) GS:ffff99c86f680000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000010b8a6006 CR4: 00000000001726e0
Call Trace:
<TASK>
kvm_vcpu_ioctl+0x243/0x5e0 [kvm]
vcpu_enter_guest+0x383/0xf50 [kvm]
? xfer_to_guest_mode_work+0xe2/0x110
? kvm_vm_ioctl+0x364/0x730 [kvm]
? __fget_files+0x86/0xc0
vcpu_run+0x4d/0x220 [kvm]
__x64_sys_ioctl+0x91/0xc0
do_syscall_64+0x61/0xb0
? fput+0x13/0x20
? exit_to_user_mode_prepare+0x37/0xb0
? syscall_exit_to_user_mode+0x27/0x50
? do_syscall_64+0x6e/0xb0
? syscall_exit_to_user_mode+0x27/0x50
? do_syscall_64+0x6e/0xb0
? do_syscall_64+0x6e/0xb0
? do_syscall_64+0x6e/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f6aebce1a2b
Code: ff ff ff 85 c0 79 8b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d5 f3 0f 00 f7 d8 64 89 01 48
RSP: 002b:00007f6ae8ffe3f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f6aebce1a2b
RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000000c
RBP: 0000557d3b429b90 R08: 0000557d3a4ebff0 R09: 00000000ffffffff
kvm_arch_vcpu_ioctl_run+0xc5/0x4f0 [kvm]
R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000001 R14: 0000000000003000 R15: 0000000000000000
</TASK>
---[ end trace 5b722d71a78069b1 ]---

This warning message will be flooding in system log files and
eventually eat up all the disk space then crash the server.

This issue will gone by either reverting it or adding the fixes below.

Reference:
https://patchwork<email address hidden>/

[Fixes]
* 0b8f11737 KVM: Add infrastructure and macro to mark VM as bugged
* 673692735 KVM: x86: Use KVM_BUG/KVM_BUG_ON to handle bugs that are fatal to the VM
* 7e1901f6c KVM: VMX: prepare sync_pir_to_irr for running with APICv disabled

The fix comes in two fold, the first two patches will fix the warning
message flooding issue, make it only gets printed once. The third
patch will change the prevent this to happen.

The first patch needs to be backported as we're missing:
  2fdef3a2ae kvm: add PM-notifier
  fcfe1baedd KVM: stats: Support binary stats retrieval for a VM

The second patch needs some context adjustment. And the last one can
be cherry-picked.

[Test]
Test kernels can be found here:
https://people.canonical.com/~phlin/kernel/lp-1966499-kvm-warn-flood/

This issue can be verified with LXD:
  1. snap install lxd
  2. lxc launch images:ubuntu/20.04 --vm vm1

On affected system, the dmesg output will be flooded with this warning
message. With patched kernel the VM can be started with clean dmesg.

I have this kernel tested on Impish, the F-5.13 has been tested by
Daniël Vos (vosdev) on launchpad. Both are working as expected.

kvm-unit-tests has also been tested on my Impish instance to ensure
there is no other issues.

[Where problems could occur]
This patchset will change how the KVM bug gets reported in the kernel,
if it's incorrect it might affect VMX capability.

[Original Bug Report]
Upgrading to 5.13.0-37 or 5.13.0-39 immediately crashes my production servers as they hit:
https://<email address hidden>/T/#md1f5c8c4aa01130a449a47f3e7559f06b0372f55

It looks like we need to get e90e51d5f01d included in those kernels.

Revision history for this message
Stéphane Graber (stgraber) wrote :
Download full text (5.8 KiB)

Mar 25 16:18:30 abydos kernel: [ 1319.549186] ------------[ cut here ]------------
Mar 25 16:18:30 abydos kernel: [ 1319.549191] WARNING: CPU: 12 PID: 15052 at arch/x86/kvm/vmx/vmx.c:6336 vmx_sync_pir_to_irr+0x9f/0xc0 [kvm_intel]
Mar 25 16:18:30 abydos kernel: [ 1319.549213] Modules linked in: wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libblake2s blake2s_x86_64 libcurve25519_generic libchacha libblake2s_generic xt_HL xt_MASQUERADE xt_TCPMSS xt_tcpudp binfmt_misc rbd unix_diag nf_conntrack_netlink veth ceph libceph fscache netfs zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) ebtable_filter ebtables ip6table_raw ip6table_mangle ip6table_nat ip6table_filter ip6_tables iptable_raw iptable_mangle iptable_nat iptable_filter bpfilter nf_tables vhost_vsock vmw_vsock_virtio_transport_common vhost vhost_iotlb vsock shiftfs sch_ingress geneve ip6_udp_tunnel udp_tunnel nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 8021q garp mrp nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ipmi_ssif intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm rapl intel_cstate
Mar 25 16:18:30 abydos kernel: [ 1319.549305] efi_pstore joydev input_leds cdc_acm mei_me mei ioatdma bridge stp llc bonding acpi_ipmi tls ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad mac_hid sch_fq_codel ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid ast drm_vram_helper drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt crct10dif_pclmul fb_sys_fops crc32_pclmul ghash_clmulni_intel cec nvme rc_core aesni_intel crypto_simd ixgbe igb i2c_i801 xfrm_algo nvme_core drm i2c_smbus ahci i2c_algo_bit cryptd lpc_ich dca xhci_pci mdio libahci xhci_pci_renesas wmi
Mar 25 16:18:30 abydos kernel: [ 1319.549394] CPU: 12 PID: 15052 Comm: qemu-system-x86 Tainted: P O 5.13.0-39-generic #44~20.04.1-Ubuntu
Mar 25 16:18:30 abydos kernel: [ 1319.549399] Hardware name: Supermicro PIO-618U-T4T+-ST031/X10DRU-i+, BIOS 3.2a 11/19/2019
Mar 25 16:18:30 abydos kernel: [ 1319.549402] RIP: 0010:vmx_sync_pir_to_irr+0x9f/0xc0 [kvm_intel]
Mar 25 16:18:30 abydos kernel: [ 1319.549415] Code: 83 c4 10 5b 5d c3 48 89 df e8 5d dc eb fd 8b 93 00 03 00 00 89 45 ec 83 e2 20 85 d2 75 d2 89 c7 e8 f6 fd ff ff 8b 45 ec eb c6 <0f> 0b eb 86 f0 80 4b 39 40 8b 93 00 03 00 00 8b 45 ec 83 e2 20 eb
Mar 25 16:18:30 abydos kernel: [ 1319.549419] RSP: 0018:ffffaed30c577cd8 EFLAGS: 00010246
Mar 25 16:18:30 abydos kernel: [ 1319.549423] RAX: 0000000000000000 RBX: ffff98162a4f8000 RCX: 0000000000000000
Mar 25 16:18:30 abydos kernel: [ 1319.549425] RDX: 0000000000000400 RSI: ffffffffc0c38509 RDI: ffff98162a4f8000
Mar 25 16:18:30 abydos kernel: [ 1319.549428] RBP: ffffaed30c577cf0 R08: 0000000000000400 R09: ffff98161e73fc00
Mar 25 16:18:30 abydos kernel: [ 1319.549430] R10: 0000000000000000 R11: 0000000000000000 R12: ffff98162a4f8000
Mar 25 16:18:30 aby...

Read more...

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Stéphane Graber (stgraber) wrote :

This repeats in a loop and fills tens of GBs of space with kernel logs in just a few minutes before crashing the entire system.

Revision history for this message
Simon Déziel (sdeziel) wrote :

5.13.0-38.43 has the fix but 5.13.0-39.44 doesn't, presumably because -39 includes urgent security fixes.

Revision history for this message
Stéphane Graber (stgraber) wrote :

Ah yeah, that could be. I figured I'd test what's in -proposed but if -proposed is a security only fix on top of -37, that wouldn't help much.

It's a bit frustrating because users would have gotten the busted kernel as part of -37 which includes a security fix but then the only real option to get a booting system back now is to go to pre-security-fix.

Unfortunately this is a production server and I already spent half of the week dealing with this mess so don't have more time to play kernel bingo. Server is now running a clean upstream build.

Revision history for this message
Eric Anopolsky (erpo41) wrote :

I arrived at work this morning to find a frozen workstation with a full disk. This bug was one of two relevant search results, so I'm documenting the recovery steps I took here in case it will be helpful to others:

1. Held down the power button until the PC turned off.
2. Waited a few seconds and powered the PC back on.
3. On the grub menu, chose Advanced options for Ubuntu -> Ubuntu, with Linux 5.13.0-35-generic (recovery mode) and waited for the system to boot.
4. On the Recovery Menu, chose the root option and hit enter.
5. Ran the following:

rm /var/log/kern.log # freed up about 800GB
head -n 100000 /var/log/syslog > /var/log/syslog.bak # saved a few copies of the error just in case
rm /var/log/syslog # freed up another 800GB
apt remove linux-image-5.13.0-37-generic linux-image-unsigned-5.13.0-37-generic # it was necessary to remove the second package because that's what apt tried to install when only the first one was specified
reboot # temporary fix is complete

6. Allowed the system to boot normally without intervention.
7. Logged in, opened a terminal, and confirmed that the fix worked:

uname -a
# Expected output: Linux IS01209 5.13.0-35-generic #40~20.04.1-Ubuntu SMP Mon Mar 7 09:18:32 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

virsh list
# Expected output:
# Id Name State
#---------------------------------
# 1 my-windows-vm running
#
# Note: My Windows VM is configured to start on boot. For affected users who have a Windows VM that is not configured to start on boot, it may be necessary to start the VM manually before moving on to the next step.

watch du -h /var/log/syslog /var/log/kern.log
# Expected output:
# Every 2.0s: du -h /var/... (hostname and date/time)
#
# 320K /var/log/syslog
# 100K /var/log/kern.log
#
# Note: If those two files stay the same size or grow slowly, the fix worked. If those two files grow rapidly, the fix did not work.

Note that the above steps will also uninstall the linux-generic-hwe-20.04 package, which will prevent future kernel updates. When this bug is resolved, I plan to run (and recommend that others run) `sudo apt install linux-generic-hwe-20.04` to resume kernel updates.

Revision history for this message
DaveTickem (dave-tickem) wrote (last edit ):

Same here, rollback to .35 and have a hold on that version until a fix released. Behaviour aligns with comments on related thread - only when passing through a PCIe GPU to a (linux in my case) guest. Drop the VM back to QEMU graphics, serial terminal or other and this logspam does not materialise.

Snip of log, can provide more, however very similar to first report.

Mar 22 07:43:27 giskard kernel: [ 295.128031] ------------[ cut here ]------------
Mar 22 07:43:27 giskard kernel: [ 295.128124] RIP: 0010:vmx_sync_pir_to_irr+0x9f/0xc0 [kvm_intel]
Mar 22 07:43:27 giskard kernel: [ 295.128137] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8bb030e90000
Mar 22 07:43:27 giskard kernel: [ 295.128143] <TASK>
Mar 22 07:43:27 giskard kernel: [ 295.128271] __x64_sys_ioctl+0x91/0xc0
Mar 22 07:43:27 giskard kernel: [ 295.128286] entry_SYSCALL_64_after_hwframe+0x44/0xae
Mar 22 07:43:27 giskard kernel: [ 295.128295] RBP: 0000560f97b50c70 R08: 0000560f966401f0 R09: 00000000ffffffff
Mar 22 07:43:27 giskard kernel: [ 295.128317] WARNING: CPU: 4 PID: 8318 at arch/x86/kvm/vmx/vmx.c:6336 vmx_sync_pir_to_irr+0x9f/0xc
0 [kvm_intel]
Mar 22 07:43:27 giskard kernel: [ 295.128417] Code: 83 c4 10 5b 5d c3 48 89 df e8 5d 6c 12 00 8b 93 00 03 00 00 89 45 ec 83 e2 20 8
5 d2 75 d2 89 c7 e8 f6 fd ff ff 8b 45 ec eb c6 <0f> 0b eb 86 f0 80 4b 39 40 8b 93 00 03 00 00 8b 45 ec 83 e2 20 eb
Mar 22 07:43:27 giskard kernel: [ 295.128425] R13: 0000000000000000 R14: ffffb071854f23e0 R15: ffff8bafc4d43080
Mar 22 07:43:27 giskard kernel: [ 295.128432] vcpu_enter_guest+0x354/0x11e0 [kvm]
Mar 22 07:43:27 giskard kernel: [ 295.128562] do_syscall_64+0x61/0xb0
Mar 22 07:43:27 giskard kernel: [ 295.128576] RIP: 0033:0x7fcada3633db
Mar 22 07:43:27 giskard kernel: [ 295.128582] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000015
Mar 22 07:43:27 giskard kernel: [ 295.128588] </TASK>

Revision history for this message
Eric Anopolsky (erpo41) wrote :

This issue occurs in my environment with a Windows guest even without PCIe GPU passthrough:

anopolsky@IS01209:~$ virsh dumpxml windows-primary |grep hostdev|wc -l
0
anopolsky@IS01209:~$ virsh dumpxml windows-primary |grep -A 4 '<video>'
    <video>
      <model type='qxl' ram='65536' vram='65536' vgamem='16384' heads='1' primary='yes'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
    </video>

Revision history for this message
Axton Grams (axton-grams) wrote :
Download full text (6.3 KiB)

Similar configuration with Windows guests (6 total):

root@cluster-05:/var/log# virsh dumpxml guest1 |grep -A 4 '<video>'
    <video>
      <model type='qxl' ram='65536' vram='65536' vgamem='16384' heads='1' primary='yes'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
    </video>

This is the first error, just after starting the first vm.

Mar 28 06:27:17 cluster-05 systemd[1]: Started Virtual Machine guest1.
Mar 28 06:27:17 cluster-05 kernel: [ 21.227759] ------------[ cut here ]------------
Mar 28 06:27:17 cluster-05 kernel: [ 21.227762] WARNING: CPU: 21 PID: 5027 at arch/x86/kvm/vmx/vmx.c:6336 vmx_sync_pir_to_irr+0x9f/0xc0 [kvm_intel]
Mar 28 06:27:17 cluster-05 kernel: [ 21.227779] Modules linked in: vhost_net vhost vhost_iotlb tap ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bpfilter nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua zfs(PO) zunicode(PO) zzstd(O) ipmi_ssif zlua(O) intel_rapl_msr zavl(PO) intel_rapl_common icp(PO) zcommon(PO) znvpair(PO) spl(O) isst_if_common skx_edac nfit x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel ast kvm crct10dif_pclmul drm_vram_helper ghash_clmulni_intel joydev drm_ttm_helper ttm rapl drm_kms_helper input_leds intel_cstate cec rc_core i2c_algo_bit efi_pstore fb_sys_fops syscopyarea sysfillrect sysimgblt mei_me ioatdma mei intel_pch_thermal dca acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler bridge acpi_pad acpi_power_meter mac_hid sch_fq_codel mii 8021q garp mrp stp llc bonding tls drm ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear
Mar 28 06:27:17 cluster-05 kernel: [ 21.227848] hid_generic usbhid hid raid1 crc32_pclmul aesni_intel crypto_simd i40e mpt3sas cryptd nvme raid_class i2c_i801 scsi_transport_sas ahci nvme_core xhci_pci i2c_smbus lpc_ich libahci xhci_pci_renesas wmi
Mar 28 06:27:17 cluster-05 kernel: [ 21.227865] CPU: 21 PID: 5027 Comm: CPU 0/KVM Tainted: P O 5.13.0-37-generic #42~20.04.1-Ubuntu
Mar 28 06:27:17 cluster-05 kernel: [ 21.227868] Hardware name: Supermicro SYS-5019P-M/X11SPM-F, BIOS 3.0c 03/27/2019
Mar 28 06:27:17 cluster-05 kernel: [ 21.227870] RIP: 0010:vmx_sync_pir_to_irr+0x9f/0xc0 [kvm_intel]
Mar 28 06:27:17 cluster-05 kernel: [ 21.227879] Code: 83 c4 10 5b 5d c3 48 89 df e8 5d 1c 30 00 8b 93 00 03 00 00 89 45 ec 83 e2 20 85 d2 75 d2 89 c7 e8 f6 fd ff ff 8b 45 ec eb c6 <0f> 0b eb 86 f0 80 4b 39 40 8b 93 00 03 00 00 8b 45 ec 83 e2 20 eb
Mar 28 06:27:17 cluster-05 kernel: [ 21.227881] RSP: 0018:ffffb6e7c1f97cf0 EFLAGS: 00010246
Mar 28 06:27:17 cluster-05 kernel: [ 21.227884] RAX: 0000000000000000 RBX: ffff9cd2b0a5a640 RCX: 0000000000000006
Mar 28 06:27:17 cluster-05 kernel: [ 21.227886] RDX: 00000000fffe2b98 RSI: ffffffffc0e9c509 RDI: ffff9cd2b0a5a640
Mar 28 06:27:17 cluster-05 kernel: [ 21.227887] RBP: ffffb6e7c1f97d08 R08: 0000000000000400 R09: 00000000ffffffff
Mar 28 06:27:17 cluster-05 kernel: [ 21.227889] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9cd2b0a5a640
Mar 28 06:27:17 ...

Read more...

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Hello,
this patch e90e51d5f01d ("KVM: VMX: clear vmx_x86_ops.sync_pir_to_irr if APICv is disabled") has been applied to our master-next branch of Impish tree.

commit 1b611352fd6fd42a5a672f52f5f3035d48b3cd13
Author: Paolo Bonzini <email address hidden>
Date: Tue Nov 30 07:36:41 2021 -0500

    KVM: VMX: clear vmx_x86_ops.sync_pir_to_irr if APICv is disabled

    BugLink: https://bugs.launchpad.net/bugs/1958287

    [ Upstream commit e90e51d5f01d2baae5dcce280866bbb96816e978 ]

    There is nothing to synchronize if APICv is disabled, since neither
    other vCPUs nor assigned devices can set PIR.ON.

    Signed-off-by: Paolo Bonzini <email address hidden>
    Signed-off-by: Sasha Levin <email address hidden>
    Signed-off-by: Kamal Mostafa <email address hidden>
    Signed-off-by: Stefan Bader <email address hidden>

It should be included in 5.13.0-40.45 of SRU cycle 2022 03/21 (tracking bug lp:1966701)
I would suggest you to give this proposed kernel a try.

Also,
can you provide us the step to reproduce this?
I can't reproduce this on an clean Impish bare-metal system running with 5.13.0-39-generic with our KVM smoke test (which utilize uvt-kvm to create the KVM instance).
It would be nice to improve the test coverage with it.
Thanks!

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux (Ubuntu Impish):
status: New → Confirmed
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :
Download full text (4.4 KiB)

I managed to reproduce this with the hyperv_stimer test in kvm-unit-tests with Impish 5.13.0-39, with the identical output spamming in dmesg.

And I can reproduce this with 5.13.0-40

[ 921.874608] ------------[ cut here ]------------
[ 921.874609] WARNING: CPU: 13 PID: 6997 at arch/x86/kvm/vmx/vmx.c:6336 vmx_sync_pir_to_irr+0x9e/0xc0 [kvm_intel]
[ 921.874613] ? xfer_to_guest_mode_work+0xe2/0x110
[ 921.874616] Modules linked in: vhost_net vhost vhost_iotlb tap xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter nf_tables nfnetlink bridge stp llc nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm joydev input_leds ioatdma rapl intel_cstate efi_pstore ipmi_si mei_me mei mac_hid acpi_pad
[ 921.874616] vcpu_run+0x4d/0x220 [kvm]
[ 921.874635] acpi_power_meter sch_fq_codel ipmi_devintf ipmi_msghandler msr ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid mgag200 i2c_algo_bit drm_kms_helper crct10dif_pclmul syscopyarea crc32_pclmul sysfillrect sysimgblt ghash_clmulni_intel fb_sys_fops ixgbe cec aesni_intel rc_core crypto_simd xfrm_algo cryptd drm ahci dca i2c_i801 xhci_pci mdio libahci i2c_smbus lpc_ich xhci_pci_renesas wmi
[ 921.874664] CPU: 13 PID: 6997 Comm: qemu-system-x86 Tainted: G W I 5.13.0-39-generic #44-Ubuntu
[ 921.874665] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE5C610.86B.01.01.1008.031920151331 03/19/2015
[ 921.874651] kvm_arch_vcpu_ioctl_run+0xc5/0x4f0 [kvm]
[ 921.874666] RIP: 0010:vmx_sync_pir_to_irr+0x9e/0xc0 [kvm_intel]
[ 921.874671] Code: e8 47 f5 18 00 8b 93 00 03 00 00 89 45 ec 83 e2 20 85 d2 74 dc 48 8b 55 f0 65 48 2b 14 25 28 00 00 00 75 1d 48 8b 5d f8 c9 c3 <0f> 0b eb 87 f0 80 4b 39 40 8b 93 00 03 00 00 8b 45 ec 83 e2 20 eb
[ 921.874673] RSP: 0018:ffffae4d8d107c98 EFLAGS: 00010046
[ 921.874674] RAX: 0000000000000000 RBX: ffff99c552942640 RCX: ffff99c5043a72f0
[ 921.874675] RDX: ffff99c552942640 RSI: 0000000000000001 RDI: ffff99c552942640
[ 921.874676] RBP: ffffae4d8d107cb0 R08: ffff99c86f6a7140 R09: 0000000000027100
[ 921.874677] R10: 0000000042280000 R11: 000000000000000a R12: ffff99c552942640
[ 921.874678] R13: 0000000000000000 R14: ffffae4d8d1a63e0 R15: ffff99c552942640
[ 921.874679] FS: 00007f6ae9be7640(0000) GS:ffff99c86f680000(0000) knlGS:0000000000000000
[ 921.874680] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 921.874681] CR2: 0000000000000000 CR3: 000000010b8a6006 CR4: 00000000001726e0
[ 921.874683] Call Trace:
[ 921.874684] <TASK>
[ 921.874684] kvm_vcpu_ioctl+0x243/0x5e0 [kvm]
[ 921.874685] vcpu_enter_guest+0x383/0xf50 [kvm]
[ 921.874720] ? xfer_to_guest_mode_work+0xe2/0x110
[ 921.874709] ? kvm_vm_ioctl+0x364/0x730 [kvm]
[ 921.874738] ? __fget_files+0x86/0xc0
[ 921.874723] vcpu_run+0x4d/0x220 [kvm]
[ 921.874742] __x64_sys_ioctl+0x91/0xc0
[ 921.874744] do_...

Read more...

Po-Hsu Lin (cypressyew)
Changed in linux (Ubuntu Impish):
assignee: nobody → Po-Hsu Lin (cypressyew)
Po-Hsu Lin (cypressyew)
tags: added: 5.13 impish
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Hello folks,
can you give this test kernel a try:
https://people.canonical.com/~phlin/kernel/lp-1966499-kvm-warn-flood/

Which contains the following commits:
 * KVM: VMX: prepare sync_pir_to_irr for running with APICv disabled
 * KVM: x86: Use KVM_BUG/KVM_BUG_ON to handle bugs that are fatal to the VM
 * KVM: Add infrastructure and macro to mark VM as bugged

$ uname -a
Linux fili 5.13.0-41-generic #46+lp1966499v2 SMP Fri Apr 8 05:39:48 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

I am not sure how to reproduce this issue by creating a new VM, so I have it verified on my bare-metal testing node with the hyperv_stimer test in: https://code.launchpad.net/~canonical-kernel-team/+git/kvm-unit-tests
Steps:
1. sudo apt install -y build-essential cpu-checker qemu-kvm git gcc
2. git clone -b hirsute --depth=1 https://code.launchpad.net/~canonical-kernel-team/+git/kvm-unit-tests
3. cd kvm-unit-tests; make standalone
4. sudo ./tests/hyperv_stimer

With unpatched kernel the test will fail with timeout.
And dmesg will be flooded with this warning until disk ran out of space:
  WARNING: CPU: 13 PID: 6997 at arch/x86/kvm/vmx/vmx.c:6336 vmx_sync_pir_to_irr+0x9e/0xc0 [kvm_intel]

With the patched kernel this test will pass with clean dmesg.

Revision history for this message
Daniël Vos (vosdev) wrote :

I'm available for testing but I only have Ubuntu 20.04 servers available. I believe you built it for a newer version.

dpkg: dependency problems prevent configuration of linux-headers-5.13.0-41-generic:
 linux-headers-5.13.0-41-generic depends on libc6 (>= 2.34); however:
  Version of libc6:amd64 on system is 2.31-0ubuntu9.7.

You could also `snap install lxd` and `lxc launch images:ubuntu/20.04 --vm vm1` to run a VM using LXD.

Followed by `lxc exec vm1 bash` after 10-30 seconds

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Hey Daniël Vos (vosdev),
thanks for the reproducer, it helps to ensure this fix is working as expected, and provides us more coverage in the future!
I can get this flooded warning message right after starting the VM on my Impish test server. And the fix works.

And yes, this kernel is for Impish, I should have build a Focal hwe kernel as well. You will find the Focal 5.13 HWE kernel in the focal directory:
https://people.canonical.com/~phlin/kernel/lp-1966499-kvm-warn-flood/

Revision history for this message
Daniël Vos (vosdev) wrote :

Hey Po-Hsu Lin,

Tested on Ubuntu 20.04 and can confirm my VMs run again! Thanks for the quick fix!

Po-Hsu Lin (cypressyew)
Changed in linux (Ubuntu Impish):
status: Confirmed → In Progress
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Thanks for your feedback, I have submitted the patch for SRU:
https://lists.ubuntu.com/archives/kernel-team/2022-April/129362.html

description: updated
Changed in linux (Ubuntu Jammy):
status: Confirmed → Fix Released
Stefan Bader (smb)
Changed in linux (Ubuntu Impish):
importance: Undecided → High
Changed in linux (Ubuntu Impish):
status: In Progress → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux/5.13.0-41.46 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-impish' to 'verification-done-impish'. If the problem still exists, change the tag 'verification-needed-impish' to 'verification-failed-impish'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-impish
Revision history for this message
Stéphane Z (salamafet) wrote :

Works great on my side with the 5.13.0-41.46 kernel.

VM start correctly and no more log spamming.

tags: added: verification-done-impish
removed: verification-needed-impish
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Thanks for testing this!

Revision history for this message
Axton Grams (axton-grams) wrote :

Will this also be released for focal?

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Hi,
Yes it will be shipped with Focal HWE 5.13 kernel.

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (33.7 KiB)

This bug was fixed in the package linux - 5.13.0-41.46

---------------
linux (5.13.0-41.46) impish; urgency=medium

  * impish/linux: 5.13.0-41.46 -proposed tracker (LP: #1969014)

  * NVMe devices fail to probe due to ACPI power state change (LP: #1942624)
    - ACPI: power: Rework turning off unused power resources
    - ACPI: PM: Do not turn off power resources in unknown state

  * Recent 5.13 kernel has broken KVM support (LP: #1966499)
    - KVM: Add infrastructure and macro to mark VM as bugged
    - KVM: x86: Use KVM_BUG/KVM_BUG_ON to handle bugs that are fatal to the VM
    - KVM: VMX: prepare sync_pir_to_irr for running with APICv disabled

  * LRMv6: add multi-architecture support (LP: #1968774)
    - [Packaging] resync dkms-build{,--nvidia-N}

  * io_uring regression - lost write request (LP: #1952222)
    - io-wq: split bounded and unbounded work into separate lists

  * xfrm interface cannot be changed anymore (LP: #1968591)
    - xfrm: fix the if_id check in changelink

  * Use kernel-testing repo from launchpad for ADT tests (LP: #1968016)
    - [Debian] Use kernel-testing repo from launchpad

  * vmx_ldtr_test in ubuntu_kvm_unit_tests failed (FAIL: Expected 0 for L1 LDTR
    selector (got 50)) (LP: #1956315)
    - KVM: nVMX: Set LDTR to its architecturally defined value on nested VM-Exit

  * audio from external sound card is distorted (LP: #1966066)
    - ALSA: usb-audio: Fix packet size calculation regression

  * Impish update: upstream stable patchset 2022-04-12 (LP: #1968771)
    - cgroup/cpuset: Fix a race between cpuset_attach() and cpu hotplug
    - btrfs: tree-checker: check item_size for inode_item
    - btrfs: tree-checker: check item_size for dev_item
    - clk: jz4725b: fix mmc0 clock gating
    - vhost/vsock: don't check owner in vhost_vsock_stop() while releasing
    - parisc/unaligned: Fix fldd and fstd unaligned handlers on 32-bit kernel
    - parisc/unaligned: Fix ldw() and stw() unalignment handlers
    - KVM: x86/mmu: make apf token non-zero to fix bug
    - drm/amdgpu: disable MMHUB PG for Picasso
    - drm/i915: Correctly populate use_sagv_wm for all pipes
    - sr9700: sanity check for packet length
    - USB: zaurus: support another broken Zaurus
    - CDC-NCM: avoid overflow in sanity checking
    - x86/fpu: Correct pkru/xstate inconsistency
    - tee: export teedev_open() and teedev_close_context()
    - optee: use driver internal tee_context for some rpc
    - ping: remove pr_err from ping_lookup
    - perf data: Fix double free in perf_session__delete()
    - bnx2x: fix driver load from initrd
    - bnxt_en: Fix active FEC reporting to ethtool
    - hwmon: Handle failure to register sensor with thermal zone correctly
    - bpf: Do not try bpf_msg_push_data with len 0
    - selftests: bpf: Check bpf_msg_push_data return value
    - bpf: Add schedule points in batch ops
    - io_uring: add a schedule point in io_add_buffers()
    - net: __pskb_pull_tail() & pskb_carve_frag_list() drop_monitor friends
    - tipc: Fix end of loop tests for list_for_each_entry()
    - gso: do not skip outer ip header in case of ipip and net_failover
    - openvswitch: Fix setting ipv6 fields causing hw csum failure
   ...

Changed in linux (Ubuntu Impish):
status: Fix Committed → Fix Released
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-intel-5.13/5.13.0-1012.12 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-focal
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.