Bionic QEMU with Bionic Kernel hangs in AMD FX-8350 with cpu-host as passthrough

Bug #1834522 reported by Rafael David Tinoco
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Unassigned
Bionic
Fix Released
Medium
Rafael David Tinoco

Bug Description

[Impact]

 * Nested AMD KVM guest does not work in AMD CPUs when using host-passthrough as cpu-mode.
 * QEMU does not start, hanging before the VM initialization.

[Test Case]

 * Bionic KVM GUEST tries to use nested KVM in AMD CPU
 * to use the following XML file: https://paste.ubuntu.com/p/BSyFY7ksR5/
 * to have AMD FX(tm)-8350 Eight-Core Processor CPU or similar
 * to use Xenial qemu on top of a HWE kernel -> works

[Regression Potential]

 * KVM SVM could be affected but patch is from upstream, fixes the specific issue and has been tested by me in my currently developing workstation (heavy usage).

[Other Info]

 * Patches have been sent to kernel team mailing list.

CVE References

Changed in linux (Ubuntu):
status: New → Confirmed
importance: Undecided → Medium
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in linux (Ubuntu Bionic):
status: New → Confirmed
importance: Undecided → Medium
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in linux (Ubuntu):
importance: Medium → Undecided
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
status: Confirmed → Fix Released
importance: Undecided → Medium
Brad Figg (brad-figg)
tags: added: cscc
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :
Download full text (3.7 KiB)

# BISECT LOG

git bisect start
# bad: [0adb32858b0bddf4ada5f364a84ed60b196dbcda] Linux 4.16
git bisect bad 0adb32858b0bddf4ada5f364a84ed60b196dbcda
# good: [d8a5b80568a9cb66810e75b182018e9edb68e8ff] Linux 4.15
git bisect good d8a5b80568a9cb66810e75b182018e9edb68e8ff
# good: [c14376de3a1befa70d9811ca2872d47367b48767] printk: Wake klogd when passing console_lock owner
git bisect good c14376de3a1befa70d9811ca2872d47367b48767
# good: [2246edfaf88dc368e8671b04afd54412625df60a] Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
git bisect good 2246edfaf88dc368e8671b04afd54412625df60a
# good: [dfe8db22372873d205c78a9fd5370b1b088a2b87] Merge tag 'drm-misc-fixes-2018-02-21' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
git bisect good dfe8db22372873d205c78a9fd5370b1b088a2b87
# bad: [4665c6b04651e96c1e2eb9129a30d6055040ff73] Merge tag 'linux-can-fixes-for-4.16-20180312' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
git bisect bad 4665c6b04651e96c1e2eb9129a30d6055040ff73
# bad: [3499de32fa6b608ba646380ac3838d30a2558ead] Merge tag 'linux-kselftest-4.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
git bisect bad 3499de32fa6b608ba646380ac3838d30a2558ead
# good: [65738c6b461a8bb0b056e024299738f7cc9a28b7] Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
git bisect good 65738c6b461a8bb0b056e024299738f7cc9a28b7
# good: [c23a75759191e84f4ba15b85ea4f97bd544b5362] Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good c23a75759191e84f4ba15b85ea4f97bd544b5362
# bad: [d4858aaf6bd8a90e2dacc0dfec2077e334dcedbf] Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
git bisect bad d4858aaf6bd8a90e2dacc0dfec2077e334dcedbf
# good: [0eb578009a1d530a11846d7c4733a5db04730884] tools/kvm_stat: use a more pythonic way to iterate over dictionaries
git bisect good 0eb578009a1d530a11846d7c4733a5db04730884
# good: [fe2a3027e74e40a3ece3a4c1e4e51403090a907a] KVM: x86: fix backward migration with async_PF
git bisect good fe2a3027e74e40a3ece3a4c1e4e51403090a907a
# bad: [7607b7174405aec7441ff6c970833c463114040a] KVM: SVM: install RSM intercept
git bisect bad 7607b7174405aec7441ff6c970833c463114040a
# good: [e5699f56bc91a286f006b0728085e0b4e8f5749b] crypto: ccp: Fix sparse, use plain integer as NULL pointer
git bisect good e5699f56bc91a286f006b0728085e0b4e8f5749b
# good: [3e233385ef4a217a2812115ed84d4be36eb16817] KVM: SVM: no need to call access_ok() in LAUNCH_MEASURE command
git bisect good 3e233385ef4a217a2812115ed84d4be36eb16817
# first bad commit: [7607b7174405aec7441ff6c970833c463114040a] KVM: SVM: install RSM intercept

# NOTE

I was doing "invert" bisection.. so the bad commit is actually what seems to have fixed the issue:

commit 7607b7174405aec7441ff6c970833c463114040a
Author: Brijesh Singh <email address hidden>
Date: Mon Feb 19 10:14:44 2018 -0600

    KVM: SVM: install RSM intercept

    RSM instruction is used by the SMM handler to return from SMM mode.
    Currently, rsm causes a #UD - which results in instruction fetch, decode,
    and emulate. By installing the RSM inte...

Read more...

Changed in linux (Ubuntu Bionic):
status: Confirmed → In Progress
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

I've included patch above into Bionic tree:

commit 9bff5f095923aab04411cf4e9135b975b70e3ead (tag: Ubuntu-4.15.0-58.64, origin/master, origin/HEAD)
Author: Stefan Bader <email address hidden>
Date: Tue Aug 6 10:45:37 2019

    UBUNTU: Ubuntu-4.15.0-58.64

And it, indeed, fixed the issue.

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Also needed:

commit 35be0aded76b54a24dc8aa678a71bca22273e8d8
Author: Sean Christopherson <email address hidden>
Date: Thu Aug 23 17:56:47 2018

    KVM: x86: SVM: Set EMULTYPE_NO_REEXECUTE for RSM emulation

    Re-execution after an emulation decode failure is only intended to
    handle a case where two or vCPUs race to write a shadowed page, i.e.
    we should never re-execute an instruction as part of RSM emulation.

    Add a new helper, kvm_emulate_instruction_from_buffer(), to support
    emulating from a pre-defined buffer. This eliminates the last direct
    call to x86_emulate_instruction() outside of kvm_mmu_page_fault(),
    which means x86_emulate_instruction() can be unexported in a future
    patch.

    Fixes: 7607b7174405 ("KVM: SVM: install RSM intercept")
    Cc: Brijesh Singh <email address hidden>
    Signed-off-by: Sean Christopherson <email address hidden>
    Cc: <email address hidden>
    Signed-off-by: Radim Krčmář <email address hidden>

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :
description: updated
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Waiting kernel team sponsorship. Thx.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

BTW - Thanks for the work on this Rafael!

Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

It solves the problem and I'm sorry for taking so long to verify...

Commit:

Author: Sean Christopherson <email address hidden>
Date: Thu Aug 29 14:06:58 2019

    KVM: x86: SVM: Set EMULTYPE_NO_REEXECUTE for RSM emulation

    BugLink: https://bugs.launchpad.net/bugs/1834522

Indeed solves the problem.

Thank you!

tags: added: verification-done verification-done-bionic
removed: verification-needed-bionic
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (20.2 KiB)

This bug was fixed in the package linux - 4.15.0-65.74

---------------
linux (4.15.0-65.74) bionic; urgency=medium

  * bionic/linux: 4.15.0-65.74 -proposed tracker (LP: #1844403)

  * arm64: large modules fail to load (LP: #1841109)
    - arm64/kernel: kaslr: reduce module randomization range to 4 GB
    - arm64/kernel: don't ban ADRP to work around Cortex-A53 erratum #843419
    - arm64: fix undefined reference to 'printk'
    - arm64/kernel: rename module_emit_adrp_veneer->module_emit_veneer_for_adrp
    - [config] Remove CONFIG_ARM64_MODULE_CMODEL_LARGE

  * CVE-2018-20976
    - xfs: clear sb->s_fs_info on mount failure

  * br_netfilter: namespace sysctl operations (LP: #1836910)
    - net: bridge: add bitfield for options and convert vlan opts
    - net: bridge: convert nf call options to bits
    - netfilter: bridge: port sysctls to use brnf_net
    - netfilter: bridge: namespace bridge netfilter sysctls
    - netfilter: bridge: prevent UAF in brnf_exit_net()

  * tuntap: correctly set SOCKWQ_ASYNC_NOSPACE (LP: #1830756)
    - tuntap: correctly set SOCKWQ_ASYNC_NOSPACE

  * Bionic update: upstream stable patchset 2019-08-30 (LP: #1842114)
    - HID: Add 044f:b320 ThrustMaster, Inc. 2 in 1 DT
    - MIPS: kernel: only use i8253 clocksource with periodic clockevent
    - mips: fix cacheinfo
    - netfilter: ebtables: fix a memory leak bug in compat
    - ASoC: dapm: Fix handling of custom_stop_condition on DAPM graph walks
    - bonding: Force slave speed check after link state recovery for 802.3ad
    - can: dev: call netif_carrier_off() in register_candev()
    - ASoC: Fail card instantiation if DAI format setup fails
    - st21nfca_connectivity_event_received: null check the allocation
    - st_nci_hci_connectivity_event_received: null check the allocation
    - ASoC: ti: davinci-mcasp: Correct slot_width posed constraint
    - net: usb: qmi_wwan: Add the BroadMobi BM818 card
    - qed: RDMA - Fix the hw_ver returned in device attributes
    - isdn: mISDN: hfcsusb: Fix possible null-pointer dereferences in
      start_isoc_chain()
    - netfilter: ipset: Fix rename concurrency with listing
    - isdn: hfcsusb: Fix mISDN driver crash caused by transfer buffer on the stack
    - perf bench numa: Fix cpu0 binding
    - can: sja1000: force the string buffer NULL-terminated
    - can: peak_usb: force the string buffer NULL-terminated
    - net/ethernet/qlogic/qed: force the string buffer NULL-terminated
    - NFSv4: Fix a potential sleep while atomic in nfs4_do_reclaim()
    - HID: input: fix a4tech horizontal wheel custom usage
    - SMB3: Kernel oops mounting a encryptData share with CONFIG_DEBUG_VIRTUAL
    - net: cxgb3_main: Fix a resource leak in a error path in 'init_one()'
    - net: hisilicon: make hip04_tx_reclaim non-reentrant
    - net: hisilicon: fix hip04-xmit never return TX_BUSY
    - net: hisilicon: Fix dma_map_single failed on arm64
    - libata: have ata_scsi_rw_xlat() fail invalid passthrough requests
    - libata: add SG safety checks in SFF pio transfers
    - x86/lib/cpu: Address missing prototypes warning
    - drm/vmwgfx: fix memory leak when too many retries have occurred
    - perf ftrace: Fix failure to set cpuma...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.