[Regression] Bionic kernel 4.15.0-71.80 can not boot on ThunderX

Bug #1853326 reported by Ike Panhc
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Undecided
Unassigned
Bionic
Fix Released
High
Ike Panhc

Bug Description

[Impact]
4.15.0-71-generic can not boot on ThunderX

Here are console logs
https://pastebin.ubuntu.com/p/xcRk7VRrzF/
https://pastebin.ubuntu.com/p/DkdBqbBqqD/

[Test Case]
Boot kernel with earlycon. See if kernel oops while booting.

[Regression Risk]
TBD

Ike Panhc (ikepanhc)
Changed in linux (Ubuntu):
status: New → Invalid
Changed in linux (Ubuntu Bionic):
status: New → In Progress
assignee: nobody → Ike Panhc (ikepanhc)
importance: Undecided → High
Revision history for this message
Ike Panhc (ikepanhc) wrote :

The root cause of regression is within

d0f174e40a6 <email address hidden> 2019-11-12 19:04:49 +0100 arm64: enable generic CPU vulnerabilites support
f94f9d3a3e8b <email address hidden> 2019-11-12 19:04:49 +0100 arm64: add sysfs vulnerability show for meltdown
c288f6b5788d <email address hidden> 2019-11-12 19:04:48 +0100 arm64: Add sysfs vulnerability show for spectre-v1
f2485ae5fd84 <email address hidden> 2019-11-12 19:04:48 +0100 arm64: fix SSBS sanitization
1931a913df7e <email address hidden> 2019-11-12 19:04:48 +0100 KVM: arm64: Set SCTLR_EL2.DSSBS if SSBD is forcefully disabled and !vhe
fd872fd82e12 <email address hidden> 2019-11-12 19:04:48 +0100 arm64: ssbd: Add support for PSTATE.SSBS rather than trapping to EL3
2a3135c3033c <email address hidden> 2019-11-12 19:04:48 +0100 arm64: cpufeature: Detect SSBS and advertise to userspace
78dc3acb34fa <email address hidden> 2019-11-12 19:04:48 +0100 arm64: Get rid of __smccc_workaround_1_hvc_*
5c43fb65359d <email address hidden> 2019-11-12 19:04:48 +0100 arm64: don't zero DIT on signal return
c6c07232325a <email address hidden> 2019-11-12 19:04:48 +0100 arm64: KVM: Use SMCCC_ARCH_WORKAROUND_1 for Falkor BP hardening
274adba3ccf6 <email address hidden> 2019-11-12 19:04:47 +0100 arm64: capabilities: Add support for checks based on a list of MIDRs
f34e57c35b72 <email address hidden> 2019-11-12 19:04:47 +0100 arm64: Add MIDR encoding for Arm Cortex-A55 and Cortex-A35
8d811d39465c <email address hidden> 2019-11-12 19:04:47 +0100 arm64: Add helpers for checking CPU MIDR against a range
b2eddaf65384 <email address hidden> 2019-11-12 19:04:47 +0100 arm64: capabilities: Clean up midr range helpers
628859e8621c <email address hidden> 2019-11-12 19:04:47 +0100 arm64: capabilities: Change scope of VHE to Boot CPU feature
3bf4ffd98cc4 <email address hidden> 2019-11-12 19:04:47 +0100 arm64: capabilities: Add support for features enabled early

Revision history for this message
Ike Panhc (ikepanhc) wrote :

Narrow down to

1931a913df7e <email address hidden> 2019-11-12 19:04:48 +0100 KVM: arm64: Set SCTLR_EL2.DSSBS if SSBD is forcefully disabled and !vhe
fd872fd82e12 <email address hidden> 2019-11-12 19:04:48 +0100 arm64: ssbd: Add support for PSTATE.SSBS rather than trapping to EL3
2a3135c3033c <email address hidden> 2019-11-12 19:04:48 +0100 arm64: cpufeature: Detect SSBS and advertise to userspace
78dc3acb34fa <email address hidden> 2019-11-12 19:04:48 +0100 arm64: Get rid of __smccc_workaround_1_hvc_*

Revision history for this message
Ike Panhc (ikepanhc) wrote :

Bisect end in this patch. I am going to build kernel with this patch reverted and test.

78dc3acb34fa <email address hidden> 2019-11-12 19:04:48 +0100 arm64: Get rid of __smccc_workaround_1_hvc_*

Ike Panhc (ikepanhc)
description: updated
Stefan Bader (smb)
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Revision history for this message
dann frazier (dannf) wrote :

Tracking the reapplication of the reverted patches in bug 1854207.

Revision history for this message
Ike Panhc (ikepanhc) wrote :

Tried 4.15.0-72.81 kernel on 1 socket and 2 sockets ThunderX machine and all boot ok.

Thanks.

tags: added: verification-done-bionic
removed: verification-needed-bionic
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (28.6 KiB)

This bug was fixed in the package linux - 4.15.0-72.81

---------------
linux (4.15.0-72.81) bionic; urgency=medium

  * bionic/linux: 4.15.0-72.81 -proposed tracker (LP: #1854027)

  * [Regression] Bionic kernel 4.15.0-71.80 can not boot on ThunderX
    (LP: #1853326)
    - Revert "arm64: Use firmware to detect CPUs that are not affected by
      Spectre-v2"
    - Revert "arm64: Get rid of __smccc_workaround_1_hvc_*"

  * [Regression] Bionic kernel 4.15.0-71.80 can not boot on ThunderX2 and
    Kunpeng920 (LP: #1852723)
    - SAUCE: arm64: capabilities: Move setup_boot_cpu_capabilities() call to
      correct place

linux (4.15.0-71.80) bionic; urgency=medium

  * bionic/linux: 4.15.0-71.80 -proposed tracker (LP: #1852289)

  * Bionic update: upstream stable patchset 2019-10-29 (LP: #1850541)
    - panic: ensure preemption is disabled during panic()
    - f2fs: use EINVAL for superblock with invalid magic
    - [Config] updateconfigs for USB_RIO500
    - USB: rio500: Remove Rio 500 kernel driver
    - USB: yurex: Don't retry on unexpected errors
    - USB: yurex: fix NULL-derefs on disconnect
    - USB: usb-skeleton: fix runtime PM after driver unbind
    - USB: usb-skeleton: fix NULL-deref on disconnect
    - xhci: Fix false warning message about wrong bounce buffer write length
    - xhci: Prevent device initiated U1/U2 link pm if exit latency is too long
    - xhci: Check all endpoints for LPM timeout
    - usb: xhci: wait for CNR controller not ready bit in xhci resume
    - USB: adutux: fix use-after-free on disconnect
    - USB: adutux: fix NULL-derefs on disconnect
    - USB: adutux: fix use-after-free on release
    - USB: iowarrior: fix use-after-free on disconnect
    - USB: iowarrior: fix use-after-free on release
    - USB: iowarrior: fix use-after-free after driver unbind
    - USB: usblp: fix runtime PM after driver unbind
    - USB: chaoskey: fix use-after-free on release
    - USB: ldusb: fix NULL-derefs on driver unbind
    - serial: uartlite: fix exit path null pointer
    - USB: serial: keyspan: fix NULL-derefs on open() and write()
    - USB: serial: ftdi_sio: add device IDs for Sienna and Echelon PL-20
    - USB: serial: option: add Telit FN980 compositions
    - USB: serial: option: add support for Cinterion CLS8 devices
    - USB: serial: fix runtime PM after driver unbind
    - USB: usblcd: fix I/O after disconnect
    - USB: microtek: fix info-leak at probe
    - USB: dummy-hcd: fix power budget for SuperSpeed mode
    - usb: renesas_usbhs: gadget: Do not discard queues in
      usb_ep_set_{halt,wedge}()
    - usb: renesas_usbhs: gadget: Fix usb_ep_set_{halt,wedge}() behavior
    - USB: legousbtower: fix slab info leak at probe
    - USB: legousbtower: fix deadlock on disconnect
    - USB: legousbtower: fix potential NULL-deref on disconnect
    - USB: legousbtower: fix open after failed reset request
    - USB: legousbtower: fix use-after-free on release
    - staging: vt6655: Fix memory leak in vt6655_probe
    - iio: adc: ad799x: fix probe error handling
    - iio: adc: axp288: Override TS pin bias current for some models
    - iio: light: opt3001: fix mutex unlock race
    - efivar/ssdt: Don't iterate over EFI va...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.