Kdump broken since 4.15.0-65 on secureboot - purgatory cannot load

Bug #1869672 reported by yamato
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
High
Guilherme G. Piccoli
Bionic
Fix Released
High
Guilherme G. Piccoli

Bug Description

[Impact]
* Kdump kernel can't be loaded using Linux kernel 4.15.0-65 and newer on Bionic; kexec fails to load using the "new" kexec_file_load() syscall, showing the following messages in dmesg:

kexec: Undefined symbol: __stack_chk_fail
kexec-bzImage64: Loading purgatory failed

* Reason for this was that backport from upstream commit b059f801a937 ("x86/purgatory: Use CFLAGS_REMOVE rather than reset KBUILD_CFLAGS") makes use of a config option guard that wasn't backported to Ubuntu 4.15.x series.

* Also, we found another related issue, an undefined memcpy() symbol, that was related to the above patch too. We propose here a specific fix for Bionic, in the form of the patch attached in comment #18.

[Test case]

* Basically the test consists in booting a signed kernel in a secure boot environment (this is required given Ubuntu kernel is built with CONFIG_KEXEC_VERIFY_SIG so to use kexec_file_load() we must be in a proper signed/secure booted system). It works until 4.15.0-64, and starts to fail after this release, until the current proposed version 4.15.0-97. We can also check kdump-tools service in the failing case, which shows:

systemctl status kdump-tools
[...]
kdump-tools[895]: Starting kdump-tools: * Creating symlink /var/lib/kdump/vmlinuz
kdump-tools[895]: * Creating symlink /var/lib/kdump/initrd.img
kdump-tools[895]: kexec_file_load failed: Exec format error
kdump-tools[895]: * failed to load kdump kernel
[...]

* With the patch attached in the LP, it works normally again and I was able to collect a kdump.

[Regression potential]

* Given the patch is quite simple and fixes the build of purgatory, I think the regression potential is low. One potential regression in future would be on backports to purgatory Makefile, making them more difficult/prone to errors; given purgatory is a pretty untouchable code, I consider the regression potential here to be really low.

CVE References

yamato (openkaz)
description: updated
yamato (openkaz)
description: updated
Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Hi yamato, thanks for the bug report! Does your system have secureboot enabled? Also, can you test in a recent kernel, like 4.15.0-91?

Thanks in advance,

Guilherme

Revision history for this message
yamato (openkaz) wrote :

Guilherme,,

Thank you for your reply!

> Does your system have secureboot enabled?
Yes.
I confirmed kdump works without secureboot, but we need it.

> Also, can you test in a recent kernel, like 4.15.0-91?

Yes, I also tried to use 4.15.0-91 and the other.

The results:

4.15.0-45 -> OK
4.15.0-64 -> OK
4.15.0-76 -> NG
4.15.0-88 -> NG
4.15.0-91 -> NG

4.18.0-15 -> OK

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Hi yamato, thank you for the report and prompt test!

So, I need to setup a guest with secureboot, I don't have it at hand. Meanwhile, can you test the in-between kernel versions, so we can narrow down when the problem was introduced?
The versions we had released in Bionic between -64 and -76 are: 4.15.0-65-generic, 4.15.0-66-generic, 4.15.0-69-generic, 4.15.0-70-generic and 4.15.0-72-generic.

If you could test with them all, certainly that would speed-up the resolution, while I work a VM with secureboot support.
Thanks again,

Guilherme

Changed in makedumpfile (Ubuntu):
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
importance: Undecided → High
Revision history for this message
yamato (openkaz) wrote :

Guilherme,

> can you test the in-between kernel versions, so we can narrow down when the problem was introduced?

Of course!
I have tesed each kernels. Please check the result. It seems that the problem occurs from 4.15.0-65.

4.15.0-45 -> OK
4.15.0-64 -> OK
* 4.15.0-65 -> NG
* 4.15.0-66 -> NG
* 4.15.0-69 -> NG
* 4.15.0-70 -> NG
* 4.15.0-72 -> NG
* 4.15.0-74 -> NG
4.15.0-76 -> NG
4.15.0-88 -> NG
4.15.0-91 -> NG

4.18.0-15 -> OK

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Hi yamato, thanks a lot for your help! This greatly reduced the search scope.

There's one more thing I need from you - can you collect a sosreport from your system?
This way I can try to mimic your system here the best I can.
Thanks in advance,

Guilherme

Revision history for this message
yamato (openkaz) wrote :

Hi Guilherme,

Sure. I'll send my sosreport later.

However, I believe this problem is not related with my hardware because
this problem also occurs 3 different machines.

* Advantec EPC-U2117
* Virtual Machine on VMware ESXi
* UPS-APLX7-A20-0464

I write the procedure to find this problem.

1. Install 'Ubuntu server 18.04'.

2. Install 'kdump-tools'

$ sudo apt install kdump-tools

3. create key and import

after rebooting, the kdump kernel will enable when the secureboot is disabled.
However, the kdump kernel will blocked regardless of the version of kernel.

So I need to run these commands according to this.
https://bugs.launchpad.net/ubuntu/+source/shim-signed/+bug/1840941

$ sudo openssl x509 -outform der -in /etc/ssl/certs/ca-certificates.crt -out new-ca.der
$ sudo mokutil --import new-ca.der
Inputing password will be required.

4. Reboot

After rebooting, MOK management interface will be shown.
Type Enter and select Enroll MOK > Yes, then input the password, then reboot

5. Check whether kdump-tools works or not.

$ sudo systemctl status kdump-tools

Using kernel 4.15.0-64:
● kdump-tools.service - Kernel crash dump capture service
   Loaded: loaded (/lib/systemd/system/kdump-tools.service; enabled; vendor preset: enabled)
   Active: active (exited) since Tue 2020-04-07 17:38:53 UTC; 41s ago
  Process: 878 ExecStart=/etc/init.d/kdump-tools start (code=exited, status=0/SUCCESS)
 Main PID: 878 (code=exited, status=0/SUCCESS)

Apr 07 17:38:51 advanced systemd[1]: Starting Kernel crash dump capture service...
Apr 07 17:38:52 advanced kdump-tools[878]: Starting kdump-tools: * Creating symlink /var/lib/kdump/vmlinuz
Apr 07 17:38:52 advanced kdump-tools[878]: * Creating symlink /var/lib/kdump/initrd.img
Apr 07 17:38:52 advanced kdump-tools[878]: * loaded kdump kernel
Apr 07 17:38:52 advanced kdump-tools[1029]: loaded kdump kernel
Apr 07 17:38:53 advanced systemd[1]: Started Kernel crash dump capture service.

Using kernel 4.15.0-76:
● kdump-tools.service - Kernel crash dump capture service
   Loaded: loaded (/lib/systemd/system/kdump-tools.service; enabled; vendor preset: enabled)
   Active: active (exited) since Tue 2020-04-07 16:59:01 UTC; 25min ago
 Main PID: 889 (code=exited, status=0/SUCCESS)
    Tasks: 0 (limit: 4915)
   CGroup: /system.slice/kdump-tools.service

Apr 07 16:58:42 advanced systemd[1]: Starting Kernel crash dump capture service...
Apr 07 16:58:43 advanced kdump-tools[889]: Starting kdump-tools: * Creating symlink /var/lib/kdump/vmlinuz
Apr 07 16:58:43 advanced kdump-tools[889]: kdump-tools: Generating /var/lib/kdump/initrd.img-4.15.0-76-lowlatency
Apr 07 16:59:01 advanced kdump-tools[889]: * Creating symlink /var/lib/kdump/initrd.img
Apr 07 16:59:01 advanced kdump-tools[889]: kexec_file_load failed: Exec format error
Apr 07 16:59:01 advanced kdump-tools[889]: * failed to load kdump kernel
Apr 07 16:59:01 advanced systemd[1]: Started Kernel crash dump capture service.

Revision history for this message
yamato (openkaz) wrote :

First, I'll send sosreport on kernel 4.15.0-64.

Revision history for this message
yamato (openkaz) wrote :

I'll send the sosreport on kernel 4.15.0-76.

Revision history for this message
yamato (openkaz) wrote :

By the way, I found the new kernel 4.15.0-96 is available.

I also tried, but the problem won't be fixed....

4.15.0-45 -> OK
4.15.0-64 -> OK
* 4.15.0-65 -> NG
* 4.15.0-66 -> NG
* 4.15.0-69 -> NG
* 4.15.0-70 -> NG
* 4.15.0-72 -> NG
* 4.15.0-74 -> NG
4.15.0-76 -> NG
4.15.0-88 -> NG
4.15.0-91 -> NG
** 4.15.0-96 -> NG

4.18.0-15 -> OK

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Hi yamato, thanks a lot for your thorough testing and for the sosreport. I'll try to reproduce here and analyze the sosreport. As soon as I have news, I'll get back to you.

Cheers,

Guilherme

Revision history for this message
yamato (openkaz) wrote :

Guilherme,

Thank you for your support.
Please feel free to tell me what I can do.

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Hi yamato, I have good news! Managed to reproduce this in a secure boot guest (after fighting with the keys hehe). There is an hypothesis of what's going on: one patch that was merged in -65 changed the way flags are set on purgatory, a piece of code used on kexec process. If we use kexec in a secure system, it must use the kexec_file_load syscall, which will rely in kernel purgatory. In no-secure-boot systems, the regular/old kexec_load flag is used, which loads the purgatory from kexec-tools.

If the hypothesis is right, we need to figure why kernel 5.3-hwe works - there is a potential fix there based on my analysis, could be that one...but more tests are required. In sosreport dmesg, you can see the following messages, which are related:

kexec: Undefined symbol: __stack_chk_fail
kexec-bzImage64: Loading purgatory failed

Work is ongoing, thanks again for the report and tests!
Cheers,

Guilherme

Revision history for this message
yamato (openkaz) wrote :

Guilherme,

I'm so glad to hear it. Thanks a lot for your effort!

Changed in makedumpfile (Ubuntu):
status: New → Confirmed
Changed in makedumpfile (Ubuntu Bionic):
status: New → Confirmed
status: Confirmed → In Progress
Changed in makedumpfile (Ubuntu):
status: Confirmed → In Progress
Changed in makedumpfile (Ubuntu Bionic):
importance: Undecided → High
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Hi yamato, thank you for the great report! I think I found the problem - can you test my fix proposal? It's just a matter of adding the following PPA to your system and installing the kernel:

launchpad.net/~gpiccoli/+archive/ubuntu/test1869672

You can run the following commands in order to achieve that:
sudo add-apt-repository ppa:gpiccoli/test1869672
sudo apt-get update
sudo apt-get install linux-image-4.15.0-97-generic linux-modules-4.15.0-97-generic linux-modules-extra-4.15.0-97-generic

After that, reboot into this new kernel, and test kdump load if possible. Also, please send me the output of the commands "uname -a" and "kdump-config show" when running the PPA kernel.

Thanks again,

Guilherme

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Hi yamato, I'm sorry but I found yet another issue that I need to take care in my fix, hopefully by tomorrow I'll have a new test kernel.

Also, I've noticed that in order to test PPA kernels on secure boot systems, you'll need to enroll the PPA sign key on shim, to make the kernel bootable - I'll present the details when kernel is ready.

Thanks,

Guilherme

Revision history for this message
yamato (openkaz) wrote :

Excellent!

Of course I'll try your kernel.

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Hi yamato, thank you! Good news, this time the fix is working and I was able to kdump in my secureboot system. So, I'll move on and change title/component/description of this LP to proceed with the SRU - I hope next kernel (after 4.15.0-97) will have the fix =)

In order to test the kernel, you must follow the below procedures (as root user):
[I assume you don't have the proposed pocket enabled in your system, if so please disable it before testing]

add-apt-repository ppa:gpiccoli/test1869672-2
apt-get update
apt-get install linux-image-4.15.0-97-generic linux-modules-4.15.0-97-generic linux-modules-extra-4.15.0-97-generic

Then, get the file: http://ppa.launchpad.net/gpiccoli/test1869672-2/ubuntu/dists/bionic/main/signed/linux-amd64/4.15.0-97.98+TEST0000000v20200423b3/signed.tar.gz

Extract it and you'll see a file uefi.crt in "control/" folder. You can use the following command to extract its .DER key:

openssl x509 -in uefi.crt -outform der -out cert.der

Finally, I'm running "mokutil --import cert.der" to enroll the certificate on shim. After that, you must reboot and you firmware should present you a MOK utility to enroll the key (OVMF does, I need to access through serial console when booting).

With all these steps, I was able to test successfully the kernel, and produced a kernel dump.
Cheers,

Guilherme

summary: - kdump kernel can't be loaded using kernel 4.15.0-76
+ Kdump broken since 4.15.0-65 on secureboot - purgatory cannot load
no longer affects: makedumpfile (Ubuntu)
Changed in linux (Ubuntu):
status: New → In Progress
Changed in linux (Ubuntu Bionic):
status: New → In Progress
Changed in linux (Ubuntu):
importance: Undecided → High
Changed in linux (Ubuntu Bionic):
importance: Undecided → High
Changed in linux (Ubuntu):
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
Changed in linux (Ubuntu Bionic):
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
tags: added: seg
Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :
description: updated
no longer affects: makedumpfile (Ubuntu Bionic)
Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

SRU submitted this morning [0], unfortunately we're not going to make this cycle, the fix will need to wait for the next cycle.

Cheers,

Guilherme

[0] https://lists.ubuntu.com/archives/kernel-team/2020-April/109324.html

Revision history for this message
yamato (openkaz) wrote :

Guilherme,

I tried to use your kernel, and it works!

Now I send you outputs you requested:

$ uname -a
Linux bntutest97 4.15.0-97-generic #98+TEST0000000v20200423b3-Ubuntu SMP Thu Apr 23 22:20:18 UTC 20 x86_64 x86_64 x86_64 GNU/Linux

$ sudo kdump-config show
udo kdump-config show
DUMP_MODE: kdump
USE_KDUMP: 1
KDUMP_SYSCTL: kernel.panic_on_oops=1
KDUMP_COREDIR: /var/crash
crashkernel addr: 0x2c000000
   /var/lib/kdump/vmlinuz: symbolic link to /boot/vmlinuz-4.15.0-97-generic
kdump initrd:
   /var/lib/kdump/initrd.img: symbolic link to /var/lib/kdump/initrd.img-4.15.0-97-generic
current state: ready to kdump

kexec command:
  /sbin/kexec -p -s --command-line="BOOT_IMAGE=/boot/vmlinuz-4.15.0-97-generic root=UUID=cddaa45a-b277-465f-9410-3633976f020c ro reset_devices systemd.unit=kdump-tools-dump.service nr_cpus=1 irqpoll nousb ata_piix.prefer_ms_hyperv=0" --initrd=/var/lib/kdump/initrd.img /var/lib/kdump/vmlinuz

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Thank you yamato, for confirming! So, the patch was sent to kernel team mailing-list, and will be present in next cycle of kernel release - unfortunately we've lost the current cycle.

Cheers,

Guilherme

Revision history for this message
yamato (openkaz) wrote :

Guilherme,

Sure. Do you know when the kernel will be released?

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Hi yamato, the kernel release schedule is available at: kernel.ubuntu.com

You can see the next cycle as the last of the 3 "columns", so the kernel with this patch should get released to -proposed pocket around May 25th and to the -updates pocket around Jun 8th, subject to delays due to security issues/CVEs, that take precedence.

Cheers,

Guilherme

Revision history for this message
yamato (openkaz) wrote :

Guilherme,

Thanks. I'm looking forward to release the kernel!

Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Hi yamato, the kernel with the fix was released to -proposed pocket. Could you give it a try? The instructions to install a proposed kernel are available at https://wiki.ubuntu.com/Testing/EnableProposed.

The version in -proposed containing the fix for this issue is 4.15.0-102 .
Thanks in advance,

Guilherme

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Hi yamato, did you have an opportunity to test the new package, to check if that fixes the problem for you?

Thanks in advance,

Guilherme

Changed in linux (Ubuntu):
status: In Progress → Fix Committed
Revision history for this message
yamato (openkaz) wrote :

Guilherme,

Sorry for late response, and I haven’t tried it.
I’ll try it tomorrow and reply soon.

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Thanks a lot yamato, let us know how it goes =)
Cheers,

Guilherme

Revision history for this message
yamato (openkaz) wrote :

Guilherme,

I've confirmed the kdump works correctly!

yam@ubuntu18:~$ sudo systemctl status kdump-tools
● kdump-tools.service - Kernel crash dump capture service
   Loaded: loaded (/lib/systemd/system/kdump-tools.service; enabled; vendor preset: enabled)
   Active: active (exited) since Thu 2020-05-28 01:10:15 UTC; 5min ago
  Process: 540 ExecStart=/etc/init.d/kdump-tools start (code=exited, status=0/SUCCESS)
 Main PID: 540 (code=exited, status=0/SUCCESS)

May 28 01:10:12 ubuntu18 systemd[1]: Starting Kernel crash dump capture service...
May 28 01:10:13 ubuntu18 kdump-tools[540]: Starting kdump-tools: * Creating symlink /var/lib/kdump/vmlinuz
May 28 01:10:13 ubuntu18 kdump-tools[540]: * Creating symlink /var/lib/kdump/initrd.img
May 28 01:10:15 ubuntu18 kdump-tools[540]: * loaded kdump kernel
May 28 01:10:15 ubuntu18 kdump-tools[764]: loaded kdump kernel
May 28 01:10:15 ubuntu18 systemd[1]: Started Kernel crash dump capture service.
yam@ubuntu18:~$ uname -a
Linux ubuntu18 4.15.0-102-generic #103-Ubuntu SMP Fri May 15 15:22:18 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
yam@ubuntu18:~$ sudo lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.4 LTS
Release: 18.04
Codename: bionic

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Great news yamato, thanks for testing. I'll mark it as verified.
Thanks once more for the great bug report and all the help provided on testing!

Cheers,

Guilherme

tags: added: verification-done-bionic
removed: verification-needed-bionic
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (18.8 KiB)

This bug was fixed in the package linux - 4.15.0-106.107

---------------
linux (4.15.0-106.107) bionic; urgency=medium

  * CVE-2020-0543
    - SAUCE: x86/cpu: Add a steppings field to struct x86_cpu_id
    - SAUCE: x86/cpu: Add 'table' argument to cpu_matches()
    - SAUCE: x86/speculation: Add Special Register Buffer Data Sampling (SRBDS)
      mitigation
    - SAUCE: x86/speculation: Add SRBDS vulnerability and mitigation documentation
    - SAUCE: x86/speculation: Add Ivy Bridge to affected list

linux (4.15.0-103.104) bionic; urgency=medium

  * bionic/linux: 4.15.0-103.104 -proposed tracker (LP: #1881272)

  * "BUG: unable to handle kernel paging request" when testing
    ubuntu_kvm_smoke_test.kvm_smoke_test with B-KVM in proposed (LP: #1881072)
    - KVM: VMX: Explicitly reference RCX as the vmx_vcpu pointer in asm blobs
    - KVM: VMX: Mark RCX, RDX and RSI as clobbered in vmx_vcpu_run()'s asm blob

linux (4.15.0-102.103) bionic; urgency=medium

  * bionic/linux: 4.15.0-102.103 -proposed tracker (LP: #1878856)

  * Packaging resync (LP: #1786013)
    - update dkms package versions

  * debian/scripts/file-downloader does not handle positive failures correctly
    (LP: #1878897)
    - [Packaging] file-downloader not handling positive failures correctly

  * Kernel log flood "ceph: Failed to find inode for 1" (LP: #1875884)
    - ceph: don't check quota for snap inode
    - ceph: quota: cache inode pointer in ceph_snap_realm

  * [UBUNTU 18.04] zpcictl --reset - contribution for kernel (LP: #1870320)
    - s390/pci: Recover handle in clp_set_pci_fn()
    - s390/pci: Fix possible deadlock in recover_store()

  * Bionic update: upstream stable patchset 2020-05-12 (LP: #1878256)
    - drm/edid: Fix off-by-one in DispID DTD pixel clock
    - drm/qxl: qxl_release leak in qxl_draw_dirty_fb()
    - drm/qxl: qxl_release leak in qxl_hw_surface_alloc()
    - drm/qxl: qxl_release use after free
    - btrfs: fix block group leak when removing fails
    - btrfs: fix partial loss of prealloc extent past i_size after fsync
    - mmc: sdhci-xenon: fix annoying 1.8V regulator warning
    - mmc: sdhci-pci: Fix eMMC driver strength for BYT-based controllers
    - ALSA: hda/realtek - Two front mics on a Lenovo ThinkCenter
    - ALSA: hda/hdmi: fix without unlocked before return
    - ALSA: pcm: oss: Place the plugin buffer overflow checks correctly
    - PM: ACPI: Output correct message on target power state
    - PM: hibernate: Freeze kernel threads in software_resume()
    - dm verity fec: fix hash block number in verity_fec_decode
    - RDMA/mlx5: Set GRH fields in query QP on RoCE
    - RDMA/mlx4: Initialize ib_spec on the stack
    - vfio: avoid possible overflow in vfio_iommu_type1_pin_pages
    - vfio/type1: Fix VA->PA translation for PFNMAP VMAs in vaddr_get_pfn()
    - iommu/qcom: Fix local_base status check
    - scsi: target/iblock: fix WRITE SAME zeroing
    - iommu/amd: Fix legacy interrupt remapping for x2APIC-enabled system
    - ALSA: opti9xx: shut up gcc-10 range warning
    - nfs: Fix potential posix_acl refcnt leak in nfs3_set_acl
    - dmaengine: dmatest: Fix iteration non-stop logic
    - selinux: properly handle multiple messages in ...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.