amd_sfh: Null pointer dereference on early device init causes early panic and fails to boot

Bug #1956519 reported by Vadik Mironov
54
This bug affects 9 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Unassigned
Impish
Fix Released
Medium
Matthew Ruffell

Bug Description

BugLink: https://bugs.launchpad.net/bugs/1956519

[Impact]

A regression was introduced into 5.13.0-23-generic for devices using AMD Ryzen chipsets that incorporate AMD Sensor Fusion Hub (SFH) HID devices, which are mostly Ryzen based laptops, but desktops do have the SOC embedded as well.

On early boot, when the driver initialises the device, it hits a null pointer dereference with the following stack trace:

BUG: kernel NULL pointer dereference, address: 000000000000000c
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 0 P4D 0
Oops: 0002 [#1] SMP NOPTI
CPU: 0 PID: 175 Comm: systemd-udevd Not tainted 5.13.0-23-generic #23-Ubuntu
RIP: 0010:amd_sfh_hid_client_init+0x47/0x350 [amd_sfh]
Call Trace:
  ? __pci_set_master+0x5f/0xe0
  amd_mp2_pci_probe+0xad/0x160 [amd_sfh]
  local_pci_probe+0x48/0x80
  pci_device_probe+0x105/0x1c0
  really_probe+0x24b/0x4c0
  driver_probe_device+0xf0/0x160
  device_driver_attach+0xab/0xb0
  __driver_attach+0xb2/0x140
  ? device_driver_attach+0xb0/0xb0
  bus_for_each_dev+0x7e/0xc0
  driver_attach+0x1e/0x20
  bus_add_driver+0x135/0x1f0
  driver_register+0x95/0xf0
  ? 0xffffffffc03d2000
  __pci_register_driver+0x57/0x60
  amd_mp2_pci_driver_init+0x23/0x1000 [amd_sfh]
  do_one_initcall+0x48/0x1d0
  ? kmem_cache_alloc_trace+0xfb/0x240
  do_init_module+0x62/0x290
  load_module+0xa8f/0xb10
  __do_sys_finit_module+0xc2/0x120
  __x64_sys_finit_module+0x18/0x20
  do_syscall_64+0x61/0xb0
  ? ksys_mmap_pgoff+0x135/0x260
  ? exit_to_user_mode_prepare+0x37/0xb0
  ? syscall_exit_to_user_mode+0x27/0x50
  ? __x64_sys_mmap+0x33/0x40
  ? do_syscall_64+0x6e/0xb0
  ? do_syscall_64+0x6e/0xb0
  ? do_syscall_64+0x6e/0xb0
  ? syscall_exit_to_user_mode+0x27/0x50
  ? do_syscall_64+0x6e/0xb0
  ? exc_page_fault+0x8f/0x170
  ? asm_exc_page_fault+0x8/0x30
  entry_SYSCALL_64_after_hwframe+0x44/0xae

This causes a panic and the system is unable to continue booting, and the user must select an older kernel to boot.

[Fix]

The issue was introduced in 5.13.0-23-generic by the commit:

commit d46ef750ed58cbeeba2d9a55c99231c30a172764
commit-impish 56559d7910e704470ad72da58469b5588e8cbf85
Author: Evgeny Novikov <email address hidden>
Date: Tue Jun 1 19:38:01 2021 +0300
Subject:HID: amd_sfh: Fix potential NULL pointer dereference
Link: https://github.com/torvalds/linux/commit/d46ef750ed58cbeeba2d9a55c99231c30a172764

The issue is pretty straightforward, amd_sfh_client.c attempts to dereference cl_data, but it is NULL:

$ eu-addr2line -ifae ./usr/lib/debug/lib/modules/5.13.0-23-generic/kernel/drivers/hid/amd-sfh-hid/amd_sfh.ko amd_sfh_hid_client_init+0x47
0x0000000000000767
amd_sfh_hid_client_init
/build/linux-k2e9CH/linux-5.13.0/drivers/hid/amd-sfh-hid/amd_sfh_client.c:147:27

134 int amd_sfh_hid_client_init(struct amd_mp2_dev *privdata)
135 {
...
146
147 cl_data->num_hid_devices = amd_mp2_get_sensor_num(privdata, &cl_data->sensor_idx[0]);
148
...

The patch moves the call to amd_sfh_hid_client_init() before privdata->cl_data is actually allocated by devm_kzalloc, hence cl_data being NULL.

+ rc = amd_sfh_hid_client_init(privdata);
+ if (rc)
+ return rc;
+
        privdata->cl_data = devm_kzalloc(&pdev->dev, sizeof(struct amdtp_cl_data), GFP_KERNEL);
        if (!privdata->cl_data)
                return -ENOMEM;
...
- return amd_sfh_hid_client_init(privdata);
+ return 0;

The issue was fixed upstream in 5.15-rc4 by the commit:

commit 88a04049c08cd62e698bc1b1af2d09574b9e0aee
Author: Basavaraj Natikar <email address hidden>
Date: Thu Sep 23 17:59:27 2021 +0530
Subject: HID: amd_sfh: Fix potential NULL pointer dereference
Link: https://github.com/torvalds/linux/commit/88a04049c08cd62e698bc1b1af2d09574b9e0aee

The fix places the call to amd_sfh_hid_client_init() after privdata->cl_data is allocated, and it changes the order of amd_sfh_hid_client_init() to happen before devm_add_action_or_reset() fixing the actual null pointer dereference which caused these commits to exist.

This patch also landed in 5.14.10 -stable, but it seems it was omitted from being backported to impish, likely due to it sharing the exact same subject line as the regression commit, so it was likely dropped as a duplicate?

[Testcase]

You need an AMD Ryzen based system that has a AMD Sensor Fusion Hub HID device built in to test this.

Simply booting the system is enough to trigger the issue.

A test kernel is available in the following ppa:

https://launchpad.net/~mruffell/+archive/ubuntu/lp1956519-test

A community user has tested the test kernel, and has confirmed that it fixes the issue.

[Where problems could occur]

If a regression were to occur, it would only affect AMD Ryzen based systems with the AMD Sensor Fusion Hub HID device SOC. Since the changes affect the device initialisation function, a regression could cause systems to panic during boot, forcing users to revert to older kernels to start their systems.

Saying that, the patch is present in 5.15-rc4 and is in 5.14.10, and is in widespread use, and is already present in Jammy.

CVE References

Revision history for this message
Vadik Mironov (vadikmironov) wrote :
summary: - kernel panic after upgrading to kernel 5.13.0-23 on Asus
+ kernel panic after upgrading to kernel 5.13.0-23
Revision history for this message
Vadik Mironov (vadikmironov) wrote : Re: kernel panic after upgrading to kernel 5.13.0-23

I am attaching the full dmesg output for both bad kernel version and good version booting successfully. Please let me know if there is anything else I can provide.

Revision history for this message
Vadik Mironov (vadikmironov) wrote :
Revision history for this message
Vadik Mironov (vadikmironov) wrote :
description: updated
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
information type: Public → Public Security
information type: Public Security → Public
Revision history for this message
Kelsey Steele (kelsey-steele) wrote : Re: kernel panic after upgrading to kernel 5.13.0-23

Thank you for the report and information! We're working to get a fix out. Please refer to bug 1956401 for updates

Revision history for this message
Vadik Mironov (vadikmironov) wrote :

Thanks a lot Kelsey.

Revision history for this message
Vadik Mironov (vadikmironov) wrote :
Download full text (5.2 KiB)

Kelsey, following the suggestion from bug 1956401, I've upgraded to 5.13.0-24-generic and it's exactly the same story as with 5.13.0-23:

    1.330735] BUG: kernel NULL pointer dereference, address: 000000000000000c
[ 1.330768] #PF: supervisor write access in kernel mode
[ 1.330788] #PF: error_code(0x0002) - not-present page
[ 1.330809] PGD 0 P4D 0
[ 1.330822] Oops: 0002 [#1] SMP NOPTI
[ 1.330838] CPU: 0 PID: 204 Comm: systemd-udevd Not tainted 5.13.0-24-generic #24-Ubuntu
[ 1.330870] Hardware name: ASUSTeK COMPUTER INC. MINIPC PN50/PN50, BIOS 0623 05/13/2021
[ 1.330900] RIP: 0010:amd_sfh_hid_client_init+0x47/0x350 [amd_sfh]
[ 1.330930] Code: 00 53 48 83 ec 20 48 8b 5f 08 48 8b 07 48 8d b3 22 01 00 00 4c 8d b0 c8 00 00 00 e8 23 07 00 00 45 31 c0 31 c9 ba 00 00 20 00 <89> 43 0c 48 8d 83 68 01 00 00 48 8d bb 80 01 00 00 48 c7 c6 f0 6d
[ 1.330997] RSP: 0018:ffffa523c0b939e0 EFLAGS: 00010246
[ 1.331018] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 1.331045] RDX: 0000000000200000 RSI: ffffffffc040c249 RDI: ffffffff9300004c
[ 1.331072] RBP: ffffa523c0b93a28 R08: 0000000000000000 R09: 0000000000000006
[ 1.331100] R10: ffffa523c0d00000 R11: 0000000000000007 R12: 0000000fffffffe0
[ 1.331127] R13: ffff8a4ac11c5cd8 R14: ffff8a4ac11570c8 R15: ffff8a4ac11c5cd8
[ 1.331154] FS: 00007feacb0ca8c0(0000) GS:ffff8a4dbf200000(0000) knlGS:0000000000000000
[ 1.331184] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1.331206] CR2: 000000000000000c CR3: 0000000117148000 CR4: 0000000000350ef0
[ 1.331233] Call Trace:
[ 1.331245] <TASK>
[ 1.331255] ? __pci_set_master+0x5f/0xe0
[ 1.331276] amd_mp2_pci_probe+0xad/0x160 [amd_sfh]
[ 1.331298] local_pci_probe+0x48/0x80
[ 1.331315] pci_device_probe+0x105/0x1c0
[ 1.331333] really_probe+0x24b/0x4c0
[ 1.331351] driver_probe_device+0xf0/0x160
[ 1.331369] device_driver_attach+0xab/0xb0
[ 1.331388] __driver_attach+0xb2/0x140
[ 1.331405] ? device_driver_attach+0xb0/0xb0
[ 1.331423] bus_for_each_dev+0x7e/0xc0
[ 1.331440] driver_attach+0x1e/0x20
[ 1.331458] bus_add_driver+0x135/0x1f0
[ 1.331475] driver_register+0x95/0xf0
[ 1.331492] ? 0xffffffffc0411000
[ 1.331506] __pci_register_driver+0x57/0x60
[ 1.331524] amd_mp2_pci_driver_init+0x23/0x1000 [amd_sfh]
[ 1.331548] do_one_initcall+0x48/0x1d0
[ 1.331566] ? kmem_cache_alloc_trace+0xfb/0x240
[ 1.331587] do_init_module+0x62/0x290
[ 1.331605] load_module+0xa8f/0xb10
[ 1.331621] __do_sys_finit_module+0xc2/0x120
[ 1.331641] __x64_sys_finit_module+0x18/0x20
[ 1.332883] do_syscall_64+0x61/0xb0
[ 1.334112] ? fput+0x13/0x20
[ 1.335316] ? ksys_mmap_pgoff+0x135/0x260
[ 1.336514] ? exit_to_user_mode_prepare+0x37/0xb0
[ 1.337702] ? syscall_exit_to_user_mode+0x27/0x50
[ 1.338877] ? __x64_sys_mmap+0x33/0x40
[ 1.340036] ? do_syscall_64+0x6e/0xb0
[ 1.341180] ? do_syscall_64+0x6e/0xb0
[ 1.342303] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 1.343422] RIP: 0033:0x7feacb66094d
[ 1.344527] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 ...

Read more...

Revision history for this message
Vadik Mironov (vadikmironov) wrote :

Full dmesg output from 5.13.0-24 run attached

Revision history for this message
Vadik Mironov (vadikmironov) wrote :

Having played with it a bit more, it certainly does not look like a solution to 1956401 is applicable here. Unless there are any concerns, I am removing the duplicate flag and would be appreciative if anyone from the kernel team would take a look.

Perhaps this is something brought into 5.13 with the backported patches to amd-sfh-hid driver?

Changed in linux (Ubuntu):
status: Confirmed → Fix Released
Changed in linux (Ubuntu Impish):
status: New → In Progress
importance: Undecided → Medium
Revision history for this message
Matthew Ruffell (mruffell) wrote :
Download full text (4.4 KiB)

Hi Vadik, Oliver,

Thanks for reporting, and sorry that 5.13.0-24-generic in -proposed didn't solve the issue.

Let's do some analysis:

[ 1.381250] BUG: kernel NULL pointer dereference, address: 000000000000000c
[ 1.381270] RIP: 0010:amd_sfh_hid_client_init+0x47/0x350 [amd_sfh]
[ 1.381299] Call Trace:
[ 1.381302] ? __pci_set_master+0x5f/0xe0
[ 1.381310] amd_mp2_pci_probe+0xad/0x160 [amd_sfh]
[ 1.381314] local_pci_probe+0x48/0x80
...

Okay, so a null pointer dereference in the amd_sfh module. The c in 000000000000000c probably means offset +12 in the struct we are trying to access.

Let's see where this is:

$ eu-addr2line -ifae ./usr/lib/debug/lib/modules/5.13.0-23-generic/kernel/drivers/hid/amd-sfh-hid/amd_sfh.ko amd_sfh_hid_client_init+0x47
0x0000000000000767
amd_sfh_hid_client_init
/build/linux-k2e9CH/linux-5.13.0/drivers/hid/amd-sfh-hid/amd_sfh_client.c:147:27

Let's have a look:

134 int amd_sfh_hid_client_init(struct amd_mp2_dev *privdata)
135 {
...
146
147 cl_data->num_hid_devices = amd_mp2_get_sensor_num(privdata, &cl_data->sensor_idx[0]);
148
...

Okay, so we are dereferencing either cl_data->num_hid_devices or &cl_data->sensor_idx[0], but they are both in cl_data, so cl_data will be NULL.

Since you mentioned that it worked in 5.13.0-22-generic, and broke in 5.13.0-23-generic, lets see if this changed in 5.13.0-23-generic:

$ git log --grep "amd_sfh" Ubuntu-5.13.0-22.22..Ubuntu-5.13.0-23.23
commit d46ef750ed58cbeeba2d9a55c99231c30a172764
commit-impish 56559d7910e704470ad72da58469b5588e8cbf85
Author: Evgeny Novikov <email address hidden>
Date: Tue Jun 1 19:38:01 2021 +0300
Subject:HID: amd_sfh: Fix potential NULL pointer dereference
Link: https://github.com/torvalds/linux/commit/d46ef750ed58cbeeba2d9a55c99231c30a172764

Okay, so this patch changes the parent function to amd_sfh_hid_client_init(), which is amd_mp2_pci_probe().

+ rc = amd_sfh_hid_client_init(privdata);
+ if (rc)
+ return rc;
+
        privdata->cl_data = devm_kzalloc(&pdev->dev, sizeof(struct amdtp_cl_data), GFP_KERNEL);
        if (!privdata->cl_data)
                return -ENOMEM;
...
- return amd_sfh_hid_client_init(privdata);
+ return 0;

So it seems we are moving the call to amd_sfh_hid_client_init(privdata) from the end of the function up a bit, and interestingly, before the call to privdata->cl_data = devm_kzalloc().

So... we are using privdata->cl_data before it is being allocated? Looks like we have found our NULL pointer dereference.

I suppose the commit to "fix" the null pointer dereference actually introduced another one.

Looking at this commit in the upstream tree, I came across:

commit 88a04049c08cd62e698bc1b1af2d09574b9e0aee
Author: Basavaraj Natikar <email address hidden>
Date: Thu Sep 23 17:59:27 2021 +0530
Subject: HID: amd_sfh: Fix potential NULL pointer dereference
Link: https://github.com/torvalds/linux/commit/88a04049c08cd62e698bc1b1af2d09574b9e0aee

This patch seems to move the call to after cl_data is allocated, which should fix this.

- rc = amd_sfh_hid_client_init(privdata);
- if (rc)
- return rc;
-
        privdata->cl_data ...

Read more...

Changed in linux (Ubuntu Impish):
assignee: nobody → Matthew Ruffell (mruffell)
tags: added: seg
Revision history for this message
Vadik Mironov (vadikmironov) wrote :

Matthew, thanks a lot for your detailed analysis. I stumbled across Evgeny's patch yesterday as a most notable change related to null ptr handling, but totally missed the second patch from Basavaraj too. How peculiar. Anyway, please do let me know once you have a kernel build and I will give it a ride.

Revision history for this message
Oliver Nissen (nextcube) wrote :

Hi Matthew,

another big thank you from my side for the quick and perfect analysis! Being given the necessary instructions I will be more than happy to test the fixed kernel.

Oliver

Revision history for this message
Iestyn Elfick (chimera-ise) wrote :
Download full text (4.1 KiB)

FYI - still occurs in 5.13.0-25.

Linux version 5.13.0-25-generic (buildd@lgw01-amd64-047) (gcc (Ubuntu 11.2.0-7ubuntu2) 11.2.0, GNU ld (GNU Binutils for Ubuntu) 2.37) #26-Ubuntu SMP Fri Jan 7 15:48:31 UTC 2022

BUG: kernel NULL pointer dereference, address: 000000000000000c
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 0 P4D 0
Oops: 0002 [#1] SMP NOPTI
CPU: 0 PID: 191 Comm: systemd-udevd Not tainted 5.13.0-25-generic #26-Ubuntu
Hardware name: ASUSTeK COMPUTER INC. MINIPC PN50/PN50, BIOS 0620 03/18/2021
RIP: 0010:amd_sfh_hid_client_init+0x47/0x350 [amd_sfh]
Code: 00 53 48 83 ec 20 48 8b 5f 08 48 8b 07 48 8d b3 22 01 00 00 4c 8d b0 c8 00 00 00 e8 23 07 00 00 45 31 c0 31 c9 ba 00 00 20 00 <89> 43 0c 48 8d 83 68 01 00 00 48 8d bb 80 01 00 00 48 c7 c6 20 6d
RSP: 0018:ffffa431c0a2fa60 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000200000 RSI: ffffffffc0415249 RDI: ffffffff9560004c
RBP: ffffa431c0a2faa8 R08: 0000000000000000 R09: 0000000000000006
R10: ffffa431c0d00000 R11: 0000000000000007 R12: 0000000fffffffe0
R13: ffff91d1170ead98 R14: ffff91d1014bb0c8 R15: ffff91d1170ead98
FS: 00007fa5fe59d8c0(0000) GS:ffff91d7ef600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000000c CR3: 0000000117850000 CR4: 0000000000350ef0
Call Trace:
 ? __pci_set_master+0x5f/0xe0
 amd_mp2_pci_probe+0xad/0x160 [amd_sfh]
 local_pci_probe+0x48/0x80
 pci_device_probe+0x105/0x1c0
 really_probe+0x24b/0x4c0
 driver_probe_device+0xf0/0x160
 device_driver_attach+0xab/0xb0
 __driver_attach+0xb2/0x140
 ? device_driver_attach+0xb0/0xb0
 bus_for_each_dev+0x7e/0xc0
 driver_attach+0x1e/0x20
 bus_add_driver+0x135/0x1f0
 driver_register+0x95/0xf0
 ? 0xffffffffc041a000
 __pci_register_driver+0x57/0x60
 amd_mp2_pci_driver_init+0x23/0x1000 [amd_sfh]
 do_one_initcall+0x48/0x1d0
 ? kmem_cache_alloc_trace+0xfb/0x240
 do_init_module+0x62/0x290
 load_module+0xa8f/0xb10
 __do_sys_finit_module+0xc2/0x120
 __x64_sys_finit_module+0x18/0x20
 do_syscall_64+0x61/0xb0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_newfstatat+0x1c/0x20
 ? do_syscall_64+0x6e/0xb0
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fa5feb3394d
Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d b3 64 0f 00 f7 d8 64 89 01 48
RSP: 002b:00007ffc5ce60ba8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
RAX: ffffffffffffffda RBX: 000056027b85af40 RCX: 00007fa5feb3394d
RDX: 0000000000000000 RSI: 00007fa5fecc33fe RDI: 000000000000000c
RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000000
R10: 000000000000000c R11: 0000000000000246 R12: 00007fa5fecc33fe
R13: 000056027b845ed0 R14: 000056027b85b030 R15: 000056027b84a330
Modules linked in: amd_sfh(+) i2c_hid_acpi libahci i2c_hid i2c_piix4(+) xhci_pci_renesas(+) nvme_core(+) wmi(+) video(+) fjes(+) hid
CR2: 000000000000000c
---[ end trace cc368d63aaf78960 ]---
RIP: 0010:amd_sfh_hid_client_init+0x47/0x350 [amd_sfh]
Code: 00 53 48 83 ec 20 48 8b 5f 08 48 8b 07 48 8d b3 22 0...

Read more...

Revision history for this message
Matthew Ruffell (mruffell) wrote :

Hi Vadik, Oliver, Iestyn,

The test kernel has just finished building, and is ready to test. It would be great if you could install it and let me know if it fixes the issue.

The kernel is 5.13.0-23-generic, with the following commit added:

commit 88a04049c08cd62e698bc1b1af2d09574b9e0aee
Author: Basavaraj Natikar <email address hidden>
Date: Thu Sep 23 17:59:27 2021 +0530
Subject: HID: amd_sfh: Fix potential NULL pointer dereference
Link: https://github.com/torvalds/linux/commit/88a04049c08cd62e698bc1b1af2d09574b9e0aee

Please note, these test packages are NOT SUPPORTED by Canonical and are for TEST PURPOSES ONLY. ONLY install in a dedicated test environment.

Instructions to install (on a Impish system):
1) sudo add-apt-repository ppa:mruffell/lp1956519-test
2) sudo apt update
3) sudo apt install linux-image-unsigned-5.13.0-23-generic linux-modules-5.13.0-23-generic linux-modules-extra-5.13.0-23-generic linux-headers-5.13.0-23-generic
4) sudo reboot
5) uname -rv
5.13.0-23-generic #23+TEST1956519v20220112b1-Ubuntu SMP Wed Jan 12 00:24:19 UTC 20

If you are asked to abort the current kernel removal, say no.

You may need to change your grub config to boot the correct kernel. You can follow these instructions to do that: https://paste.ubuntu.com/p/WGpCWTPyTj/

Please make sure the uname is correct on boot. Sometimes newer kernels get pulled in due to metapackage dependencies not liking the linux-image-unsigned package.

Let me know if the kernel boots correctly and you no longer have a stacktrace in "sudo dmesg". If it works, I will submit the patch for SRU into the next kernel update.

Thanks,
Matthew

Revision history for this message
Vadik Mironov (vadikmironov) wrote :
Download full text (7.2 KiB)

Matthew, thanks a lot. I can confirm the issue is gone. There is a bunch of preexisting errors related to the BIOS as far as I can see, but these were present in .22 too. Please go right ahead with the patch submission and hopefully it'll make it into .26 kernel.

vadikmironov@MINIPC-PN50:~$ uname -rv
5.13.0-23-generic #23+TEST1956519v20220112b1-Ubuntu SMP Wed Jan 12 00:24:19 UTC 20
vadikmironov@MINIPC-PN50:~$ sudo dmesg | grep -i bug
[ 0.159485] ACPI BIOS Error (bug): Failure creating named object [\SMIB], AE_ALREADY_EXISTS (20210331/dsfield-637)
[ 0.160771] ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.M291.WLAN], AE_NOT_FOUND (20210331/dswload2-162)
[ 0.162718] ACPI BIOS Error (bug): Failure creating named object [\_SB.PCI0.SBRG.EC0.VER1], AE_ALREADY_EXISTS (20210331/dsfield-637)
[ 0.162742] ACPI BIOS Error (bug): Failure creating named object [\_SB.PCI0.SBRG.EC0.CCI0], AE_ALREADY_EXISTS (20210331/dsfield-637)
[ 0.162748] ACPI BIOS Error (bug): Failure creating named object [\_SB.PCI0.SBRG.EC0.CCI1], AE_ALREADY_EXISTS (20210331/dsfield-637)
[ 0.162753] ACPI BIOS Error (bug): Failure creating named object [\_SB.PCI0.SBRG.EC0.CCI2], AE_ALREADY_EXISTS (20210331/dsfield-637)
[ 0.162758] ACPI BIOS Error (bug): Failure creating named object [\_SB.PCI0.SBRG.EC0.CCI3], AE_ALREADY_EXISTS (20210331/dsfield-637)
[ 0.162779] ACPI BIOS Error (bug): Failure creating named object [\_SB.PCI0.SBRG.EC0.CTL0], AE_ALREADY_EXISTS (20210331/dsfield-637)
[ 0.162785] ACPI BIOS Error (bug): Failure creating named object [\_SB.PCI0.SBRG.EC0.CTL1], AE_ALREADY_EXISTS (20210331/dsfield-637)
[ 0.162790] ACPI BIOS Error (bug): Failure creating named object [\_SB.PCI0.SBRG.EC0.CTL2], AE_ALREADY_EXISTS (20210331/dsfield-637)
[ 0.162795] ACPI BIOS Error (bug): Failure creating named object [\_SB.PCI0.SBRG.EC0.CTL3], AE_ALREADY_EXISTS (20210331/dsfield-637)
[ 0.162800] ACPI BIOS Error (bug): Failure creating named object [\_SB.PCI0.SBRG.EC0.CTL4], AE_ALREADY_EXISTS (20210331/dsfield-637)
[ 0.162805] ACPI BIOS Error (bug): Failure creating named object [\_SB.PCI0.SBRG.EC0.CTL5], AE_ALREADY_EXISTS (20210331/dsfield-637)
[ 0.162811] ACPI BIOS Error (bug): Failure creating named object [\_SB.PCI0.SBRG.EC0.CTL6], AE_ALREADY_EXISTS (20210331/dsfield-637)
[ 0.162817] ACPI BIOS Error (bug): Failure creating named object [\_SB.PCI0.SBRG.EC0.CTL7], AE_ALREADY_EXISTS (20210331/dsfield-637)
[ 0.162840] ACPI BIOS Error (bug): Failure creating named object [\_SB.PCI0.SBRG.EC0.MGI0], AE_ALREADY_EXISTS (20210331/dsfield-637)
[ 0.162846] ACPI BIOS Error (bug): Failure creating named object [\_SB.PCI0.SBRG.EC0.MGI1], AE_ALREADY_EXISTS (20210331/dsfield-637)
[ 0.162852] ACPI BIOS Error (bug): Failure creating named object [\_SB.PCI0.SBRG.EC0.MGI2], AE_ALREADY_EXISTS (20210331/dsfield-637)
[ 0.162857] ACPI BIOS Error (bug): Failure creating named object [\_SB.PCI0.SBRG.EC0.MGI3], AE_ALREADY_EXISTS (20210331/dsfield-637)
[ 0.162863] ACPI BIOS Error (bug): Failure creating named object [\_SB.PCI0.SBRG.EC0.MGI4], AE_ALREADY_EXISTS (20210331/dsfield-637)
[ 0.162869] ACPI BIOS Error (bug): Failure creating named object [\_S...

Read more...

summary: - kernel panic after upgrading to kernel 5.13.0-23
+ amd_sfh: Null pointer dereference on early device init causes early
+ panic and fails to boot
description: updated
Revision history for this message
Matthew Ruffell (mruffell) wrote :

Hi Vadik,

Thanks for running that test kernel, and its great to hear that it fixes your issue.

I have written up a proper SRU template for the bug, and I tidied up the patches and submitted them for SRU to the Ubuntu Kernel team mailing list:

https://lists.ubuntu.com/archives/kernel-team/2022-January/127102.html
https://lists.ubuntu.com/archives/kernel-team/2022-January/127103.html

The next steps are for the kernel team to review the patches. We need two acks from Senior Kernel Team members to be accepted into the next kernel SRU cycle.

Once we have two acks, the patch will be applied to the Impish kernel git tree, and built into the next kernel update, and placed into -proposed for verification. When this happens, I will need you to test the kernel in -proposed and again make sure it fixes your issue. If it does, I will mark the Launchpad bug as verified and the kernel will be released to -updates at the end of the cycle.

Looking at https://kernel.ubuntu.com/, we see the next patch deadline is the 26th of Jan, hopefully we will get a review by then, the kernel team will build the kernel between 31st Jan and 4th of Feb, and the kernel will be in -proposed between 7th Feb and 18th of Feb, with a release hopefully 21st Feb, give or take a few days if any CVEs turn up.

I will keep you informed of each step, and I will write back when its time to test the kernel in -proposed.

Thanks,
Matthew

Revision history for this message
Vadik Mironov (vadikmironov) wrote :

Thanks a lot Matthew. Alright, no problems at all, I'll stick to .22 kernel for now and will wait for updates.

Changed in linux (Ubuntu Impish):
status: In Progress → Fix Committed
Revision history for this message
Matthew Ruffell (mruffell) wrote :

Hi Vadik, Oliver, Iestyn,

The kernel team has reviewed the patches, and we have two acks from Senior kernel team members:

https://lists.ubuntu.com/archives/kernel-team/2022-January/127108.html
https://lists.ubuntu.com/archives/kernel-team/2022-January/127111.html

The patch has now been applied to the Impish kernel git tree:

https://lists.ubuntu.com/archives/kernel-team/2022-January/127117.html

We are all set, and accepted into the next SRU cycle. I will likely write back around the 7th of Feb, when there is a new kernel in -proposed for you to test and verify.

Thanks,
Matthew

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux/5.13.0-28.31 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-impish' to 'verification-done-impish'. If the problem still exists, change the tag 'verification-needed-impish' to 'verification-failed-impish'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-impish
Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :

Hello Matthew and everyone else affected by this issue,

The Impish kernel had to be re-spun because of a security issue and the kernel team decided to include this fix with the re-spin. Could someone please verify if the kernel mentioned on the previous automated comment fixes the bug?

Thank you.

Revision history for this message
Vadik Mironov (vadikmironov) wrote :

Kleber, works like a charm! Thanks a lot for the bugfix.

MINIPC-PN50:~$ uname -ra
Linux MINIPC-PN50 5.13.0-28-generic #31-Ubuntu SMP Thu Jan 13 17:41:06 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

tags: added: verification-done-impish
removed: verification-needed-impish
Revision history for this message
Iestyn Elfick (chimera-ise) wrote :

Same here, problem resolved. Great job, amazing turnaround.

Revision history for this message
Stefan (boldos) wrote :

Just out of curiosity: Is it planned to bring this to Focal too? (since the same problem can be seen in LTS Focal with kernels 5.13).

Revision history for this message
Matthew Ruffell (mruffell) wrote :

Hi Stefan,

Yes, this will be brought to Focal's HWE kernel, and will be released at the same time the Impish kernel is released, looking to be the end of the month going by https://kernel.ubuntu.com/

Vadik, Iestyn, thank you very much for verifying the Impish kernel in -proposed.

Thanks,
Matthew

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (74.6 KiB)

This bug was fixed in the package linux - 5.13.0-28.31

---------------
linux (5.13.0-28.31) impish; urgency=medium

  * amd_sfh: Null pointer dereference on early device init causes early panic
    and fails to boot (LP: #1956519)
    - HID: amd_sfh: Fix potential NULL pointer dereference

  * impish: ddebs build take too long and times out (LP: #1957810)
    - [Packaging] enforce xz compression for ddebs

  * audio mute/ mic mute are not working on a HP machine (LP: #1955691)
    - ALSA: hda/realtek: fix mute/micmute LEDs for a HP ProBook

  * rtw88_8821ce causes freeze (LP: #1927808)
    - rtw88: Disable PCIe ASPM while doing NAPI poll on 8821CE

  * alsa/sdw: fix the audio sdw codec parsing logic in the acpi table
    (LP: #1955686)
    - ALSA: hda: intel-sdw-acpi: harden detection of controller
    - ALSA: hda: intel-sdw-acpi: go through HDAS ACPI at max depth of 2

  * icmp_redirect from selftests fails on F/kvm (unary operator expected)
    (LP: #1938964)
    - selftests: icmp_redirect: pass xfail=0 to log_test()

  * Impish update: upstream stable patchset 2021-12-17 (LP: #1955180)
    - arm64: zynqmp: Do not duplicate flash partition label property
    - arm64: zynqmp: Fix serial compatible string
    - ARM: dts: sunxi: Fix OPPs node name
    - arm64: dts: allwinner: h5: Fix GPU thermal zone node name
    - arm64: dts: allwinner: a100: Fix thermal zone node name
    - staging: wfx: ensure IRQ is ready before enabling it
    - ARM: dts: NSP: Fix mpcore, mmc node names
    - scsi: lpfc: Fix list_add() corruption in lpfc_drain_txq()
    - arm64: dts: rockchip: Disable CDN DP on Pinebook Pro
    - arm64: dts: hisilicon: fix arm,sp805 compatible string
    - RDMA/bnxt_re: Check if the vlan is valid before reporting
    - bus: ti-sysc: Add quirk handling for reinit on context lost
    - bus: ti-sysc: Use context lost quirk for otg
    - usb: musb: tusb6010: check return value after calling
      platform_get_resource()
    - usb: typec: tipd: Remove WARN_ON in tps6598x_block_read
    - ARM: dts: ux500: Skomer regulator fixes
    - staging: rtl8723bs: remove possible deadlock when disconnect (v2)
    - ARM: BCM53016: Specify switch ports for Meraki MR32
    - arm64: dts: qcom: msm8998: Fix CPU/L2 idle state latency and residency
    - arm64: dts: qcom: ipq6018: Fix qcom,controlled-remotely property
    - arm64: dts: freescale: fix arm,sp805 compatible string
    - ASoC: SOF: Intel: hda-dai: fix potential locking issue
    - clk: imx: imx6ul: Move csi_sel mux to correct base register
    - ASoC: nau8824: Add DMI quirk mechanism for active-high jack-detect
    - scsi: advansys: Fix kernel pointer leak
    - ALSA: intel-dsp-config: add quirk for APL/GLK/TGL devices based on ES8336
      codec
    - firmware_loader: fix pre-allocated buf built-in firmware use
    - ARM: dts: omap: fix gpmc,mux-add-data type
    - usb: host: ohci-tmio: check return value after calling
      platform_get_resource()
    - ARM: dts: ls1021a: move thermal-zones node out of soc/
    - ARM: dts: ls1021a-tsn: use generic "jedec,spi-nor" compatible for flash
    - ALSA: ISA: not for M68K
    - tty: tty_buffer: Fix the softlockup issue in flush_to_ldisc
    - MIPS: sni:...

Changed in linux (Ubuntu Impish):
status: Fix Committed → Fix Released
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-intel-5.13/5.13.0-1010.10 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-focal
tags: added: verification-done-focal
removed: verification-needed-focal
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.