Pull-request: Apply mm/mglru patches to fix soft lockup
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux-nvidia-6.5 (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
[ 1918.995157] watchdog: BUG: soft lockup - CPU#0 stuck for 1725s! [kswapd0:42]
[ 1919.002366] Modules linked in: raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq async_xor xor xor_neon async_tx raid6_pq raid1 raid0 multipath linear scsi_dh_alua scsi_dh_emc scsi_dh_rdac nvme nvme_core nvme_common
[ 1919.023319] CPU: 0 PID: 42 Comm: kswapd0 Tainted: G L 6.5.0-1011-nvidia #11-Ubuntu
[ 1919.032483] Hardware name: NVIDIA Grace Hopper x4 P4496/UT2.1 DP Chassis, BIOS 01.02.00 20240120
[ 1919.042180] pstate: 83400009 (Nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
[ 1919.049300] pc : __rcu_read_
[ 1919.053666] lr : shrink_
[ 1919.057675] sp : ffff80008149bb70
[ 1919.061060] x29: ffff80008149bb70 x28: ffff00003ddfa600 x27: ffffdeee387571e0
[ 1919.068366] x26: ffffdeee383154f8 x25: 0000000000000001 x24: ffff301c90bf4400
[ 1919.075671] x23: ffffffffffffffff x22: 0000000000000000 x21: ffff301c90bf4400
[ 1919.082975] x20: 00000000000002d3 x19: ffff80008149bd68 x18: 0000000000000000
[ 1919.090281] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[ 1919.097585] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[ 1919.104890] x11: 0000000000000000 x10: 0000000000000000 x9 : ffffdeee355c2008
[ 1919.112195] x8 : ffff80008149bde0 x7 : 0000000000000000 x6 : 0000000000000020
[ 1919.119500] x5 : 0000000000000001 x4 : 0000000000000000 x3 : 0000000000000000
[ 1919.126804] x2 : 0000000000000000 x1 : 0000000000000001 x0 : ffff301c90bf4400
[ 1919.134109] Call trace:
[ 1919.136606] __rcu_read_
[ 1919.140615] lru_gen_
[ 1919.144979] shrink_
[ 1919.148633] balance_
[ 1919.152464] kswapd+0x12c/0x268
[ 1919.155672] kthread+0x104/0x110
[ 1919.158970] ret_from_
[ 1942.995157] watchdog: BUG: soft lockup - CPU#0 stuck for 1747s! [kswapd0:42]
[ 1943.002366] Modules linked in: raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq async_xor xor xor_neon async_tx raid6_pq raid1 raid0 multipath linear scsi_dh_alua scsi_dh_emc scsi_dh_rdac nvme nvme_core nvme_common
[ 1943.023319] CPU: 0 PID: 42 Comm: kswapd0 Tainted: G L 6.5.0-1011-nvidia #11-Ubuntu
[ 1943.032483] Hardware name: NVIDIA Grace Hopper x4 P4496/UT2.1 DP Chassis, BIOS 01.02.00 20240120
[ 1943.042180] pstate: 63400009 (nZCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
[ 1943.049300] pc : lru_gen_
[ 1943.053932] lr : lru_gen_
[ 1943.058651] sp : ffff80008149bbf0
[ 1943.062037] x29: ffff80008149bbf0 x28: ffff00003ddfa600 x27: ffffdeee387571e0
[ 1943.069342] x26: ffffdeee383154f8 x25: ffff80008149bde0 x24: 00000000000007c0
[ 1943.076647] x23: 0000000000000001 x22: 0000000000000000 x21: ffff00003ddfe600
[ 1943.083952] x20: ffff00003ddfa600 x19: ffff80008149bd68 x18: 0000000000000000
[ 1943.091257] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[ 1943.098562] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[ 1943.105867] x11: 0000000000000000 x10: 0000000000000000 x9 : ffffdeee359e4bb8
[ 1943.113172] x8 : ffff80008149bde0 x7 : 0000000000000000 x6 : 0000000000000000
[ 1943.120477] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
[ 1943.127782] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff301c90bf4400
[ 1943.135086] Call trace:
[ 1943.137583] lru_gen_
[ 1943.141858] shrink_
[ 1943.145511] balance_
[ 1943.149342] kswapd+0x12c/0x268
[ 1943.152551] kthread+0x104/0x110
[ 1943.155849] ret_from_
CVE References
- 2023-34324
- 2023-46813
- 2023-46838
- 2023-50431
- 2023-51779
- 2023-51780
- 2023-51781
- 2023-51782
- 2023-5972
- 2023-6111
- 2023-6176
- 2023-6531
- 2023-6560
- 2023-6606
- 2023-6622
- 2023-6817
- 2023-6915
- 2023-6931
- 2023-6932
- 2024-0193
- 2024-0565
- 2024-0582
- 2024-0646
- 2024-1085
- 2024-1086
- 2024-22705
- 2024-23850
- 2024-23851
- 2024-26597
- 2024-26599
This bug is awaiting verification that the linux-nvidia- 6.5/6.5. 0-1014. 14 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification- needed- jammy-linux- nvidia- 6.5' to 'verification- done-jammy- linux-nvidia- 6.5'. If the problem still exists, change the tag 'verification- needed- jammy-linux- nvidia- 6.5' to 'verification- failed- jammy-linux- nvidia- 6.5'.
If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.
See https:/ /wiki.ubuntu. com/Testing/ EnableProposed for documentation how to enable and use -proposed. Thank you!