BUG: kernel NULL pointer dereference, address: 0000000000000008

Bug #1981658 reported by Haw Loeung
92
This bug affects 13 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Undecided
Unassigned
Bionic
Fix Released
Undecided
Unassigned
linux-hwe-5.4 (Ubuntu)
Confirmed
Undecided
Unassigned
Bionic
Confirmed
Undecided
Unassigned
linux-hwe-6.2 (Ubuntu)
New
Undecided
Unassigned
Bionic
New
Undecided
Unassigned

Bug Description

Hi,

On one of the main US Ubuntu Archive servers (banjo), we decided to reboot into a HWE kernel. The latest being 5.4.0-122 but on doing so, ran into this kernel panic:

| [ 350.776585] BUG: kernel NULL pointer dereference, address: 0000000000000008
| [ 350.783674] #PF: supervisor read access in kernel mode
| [ 350.788846] #PF: error_code(0x0000) - not-present page
| [ 350.794019] PGD 0 P4D 0
| [ 350.796631] Oops: 0000 [#1] SMP NOPTI
| [ 350.800425] CPU: 8 PID: 0 Comm: swapper/8 Not tainted 5.4.0-122-generic #138~18.04.1-Ubuntu
| [ 350.808918] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 02/10/2022
| [ 350.817666] RIP: 0010:tcp_create_openreq_child+0x2e1/0x3e0
| [ 350.823187] Code: 08 00 00 41 8b 84 24 18 01 00 00 48 c7 83 80 08 00 00 00 00 00 00 4c 89 e6 4c 89 ef 89 83 c4 05 00 00 49 8b 84 24 f8 00 00 00 <48> 8b 40 08 e8 96 28 42 00 48 85 c0 0f b7 83 68 05 00 00 74 0a 83
| [ 350.842068] RSP: 0018:ffff9a958cce8858 EFLAGS: 00010246
| [ 350.847324] RAX: 0000000000000000 RBX: ffff897618739c80 RCX: 0000000000000007
| [ 350.854502] RDX: 0000000000000020 RSI: ffff897607afb0b0 RDI: ffff897605c85580
| [ 350.861682] RBP: ffff9a958cce8878 R08: 0000000000000178 R09: ffff89763e407800
| [ 350.868859] R10: 00000000000004c4 R11: ffff9a958cce89c7 R12: ffff897607afb0b0
| [ 350.876039] R13: ffff897605c85580 R14: ffff8976205fbe00 R15: ffff89762688b400
| [ 350.883219] FS: 0000000000000000(0000) GS:ffff89763ec00000(0000) knlGS:0000000000000000
| [ 350.891358] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
| [ 350.897138] CR2: 0000000000000008 CR3: 0000001fd7914000 CR4: 0000000000340ee0
| [ 350.904319] Call Trace:
| [ 350.906787] <IRQ>
| [ 350.908824] tcp_v6_syn_recv_sock+0x8d/0x710
| [ 350.913259] ? ip6_route_output_flags_noref+0xd0/0x110
| [ 350.918435] tcp_get_cookie_sock+0x48/0x140
| [ 350.922688] cookie_v6_check+0x5a2/0x700
| [ 350.926714] tcp_v6_do_rcv+0x36c/0x3e0
| [ 350.930589] ? tcp_v6_do_rcv+0x36c/0x3e0
| [ 350.934589] tcp_v6_rcv+0xa16/0xa60
| [ 350.938102] ip6_protocol_deliver_rcu+0xd8/0x4d0
| [ 350.942750] ip6_input+0x41/0xb0
| [ 350.946000] ip6_sublist_rcv_finish+0x42/0x60
| [ 350.950385] ip6_sublist_rcv+0x235/0x260
| [ 350.954333] ? __netif_receive_skb_core+0x19d/0xc60
| [ 350.959245] ipv6_list_rcv+0x110/0x140
| [ 350.963018] __netif_receive_skb_list_core+0x157/0x260
| [ 350.968192] ? build_skb+0x17/0x80
| [ 350.971615] netif_receive_skb_list_internal+0x187/0x2a0
| [ 350.976961] gro_normal_list.part.131+0x1e/0x40
| [ 350.981519] napi_complete_done+0x94/0x120
| [ 350.985700] mlx5e_napi_poll+0x178/0x630 [mlx5_core]
| [ 350.990697] net_rx_action+0x140/0x3e0
| [ 350.994475] __do_softirq+0xe4/0x2da
| [ 350.998079] irq_exit+0xae/0xb0
| [ 351.001239] do_IRQ+0x59/0xe0
| [ 351.004228] common_interrupt+0xf/0xf
| [ 351.007913] </IRQ>
| [ 351.010029] RIP: 0010:cpuidle_enter_state+0xbc/0x440
| [ 351.015023] Code: ff e8 b8 ca 80 ff 80 7d d3 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 54 03 00 00 31 ff e8 4b 4f 87 ff fb 66 0f 1f 44 00 00 <45> 85 ed 0f 88 1a 03 00 00 4c 2b 7d c8 48 ba cf f7 53 e3 a5 9b c4
| [ 351.033952] RSP: 0018:ffff9a958026fe48 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd6
| [ 351.041633] RAX: ffff89763ec2fe00 RBX: ffffffff84b66b40 RCX: 000000000000001f
| [ 351.048816] RDX: 00000051abe96150 RSI: 000000002abf3234 RDI: 0000000000000000
| [ 351.055997] RBP: ffff9a958026fe88 R08: 0000000000000002 R09: 000000000002f680
| [ 351.063176] R10: ffff9a958026fe18 R11: 0000000000000115 R12: ffff8976274c3800
| [ 351.070355] R13: 0000000000000001 R14: ffffffff84b66bb8 R15: 00000051abe96150
| [ 351.077540] ? cpuidle_enter_state+0x98/0x440
| [ 351.081930] ? menu_select+0x377/0x600
| [ 351.085706] cpuidle_enter+0x2e/0x40
| [ 351.089310] call_cpuidle+0x23/0x40
| [ 351.092821] do_idle+0x1f6/0x270
| [ 351.096069] cpu_startup_entry+0x1d/0x20
| [ 351.100024] start_secondary+0x166/0x1c0
| [ 351.103977] secondary_startup_64+0xa4/0xb0
| [ 351.108186] Modules linked in: binfmt_misc bonding nls_iso8859_1 ipmi_ssif edac_mce_amd kvm_amd kvm hpilo ccp ipmi_si ipmi_devintf ipmi_msghandler acpi_tad k10temp mac_hid acpi_power_meter sch_fq tcp_bbr ib_iser rdma_cm iw_cm ib_cm iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear mlx5_ib raid1 ses enclosure ib_uverbs ib_core mgag200 drm_vram_helper ttm drm_kms_helper syscopyarea crct10dif_pclmul sysfillrect mlx5_core crc32_pclmul sysimgblt smartpqi fb_sys_fops uas ghash_clmulni_intel aesni_intel crypto_simd igb pci_hyperv_intf cryptd glue_helper usb_storage dca tls drm i2c_algo_bit scsi_transport_sas mlxfw nvme i2c_piix4 nvme_core wmi
| [ 351.180156] CR2: 0000000000000008
| [ 351.183629] ---[ end trace 23210cdf0c6d5851 ]---
| [ 351.322276] RIP: 0010:tcp_create_openreq_child+0x2e1/0x3e0
| [ 351.327974] Code: 08 00 00 41 8b 84 24 18 01 00 00 48 c7 83 80 08 00 00 00 00 00 00 4c 89 e6 4c 89 ef 89 83 c4 05 00 00 49 8b 84 24 f8 00 00 00 <48> 8b 40 08 e8 96 28 42 00 48 85 c0 0f b7 83 68 05 00 00 74 0a 83
| [ 351.346878] RSP: 0018:ffff9a958cce8858 EFLAGS: 00010246
| [ 351.352166] RAX: 0000000000000000 RBX: ffff897618739c80 RCX: 0000000000000007
| [ 351.359348] RDX: 0000000000000020 RSI: ffff897607afb0b0 RDI: ffff897605c85580
| [ 351.366526] RBP: ffff9a958cce8878 R08: 0000000000000178 R09: ffff89763e407800
| [ 351.373705] R10: 00000000000004c4 R11: ffff9a958cce89c7 R12: ffff897607afb0b0
| [ 351.380886] R13: ffff897605c85580 R14: ffff8976205fbe00 R15: ffff89762688b400
| [ 351.388065] FS: 0000000000000000(0000) GS:ffff89763ec00000(0000) knlGS:0000000000000000
| [ 351.396203] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
| [ 351.401982] CR2: 0000000000000008 CR3: 0000001fd7914000 CR4: 0000000000340ee0
| [ 351.409162] Kernel panic - not syncing: Fatal exception in interrupt
| [ 351.415613] Kernel Offset: 0x2000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
| [ 351.437793] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

Per ~IS-Outage on Mattermost, tried various other older kernels and it seems -121 is working fine so looks to be introduced in -122.

| [hloeung@banjo ~]$ lsb_release -a
| No LSB modules are available.
| Distributor ID: Ubuntu
| Description: Ubuntu 18.04.6 LTS
| Release: 18.04
| Codename: bionic
---
ProblemType: Bug
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Jul 14 03:13 seq
 crw-rw---- 1 root audio 116, 33 Jul 14 03:13 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.9-0ubuntu7.28
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
DistroRelease: Ubuntu 18.04
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
MachineType: HPE ProLiant DL385 Gen10
Package: linux-hwe-5.4
PciMultimedia:

ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.utf8
 SHELL=/bin/bash
ProcFB: 0 mgag200drmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.4.0-121-generic root=UUID=a5a2675d-dc52-48f0-9273-2c6dadac446f ro console=ttyS0,115200 nosplash
ProcVersionSignature: Ubuntu 5.4.0-121.137~18.04.1-generic 5.4.189
RelatedPackageVersions:
 linux-restricted-modules-5.4.0-121-generic N/A
 linux-backports-modules-5.4.0-121-generic N/A
 linux-firmware 1.173.21
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
Tags: bionic
Uname: Linux 5.4.0-121-generic x86_64
UpgradeStatus: Upgraded to bionic on 2019-12-16 (940 days ago)
UserGroups: adm
_MarkForUpload: True
dmi.bios.date: 02/10/2022
dmi.bios.vendor: HPE
dmi.bios.version: A40
dmi.board.name: ProLiant DL385 Gen10
dmi.board.vendor: HPE
dmi.chassis.type: 23
dmi.chassis.vendor: HPE
dmi.modalias: dmi:bvnHPE:bvrA40:bd02/10/2022:svnHPE:pnProLiantDL385Gen10:pvr:rvnHPE:rnProLiantDL385Gen10:rvr:cvnHPE:ct23:cvr:
dmi.product.family: ProLiant
dmi.product.name: ProLiant DL385 Gen10
dmi.product.sku: 878615-B21
dmi.sys.vendor: HPE

Haw Loeung (hloeung)
description: updated
description: updated
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1981658

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Changed in linux (Ubuntu Bionic):
status: New → Incomplete
Revision history for this message
Haw Loeung (hloeung) wrote : CRDA.txt

apport information

tags: added: apport-collected bionic
description: updated
Revision history for this message
Haw Loeung (hloeung) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Haw Loeung (hloeung) wrote : Lspci.txt

apport information

Revision history for this message
Haw Loeung (hloeung) wrote : Lsusb.txt

apport information

Revision history for this message
Haw Loeung (hloeung) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Haw Loeung (hloeung) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Haw Loeung (hloeung) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Haw Loeung (hloeung) wrote : ProcModules.txt

apport information

Revision history for this message
Haw Loeung (hloeung) wrote : UdevDb.txt

apport information

Revision history for this message
Haw Loeung (hloeung) wrote : WifiSyslog.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Changed in linux (Ubuntu Bionic):
status: Incomplete → Confirmed
Changed in linux-hwe-5.4 (Ubuntu):
status: New → Confirmed
Changed in linux-hwe-5.4 (Ubuntu Bionic):
status: New → Confirmed
Revision history for this message
Jason Ashdown (jsonuk) wrote :

Also want to confirm that I am seeing the same kernel panic in the last couple of days with v5.4.0-122.

Revision history for this message
Jon Wilkes (plujon) wrote :

I've hit this (or a very similar problem) on 2 different digital ocean droplets running 20.04 LTS . Unfortunately, what I could see of the console contained only the end of the panic report, but I've transcribed it by hand below (I have screenshots if the missing numbers are wanted). The RIP was exactly the same on both 20.04.4 LTS machines I have but slightly different than the one posted by the original filer of this bug:

    RIP: 0010:tcp_create_openreq_child+0x2fd/0x410

Here's what I got from the console:

    psmouse net_failover failover virtio_blk virtio_scsi floppy
    [ 6216.896076] CR2: 0000000000000008
    [ 6216.896830] ---[ end trace 13e3ec3bb3cc9e33 ]---
    [ 6216.897743] RIP: 0010:tcp_create_openreq_child+0x2fd/0x410
    [ 6216.896830] Code: 08 00 00 8b 83 18 01 00 00 48 89 de 4c 89 ef 49 c7 84 24 80
     08 00 00 00 00 00 00 41 89 84 24 c4 05 00 00 48 8b 83 f8 00 00 00 <48> 8b 40 08
     e8 9a d0 46 00 48 85 c0 41 0f b7 84 24 68 05 00 00 74
    [ 6216.902357] RSP: 0018:ffffa6de400b88888 EFLAGS: 00010246
    [ 6216.903497] RAX: ...
    [ 6216.... ] RDX: ...
    [ 6216.... ] RBP: ...
    [ 6216.... ] R10: ...
    [ 6216.... ] R13: ...
    [ 6216.... ] FS: ...
    [ 6216.... ] FS: ...
    [ 6216.... ] CS: ...
    [ 6216.... ] CR2: ...
    [ 6216.... ] DR0: ...
    [ 6216.... ] DR3: ...
    [ 6216.919380] Kernel panic - not syncing: Fatal exception in interrupt
     ]---

    $ uname -sr
    Linux 5.4.0-122-generic

Both droplets were using linux-image-virtual, and to avoid this problem, I reverted to 5.4.0-121 via grub-set-default.

Revision history for this message
lilideng (lilideng) wrote :
Download full text (7.1 KiB)

we also see this issue on azure ubuntu 1804, kernel version is 5.4.0-1086-azure

[ 823.785727] BUG: kernel NULL pointer dereference, address: 0000000000000008
[ 823.791661] #PF: supervisor read access in kernel mode
[ 823.791661] #PF: error_code(0x0000) - not-present page
[ 823.798898] PGD 0 P4D 0
[ 823.798898] Oops: 0000 [#1] SMP PTI
[ 823.798898] CPU: 21 PID: 0 Comm: swapper/21 Not tainted 5.4.0-1086-azure #91~18.04.1-Ubuntu
[ 823.798898] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008 12/07/2018
[ 823.798898] RIP: 0010:tcp_create_openreq_child+0x2e1/0x3e0
[ 823.798898] Code: 08 00 00 41 8b 84 24 18 01 00 00 48 c7 83 80 08 00 00 00 00 00 00 4c 89 e6 4c 89 ef 89 83 c4 05 00 00 49 8b 84 24 f8 00 00 00 <48> 8b 40 08 e8 b6 81 4b 00 48 85 c0 0f b7 83 68 05 00 00 74 0a 83
[ 823.798898] RSP: 0018:ffffbcee00510950 EFLAGS: 00010246
[ 823.798898] RAX: 0000000000000000 RBX: ffff9e64f0c53d40 RCX: 0000000000000007
[ 823.798898] RDX: 0000000000000020 RSI: ffff9e6cd950fb60 RDI: ffff9e6ccf8d3480
[ 823.798898] RBP: ffffbcee00510970 R08: 0000000000000000 R09: ffff9e6d19007800
[ 823.798898] R10: 0000000000000514 R11: ffffbcee00510a37 R12: ffff9e6cd950fb60
[ 823.798898] R13: ffff9e6ccf8d3480 R14: ffff9e6cd7c1e200 R15: ffff9e6d1305e600
[ 823.798898] FS: 0000000000000000(0000) GS:ffff9e6d1f940000(0000) knlGS:0000000000000000
[ 823.798898] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 823.798898] CR2: 0000000000000008 CR3: 000000105216e004 CR4: 00000000003706e0
[ 823.798898] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 823.798898] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 823.798898] Call Trace:
[ 823.798898] <IRQ>
[ 823.798898] tcp_v4_syn_recv_sock+0x5a/0x3d0
[ 823.798898] tcp_get_cookie_sock+0x48/0x140
[ 823.798898] cookie_v4_check+0x561/0x660
[ 823.798898] tcp_v4_do_rcv+0x1a0/0x1d0
[ 823.956063] tcp_v4_rcv+0xa86/0xad0
[ 823.956063] ip_protocol_deliver_rcu+0x31/0x1b0
[ 823.956063] ip_local_deliver_finish+0x48/0x50
[ 823.956063] ip_local_deliver+0x7e/0xe0
[ 823.956063] ? ip_protocol_deliver_rcu+0x1b0/0x1b0
[ 823.956063] ip_sublist_rcv_finish+0x42/0x60
[ 823.956063] ip_sublist_rcv+0x239/0x270
[ 823.956063] ? ip_rcv_finish_core.isra.18+0x3b0/0x3b0
[ 823.956063] ip_list_rcv+0x10d/0x130
[ 823.956063] __netif_receive_skb_list_core+0x23e/0x260
[ 823.956063] netif_receive_skb_list_internal+0x17a/0x290
[ 823.956063] gro_normal_list.part.132+0x1e/0x40
[ 823.956063] napi_complete_done+0x94/0x110
[ 823.956063] mlx5e_napi_poll+0x178/0x630 [mlx5_core]
[ 823.956063] net_rx_action+0x134/0x3c0
[ 823.956063] __do_softirq+0xde/0x2ce
[ 823.956063] irq_exit+0xd7/0xe0
[ 823.956063] hyperv_vector_handler+0x63/0x70
[ 823.956063] hyperv_callback_vector+0xf/0x20
[ 823.956063] </IRQ>
[ 823.956063] RIP: 0010:default_idle+0x2b/0x150
[ 823.956063] Code: 1f 44 00 00 55 48 89 e5 41 56 41 55 41 54 53 65 44 8b 25 58 85 1c 4e 0f 1f 44 00 00 0f 1f 44 00 00 0f 00 2d 79 99 5b 00 fb f4 <65> 44 8b 25 3d 85 1c 4e 0f 1f 44 00 00 5b 41 5c 41 5d 41 5e 5d c3
[ 823.956063] RSP: 0018:ffffbcee0011be78 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff0c
[...

Read more...

Revision history for this message
Andrea Righi (arighi) wrote :

I think this might be fixed by `55573f3a3f352 tcp: make sure treq->af_specific is initialized` that is currently applied to 5.4.0-123.139 in focal-proposed.

It'd be great if someone that is able to reproduce the problem could try to install this new kernel from -proposed and verify if it's still happening.

Revision history for this message
lilideng (lilideng) wrote :

We didn't have repro steps; customers encounter this in some rate. So we can't verify the commit.

Revision history for this message
c (squirrelsc) wrote :

@arighi Is this fix in latest proposal of Azure kernel as well? We have customers met the same issue. We can ask customer to try the fix.

Revision history for this message
Andrea Righi (arighi) wrote :

@squirrelsc the next Azure kernel in Focal that will have this fix is going to be 5.4.0-1087.92, but I don't see it in proposed yet. If it helps I can upload to a ppa an "unofficial" Azure kernel with the fix.

Revision history for this message
Andrea Righi (arighi) wrote :

...or if it's easier, I've just uploaded some debs here:
https://kernel.ubuntu.com/~arighi/lp-1981658/

This kernel (5.4.0-1087.92+arighi) is probably going to be the Azure kernel in proposed and it includes the fix that I mentioned above.

Revision history for this message
c (squirrelsc) wrote :

Thank you, Andrea. We're trying to repro it, but it's not so far. Can you create a proposed kernel, which includes this fix? Once we can reproduce it, will let you know asap. But we can prepare the proposal kernel in the meantime.

Revision history for this message
Tim Gardner (timg-tpi) wrote :

linux-azure 5.4.0-1087.92 with commit 55573f3a3f352 ("tcp: make sure treq->af_specific is initialized") is building in https://launchpad.net/~canonical-kernel-team/+archive/ubuntu/ppa/

Revision history for this message
c (squirrelsc) wrote :

Thank you Tim, when this kernel can be released?

Revision history for this message
Haw Loeung (hloeung) wrote :

It seems the latest in -proposed has fixed it for us.

| [hloeung@banjo ~]$ uname -a
| Linux banjo 5.4.0-123-generic #139~18.04.1-Ubuntu SMP Wed Jul 13 21:12:05 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
| [hloeung@banjo ~]$ uptime
| 01:06:47 up 44 min, 2 users, load average: 11.01, 11.21, 9.54

Where previously, it would kernel panic not too long after booting up.

| [hloeung@banjo ~]$ apt-cache policy linux-image-5.4.0-123-generic
| linux-image-5.4.0-123-generic:
| Installed: 5.4.0-123.139~18.04.1
| Candidate: 5.4.0-123.139~18.04.1
| Version table:
| *** 5.4.0-123.139~18.04.1 500
| 500 http://archive.ubuntu.com//ubuntu bionic-proposed/main amd64 Packages
| 100 /var/lib/dpkg/status

Revision history for this message
Tim Gardner (timg-tpi) wrote :

@squirrelsc - Kernels with this fix are due for release Aug 1, 2022.

Revision history for this message
timeless (timeless) wrote :

Fwiw I deployed this for our server yesterday and it's been up 21 hours, whereas before it didn't really survive for more than ~6 hours on 122...

Revision history for this message
halfgaar (wiebe-halfgaar) wrote :

What's the status on the update? We're getting crashes on Ubuntu 18.04, Amazon kernel 5.4.0-1081-aws.

Revision history for this message
halfgaar (wiebe-halfgaar) wrote :

The latest in bionic-proposed is 5.4.0-1082.89~18.04.1, but I can't find references to it being fixed there? The following page doesn't mention it:

https://www.ubuntuupdates.org/package/core/bionic/main/proposed/linux-aws-5.4

There is mention of another null dereference, but it's not this one.

Revision history for this message
Bodo Petermann (bpetermann) wrote :

The changelog on ubuntuupdates is cut short.
See https://launchpad.net/ubuntu/+source/linux-aws/5.4.0-1082.89 instead. The fix is mentioned there (tcp: make sure treq->af_specific is initialized)

Revision history for this message
Tim Gardner (timg-tpi) wrote :

Due to Retbleed mitigation testing, the SRU cycle has been extended by one week. The new release date is Aug 8, 2022.

Tim Gardner (timg-tpi)
Changed in linux (Ubuntu Bionic):
status: Confirmed → In Progress
Revision history for this message
Michael Scanlan (coinmover) wrote (last edit ):

Just saw this, it's effecting 25,000 users and crashed my server. Can't get anything running now. Exact same error/problem. Using Digital Ocean as VPS provider. -121 was fine, was intoed in -122.

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (12.7 KiB)

This bug was fixed in the package linux - 4.15.0-191.202

---------------
linux (4.15.0-191.202) bionic; urgency=medium

  * CVE-2022-2586
    - SAUCE: netfilter: nf_tables: do not allow SET_ID to refer to another table
    - SAUCE: netfilter: nf_tables: do not allow RULE_ID to refer to another chain

  * CVE-2022-2588
    - SAUCE: net_sched: cls_route: remove from list when handle is 0

  * CVE-2022-34918
    - netfilter: nf_tables: stricter validation of element data

  * BUG: kernel NULL pointer dereference, address: 0000000000000008
    (LP: #1981658)
    - tcp: make sure treq->af_specific is initialized

linux (4.15.0-190.201) bionic; urgency=medium

  * bionic/linux: 4.15.0-190.201 -proposed tracker (LP: #1981321)

  * CVE-2022-1679
    - SAUCE: ath9k: fix use-after-free in ath9k_hif_usb_rx_cb

  * Bionic update: upstream stable patchset 2022-07-06 (LP: #1980879)
    - MIPS: Use address-of operator on section symbols
    - block: drbd: drbd_nl: Make conversion to 'enum drbd_ret_code' explicit
    - can: grcan: grcan_probe(): fix broken system id check for errata workaround
      needs
    - can: grcan: only use the NAPI poll budget for RX
    - Bluetooth: Fix the creation of hdev->name
    - mmc: rtsx: add 74 Clocks in power on flow
    - mm: hugetlb: fix missing cache flush in copy_huge_page_from_user()
    - mm: userfaultfd: fix missing cache flush in mcopy_atomic_pte() and
      __mcopy_atomic()
    - ALSA: pcm: Fix races among concurrent hw_params and hw_free calls
    - ALSA: pcm: Fix races among concurrent read/write and buffer changes
    - ALSA: pcm: Fix races among concurrent prepare and hw_params/hw_free calls
    - ALSA: pcm: Fix races among concurrent prealloc proc writes
    - ALSA: pcm: Fix potential AB/BA lock with buffer_mutex and mmap_lock
    - VFS: Fix memory leak caused by concurrently mounting fs with subtype
    - batman-adv: Don't skb_split skbuffs with frag_list
    - net: Fix features skip in for_each_netdev_feature()
    - ipv4: drop dst in multicast routing path
    - netlink: do not reset transport header in netlink_recvmsg()
    - mac80211_hwsim: call ieee80211_tx_prepare_skb under RCU protection
    - hwmon: (ltq-cputemp) restrict it to SOC_XWAY
    - s390/ctcm: fix variable dereferenced before check
    - s390/ctcm: fix potential memory leak
    - s390/lcs: fix variable dereferenced before check
    - net/smc: non blocking recvmsg() return -EAGAIN when no data and
      signal_pending
    - net: sfc: ef10: fix memory leak in efx_ef10_mtd_probe()
    - hwmon: (f71882fg) Fix negative temperature
    - ASoC: max98090: Reject invalid values in custom control put()
    - ASoC: max98090: Generate notifications on changes for custom control
    - ASoC: ops: Validate input values in snd_soc_put_volsw_range()
    - tcp: resalt the secret every 10 seconds
    - usb: cdc-wdm: fix reading stuck on device close
    - USB: serial: pl2303: add device id for HP LM930 Display
    - USB: serial: qcserial: add support for Sierra Wireless EM7590
    - USB: serial: option: add Fibocom L610 modem
    - USB: serial: option: add Fibocom MA510 modem
    - cgroup/cpuset: Remove cpus_allowed/mems_allowed setup in cpuset_init_smp()
 ...

Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Released
Revision history for this message
Rajiv Ginotra (rajivginotra) wrote (last edit ):
Download full text (7.0 KiB)

Is anybody knows the steps to reproduce this issue? We are also facing the same below TB in our testbed and we are planning to take the patch mentioned in comment #15.

Even though we are using this kernel from long time and seen this issue on very few nodes.

Appreciate your help in this regard.

Feb 22 22:20:47 maglev-master-192-168-70-10 kernel: [14501.026015] TCP: request_sock_TCP: Possible SYN flooding on port 8033. Sending cookies. Check SNMP counters.
Feb 22 22:20:47 maglev-master-192-168-70-10 kernel: [14501.027529] BUG: kernel NULL pointer dereference, address: 0000000000000008
Feb 22 22:20:47 maglev-master-192-168-70-10 kernel: [14501.035339] #PF: supervisor read access in kernel mode
Feb 22 22:20:47 maglev-master-192-168-70-10 kernel: [14501.041083] #PF: error_code(0x0000) - not-present page
Feb 22 22:20:47 maglev-master-192-168-70-10 kernel: [14501.046838] PGD 0 P4D 0
Feb 22 22:20:47 maglev-master-192-168-70-10 kernel: [14501.049670] Oops: 0000 [#1] SMP NOPTI
Feb 22 22:20:47 maglev-master-192-168-70-10 kernel: [14501.053764] CPU: 36 PID: 230 Comm: ksoftirqd/36 Not tainted 5.4.0-122-generic #138~18.04.1
Feb 22 22:20:47 maglev-master-192-168-70-10 kernel: [14501.063007] Hardware name: Cisco Systems Inc DN2-HW-APL-L/UCSC-C220-M5SX, BIOS C220M5.4.1.3i.0.0713210713 07/13/2021
Feb 22 22:20:47 maglev-master-192-168-70-10 kernel: [14501.074770] RIP: 0010:tcp_create_openreq_child+0x2e1/0x3e0
Feb 22 22:20:47 maglev-master-192-168-70-10 kernel: [14501.080907] Code: 08 00 00 41 8b 84 24 18 01 00 00 48 c7 83 80 08 00 00 00 00 00 00 4c 89 e6 4c 89 ef 89 83 c4 05 00 00 49 8b 84 24 f8 00 00 00 <48> 8b 40 08 e8 96 b8 41 00 48 85 c0 0f b7 83 68 05 00 00 74 0a 83
Feb 22 22:20:47 maglev-master-192-168-70-10 kernel: [14501.101919] RSP: 0018:ffff97b88d207a28 EFLAGS: 00010246
Feb 22 22:20:47 maglev-master-192-168-70-10 kernel: [14501.107764] RAX: 0000000000000000 RBX: ffff8abb9d1bc600 RCX: 0000000000000007
Feb 22 22:20:47 maglev-master-192-168-70-10 kernel: [14501.115745] RDX: 0000000000000020 RSI: ffff8abb3c6e1560 RDI: ffff8acdef1b9180
Feb 22 22:20:47 maglev-master-192-168-70-10 kernel: [14501.123722] RBP: ffff97b88d207a48 R08: 0000000000000000 R09: ffff8aacffc07800
Feb 22 22:20:47 maglev-master-192-168-70-10 kernel: [14501.131699] R10: 0000000000000514 R11: ffff97b88d207b0f R12: ffff8abb3c6e1560
Feb 22 22:20:47 maglev-master-192-168-70-10 kernel: [14501.139678] R13: ffff8acdef1b9180 R14: ffff8abdfc1a7500 R15: ffff8adc75529ec0
Feb 22 22:20:47 maglev-master-192-168-70-10 kernel: [14501.147655] FS: 0000000000000000(0000) GS:ffff8adcff000000(0000) knlGS:0000000000000000
Feb 22 22:20:47 maglev-master-192-168-70-10 kernel: [14501.156705] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 22 22:20:47 maglev-master-192-168-70-10 kernel: [14501.163144] CR2: 0000000000000008 CR3: 000000594ea0a005 CR4: 00000000007606e0
Feb 22 22:20:47 maglev-master-192-168-70-10 kernel: [14501.171120] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Feb 22 22:20:47 maglev-master-192-168-70-10 kernel: [14501.179099] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Feb 22 22:20:47 maglev-master-192-168-70-10 kernel: [14501.187076] PKRU: 5...

Read more...

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Please update the kernel to version newer than 5.4.0-123.

Revision history for this message
Hassan El Jacifi (waver) wrote :
Download full text (3.7 KiB)

Hi all,

This bug seems to be present on kernel "6.2.0-32-generic #32~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri Aug 18 10:40:13 UTC 2 x86_64 x86_64 x86_64 GNU/Linux"

No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.3 LTS
Release: 22.04
Codename: jammy

Logs:
[lun sep 11 12:18:58 2023] BUG: kernel NULL pointer dereference, address: 000000000000002b
[lun sep 11 12:18:58 2023] #PF: supervisor read access in kernel mode
[lun sep 11 12:18:58 2023] #PF: error_code(0x0000) - not-present page
[lun sep 11 12:18:58 2023] PGD 0 P4D 0
[lun sep 11 12:18:58 2023] Oops: 0000 [#1] PREEMPT SMP PTI
[lun sep 11 12:18:58 2023] CPU: 6 PID: 118 Comm: kswapd0 Tainted: P OE 6.2.0-32-generic #32~22.04.1-Ubuntu
[lun sep 11 12:18:58 2023] Hardware name: System manufacturer System Product Name/ROG STRIX Z370-E GAMING, BIOS 3005 09/30/2021
[lun sep 11 12:18:58 2023] RIP: 0010:down_read_trylock+0x16/0x80
[lun sep 11 12:18:58 2023] Code: 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 65 ff 05 5c 69 6b 50 48 b9 07 00 00 00 00 00 00 80 <48> 8b 07 48 85 c8 75 57 48 8d 90 00 01 00 00 f0 48 0f b1 17 75 ed
[lun sep 11 12:18:58 2023] RSP: 0018:ffffb6e3405539a8 EFLAGS: 00010282
[lun sep 11 12:18:58 2023] RAX: 0000000000000000 RBX: fffff2dbdbbc1080 RCX: 8000000000000007
[lun sep 11 12:18:58 2023] RDX: 0000000000000000 RSI: ffffb6e340553a58 RDI: 000000000000002b
[lun sep 11 12:18:58 2023] RBP: ffffb6e3405539d8 R08: 0000000000000000 R09: 0000000000000000
[lun sep 11 12:18:58 2023] R10: 0000000000000000 R11: 0000000000000000 R12: ffff94f60ffade38
[lun sep 11 12:18:58 2023] R13: ffff94f60ffade39 R14: ffffb6e340553a58 R15: 000000000000002b
[lun sep 11 12:18:58 2023] FS: 0000000000000000(0000) GS:ffff94fd16b80000(0000) knlGS:0000000000000000
[lun sep 11 12:18:58 2023] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[lun sep 11 12:18:58 2023] CR2: 000000000000002b CR3: 0000000330e10003 CR4: 00000000003706e0
[lun sep 11 12:18:58 2023] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[lun sep 11 12:18:58 2023] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[lun sep 11 12:18:58 2023] Call Trace:
[lun sep 11 12:18:58 2023] <TASK>
[lun sep 11 12:18:58 2023] ? show_regs+0x72/0x90
[lun sep 11 12:18:58 2023] ? __die+0x25/0x80
[lun sep 11 12:18:58 2023] ? page_fault_oops+0x79/0x190
[lun sep 11 12:18:58 2023] ? blk_mq_get_new_requests+0xf6/0x1a0
[lun sep 11 12:18:58 2023] ? do_user_addr_fault+0x30c/0x640
[lun sep 11 12:18:58 2023] ? exc_page_fault+0x81/0x1b0
[lun sep 11 12:18:58 2023] ? asm_exc_page_fault+0x27/0x30
[lun sep 11 12:18:58 2023] ? down_read_trylock+0x16/0x80
[lun sep 11 12:18:58 2023] ? folio_lock_anon_vma_read+0x76/0x190
[lun sep 11 12:18:58 2023] rmap_walk_anon+0x262/0x350
[lun sep 11 12:18:58 2023] folio_referenced+0x17d/0x240
[lun sep 11 12:18:58 2023] ? __pfx_folio_referenced_one+0x10/0x10
[lun sep 11 12:18:58 2023] ? __pfx_folio_lock_anon_vma_read+0x10/0x10
[lun sep 11 12:18:58 2023] shrink_folio_list+0x7ee/0xc20
[lun sep 11 12:18:58 2023] shrink_inactive_list+0x191/0x600
[lun sep 11 12:18:58 2023] ? shrink_active_list+0x2dd/0x470
[lun ...

Read more...

Revision history for this message
Hassan El Jacifi (waver) wrote :

Distributor ID: Ubuntu
Description: Ubuntu 22.04.3 LTS
Release: 22.04
Codename: jammy

Revision history for this message
Marin Gilles (mrngilles) wrote :

I have seen the same issue, in my case on a desktop machine when my machine goes to sleep with a dock attached, or, with the dock attached, when I run `xset dpms force off`

lsb_release -a

Distributor ID: Ubuntu
Description: Ubuntu 23.04
Release: 23.04
Codename: lunar

uname -a

Linux hostname 6.2.0-33-generic #33-Ubuntu SMP PREEMPT_DYNAMIC Tue Sep 5 14:49:19 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.