Memory arena corruption with FUSE (was Memory allocation failure crashes kernel hard, presumably related to FUSE)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Fedora) |
Won't Fix
|
Critical
|
|||
linux (Ubuntu) |
Fix Released
|
High
|
Seth Forshee | ||
Wily |
Fix Released
|
High
|
Seth Forshee | ||
Xenial |
Fix Released
|
High
|
Seth Forshee |
Bug Description
== SRU Justification ==
Impact: Races in fuse's synchronous io handling can result in use-after-free bugs which are causing kernel crashes.
Fix: Two commits from fuse-next, one which simply caches the result of a test to avoid a use-after-free and another which adds reference counting to the fuse_io_priv struct to get rid of some convoluted rules for determining when this structure can be freed.
Test case: Tested on LP #1505948.
---
Hello everybody,
Linux 4.1, 4.2 or 4.3-rc leads to an immediate kernel panic in our setup when trying to start a Qemu process on top of a fuse-based mount. Here is an example stacktrace:
[ 739.807817] BUG: unable to handle kernel paging request at ffff8800a4104ea0
[ 739.840201] IP: [<ffffffff811cc
[ 739.870309] PGD 2fee067 PUD 2fbf4dd063 PMD 0
[ 739.890418] Oops: 0000 [#1] SMP
[ 739.905265] Modules linked in: nbd vport_vxlan vport_gre gre ebtable_filter ebtables openvswitch ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_
[ 740.345300] CPU: 8 PID: 10550 Comm: qemu-system-x86 Not tainted 4.2.0-040200-
[ 740.386879] Hardware name: HP ProLiant DL380 Gen9, BIOS P89 05/06/2015
[ 740.416827] task: ffff882f8e958dc0 ti: ffff882f28c20000 task.ti: ffff882f28c20000
[ 740.451672] RIP: 0010:[<
[ 740.494047] RSP: 0018:ffff882f28
[ 740.518425] RAX: 0000000000000000 RBX: 00000000000000d0 RCX: 00000000000026b3
[ 740.551611] RDX: 00000000000026b2 RSI: 00000000000000d0 RDI: ffff882fbf407840
[ 740.584846] RBP: ffff882f28c23ca8 R08: 0000000000019920 R09: ffffe8d000200ab0
[ 740.618287] R10: ffffffff812e8dcd R11: ffffea00bca0ac00 R12: 00000000000000d0
[ 740.651320] R13: ffff882fbf407840 R14: ffff8800a4104ea0 R15: ffff882fbf407840
[ 740.684195] FS: 00007f2642ffd70
[ 740.722030] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 740.749469] CR2: ffff8800a4104ea0 CR3: 0000002f26f83000 CR4: 00000000001426e0
[ 740.783390] Stack:
[ 740.792577] ffffffff812e8dcd 0000000000000048 0000000000000002 ffff882f908c8468
[ 740.827003] 0000000001bef000 ffff882f928e4600 ffff882f28c23e48 ffff882f28c23d70
[ 740.860971] ffff882f28c23d38 ffffffff812e8dcd 0000000000000001 ffff882f908c8300
[ 740.894994] Call Trace:
[ 740.906211] [<ffffffff812e8
[ 740.932940] [<ffffffff812e8
[ 740.958866] [<ffffffff81177
[ 740.989318] [<ffffffff812e9
[ 741.017725] [<ffffffff811e9
[ 741.041787] [<ffffffff811e9
[ 741.065307] [<ffffffff811ea
[ 741.090141] [<ffffffff81085
[ 741.135924] [<ffffffff817a8
[ 741.183478] Code: 4c 03 05 32 d8 e3 7e 4d 8b 30 49 8b 40 10 4d 85 f6 0f 84 22 01 00 00 48 85 c0 0f 84 19 01 00 00 49 63 47 20 48 8d 4a 01 4d 8b 07 <49> 8b 1c 06 4c 89 f0 65 49 0f c7 08 0f 94 c0 84 c0 74 b9 49 63
[ 741.306817] RIP [<ffffffff811cc
The problem has also been documented by somebody else in the Fedora bug tracker at https:/
This behaviour is 100% reproducible. I have asked the fuse-devel mailinglist for advice, but up to this point with no success:
http://
We are still investigating if this issue is also happening with 4.0 and will add the information to this bug report once we have it. Any help on debugging will be greatly appreciated.
Related branches
Changed in linux (Ubuntu): | |
importance: | Undecided → High |
tags: | added: kernel-da-key wily |
summary: |
- Memory allocation failure crashes kernel hard, presumably related to - FUSE + Memory arena corruption with FUSE (was Memory allocation failure crashes + kernel hard, presumably related to FUSE) |
description: | updated |
Changed in linux (Ubuntu Wily): | |
assignee: | nobody → Seth Forshee (sforshee) |
status: | Confirmed → In Progress |
Changed in linux (Ubuntu Xenial): | |
assignee: | nobody → Seth Forshee (sforshee) |
status: | Confirmed → In Progress |
Changed in linux (Ubuntu Xenial): | |
status: | In Progress → Fix Committed |
Changed in linux (Ubuntu Wily): | |
status: | In Progress → Fix Committed |
Changed in linux (Fedora): | |
importance: | Unknown → Critical |
status: | Unknown → Won't Fix |
Description of problem:
After upgrading a node from F20 to F21, node crashes accessing glusterfs volume.
The remaining F20 nodes have no problem accessing the volume.
Aug 16 20:24:25 bagel kernel: [ 1810.077267] ------------[ cut here ]------------ fc21.x86_ 64 #1 ffffffff8120853 2>] [<ff 9b7c98 EFLAGS: 0(0000) GS:ffff88087fc4 0000(0000) knlGS:000000000 0000000 c8c>] fuse_direct_ IO+0x20c/ 0x340 [fuse] 2fa>] generic_ file_read_ iter+0x4ca/ 0x6...
Aug 16 20:24:25 bagel kernel: [ 1810.081945] kernel BUG at mm/slub.c:3413!
Aug 16 20:24:25 bagel kernel: [ 1810.085998] invalid opcode: 0000 [#1] SMP
Aug 16 20:24:25 bagel kernel: [ 1810.090177] Modules linked in: vhost_net vhost m
acvtap macvlan ebt_arp ebtable_nat fuse nfsv3 nfs_acl nfs lockd grace sunrpc fsca
che ebtable_filter ebtables ip6table_filter ip6_tables softdog scsi_transport_isc
si xt_physdev br_netfilter nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport xt_connt
rack nf_conntrack vfat fat coretemp kvm_intel kvm bcache iTCO_wdt crct10dif_pclmu
l ipmi_devintf crc32_pclmul iTCO_vendor_support gpio_ich igb crc32c_intel ptp pps
_core lpc_ich ghash_clmulni_intel i2c_i801 mfd_core ipmi_si dca ipmi_msghandler i
2c_ismt tpm_tis shpchp tpm acpi_cpufreq ast i2c_algo_bit drm_kms_helper ttm drm 8
021q garp mrp tun bridge stp llc bonding
Aug 16 20:24:25 bagel kernel: [ 1810.149526] CPU: 1 PID: 4794 Comm: qemu-system-x
86 Not tainted 4.1.4-100.
Aug 16 20:24:25 bagel kernel: [ 1810.157603] Hardware name: Supermicro A1SRM-2758
F/A1SRM-2758F, BIOS 1.2 02/16/2015
Aug 16 20:24:25 bagel kernel: [ 1810.165246] task: ffff88085a1313c0 ti: ffff8803b
09b4000 task.ti: ffff8803b09b4000
Aug 16 20:24:25 bagel kernel: [ 1810.172800] RIP: 0010:[<
ffffff81208532>] kfree+0x152/0x160
Aug 16 20:24:25 bagel kernel: [ 1810.180467] RSP: 0018:ffff8803b0
00010246
Aug 16 20:24:25 bagel kernel: [ 1810.185833] RAX: 005ffff80000002c RBX: ffff88020
08b9960 RCX: dead000000200200
Aug 16 20:24:25 bagel kernel: [ 1810.193032] RDX: 000077ff80000000 RSI: ffff88085
a1313c0 RDI: ffff8802008b9960
Aug 16 20:24:25 bagel kernel: [ 1810.200231] RBP: ffff8803b09b7cb8 R08: ffff8803b
09b7c80 R09: ffffea0008022e40
Aug 16 20:24:25 bagel kernel: [ 1810.207431] R10: 0000000000002fe4 R11: 000000000
0000000 R12: 0000000149928000
Aug 16 20:24:25 bagel kernel: [ 1810.214629] R13: ffffffffa02e5c8c R14: ffff8803b
09b7e50 R15: ffff8801009b5600
Aug 16 20:24:25 bagel kernel: [ 1810.221829] FS: 00007f35609ff70
Aug 16 20:24:25 bagel kernel: [ 1810.229992] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Aug 16 20:24:25 bagel kernel: [ 1810.235799] CR2: 00007fbf24022a98 CR3: 0000000100a81000 CR4: 00000000001027e0
Aug 16 20:24:25 bagel kernel: [ 1810.243001] Stack:
Aug 16 20:24:25 bagel kernel: [ 1810.245037] ffff8802008b9960 ffff8802008b9960 0000000149928000 ffff8803b09b7da8
Aug 16 20:24:25 bagel kernel: [ 1810.252590] ffff8803b09b7d48 ffffffffa02e5c8c 0000000000004800 ffff8806eea842c0
Aug 16 20:24:25 bagel kernel: [ 1810.260145] 0000000000004800 00000001f4000000 000000014992c800 0000000000000000
Aug 16 20:24:25 bagel kernel: [ 1810.267699] Call Trace:
Aug 16 20:24:25 bagel kernel: [ 1810.270189] [<ffffffffa02e5
Aug 16 20:24:25 bagel kernel: [ 1810.276525] [<ffffffff811ac