Network related kernel panic on Atom 64bit system using saucy backport stack on precise

Bug #1251946 reported by Stéphane Graber
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
High
Unassigned

Bug Description

I can't report this through ubuntu-bug as the system is a firewall with restricted connectivity and the kernel bug prevents me from using the system for more than a couple minutes anyway.

I recently deployed the saucy kernel on that precise system, after the first reboot things looked fine but just a few minutes later, the whole system hang. I reproduced this very reliably 3-4 times, then hooked up a console cable and extracted the following kernel panic:

[ 124.428031] ------------[ cut here ]------------
[ 124.432656] Kernel BUG at ffffffff81629ce8 [verbose debug info unavailable]
[ 124.439622] invalid opcode: 0000 [#1] SMP
[ 124.443766] Modules linked in: authenc(F) esp6(F) xfrm6_mode_transport(F) ipcomp6(F) xfrm6_tunnel(F) tunnel6(F) xfrm4_mode_tunnel(F) xfrm6_mode_tunnel(F) v)
[ 124.577657] CPU: 0 PID: 12800 Comm: samba Tainted: GF WC 3.11.0-13-generic #20~precise2-Ubuntu
[ 124.586962] Hardware name: /D2500CC, BIOS CCCDT10N.86A.0039.2013.0425.1625 04/25/2013
[ 124.596268] task: ffff8800cd711770 ti: ffff8800bfde2000 task.ti: ffff8800bfde2000
[ 124.603752] RIP: 0010:[<ffffffff81629ce8>] [<ffffffff81629ce8>] pskb_expand_head+0x288/0x2d0
[ 124.612305] RSP: 0018:ffff88012fc03688 EFLAGS: 00010202
[ 124.617621] RAX: 0000000000000003 RBX: ffff8800cd44d000 RCX: 0000000000000020
[ 124.624762] RDX: 0000000000000000 RSI: 0000000000000050 RDI: 00000000000006c0
[ 124.631900] RBP: ffff88012fc036c8 R08: 3f2c680500000060 R09: 3f2c680500000060
[ 124.639037] R10: bfcb2ffeff3e1602 R11: 20270ff0c0f20726 R12: 0000000000000050
[ 124.646175] R13: 0000000000000020 R14: 0000000000000001 R15: ffffffff81709280
[ 124.653317] FS: 00007f81c5448740(0000) GS:ffff88012fc00000(0000) knlGS:0000000000000000
[ 124.661407] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 124.667158] CR2: 000000000330e780 CR3: 00000000bfdc9000 CR4: 00000000000007f0
[ 124.674295] Stack:
[ 124.676318] ffffe8ffffc00640 0000000180400040 ffff8801264d6d40 ffff8800cd44d000
[ 124.683792] ffff880126ddc000 ffff880126919180 0000000000000001 ffffffff81709280
[ 124.691262] ffff88012fc036f8 ffffffff816caac0 ffff8800cd44d000 0000000000000001
[ 124.698729] Call Trace:
[ 124.701187] <IRQ>
[ 124.703121] [<ffffffff81709280>] ? xfrm6_extract_output+0x50/0x50
[ 124.709525] [<ffffffff816caac0>] xfrm_output_one+0x90/0x230
[ 124.715189] [<ffffffff816cad26>] xfrm_output_resume+0xc6/0x170
[ 124.721112] [<ffffffff816cade3>] xfrm_output2+0x13/0x20
[ 124.726429] [<ffffffff816cae2f>] xfrm_output+0x3f/0x100
[ 124.731747] [<ffffffff81709299>] xfrm6_output_finish+0x19/0x20
[ 124.737669] [<ffffffff816d5729>] ip6_fragment+0x929/0xab0
[ 124.743160] [<ffffffff81709280>] ? xfrm6_extract_output+0x50/0x50
[ 124.749344] [<ffffffff81708ed0>] ? xfrm6_local_rxpmtu+0x70/0x70
[ 124.755356] [<ffffffff81708fa2>] __xfrm6_output+0xd2/0x180
[ 124.760933] [<ffffffff817092ca>] xfrm6_output+0x2a/0x70
[ 124.766250] [<ffffffff816d4a03>] ip6_forward+0x243/0x640
[ 124.771654] [<ffffffff816d5a20>] ? ip6_output+0xb0/0xb0
[ 124.776971] [<ffffffff816d5aa8>] ip6_rcv_finish+0x88/0x90
[ 124.782460] [<ffffffff816d5a20>] ? ip6_output+0xb0/0xb0
[ 124.787787] [<ffffffffa027783d>] __ipv6_conntrack_in+0xdd/0x180 [nf_conntrack_ipv6]
[ 124.795539] [<ffffffffa0277905>] ipv6_conntrack_in+0x25/0x30 [nf_conntrack_ipv6]
[ 124.803027] [<ffffffff816660df>] nf_iterate+0x8f/0xd0
[ 124.808172] [<ffffffff816d5a20>] ? ip6_output+0xb0/0xb0
[ 124.813488] [<ffffffff8166619d>] nf_hook_slow+0x7d/0x150
[ 124.818891] [<ffffffff816d5a20>] ? ip6_output+0xb0/0xb0
[ 124.824210] [<ffffffff816d5a20>] ? ip6_output+0xb0/0xb0
[ 124.829532] [<ffffffffa021b3e2>] nf_ct_frag6_output+0xf2/0x100 [nf_defrag_ipv6]
[ 124.836932] [<ffffffff816d5a20>] ? ip6_output+0xb0/0xb0
[ 124.842255] [<ffffffffa021a0a8>] ipv6_defrag.part.2+0x98/0xe0 [nf_defrag_ipv6]
[ 124.849577] [<ffffffffa021a11c>] ipv6_defrag+0x2c/0x30 [nf_defrag_ipv6]
[ 124.856284] [<ffffffff816660df>] nf_iterate+0x8f/0xd0
[ 124.861429] [<ffffffff816d5a20>] ? ip6_output+0xb0/0xb0
[ 124.866743] [<ffffffff8166619d>] nf_hook_slow+0x7d/0x150
[ 124.872146] [<ffffffff816d5a20>] ? ip6_output+0xb0/0xb0
[ 124.877467] [<ffffffff816d6213>] ipv6_rcv+0x313/0x3f0
[ 124.882609] [<ffffffff81636cba>] __netif_receive_skb_core+0x5ca/0x720
[ 124.889140] [<ffffffff81636e31>] __netif_receive_skb+0x21/0x70
[ 124.895063] [<ffffffff81636ea3>] netif_receive_skb+0x23/0x90
[ 124.900825] [<ffffffffa02fe6af>] br_pass_frame_up+0x9f/0x110 [bridge]
[ 124.907364] [<ffffffffa02fe8fb>] br_handle_frame_finish+0x1db/0x330 [bridge]
[ 124.914509] [<ffffffffa02febe4>] br_handle_frame+0x194/0x260 [bridge]
[ 124.921050] [<ffffffffa02fea50>] ? br_handle_frame_finish+0x330/0x330 [bridge]
[ 124.928362] [<ffffffff81636a69>] __netif_receive_skb_core+0x379/0x720
[ 124.934894] [<ffffffff810978e6>] ? ttwu_queue+0xb6/0xd0
[ 124.940210] [<ffffffff81636e31>] __netif_receive_skb+0x21/0x70
[ 124.946135] [<ffffffff816376f1>] process_backlog+0xb1/0x190
[ 124.951799] [<ffffffff81637f34>] net_rx_action+0x134/0x260
[ 124.957378] [<ffffffff8106a510>] __do_softirq+0xe0/0x280
[ 124.962780] [<ffffffff8175071c>] call_softirq+0x1c/0x30
[ 124.968094] <EOI>
[ 124.970029] [<ffffffff81015e35>] do_softirq+0x65/0xa0
[ 124.975391] [<ffffffff8106a1a4>] local_bh_enable+0x94/0xa0
[ 124.980970] [<ffffffff816d2c31>] ip6_finish_output2+0x1e1/0x4b0
[ 124.986979] [<ffffffff816d2a50>] ? ip6_flush_pending_frames+0xb0/0xb0
[ 124.993509] [<ffffffff816d5729>] ip6_fragment+0x929/0xab0
[ 124.999002] [<ffffffff816d2a50>] ? ip6_flush_pending_frames+0xb0/0xb0
[ 125.005532] [<ffffffff816d58b0>] ? ip6_fragment+0xab0/0xab0
[ 125.011196] [<ffffffff816d21f0>] ? ac6_proc_exit+0x20/0x20
[ 125.016774] [<ffffffff816d5931>] ip6_finish_output+0x81/0xc0
[ 125.022523] [<ffffffff816d59ac>] ip6_output+0x3c/0xb0
[ 125.027667] [<ffffffff816d4332>] ? __ip6_local_out+0x72/0x80
[ 125.033418] [<ffffffff816d4369>] ip6_local_out+0x29/0x30
[ 125.038821] [<ffffffff816d4643>] ip6_push_pending_frames+0x2d3/0x450
[ 125.045268] [<ffffffff816ed236>] udp_v6_push_pending_frames+0x136/0x320
[ 125.051972] [<ffffffff816ee228>] udpv6_sendmsg+0x908/0xb30
[ 125.057551] [<ffffffff813319ae>] ? aa_net_perm+0xae/0x110
[ 125.063040] [<ffffffff816da84a>] ? ipv6_dev_get_saddr+0x22a/0x320
[ 125.069225] [<ffffffff816a58d1>] inet_sendmsg+0x61/0xb0
[ 125.074542] [<ffffffff81331a86>] ? aa_revalidate_sk+0x76/0x80
[ 125.080378] [<ffffffff8132c0f7>] ? apparmor_socket_sendmsg+0x17/0x20
[ 125.086823] [<ffffffff8161d322>] sock_sendmsg+0xc2/0xe0
[ 125.092143] [<ffffffff81089aed>] ? add_wait_queue+0x4d/0x60
[ 125.097805] [<ffffffff811fb18c>] ? ep_scan_ready_list.isra.11+0x1bc/0x1c0
[ 125.104683] [<ffffffff81620348>] SYSC_sendto+0x138/0x190
[ 125.110087] [<ffffffff81745ba5>] ? _raw_spin_lock_irq+0x15/0x20
[ 125.116096] [<ffffffff811fbee6>] ? SyS_epoll_wait+0xd6/0xf0
[ 125.121761] [<ffffffff8162097e>] SyS_sendto+0xe/0x10
[ 125.126819] [<ffffffff8174ed1d>] system_call_fastpath+0x1a/0x1f
[ 125.132824] Code: b2 ff e9 16 ff ff ff b8 f4 ff ff ff eb a6 48 89 d7 48 89 55 c0 e8 49 cb b2 ff 84 c0 48 8b 55 c0 0f 85 9c fe ff ff e9 93 fe ff ff <0f> 0b
[ 125.152869] RIP [<ffffffff81629ce8>] pskb_expand_head+0x288/0x2d0
[ 125.159072] RSP <ffff88012fc03688>
[ 125.162682] ---[ end trace 389dbe75bb7ab4d0 ]---
[ 125.167349] Kernel panic - not syncing: Fatal exception in interrupt

Additionally, the following Opps (looks unrelated but we never know) appears at boot time:
Nov 17 02:41:33 sateda kernel: [ 5.767282] gma500 0000:00:02.0: setting latency timer to 64
Nov 17 02:41:33 sateda kernel: [ 5.767457] gma500 0000:00:02.0: irq 47 for MSI/MSI-X
Nov 17 02:41:33 sateda kernel: [ 5.767569] gma500 0000:00:02.0: GPU: power management timed out.
Nov 17 02:41:33 sateda kernel: [ 5.795072] resource map sanity check conflict: 0xcf800000 0xcffbefff 0xcf800000 0xcf9fffff PCI Bus 0000:02
Nov 17 02:41:33 sateda kernel: [ 5.795081] ------------[ cut here ]------------
Nov 17 02:41:33 sateda kernel: [ 5.795094] WARNING: CPU: 0 PID: 517 at /build/buildd/linux-lts-saucy-3.11.0/arch/x86/mm/ioremap.c:171 __ioremap_caller+0x382/0x390()
Nov 17 02:41:33 sateda kernel: [ 5.795098] Info: mapping multiple BARs. Your kernel is fine.
Nov 17 02:41:33 sateda kernel: [ 5.795101] Modules linked in: snd_hda_intel(F+) gma500_gfx(F+) snd_hda_codec(F) nls_iso8859_1(F) snd_hwdep(F) snd_pcm(F) drm_kms_helper(F) xt_tcpudp(F) drm(F) psmouse(F) snd_timer(F) snd(F) serio_raw(F) lpc_ich(F) soundcore(F) i2c_algo_bit(F) snd_page_alloc(F) parport_pc(F) iptable_nat(F) nf_conntrack_ipv4(F) nf_defrag_ipv4(F) video(F) nf_nat_ipv4(F) nf_nat(F) nf_conntrack(F) mac_hid(F) ip6table_filter(F) ip6_tables(F) iptable_filter(F) lp(F) parport(F) ip_tables(F) x_tables(F) ahci(F) e1000e(F) ptp(F) pps_core(F) libahci(F)
Nov 17 02:41:33 sateda kernel: [ 5.795165] CPU: 0 PID: 517 Comm: modprobe Tainted: GF 3.11.0-13-generic #20~precise2-Ubuntu
Nov 17 02:41:33 sateda kernel: [ 5.795169] Hardware name: /D2500CC, BIOS CCCDT10N.86A.0039.2013.0425.1625 04/25/2013
Nov 17 02:41:33 sateda kernel: [ 5.795174] 00000000000000ab ffff88012837b958 ffffffff8173a05d 00000000000015d0
Nov 17 02:41:33 sateda kernel: [ 5.795182] ffff88012837b9a8 ffff88012837b998 ffffffff8106539c ffff88012837b9b8
Nov 17 02:41:33 sateda kernel: [ 5.795190] ffffc90000b80000 00000000cf800000 00000000cf800000 00000000007bf000
Nov 17 02:41:33 sateda kernel: [ 5.795197] Call Trace:
Nov 17 02:41:33 sateda kernel: [ 5.795208] [<ffffffff8173a05d>] dump_stack+0x46/0x58
Nov 17 02:41:33 sateda kernel: [ 5.795217] [<ffffffff8106539c>] warn_slowpath_common+0x8c/0xc0
Nov 17 02:41:33 sateda kernel: [ 5.795223] [<ffffffff81065486>] warn_slowpath_fmt+0x46/0x50
Nov 17 02:41:33 sateda kernel: [ 5.795230] [<ffffffff81055f52>] __ioremap_caller+0x382/0x390
Nov 17 02:41:33 sateda kernel: [ 5.795249] [<ffffffffa0223dde>] ? psb_gtt_init+0x47e/0x580 [gma500_gfx]
Nov 17 02:41:33 sateda kernel: [ 5.795257] [<ffffffff8105609e>] ioremap_wc+0x2e/0x30
Nov 17 02:41:33 sateda kernel: [ 5.795272] [<ffffffffa0223dde>] psb_gtt_init+0x47e/0x580 [gma500_gfx]
Nov 17 02:41:33 sateda kernel: [ 5.795288] [<ffffffffa02282b4>] psb_driver_load+0x134/0x360 [gma500_gfx]
Nov 17 02:41:33 sateda kernel: [ 5.795317] [<ffffffffa0173de1>] drm_get_pci_dev+0x181/0x2e0 [drm]
Nov 17 02:41:33 sateda kernel: [ 5.795334] [<ffffffffa0227e65>] psb_probe+0x15/0x20 [gma500_gfx]
Nov 17 02:41:33 sateda kernel: [ 5.795341] [<ffffffff813abe3b>] local_pci_probe+0x4b/0x80
Nov 17 02:41:33 sateda kernel: [ 5.795348] [<ffffffff813ad6b9>] __pci_device_probe+0xd9/0xe0
Nov 17 02:41:33 sateda kernel: [ 5.795355] [<ffffffff813ad6fa>] pci_device_probe+0x3a/0x60
Nov 17 02:41:33 sateda kernel: [ 5.795363] [<ffffffff8149389c>] really_probe+0x6c/0x330
Nov 17 02:41:33 sateda kernel: [ 5.795369] [<ffffffff81493ce7>] driver_probe_device+0x47/0xa0
Nov 17 02:41:33 sateda kernel: [ 5.795376] [<ffffffff81493deb>] __driver_attach+0xab/0xb0
Nov 17 02:41:33 sateda kernel: [ 5.795382] [<ffffffff81493d40>] ? driver_probe_device+0xa0/0xa0
Nov 17 02:41:33 sateda kernel: [ 5.795389] [<ffffffff81491ace>] bus_for_each_dev+0x5e/0x90
Nov 17 02:41:33 sateda kernel: [ 5.795395] [<ffffffff8149345e>] driver_attach+0x1e/0x20
Nov 17 02:41:33 sateda kernel: [ 5.795402] [<ffffffff81492eec>] bus_add_driver+0x10c/0x290
Nov 17 02:41:33 sateda kernel: [ 5.795408] [<ffffffff8149436d>] driver_register+0x7d/0x160
Nov 17 02:41:33 sateda kernel: [ 5.795418] [<ffffffffa0255000>] ? 0xffffffffa0254fff
Nov 17 02:41:33 sateda kernel: [ 5.795425] [<ffffffff813ac67c>] __pci_register_driver+0x4c/0x50
Nov 17 02:41:33 sateda kernel: [ 5.795445] [<ffffffffa017405a>] drm_pci_init+0x11a/0x130 [drm]
Nov 17 02:41:33 sateda kernel: [ 5.795452] [<ffffffffa0255000>] ? 0xffffffffa0254fff
Nov 17 02:41:33 sateda kernel: [ 5.795469] [<ffffffffa0255017>] psb_init+0x17/0x1000 [gma500_gfx]
Nov 17 02:41:33 sateda kernel: [ 5.795476] [<ffffffff8100212a>] do_one_initcall+0xfa/0x1b0
Nov 17 02:41:33 sateda kernel: [ 5.795483] [<ffffffff810578d3>] ? set_memory_nx+0x43/0x50
Nov 17 02:41:33 sateda kernel: [ 5.795493] [<ffffffff8172e724>] do_init_module+0x80/0x1d1
Nov 17 02:41:33 sateda kernel: [ 5.795501] [<ffffffff810d1a49>] load_module+0x4c9/0x5f0
Nov 17 02:41:33 sateda kernel: [ 5.795507] [<ffffffff810cf240>] ? add_kallsyms+0x210/0x210
Nov 17 02:41:33 sateda kernel: [ 5.795514] [<ffffffff810d1c24>] SyS_init_module+0xb4/0x100
Nov 17 02:41:33 sateda kernel: [ 5.795522] [<ffffffff8174ed1d>] system_call_fastpath+0x1a/0x1f
Nov 17 02:41:33 sateda kernel: [ 5.795526] ---[ end trace c25c2c18f7a6e9ab ]---
Nov 17 02:41:33 sateda kernel: [ 5.824962] acpi device:29: registered as cooling_device2

The system itself is a D2500 atom system, running an up to date Ubuntu 12.04 64bit, it has a considerable amount of network interfaces, including bonds, bridges and most of its traffic is IPv6.

None of the kernel traces above appear when running the original 3.2 kernel so for the time being, I've reverted to that kernel.

This is a production firewall system so I can't easily make it accessible for debugging, but I'll be happy to do test runs for any debugging kernels you may provide.

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1251946

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: saucy
Revision history for this message
Stéphane Graber (stgraber) wrote :

Moving to Confirmed to make the bot happy, I believe all immediately useful information is above, let me know if you need more.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Stéphane Graber (stgraber) wrote :
Download full text (6.1 KiB)

I tried the current -proposed kernel which seems considerably more reliable, instead of panicing within a couple of minutes from the boot it now does within 30min or so, current panic message is:

[ 1407.957715] ------------[ cut here ]------------
[ 1407.962346] Kernel BUG at ffffffff815e1310 [verbose debug info unavailable]
[ 1407.969310] invalid opcode: 0000 [#1] SMP
[ 1407.973445] Modules linked in: authenc(F) esp6(F) xfrm6_mode_transport(F) ipcomp6(F) xfrm6_tunnel(F) tunnel6(F) xfrm4_mode_tunnel(F) xfrm6_mode_tunnel(F) v)
[ 1408.087941] CPU: 0 PID: 0 Comm: swapper/0 Tainted: GF 3.11.0-14-generic #21-Ubuntu
[ 1408.096465] Hardware name: /D2500CC, BIOS CCCDT10N.86A.0039.2013.0425.1625 04/25/2013
[ 1408.105773] task: ffffffff81c15440 ti: ffffffff81c00000 task.ti: ffffffff81c00000
[ 1408.113259] RIP: 0010:[<ffffffff815e1310>] [<ffffffff815e1310>] pskb_expand_head+0x230/0x280
[ 1408.121810] RSP: 0018:ffff88012fc036b0 EFLAGS: 00010202
[ 1408.127126] RAX: 0000000000000003 RBX: ffff8800b6d80e00 RCX: 0000000000000020
[ 1408.134267] RDX: 00000000000006d2 RSI: 0000000000000012 RDI: ffff8800b6d80e00
[ 1408.141404] RBP: ffff88012fc036e8 R08: 3f2c5c0500000060 R09: 3f2c5c0500000060
[ 1408.148543] R10: f5fe8db0e1a07e51 R11: 00270ff0c0f20726 R12: 0000000000000001
[ 1408.155681] R13: 0000000000000012 R14: ffff8800b6d80e00 R15: ffff88008933c17e
[ 1408.162822] FS: 0000000000000000(0000) GS:ffff88012fc00000(0000) knlGS:0000000000000000
[ 1408.170912] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1408.176663] CR2: 00000000028b5180 CR3: 0000000107f8f000 CR4: 00000000000007f0
[ 1408.183800] Stack:
[ 1408.185823] 0000000000000282 ffffffff81cd2940 ffff8800b6d80e00 0000000000000001
[ 1408.193306] ffff880110c89000 ffff8800b6d80e00 ffff88008933c17e ffff88012fc03720
[ 1408.200776] ffffffff8167f5d6 ffffffffa009de62 ffff88011ea25200 ffff8800b6d80e00
[ 1408.208260] Call Trace:
[ 1408.210716] <IRQ>
[ 1408.212650] [<ffffffff8167f5d6>] xfrm_output_resume+0xa6/0x390
[ 1408.218801] [<ffffffffa009de62>] ? __nf_conntrack_confirm+0x322/0x460 [nf_conntrack]
[ 1408.226636] [<ffffffff8167f920>] xfrm_output+0x40/0x100
[ 1408.231954] [<ffffffff816bc82c>] xfrm6_output_finish+0x1c/0x20
[ 1408.237877] [<ffffffff81689dbe>] ip6_fragment+0x77e/0xa60
[ 1408.243367] [<ffffffff816bc810>] ? xfrm6_extract_output+0xe0/0xe0
[ 1408.249551] [<ffffffff816bc5ef>] __xfrm6_output+0xcf/0x170
[ 1408.255129] [<ffffffff816bc85a>] xfrm6_output+0x2a/0x70
[ 1408.260445] [<ffffffff816890da>] ip6_forward+0x29a/0x800
[ 1408.265851] [<ffffffff81696a94>] ? ip6_route_input+0xa4/0xd0
[ 1408.271599] [<ffffffff8168a210>] ? ip6_output+0xb0/0xb0
[ 1408.276917] [<ffffffff8168a290>] ip6_rcv_finish+0x80/0x90
[ 1408.282414] [<ffffffffa010b86e>] __ipv6_conntrack_in+0xce/0x1a0 [nf_conntrack_ipv6]
[ 1408.290167] [<ffffffffa010b9c7>] ipv6_conntrack_in+0x27/0x30 [nf_conntrack_ipv6]
[ 1408.297656] [<ffffffff8161cc0b>] nf_iterate+0x8b/0xa0
[ 1408.302801] [<ffffffff8168a210>] ? ip6_output+0xb0/0xb0
[ 1408.308117] [<ffffffff8161cc94>] nf_hook_slow+0x74/0x130
[ 1408.313521] [<ffffffff8168a210>] ? ip6_output+0xb0/0xb0
[ 1408.318846] [<ffffffffa00fe213>] nf_ct_frag6_output+...

Read more...

Changed in linux (Ubuntu):
importance: Undecided → High
Revision history for this message
Stéphane Graber (stgraber) wrote :
Download full text (6.4 KiB)

And same thing reproduced after just a few seconds on the current 3.12 kernel:

[ 54.047386] ------------[ cut here ]------------
[ 54.052018] Kernel BUG at ffffffff81607e40 [verbose debug info unavailable]
[ 54.058985] invalid opcode: 0000 [#1] SMP
[ 54.063118] Modules linked in: authenc(F) esp6(F) xfrm6_mode_transport(F) ipcomp6(F) xfrm6_tunnel(F) tunnel6(F) xfrm4_mode_tunnel(F) xfrm6_mode_tunnel(F) v)
[ 54.174614] CPU: 1 PID: 13117 Comm: nsupdate Tainted: GF 3.12.0-2-generic #7-Ubuntu
[ 54.183226] Hardware name: /D2500CC, BIOS CCCDT10N.86A.0039.2013.0425.1625 04/25/2013
[ 54.192532] task: ffff88009b38af60 ti: ffff880096142000 task.ti: ffff880096142000
[ 54.200018] RIP: 0010:[<ffffffff81607e40>] [<ffffffff81607e40>] pskb_expand_head+0x230/0x280
[ 54.208570] RSP: 0018:ffff88012fc83840 EFLAGS: 00010202
[ 54.213886] RAX: 0000000000000003 RBX: ffff8800a68cf800 RCX: 0000000000000020
[ 54.221027] RDX: 0000000000000710 RSI: 0000000000000050 RDI: ffff8800a68cf800
[ 54.228164] RBP: ffff88012fc83878 R08: 46f061feff3e1602 R09: 46f061feff3e1602
[ 54.235304] R10: bfcb2ffeff3e1602 R11: 20104b7170040120 R12: 0000000000000001
[ 54.242440] R13: 0000000000000050 R14: ffff8800a68cf800 R15: ffff8800a6973840
[ 54.249581] FS: 00007fdb85441700(0000) GS:ffff88012fc80000(0000) knlGS:0000000000000000
[ 54.257672] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 54.263424] CR2: 00007fdb80037fd8 CR3: 000000009b25b000 CR4: 00000000000007e0
[ 54.270560] Stack:
[ 54.272583] 0000000000000286 ffffffff81cd5f80 ffff8800a68cf800 0000000000000001
[ 54.280058] ffff8800c2321400 ffff8800a68cf800 ffff8800a6973840 ffff88012fc838b0
[ 54.287535] ffffffff816a7986 ffffffffa00a1cf2 ffff8801286cbf00 ffff8800a68cf800
[ 54.295021] Call Trace:
[ 54.297475] <IRQ>
[ 54.299410] [<ffffffff816a7986>] xfrm_output_resume+0xa6/0x390
[ 54.305568] [<ffffffffa00a1cf2>] ? __nf_conntrack_confirm+0x322/0x460 [nf_conntrack]
[ 54.313404] [<ffffffff816a7cd0>] xfrm_output+0x40/0x100
[ 54.318720] [<ffffffff816b000a>] ? ip6_xmit+0x27a/0x420
[ 54.324037] [<ffffffff816e500c>] xfrm6_output_finish+0x1c/0x20
[ 54.329962] [<ffffffff816b223e>] ip6_fragment+0x78e/0xa60
[ 54.335450] [<ffffffff816e4ff0>] ? xfrm6_extract_output+0xe0/0xe0
[ 54.341636] [<ffffffff816b2680>] ? ip6_output+0xb0/0xb0
[ 54.346954] [<ffffffff816e4dcf>] __xfrm6_output+0xcf/0x170
[ 54.352531] [<ffffffff816e503a>] xfrm6_output+0x2a/0x70
[ 54.357849] [<ffffffff816b1563>] ip6_forward+0x293/0x7e0
[ 54.363253] [<ffffffff816bf0a4>] ? ip6_route_input+0xa4/0xd0
[ 54.369001] [<ffffffff816b2680>] ? ip6_output+0xb0/0xb0
[ 54.374321] [<ffffffff816b2700>] ip6_rcv_finish+0x80/0x90
[ 54.379815] [<ffffffffa00f584e>] __ipv6_conntrack_in+0xce/0x1a0 [nf_conntrack_ipv6]
[ 54.387569] [<ffffffffa00f59a7>] ipv6_conntrack_in+0x27/0x30 [nf_conntrack_ipv6]
[ 54.395061] [<ffffffff81644a9b>] nf_iterate+0x8b/0xa0
[ 54.400204] [<ffffffff816b2680>] ? ip6_output+0xb0/0xb0
[ 54.405519] [<ffffffff81644b24>] nf_hook_slow+0x74/0x130
[ 54.410923] [<ffffffff816b2680>] ? ip6_output+0xb0/0xb0
[ 54.416247] [<ffffffffa00fc213>] nf_ct_frag6_out...

Read more...

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Can you give the latest 3.12 kernel[0] a test, to see if this is already fixed in mainline? If the bug still exits, we can perform a bisect to find the commit that introduced this.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.12-trusty/

tags: added: performing-bisect
Revision history for this message
Stéphane Graber (stgraber) wrote :
Download full text (5.8 KiB)

Same thing on a 3.8 kernel:

[ 151.530274] ------------[ cut here ]------------
[ 151.534897] Kernel BUG at ffffffff815be3a8 [verbose debug info unavailable]
[ 151.541849] invalid opcode: 0000 [#1] SMP
[ 151.545978] Modules linked in: authenc(F) esp6(F) xfrm6_mode_transport(F) ipcomp6(F) xfrm6_tunnel(F) tunnel6(F) xfrm4_mode_tunnel(F) xfrm6_mode_tunnel(F) v)
[ 151.657182] CPU 1
[ 151.659030] Pid: 13133, comm: samba Tainted: GF 3.8.0-34-generic #49-Ubuntu /D2500CC
[ 151.669380] RIP: 0010:[<ffffffff815be3a8>] [<ffffffff815be3a8>] pskb_expand_head+0x248/0x290
[ 151.677916] RSP: 0018:ffff88012fc83870 EFLAGS: 00010202
[ 151.683223] RAX: 0000000000000002 RBX: ffff880010186200 RCX: 0000000000000020
[ 151.690351] RDX: 0000000000000710 RSI: 0000000000000050 RDI: ffff880010186200
[ 151.697474] RBP: ffff88012fc838a8 R08: 46f061feff3e1602 R09: 46f061feff3e1602
[ 151.704599] R10: bfcb2ffeff3e1602 R11: 20104b7170040120 R12: 0000000000000001
[ 151.711726] R13: 0000000000000050 R14: 0000000000000548 R15: ffff88011a3fc840
[ 151.718853] FS: 00007f78e5514740(0000) GS:ffff88012fc80000(0000) knlGS:0000000000000000
[ 151.726930] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 151.732668] CR2: 0000000003f1b3b0 CR3: 00000000b8640000 CR4: 00000000000007e0
[ 151.739796] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 151.746921] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 151.754047] Process samba (pid: 13133, threadinfo ffff8800b865a000, task ffff880111e9dd00)
[ 151.762297] Stack:
[ 151.764314] ffffffff8106a555 0000000000000286 ffff880010186200 0000000000000001
[ 151.771793] ffff8801167cb800 0000000000000548 ffff88011a3fc840 ffff88012fc838e0
[ 151.779248] ffffffff81656fa4 ffff88012fc838e8 ffff880010186a00 ffff880010186200
[ 151.786713] Call Trace:
[ 151.789161] <IRQ>
[ 151.791084] [<ffffffff8106a555>] ? mod_timer+0x165/0x200
[ 151.796690] [<ffffffff81656fa4>] xfrm_output_resume+0xa4/0x370
[ 151.802604] [<ffffffff81657283>] xfrm_output2+0x13/0x20
[ 151.807910] [<ffffffff816572cb>] xfrm_output+0x3b/0xe0
[ 151.813130] [<ffffffff81693529>] xfrm6_output_finish+0x19/0x20
[ 151.819044] [<ffffffff81661a45>] ip6_fragment+0x915/0xa60
[ 151.824523] [<ffffffff81693510>] ? xfrm6_extract_output+0xe0/0xe0
[ 151.830698] [<ffffffff816933e9>] __xfrm6_output+0x109/0x150
[ 151.836352] [<ffffffff8169355a>] xfrm6_output+0x2a/0x70
[ 151.841658] [<ffffffff81660bce>] ip6_forward+0x2ae/0x810
[ 151.847052] [<ffffffff8166dd69>] ? ip6_route_input+0x99/0xc0
[ 151.852793] [<ffffffff81661d00>] ? ip6_output+0xb0/0xb0
[ 151.858099] [<ffffffff81661d80>] ip6_rcv_finish+0x80/0x90
[ 151.863590] [<ffffffffa011f7ef>] __ipv6_conntrack_in+0xdf/0x180 [nf_conntrack_ipv6]
[ 151.871327] [<ffffffffa011f917>] ipv6_conntrack_in+0x27/0x30 [nf_conntrack_ipv6]
[ 151.878802] [<ffffffff815f6216>] nf_iterate+0x86/0xb0
[ 151.883938] [<ffffffff81661d00>] ? ip6_output+0xb0/0xb0
[ 151.889244] [<ffffffff815f62b4>] nf_hook_slow+0x74/0x130
[ 151.894638] [<ffffffff81661d00>] ? ip6_output+0xb0/0xb0
[ 151.899953] [<ffffffffa0117e62>] nf_ct_frag6_output+0xc2/0x100 [nf_defrag_ipv6]
[ ...

Read more...

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I'd like to perform a bisect to figure out what commit caused this regression. We need to identify the earliest kernel where the issue started happening as well as the latest kernel that did not have this issue.

Can you test the following kernels and report back? We are looking for the first kernel version that exhibits this bug:
v3.3 final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.3-precise/
v3.4 final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4-quantal/
v3.5 final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.5-quantal/
v3.6 final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.6-quantal/
v3.7 final: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.7-raring/

You don't have to test every kernel, just up until the kernel that first has this bug.

Thanks in advance!

Revision history for this message
Stéphane Graber (stgraber) wrote :

I've been running the standard backported 3.5 kernel for an hour without getting a panic.

I'm currently running a 3.11 debug kernel from Andy, once that one panics (probably in the next hour), I'll then try the 3.6 and 3.7 you linked above to try and figure out when the issue was introduced.

Revision history for this message
Stéphane Graber (stgraber) wrote :
Download full text (5.9 KiB)

And just as I was typing that, Andy's kernel paniced, here's the dump:

[ 580.009012] ------------[ cut here ]------------
[ 580.013637] kernel BUG at /home/apw/build/ubuntu-saucy/ubuntu-saucy/net/core/skbuff.c:1059!
[ 580.021989] invalid opcode: 0000 [#1] SMP
[ 580.026133] Modules linked in: authenc esp6 xfrm6_mode_transport ipcomp6 xfrm6_tunnel tunnel6 xfrm4_mode_tunnel xfrm6_mode_tunnel veth xfrm_user xfrm4_tunne
[ 580.115883] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.11.0-14-generic #21lp1251946v201311181709
[ 580.124755] Hardware name: /D2500CC, BIOS CCCDT10N.86A.0039.2013.0425.1625 04/25/2013
[ 580.134063] task: ffffffff81c15440 ti: ffffffff81c00000 task.ti: ffffffff81c00000
[ 580.141548] RIP: 0010:[<ffffffff815e26be>] [<ffffffff815e26be>] pskb_expand_head+0x23e/0x280
[ 580.150099] RSP: 0018:ffff88012fc036b0 EFLAGS: 00010202
[ 580.155417] RAX: 0000000000000003 RBX: ffff88009d282700 RCX: 0000000000000020
[ 580.162554] RDX: 00000000000006d2 RSI: 0000000000000012 RDI: ffff88009d282700
[ 580.169693] RBP: ffff88012fc036e8 R08: 3f2c4b0500000060 R09: 3f2c4b0500000060
[ 580.176833] R10: f5fe8db0e1a07e51 R11: 00270ff0c0f20726 R12: 0000000000000001
[ 580.183969] R13: 0000000000000012 R14: ffff88009d282700 R15: ffff8800b28991fe
[ 580.191110] FS: 0000000000000000(0000) GS:ffff88012fc00000(0000) knlGS:0000000000000000
[ 580.199202] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 580.204951] CR2: 00007fd03715d000 CR3: 00000001121c5000 CR4: 00000000000007f0
[ 580.212090] Stack:
[ 580.214112] 0000000000000282 ffffffff81cd2940 ffff88009d282700 0000000000000001
[ 580.221595] ffff880117c1c800 ffff88009d282700 ffff8800b28991fe ffff88012fc03720
[ 580.229057] ffffffff81680aa6 ffffffffa00a5e62 ffff88009d282c00 ffff88009d282700
[ 580.236534] Call Trace:
[ 580.238988] <IRQ>
[ 580.240923] [<ffffffff81680aa6>] xfrm_output_resume+0xa6/0x390
[ 580.247082] [<ffffffffa00a5e62>] ? __nf_conntrack_confirm+0x322/0x460 [nf_conntrack]
[ 580.254917] [<ffffffff81680df0>] xfrm_output+0x40/0x100
[ 580.260235] [<ffffffff816bdd1c>] xfrm6_output_finish+0x1c/0x20
[ 580.266156] [<ffffffff8168b26d>] ip6_fragment+0x77d/0xa60
[ 580.271648] [<ffffffff816bdd00>] ? xfrm6_extract_output+0xe0/0xe0
[ 580.277833] [<ffffffff816bdadf>] __xfrm6_output+0xcf/0x170
[ 580.283409] [<ffffffff816bdd4a>] xfrm6_output+0x2a/0x70
[ 580.288726] [<ffffffff8168a58a>] ip6_forward+0x29a/0x800
[ 580.294130] [<ffffffff81697f54>] ? ip6_route_input+0xa4/0xd0
[ 580.299881] [<ffffffff8168b6c0>] ? ip6_output+0xb0/0xb0
[ 580.305198] [<ffffffff8168b740>] ip6_rcv_finish+0x80/0x90
[ 580.310695] [<ffffffffa012486e>] __ipv6_conntrack_in+0xce/0x1a0 [nf_conntrack_ipv6]
[ 580.318448] [<ffffffffa01249c7>] ipv6_conntrack_in+0x27/0x30 [nf_conntrack_ipv6]
[ 580.325937] [<ffffffff8161e04b>] nf_iterate+0x8b/0xa0
[ 580.331083] [<ffffffff8168b6c0>] ? ip6_output+0xb0/0xb0
[ 580.336397] [<ffffffff8161e0d4>] nf_hook_slow+0x74/0x130
[ 580.341799] [<ffffffff8168b6c0>] ? ip6_output+0xb0/0xb0
[ 580.347127] [<ffffffffa011b213>] nf_ct_frag6_output+0xe3/0x100 [nf_defrag_ipv6]
[ 580.354524] [<ffffffff8168b6c0>] ? ip6_output+0xb0/0xb0
[ 580.359847] [...

Read more...

Revision history for this message
Stéphane Graber (stgraber) wrote :

It's now been 45min without a panic using the 3.6 mainline build, so considering that one good and moving on to 3.7.

Revision history for this message
Stéphane Graber (stgraber) wrote :

And it's now been over 30 minutes on 3.7 mainline still without any crash.

Now getting the 3.8 mainline to confirm that one gives the same panic as the Ubuntu 3.8 did.

Revision history for this message
Stéphane Graber (stgraber) wrote :
Download full text (6.8 KiB)

Looks like the issue appeared with the 3.8 kernel, just a bit over two minutes after booting the mainline 3.8 kernel:

[ 127.201762] ------------[ cut here ]------------
[ 127.206379] Kernel BUG at ffffffff815cca98 [verbose debug info unavailable]
[ 127.213331] invalid opcode: 0000 [#1] SMP
[ 127.217459] Modules linked in: authenc esp6 xfrm6_mode_transport ipcomp6 xfrm6_tunnel tunnel6 xfrm4_mode_tunnel xfrm6_mode_tunnel veth xfrm_user xfrm4_tunne
[ 127.305010] CPU 0
[ 127.306856] Pid: 0, comm: swapper/0 Not tainted 3.8.0-030800-generic #201302181935 /D2500CC
[ 127.316853] RIP: 0010:[<ffffffff815cca98>] [<ffffffff815cca98>] pskb_expand_head+0x278/0x2c0
[ 127.325391] RSP: 0018:ffff88012fc03590 EFLAGS: 00010202
[ 127.330697] RAX: 0000000000000002 RBX: ffff880129311300 RCX: 0000000000000020
[ 127.337823] RDX: 0000000000000000 RSI: 0000000000000012 RDI: 00000000000006c0
[ 127.344949] RBP: ffff88012fc035d0 R08: 196ccefeff3e1602 R09: 196ccefeff3e1602
[ 127.352074] R10: e9976c2389223888 R11: 201011b570040120 R12: 0000000000000012
[ 127.359200] R13: 0000000000000020 R14: 0000000000000001 R15: ffffffff816a3e50
[ 127.366326] FS: 0000000000000000(0000) GS:ffff88012fc00000(0000) knlGS:0000000000000000
[ 127.374402] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 127.380144] CR2: 0000000000a7f7c0 CR3: 0000000107b86000 CR4: 00000000000007f0
[ 127.387270] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 127.394395] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 127.401521] Process swapper/0 (pid: 0, threadinfo ffffffff81c00000, task ffffffff81c15440)
[ 127.409768] Stack:
[ 127.411779] ffffe8ffffc000e0 0000000000000286 ffff880127d38b40 ffff880129311300
[ 127.419249] ffff880118456c00 ffff8800ad8bf180 0000000000000001 ffffffff816a3e50
[ 127.426712] ffff88012fc03600 ffffffff816674d0 ffff880129311300 0000000000000001
[ 127.434178] Call Trace:
[ 127.436628] <IRQ>
[ 127.438559] [<ffffffff816a3e50>] ? xfrm6_extract_output+0x50/0x50
[ 127.444950] [<ffffffff816674d0>] xfrm_output_one+0x90/0x220
[ 127.450605] [<ffffffff81667726>] xfrm_output_resume+0xc6/0x170
[ 127.456517] [<ffffffff816677e3>] xfrm_output2+0x13/0x20
[ 127.461826] [<ffffffff8166782f>] xfrm_output+0x3f/0x100
[ 127.467133] [<ffffffff816a3e69>] xfrm6_output_finish+0x19/0x20
[ 127.473046] [<ffffffff81672105>] ip6_fragment+0x925/0xaa0
[ 127.478525] [<ffffffff816a3e50>] ? xfrm6_extract_output+0x50/0x50
[ 127.484698] [<ffffffff816a3b50>] ? xfrm6_local_error+0x80/0x80
[ 127.490613] [<ffffffff816a3c52>] __xfrm6_output+0x102/0x170
[ 127.496265] [<ffffffff816a3e9a>] xfrm6_output+0x2a/0x70
[ 127.501575] [<ffffffff816713d2>] ip6_forward+0x242/0x650
[ 127.506967] [<ffffffff816723f0>] ? ip6_output+0xb0/0xb0
[ 127.512275] [<ffffffff81672478>] ip6_rcv_finish+0x88/0x90
[ 127.517755] [<ffffffff816723f0>] ? ip6_output+0xb0/0xb0
[ 127.523071] [<ffffffffa011b882>] __ipv6_conntrack_in+0xe2/0x180 [nf_conntrack_ipv6]
[ 127.530804] [<ffffffff815cc890>] ? pskb_expand_head+0x70/0x2c0
[ 127.536722] [<ffffffffa011b945>] ipv6_conntrack_in+0x25/0x30 [nf_conntrack_ipv6]
[ 127.544197] [<ffffffff816058ff>]...

Read more...

Revision history for this message
Stéphane Graber (stgraber) wrote :
Download full text (153.2 KiB)

Oh, actually, just before I could reboot into the original 3.2 kernel, I got another panic on that 3.8 mainline kernel which was a bit different as it appeared to be stuck in some kind of loop for a while, eventually giving up after 20s and then rebooting (as my machines have panic=1).

Here's the very long trace:
[ 143.626520] ------------[ cut here ]------------
[ 143.631143] Kernel BUG at ffffffff815cca98 [verbose debug info unavailable]
[ 143.638096] invalid opcode: 0000 [#1] SMP
[ 143.642224] Modules linked in: authenc esp6 xfrm6_mode_transport ipcomp6 xfrm6_tunnel tunnel6 xfrm4_mode_tunnel xfrm6_mode_tunnel veth xfrm_user xfrm4_tunne
[ 143.729843] CPU 0
[ 143.731692] Pid: 15332, comm: nsupdate Not tainted 3.8.0-030800-generic #201302181935 /D2500CC
[ 143.741955] RIP: 0010:[<ffffffff815cca98>] [<ffffffff815cca98>] pskb_expand_head+0x278/0x2c0
[ 143.750492] RSP: 0018:ffff88012fc03728 EFLAGS: 00010202
[ 143.755801] RAX: 0000000000000002 RBX: ffff8800cea2bf00 RCX: 0000000000000020
[ 143.762925] RDX: 0000000000000000 RSI: 0000000000000050 RDI: 00000000000006c0
[ 143.770051] RBP: ffff88012fc03768 R08: 46f061feff3e1602 R09: 46f061feff3e1602
[ 143.777177] R10: bfcb2ffeff3e1602 R11: 20104b7170040120 R12: 0000000000000050
[ 143.784302] R13: 0000000000000020 R14: 0000000000000001 R15: ffffffff816a3e50
[ 143.791429] FS: 00007f887ef0c700(0000) GS:ffff88012fc00000(0000) knlGS:0000000000000000
[ 143.799505] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 143.805245] CR2: 00007f3df316c000 CR3: 00000000a734b000 CR4: 00000000000007f0
[ 143.812370] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 143.819496] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 143.826623] Process nsupdate (pid: 15332, threadinfo ffff8800a7356000, task ffff8801177c45c0)
[ 143.835133] Stack:
[ 143.837149] ffffe8ffffc00250 0000000000000282 ffff88012977c700 ffff8800cea2bf00
[ 143.844603] ffff8801188bf800 ffff8800a70fc540 0000000000000001 ffffffff816a3e50
[ 143.852066] ffff88012fc03798 ffffffff816674d0 ffff8800cea2bf00 0000000000000001
[ 143.859531] Call Trace:
[ 143.861983] <IRQ>
[ 143.863913] [<ffffffff816a3e50>] ? xfrm6_extract_output+0x50/0x50
[ 143.870313] [<ffffffff816674d0>] xfrm_output_one+0x90/0x220
[ 143.875968] [<ffffffff81667726>] xfrm_output_resume+0xc6/0x170
[ 143.881880] [<ffffffff816677e3>] xfrm_output2+0x13/0x20
[ 143.887187] [<ffffffff8166782f>] xfrm_output+0x3f/0x100
[ 143.892494] [<ffffffff816a3e69>] xfrm6_output_finish+0x19/0x20
[ 143.898408] [<ffffffff81672105>] ip6_fragment+0x925/0xaa0
[ 143.903887] [<ffffffff816a3e50>] ? xfrm6_extract_output+0x50/0x50
[ 143.910061] [<ffffffff816a3b50>] ? xfrm6_local_error+0x80/0x80
[ 143.915976] [<ffffffff816a3c52>] __xfrm6_output+0x102/0x170
[ 143.921629] [<ffffffff816a3e9a>] xfrm6_output+0x2a/0x70
[ 143.926935] [<ffffffff816713d2>] ip6_forward+0x242/0x650
[ 143.932330] [<ffffffff816723f0>] ? ip6_output+0xb0/0xb0
[ 143.937637] [<ffffffff81672478>] ip6_rcv_finish+0x88/0x90
[ 143.943118] [<ffffffff816723f0>] ? ip6_output+0xb0/0xb0
[ 143.948433] [<ffffffffa011b882>] __ipv6_conntrack_in+0xe2/0x180 [nf_c...

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Can you test the following kernels and report back? We are looking for the earliest kernel version that exhibits this bug:

v3.8-rc4: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.8-rc4-raring/

If v3.8-rc4 does not exhibit the bug then test v3.8-rc6:
v3.8-rc6: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.8-rc6-raring/

If v3.8-rc4 does exhibit the bug then test v3.8-rc2:
v3.8-rc2: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.8-rc2-raring

Revision history for this message
Stéphane Graber (stgraber) wrote :
Download full text (6.8 KiB)

rc4 is affected:
[ 76.048846] ------------[ cut here ]------------
[ 76.053467] Kernel BUG at ffffffff815cf5b8 [verbose debug info unavailable]
[ 76.060419] invalid opcode: 0000 [#1] SMP
[ 76.064549] Modules linked in: authenc esp6 xfrm6_mode_transport ipcomp6 xfrm6_tunnel tunnel6 xfrm4_mode_tunnel xfrm6_mode_tunnel veth xfrm_user xfrm4_tunne
[ 76.150972] CPU 1
[ 76.152821] Pid: 13301, comm: samba Not tainted 3.8.0-030800rc4-generic #201301172335 /D2500CC
[ 76.163086] RIP: 0010:[<ffffffff815cf5b8>] [<ffffffff815cf5b8>] pskb_expand_head+0x278/0x2c0
[ 76.171623] RSP: 0018:ffff88012fc83728 EFLAGS: 00010202
[ 76.176929] RAX: 0000000000000002 RBX: ffff8800be3ddd00 RCX: 0000000000000020
[ 76.184054] RDX: 0000000000000000 RSI: 0000000000000050 RDI: 00000000000006c0
[ 76.191180] RBP: ffff88012fc83768 R08: 3f2c7d0500000060 R09: 3f2c7d0500000060
[ 76.198307] R10: bfcb2ffeff3e1602 R11: 20270ff0c0f20726 R12: 0000000000000050
[ 76.205431] R13: 0000000000000020 R14: 0000000000000001 R15: ffffffff816a6450
[ 76.212557] FS: 00007fccaaa4b740(0000) GS:ffff88012fc80000(0000) knlGS:0000000000000000
[ 76.220633] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 76.226375] CR2: 00007ff1d5664018 CR3: 00000000b6e18000 CR4: 00000000000007e0
[ 76.233500] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 76.240625] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 76.247751] Process samba (pid: 13301, threadinfo ffff8800b6e1e000, task ffff8801175f1740)
[ 76.256001] Stack:
[ 76.258018] ffff88012fc9bfec 0000000180400040 ffff880127b5c6e0 ffff8800be3ddd00
[ 76.265472] ffff8801131ac800 ffff8800b38a9c00 0000000000000001 ffffffff816a6450
[ 76.272944] ffff88012fc83798 ffffffff81669b70 ffff8800be3ddd00 0000000000000001
[ 76.280416] Call Trace:
[ 76.282868] <IRQ>
[ 76.284799] [<ffffffff816a6450>] ? xfrm6_extract_output+0x50/0x50
[ 76.291192] [<ffffffff81669b70>] xfrm_output_one+0x90/0x220
[ 76.296845] [<ffffffff81669dc6>] xfrm_output_resume+0xc6/0x170
[ 76.302758] [<ffffffff81669e83>] xfrm_output2+0x13/0x20
[ 76.308064] [<ffffffff81669ecf>] xfrm_output+0x3f/0x100
[ 76.313374] [<ffffffff816a6469>] xfrm6_output_finish+0x19/0x20
[ 76.319287] [<ffffffff81674785>] ip6_fragment+0x925/0xaa0
[ 76.324766] [<ffffffff816a6450>] ? xfrm6_extract_output+0x50/0x50
[ 76.330940] [<ffffffff816a6150>] ? xfrm6_local_error+0x80/0x80
[ 76.336853] [<ffffffff816a6252>] __xfrm6_output+0x102/0x170
[ 76.342505] [<ffffffff816a649a>] xfrm6_output+0x2a/0x70
[ 76.347815] [<ffffffff81673a52>] ip6_forward+0x242/0x650
[ 76.353207] [<ffffffff81674a70>] ? ip6_output+0xb0/0xb0
[ 76.358516] [<ffffffff81674af8>] ip6_rcv_finish+0x88/0x90
[ 76.363997] [<ffffffff81674a70>] ? ip6_output+0xb0/0xb0
[ 76.369311] [<ffffffffa0108882>] __ipv6_conntrack_in+0xe2/0x180 [nf_conntrack_ipv6]
[ 76.377044] [<ffffffff815cf3b0>] ? pskb_expand_head+0x70/0x2c0
[ 76.382963] [<ffffffffa0108945>] ipv6_conntrack_in+0x25/0x30 [nf_conntrack_ipv6]
[ 76.390437] [<ffffffff816083ff>] nf_iterate+0x8f/0xd0
[ 76.395571] [<ffffffff816530a8>] ? inet_frag_destroy+0xd8/0x120
[ 76.40...

Read more...

Revision history for this message
Stéphane Graber (stgraber) wrote :
Download full text (7.0 KiB)

and rc2 is also affected:

[ 60.661389] ------------[ cut here ]------------
[ 60.666011] Kernel BUG at ffffffff815cc9e8 [verbose debug info unavailable]
[ 60.672961] invalid opcode: 0000 [#1] SMP
[ 60.677091] Modules linked in: authenc esp6 xfrm6_mode_transport ipcomp6 xfrm6_tunnel tunnel6 xfrm4_mode_tunnel xfrm6_mode_tunnel veth xfrm_user xfrm4_tunne
[ 60.763556] CPU 1
[ 60.765399] Pid: 13066, comm: nsupdate Not tainted 3.8.0-030800rc2-generic #201301022235 /D2500CC
[ 60.775913] RIP: 0010:[<ffffffff815cc9e8>] [<ffffffff815cc9e8>] pskb_expand_head+0x278/0x2c0
[ 60.784451] RSP: 0018:ffff88012fc83728 EFLAGS: 00010202
[ 60.789758] RAX: 0000000000000002 RBX: ffff8800b2869e00 RCX: 0000000000000020
[ 60.796884] RDX: 0000000000000000 RSI: 0000000000000050 RDI: 00000000000006c0
[ 60.804009] RBP: ffff88012fc83768 R08: 46f061feff3e1602 R09: 46f061feff3e1602
[ 60.811135] R10: bfcb2ffeff3e1602 R11: 20104b7170040120 R12: 0000000000000050
[ 60.818260] R13: 0000000000000020 R14: 0000000000000001 R15: ffffffff816a3860
[ 60.825385] FS: 00007f9d55f36700(0000) GS:ffff88012fc80000(0000) knlGS:0000000000000000
[ 60.833463] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 60.839203] CR2: 00007fa7bfab2f90 CR3: 00000000b13e8000 CR4: 00000000000007e0
[ 60.846328] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 60.853455] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 60.860580] Process nsupdate (pid: 13066, threadinfo ffff8800b2bfe000, task ffff8800cebd5c80)
[ 60.869088] Stack:
[ 60.871099] ffffe8ffffc80744 0000000180400040 ffff880129380ac0 ffff8800b2869e00
[ 60.878561] ffff880117395000 ffff8801124968c0 0000000000000001 ffffffff816a3860
[ 60.886032] ffff88012fc83798 ffffffff81666f40 ffff8800b2869e00 0000000000000001
[ 60.893486] Call Trace:
[ 60.895929] <IRQ>
[ 60.897861] [<ffffffff816a3860>] ? xfrm6_extract_output+0x50/0x50
[ 60.904254] [<ffffffff81666f40>] xfrm_output_one+0x90/0x220
[ 60.909907] [<ffffffff81667196>] xfrm_output_resume+0xc6/0x170
[ 60.915819] [<ffffffff81667253>] xfrm_output2+0x13/0x20
[ 60.921125] [<ffffffff8166729f>] xfrm_output+0x3f/0x100
[ 60.926433] [<ffffffff816a3879>] xfrm6_output_finish+0x19/0x20
[ 60.932348] [<ffffffff81671b55>] ip6_fragment+0x925/0xaa0
[ 60.937828] [<ffffffff816a3860>] ? xfrm6_extract_output+0x50/0x50
[ 60.944000] [<ffffffff816a3560>] ? xfrm6_local_error+0x80/0x80
[ 60.949915] [<ffffffff816a3662>] __xfrm6_output+0x102/0x170
[ 60.955567] [<ffffffff816a38aa>] xfrm6_output+0x2a/0x70
[ 60.960875] [<ffffffff81670e22>] ip6_forward+0x242/0x650
[ 60.966268] [<ffffffff81671e40>] ? ip6_output+0xb0/0xb0
[ 60.971576] [<ffffffff81671ec8>] ip6_rcv_finish+0x88/0x90
[ 60.977058] [<ffffffff81671e40>] ? ip6_output+0xb0/0xb0
[ 60.982372] [<ffffffffa00f2882>] __ipv6_conntrack_in+0xe2/0x180 [nf_conntrack_ipv6]
[ 60.990106] [<ffffffff815cc7e0>] ? pskb_expand_head+0x70/0x2c0
[ 60.996025] [<ffffffffa00f2945>] ipv6_conntrack_in+0x25/0x30 [nf_conntrack_ipv6]
[ 61.003498] [<ffffffff816057ff>] nf_iterate+0x8f/0xd0
[ 61.008635] [<ffffffff81671e40>] ? ip6_output+0xb0/0xb0
[...

Read more...

Revision history for this message
Stéphane Graber (stgraber) wrote :
Download full text (6.8 KiB)

and same goes for rc1:

[ 80.428925] ------------[ cut here ]------------
[ 80.433548] Kernel BUG at ffffffff815cc5d8 [verbose debug info unavailable]
[ 80.440501] invalid opcode: 0000 [#1] SMP
[ 80.444637] Modules linked in: authenc esp6 xfrm6_mode_transport ipcomp6 xfrm6_tunnel tunnel6 xfrm4_mode_tunnel xfrm6_mode_tunnel veth xfrm_user xfrm4_tunne
[ 80.531123] CPU 1
[ 80.532970] Pid: 12843, comm: samba Not tainted 3.8.0-030800rc1-generic #201212212135 /D2500CC
[ 80.543235] RIP: 0010:[<ffffffff815cc5d8>] [<ffffffff815cc5d8>] pskb_expand_head+0x278/0x2c0
[ 80.551773] RSP: 0018:ffff88012fc83728 EFLAGS: 00010202
[ 80.557080] RAX: 0000000000000002 RBX: ffff88010fad3600 RCX: 0000000000000020
[ 80.564204] RDX: 0000000000000000 RSI: 0000000000000050 RDI: 00000000000006c0
[ 80.571330] RBP: ffff88012fc83768 R08: 46f061feff3e1602 R09: 46f061feff3e1602
[ 80.578456] R10: bfcb2ffeff3e1602 R11: 20104b7170040120 R12: 0000000000000050
[ 80.585581] R13: 0000000000000020 R14: 0000000000000001 R15: ffffffff816a34a0
[ 80.592708] FS: 00007f393ae12740(0000) GS:ffff88012fc80000(0000) knlGS:0000000000000000
[ 80.600784] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 80.606523] CR2: 00000000013dfb78 CR3: 00000000b4c27000 CR4: 00000000000007e0
[ 80.613651] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 80.620774] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 80.627900] Process samba (pid: 12843, threadinfo ffff8800b4c3a000, task ffff8800bc915c80)
[ 80.636150] Stack:
[ 80.638160] ffff88012fc9bff8 0000000000000282 ffff880127e10360 ffff88010fad3600
[ 80.645614] ffff880118bfe400 ffff8800cdb47c00 0000000000000001 ffffffff816a34a0
[ 80.653069] ffff88012fc83798 ffffffff81666b80 ffff88010fad3600 0000000000000001
[ 80.660541] Call Trace:
[ 80.662991] <IRQ>
[ 80.664922] [<ffffffff816a34a0>] ? xfrm6_extract_output+0x50/0x50
[ 80.671314] [<ffffffff81666b80>] xfrm_output_one+0x90/0x220
[ 80.676968] [<ffffffff81666dd6>] xfrm_output_resume+0xc6/0x170
[ 80.682882] [<ffffffff81666e93>] xfrm_output2+0x13/0x20
[ 80.688188] [<ffffffff81666edf>] xfrm_output+0x3f/0x100
[ 80.693497] [<ffffffff816a34b9>] xfrm6_output_finish+0x19/0x20
[ 80.699410] [<ffffffff81671795>] ip6_fragment+0x925/0xaa0
[ 80.704891] [<ffffffff816a34a0>] ? xfrm6_extract_output+0x50/0x50
[ 80.711064] [<ffffffff816a31a0>] ? xfrm6_local_error+0x80/0x80
[ 80.716977] [<ffffffff816a32a2>] __xfrm6_output+0x102/0x170
[ 80.722631] [<ffffffff816a34ea>] xfrm6_output+0x2a/0x70
[ 80.727939] [<ffffffff81670a62>] ip6_forward+0x242/0x650
[ 80.733331] [<ffffffff81671a80>] ? ip6_output+0xb0/0xb0
[ 80.738638] [<ffffffff81671b08>] ip6_rcv_finish+0x88/0x90
[ 80.744119] [<ffffffff81671a80>] ? ip6_output+0xb0/0xb0
[ 80.749434] [<ffffffffa0111882>] __ipv6_conntrack_in+0xe2/0x180 [nf_conntrack_ipv6]
[ 80.757168] [<ffffffff815cc3d0>] ? pskb_expand_head+0x70/0x2c0
[ 80.763086] [<ffffffffa0111945>] ipv6_conntrack_in+0x25/0x30 [nf_conntrack_ipv6]
[ 80.770562] [<ffffffff8160545f>] nf_iterate+0x8f/0xd0
[ 80.775698] [<ffffffff81671a80>] ? ip6_output+0xb0/0xb0
[ 80.781...

Read more...

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I'll start a bisect between v3.7 and v3.8-rc1 and post a test kernel shortly.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I started a kernel bisect between v3.7 and v3.8-rc1. The kernel bisect will require testing of about 7-10 test kernels.

I built the first test kernel, up to the following commit:
6be35c700f742e911ecedd07fcc43d4439922334

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1249719

Can you test that kernel and report back if it has the bug or not. I will build the next test kernel based on your test results.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

In addition to the bisect, I'll poke through the git logs and see if a change sticks out, as well as look upstream for similar issues.

Revision history for this message
Stéphane Graber (stgraber) wrote :
Download full text (16.9 KiB)

That kernel panics:

[ 65.076034] ------------[ cut here ]------------
[ 65.080663] Kernel BUG at ffffffff81604508 [verbose debug info unavailable]
[ 65.087628] invalid opcode: 0000 [#1] SMP
[ 65.091771] Modules linked in: authenc esp6 xfrm6_mode_transport ipcomp6 xfrm6_tunnel tunnel6 xfrm4_mode_tunnel xfrm6_mode_tunnel veth xfrm_user zram(C) xfrm4_tunnel e
[ 65.196955] CPU: 0 PID: 12938 Comm: nsupdate Tainted: G WC 3.11.5-031105-generic #201311181306
[ 65.206343] Hardware name: /D2500CC, BIOS CCCDT10N.86A.0039.2013.0425.1625 04/25/2013
[ 65.215652] task: ffff880125d24650 ti: ffff8800aa40a000 task.ti: ffff8800aa40a000
[ 65.223137] RIP: 0010:[<ffffffff81604508>] [<ffffffff81604508>] pskb_expand_head+0x288/0x2d0
[ 65.231688] RSP: 0018:ffff88012fc03688 EFLAGS: 00010202
[ 65.237005] RAX: 0000000000000003 RBX: ffff88012619ee00 RCX: 0000000000000020
[ 65.244145] RDX: 0000000000000000 RSI: 0000000000000050 RDI: 00000000000006c0
[ 65.251283] RBP: ffff88012fc036c8 R08: 3f2c7e0500000060 R09: 3f2c7e0500000060
[ 65.258422] R10: bfcb2ffeff3e1602 R11: 20270ff0c0f20726 R12: 0000000000000050
[ 65.265560] R13: 0000000000000020 R14: 0000000000000001 R15: ffffffff816e3880
[ 65.272699] FS: 00007f2688d2e700(0000) GS:ffff88012fc00000(0000) knlGS:0000000000000000
[ 65.280791] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 65.286541] CR2: 00007f268d12b000 CR3: 00000000aa76b000 CR4: 00000000000007f0
[ 65.293680] Stack:
[ 65.295700] ffffe8ffffc00690 0000000180400040 ffff88012757f260 ffff88012619ee00 ...

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
97ebe8f55ae99059c0ad3d3be5c0417647f5e3e0

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1249719

Can you test that kernel and report back if it has the bug or not. I will build the next test kernel based on your test results.

Revision history for this message
Stéphane Graber (stgraber) wrote :
Download full text (6.9 KiB)

No luck with that one either:

[ 136.619649] ------------[ cut here ]------------
[ 136.624273] Kernel BUG at ffffffff81604508 [verbose debug info unavailable]
[ 136.631238] invalid opcode: 0000 [#1] SMP
[ 136.635382] Modules linked in: authenc esp6 xfrm6_mode_transport ipcomp6 xfrm6_tunnel tunnel6 xfrm4_mode_tunnel xfrm6_mode_tunnel veth xfrm_user zram(C) xfe
[ 136.740573] CPU: 1 PID: 13007 Comm: samba Tainted: G WC 3.11.5-031105-generic #201311181306
[ 136.749703] Hardware name: /D2500CC, BIOS CCCDT10N.86A.0039.2013.0425.1625 04/25/2013
[ 136.759012] task: ffff8800bd082ee0 ti: ffff8800b768e000 task.ti: ffff8800b768e000
[ 136.766496] RIP: 0010:[<ffffffff81604508>] [<ffffffff81604508>] pskb_expand_head+0x288/0x2d0
[ 136.775049] RSP: 0018:ffff88012fc83688 EFLAGS: 00010202
[ 136.780365] RAX: 0000000000000003 RBX: ffff8800b3e7b500 RCX: 0000000000000020
[ 136.787505] RDX: 0000000000000000 RSI: 0000000000000050 RDI: 00000000000006c0
[ 136.794643] RBP: ffff88012fc836c8 R08: 3f2c7d0500000060 R09: 3f2c7d0500000060
[ 136.801782] R10: bfcb2ffeff3e1602 R11: 20270ff0c0f20726 R12: 0000000000000050
[ 136.808919] R13: 0000000000000020 R14: 0000000000000001 R15: ffffffff816e3880
[ 136.816059] FS: 00007f37ca7a2740(0000) GS:ffff88012fc80000(0000) knlGS:0000000000000000
[ 136.824150] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 136.829902] CR2: 00000000020fa630 CR3: 00000000b7680000 CR4: 00000000000007e0
[ 136.837040] Stack:
[ 136.839062] ffffe8ffffc8064c 0000000000000286 ffff880127ea9260 ffff8800b3e7b500
[ 136.846536] ffff8801268c4800 ffff8800b3ffcc40 0000000000000001 ffffffff816e3880
[ 136.854023] ffff88012fc836f8 ffffffff816a52e0 ffff8800b3e7b500 0000000000000001
[ 136.861499] Call Trace:
[ 136.863955] <IRQ>
[ 136.865890] [<ffffffff816e3880>] ? xfrm6_extract_output+0x50/0x50
[ 136.872303] [<ffffffff816a52e0>] xfrm_output_one+0x90/0x230
[ 136.877965] [<ffffffff816a5546>] xfrm_output_resume+0xc6/0x170
[ 136.883889] [<ffffffff816a5603>] xfrm_output2+0x13/0x20
[ 136.889206] [<ffffffff816a564f>] xfrm_output+0x3f/0x100
[ 136.894525] [<ffffffff816e3899>] xfrm6_output_finish+0x19/0x20
[ 136.900448] [<ffffffff816aff49>] ip6_fragment+0x929/0xab0
[ 136.905938] [<ffffffff816e3880>] ? xfrm6_extract_output+0x50/0x50
[ 136.912122] [<ffffffff816e34d0>] ? xfrm6_local_rxpmtu+0x70/0x70
[ 136.918133] [<ffffffff816e35a2>] __xfrm6_output+0xd2/0x180
[ 136.923711] [<ffffffff816e38ca>] xfrm6_output+0x2a/0x70
[ 136.929028] [<ffffffff816af223>] ip6_forward+0x243/0x640
[ 136.934431] [<ffffffff816b0240>] ? ip6_output+0xb0/0xb0
[ 136.939749] [<ffffffff816b02c8>] ip6_rcv_finish+0x88/0x90
[ 136.945239] [<ffffffff816b0240>] ? ip6_output+0xb0/0xb0
[ 136.950566] [<ffffffffa024883d>] __ipv6_conntrack_in+0xdd/0x180 [nf_conntrack_ipv6]
[ 136.958317] [<ffffffffa0248905>] ipv6_conntrack_in+0x25/0x30 [nf_conntrack_ipv6]
[ 136.965807] [<ffffffff816408ff>] nf_iterate+0x8f/0xd0
[ 136.970951] [<ffffffff816b0240>] ? ip6_output+0xb0/0xb0
[ 136.976266] [<ffffffff816409bd>] nf_hook_slow+0x7d/0x150
[ 136.981668] [<ffffffff816b0240>] ? ip6_output+0xb0/0xb0
[ 136.986988] [<ffffffff816b0240>] ? ip6_outp...

Read more...

Revision history for this message
Stéphane Graber (stgraber) wrote :

Hmm, is there actually any difference between those two kernels? the uname in the panic appears identical.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Thanks for the update. I'll build the next kernel based on your test result. The mainline kernel build script I am using just creates generic kernel names. However, the build timestamp should be different between the kernels.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
7bcb57cde66c19df378f3468ea342166a8a4504d

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1249719

Can you test that kernel and report back if it has the bug or not. I will build the next test kernel based on your test results.

Revision history for this message
Stéphane Graber (stgraber) wrote :

Sorry I didn't get back to you earlier, I needed reliable internet the past few days. I'm doing a test run now.

Revision history for this message
Stéphane Graber (stgraber) wrote :

Well, I'd if the kernels were there ;)

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Whoops, sorry, I pasted the incorrect URL to the test kernel. It can be downloaded from:

http://kernel.ubuntu.com/~jsalisbury/lp1251946. And now that I look at it, all the URLs were wrong. Let me restart the bisect to ensure all the tests were ok.

I'll post the next test kernel shortly.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built the first test kernel again, up to the following commit:
6be35c700f742e911ecedd07fcc43d4439922334

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1251946

Can you test that kernel and report back if it has the bug or not. I will build the next test kernel based on your test results.

Revision history for this message
Stéphane Graber (stgraber) wrote :
Download full text (69.8 KiB)

So while that kernel didn't panic on me during more than 15min of use, it wasn't terribly useful since PPPoE wouldn't work with it which meant no internet connection and as a result, not much traffic.

So I won't say that this kernel was stable since it's almost guaranteed that the traffic causing the panic in the first place didn't happen during that test...

I tried 3 boots to see if this was somehow an intermittent issue but apparently it wasn't. All network interfaces would come up fine, except for the two WAN ppp links which would get stuck trying to find the modem.

Nov 26 02:06:09 sateda pppd[2109]: Plugin rp-pppoe.so loaded.
Nov 26 02:06:09 sateda pppd[2109]: pppd 2.4.5 started by root, uid 0
Nov 26 02:06:44 sateda pppd[2109]: Timeout waiting for PADO packets
Nov 26 02:06:44 sateda pppd[2109]: Unable to complete PPPoE Discovery
Nov 26 02:07:01 sateda pppd[2109]: PPP session is 4132
Nov 26 02:07:01 sateda pppd[2109]: Failed to create PPPoE socket: Address family not supported by protocol
Nov 26 02:07:01 sateda pppd[2109]: Sent PADT
Nov 26 02:07:38 sateda pppd[2109]: Timeout waiting for PADO packets
Nov 26 02:07:38 sateda pppd[2109]: Unable to complete PPPoE Discovery
Nov 26 02:08:15 sateda pppd[2109]: Timeout waiting for PADO packets
Nov 26 02:08:15 sateda pppd[2109]: Unable to complete PPPoE Discovery

The "Failed to create PPPoE socket: Address family not supported by protocol" seems to be the issue there and doesn't appear when establishing a session from either the current 3.2 or any of the kernels I previously tested.

In case that's useful, here's the dmesg as it was before I gave up and switched back to 3.2:
[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Linux version 3.7.0-030700-generic (jsalisbury@gomeisa) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #201311251856 SMP Mon Nov 25 23:58:26 UTC 2013
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.7.0-030700-generic root=/dev/mapper/internal-root ro panic=1 console=ttyS3,115200
[ 0.000000] KERNEL supported cpus:
[ 0.000000] Intel GenuineIntel
[ 0.000000] AMD AuthenticAMD
[ 0.000000] Centaur CentaurHauls
[ 0.000000] Disabled fast string operations
[ 0.000000] e820: BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000008efff] usable
[ 0.000000] BIOS-e820: [mem 0x000000000008f000-0x000000000008ffff] ACPI NVS
[ 0.000000] BIOS-e820: [mem 0x0000000000090000-0x000000000009ffff] usable
[ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000cede1fff] usable
[ 0.000000] BIOS-e820: [mem 0x00000000cede2000-0x00000000cee6efff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000cee6f000-0x00000000cee97fff] usable
[ 0.000000] BIOS-e820: [mem 0x00000000cee98000-0x00000000ceebefff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000ceebf000-0x00000000ceef9fff] usable
[ 0.000000] BIOS-e820: [mem 0x00000000ceefa000-0x00000000cefbefff] ACPI NVS
[ 0.000000] BIOS-e820: [mem 0x00000000cefbf000-0x00000000ceff0fff] usable
[ 0.000000] BIOS-e820: [mem 0x00000000ceff1000-0x00000000ceffefff] ACPI data
[ 0.000000] BIOS-e820: [mem 0x00000000ce...

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Thanks for testing. I skipped that commit, since we could not test it.

I built the next test kernel, up to the following commit:
b1ca079e7eb6ca862247fe694f320367bd962a24

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1251946

Can you test that kernel and report back if it has the bug or not. I will build the next test kernel based on your test results.

Revision history for this message
Stéphane Graber (stgraber) wrote :

Same thing as your previous kernel, PPPoE doesn't work so my machine is essentially useless and the panicing code path doesn't get exercised. So the result is "no idea".

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I also skipped that commit, since we could not test it.

I built the next test kernel, up to the following commit:
abed9a6bf2bb79e94ac6d6127f70be2f9718bb33

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1251946

Can you test that kernel and report back if it has the bug or not. I will build the next test kernel based on your test results.

Revision history for this message
Stéphane Graber (stgraber) wrote :

That kernel is still useless to me...

So I went looking a bit closer at your test kernels and it looks like the problem is the config you're using, they simply lack PPPoE support entirely...

root@sateda:~# grep -i pppoe /boot/config-3.7.0-030700rc5-generic

root@sateda:~# grep -i pppoe /boot/config-3.2.0-57-generic
CONFIG_PPPOE=m

I suspect you'll want to enable that module and then restart the bisect from the start so we can get relevant results.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Thanks for finding that. That config option is disabled in the mainline config, so I'll now build the test kernels with a Saucy tree.

I built the first test kernel again, up to the following commit:
6be35c700f742e911ecedd07fcc43d4439922334

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1251946

Can you test that kernel and report back if it has the bug or not. I will build the next test kernel based on your test results.

Revision history for this message
Stéphane Graber (stgraber) wrote :

That kernel still doesn't have PPPoE support in its config...

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Ok, one more shot. It looks like PPPOE wasn't enabled for amd64 in mainline. I enabled it and rebuilt a test kernel.

Again, its up to the following commit:
6be35c700f742e911ecedd07fcc43d4439922334

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1251946

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Stéphane Graber (stgraber) wrote :

Sorry for the delay, I just got back from a couple of weeks on another continent so didn't feel like testing this remotely :)

I'll do a test run now. By the way, since I first reported this issue, I managed to reproduce it on two more machines running the 3.11 kernel, it seems to vary with the number of netfilter rules loaded, the more you have, the more likely it's to panic (or something along those lines).

Revision history for this message
Stéphane Graber (stgraber) wrote :
Download full text (6.9 KiB)

That kernel panics:

[ 68.626968] ------------[ cut here ]------------
[ 68.631590] kernel BUG at /home/jsalisbury/bugs/lp1251946/linux-stable/net/core/skbuff.c:1040!
[ 68.640188] invalid opcode: 0000 [#1] SMP
[ 68.644324] Modules linked in: authenc esp6 xfrm6_mode_transport ipcomp6 xfrm6_tunnel tunnel6 xfrm4_mode_tunnel xfrm6_mode_tunnel veth zram(C) xfrm_user xfe
[ 68.747033] CPU 1
[ 68.748881] Pid: 12623, comm: samba Tainted: G WC 3.7.0-030700-generic #201312051340 /D2500CC
[ 68.759828] RIP: 0010:[<ffffffff815c0e76>] [<ffffffff815c0e76>] pskb_expand_head+0x286/0x2d0
[ 68.768358] RSP: 0018:ffff88012fc83728 EFLAGS: 00010202
[ 68.773664] RAX: 0000000000000002 RBX: ffff8800bd879100 RCX: 0000000000000020
[ 68.780792] RDX: 0000000000000000 RSI: 0000000000000050 RDI: 00000000000006c0
[ 68.787916] RBP: ffff88012fc83768 R08: 3f2c680500000060 R09: 3f2c680500000060
[ 68.795043] R10: bfcb2ffeff3e1602 R11: 20270ff0c0f20726 R12: 0000000000000050
[ 68.802167] R13: 0000000000000020 R14: 0000000000000001 R15: ffffffff81697d80
[ 68.809294] FS: 00007ff55143d740(0000) GS:ffff88012fc80000(0000) knlGS:0000000000000000
[ 68.817371] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 68.823112] CR2: 00000000010dcff0 CR3: 00000000b0ab4000 CR4: 00000000000007e0
[ 68.830237] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 68.837361] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 68.844489] Process samba (pid: 12623, threadinfo ffff8800b0ad2000, task ffff880125f5c560)
[ 68.852738] Stack:
[ 68.854754] ffffe8ffffc8028c 0000000000000282 ffff880127566d40 ffff8800bd879100
[ 68.862227] ffff880115ee5000 ffff8800a3bea000 0000000000000001 ffffffff81697d80
[ 68.869680] ffff88012fc83798 ffffffff8165b4d0 ffff8800bd879100 0000000000000001
[ 68.877143] Call Trace:
[ 68.879586] <IRQ>
[ 68.881517] [<ffffffff81697d80>] ? xfrm6_extract_output+0x50/0x50
[ 68.887918] [<ffffffff8165b4d0>] xfrm_output_one+0x90/0x220
[ 68.893573] [<ffffffff8165b726>] xfrm_output_resume+0xc6/0x170
[ 68.899486] [<ffffffff8165b7e3>] xfrm_output2+0x13/0x20
[ 68.904791] [<ffffffff8165b82f>] xfrm_output+0x3f/0x100
[ 68.910099] [<ffffffff81697d99>] xfrm6_output_finish+0x19/0x20
[ 68.916013] [<ffffffff816660f5>] ip6_fragment+0x925/0xab0
[ 68.921495] [<ffffffff81697d80>] ? xfrm6_extract_output+0x50/0x50
[ 68.927666] [<ffffffff81697a80>] ? xfrm6_local_error+0x80/0x80
[ 68.933580] [<ffffffff81697b82>] __xfrm6_output+0x102/0x170
[ 68.939232] [<ffffffff81697dca>] xfrm6_output+0x2a/0x70
[ 68.944541] [<ffffffff816653c2>] ip6_forward+0x242/0x650
[ 68.949935] [<ffffffff816663f0>] ? ip6_output+0xb0/0xb0
[ 68.955243] [<ffffffff81666478>] ip6_rcv_finish+0x88/0x90
[ 68.960721] [<ffffffff816663f0>] ? ip6_output+0xb0/0xb0
[ 68.966038] [<ffffffffa01c6882>] __ipv6_conntrack_in+0xe2/0x180 [nf_conntrack_ipv6]
[ 68.973770] [<ffffffff815c0c60>] ? pskb_expand_head+0x70/0x2d0
[ 68.979688] [<ffffffffa01c6945>] ipv6_conntrack_in+0x25/0x30 [nf_conntrack_ipv6]
[ 68.987165] [<ffffffff815f9e6f>] nf_iterate+0x8f/0xd0
[ 68.992299] [<ffffffff81644a38>] ? inet_fr...

Read more...

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built the next test kernel, up to the following commit:
97ebe8f55ae99059c0ad3d3be5c0417647f5e3e0

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1251946

Can you test that kernel and report back if it has the bug or not. I will build the next test kernel based on your test results.

Revision history for this message
Stéphane Graber (stgraber) wrote :

That kernel doesn't have PPPoE support so it's useless to me, I need a distro kernel for those tests :)

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I re-built the next test kernel, up to the following commit, which should have PPPoE enabled:
97ebe8f55ae99059c0ad3d3be5c0417647f5e3e0

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1251946

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.