Comment 3 for bug 190010

Revision history for this message
Matt Mackall (mpm-selenic) wrote : Re: [Bug 190010] Re: Random oopsen with Xen on Ubuntu Gutsy

Looks like this problem might have been related to running 32-bit
kernel/userspace on a 64-bit machine. Switching everything to pure
64-bit seems to have eliminated the problem.

On Thu, 2008-03-27 at 22:08 +0000, Leann Ogasawara wrote:
> ** Changed in: linux (Ubuntu)
> Sourcepackagename: linux-source-2.6.22 => linux
> Importance: Undecided => High
> Assignee: (unassigned) => Ubuntu Kernel Team (ubuntu-kernel-team)
> Status: Incomplete => Triaged
>
> --
> Random oopsen with Xen on Ubuntu Gutsy
> https://bugs.launchpad.net/bugs/190010
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in Source Package "linux" in Ubuntu: Triaged
>
> Bug description:
> Binary package hint: linux-image-2.6.22-14-xen
>
> I've got an AMD64 machine that's exhibiting random oopses and database
> corruption using 2.6.22-14 kernel from Ubuntu Gutsy and Xen
> 3.1.0-0ubuntu18 also from Gutsy,
>
> Running the same database without Xen under heavy load appears to work
> fine. Machine has 3G of ECC and memtest86 works fine as well. Also,
> machine ran the database just fine for a year before (unfortunately
> simultaneously) switching to Gutsy and Xen.
>
> Any suggestions?
>
> MySQL said:
>
> The relevant bit of the logs is this:
>
> Jan 10 23:46:26 vegguide mysqld[22376]: InnoDB: Database page corruption on disk or a failed
> Jan 10 23:46:26 vegguide mysqld[22376]: InnoDB: file read of page 1559.
> Jan 10 23:46:26 vegguide mysqld[22376]: InnoDB: You may have to recover from a backup.
>
> A couple of the oopsen pasted below:
>
> [297993.758358] BUG: unable to handle kernel NULL pointer dereference at virtual address 00000001
> [297993.758381] printing eip:
> [297993.758392] 006c6000 -> *pde = 00000000:6d67f001
> [297993.758398] 27f46000 -> *pme = 00000000:00000000
> [297993.758405] Oops: 0000 [#1]
> [297993.758408] SMP
> [297993.758415] Modules linked in: af_packet xt_multiport iptable_filter ip_tables x_tables ipv6 evdev ext3 jbd mbcach e dm_mirror dm_snapshot dm_mod fuse apparmor commoncap
> [297993.758448] CPU: 0
> [297993.758449] EIP: 0061:[<c0176eec>] Not tainted VLI
> [297993.758452] EFLAGS: 00010002 (2.6.22-14-xen #1)
> [297993.758468] EIP is at kmem_cache_alloc+0x5c/0xe0
> [297993.758474] eax: 00000000 ebx: 00000001 ecx: 00000000 edx: c1bfe8a0
> [297993.758480] esi: 00000000 edi: 00000000 ebp: 00000020 esp: e0003c00
> [297993.758487] ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0069
> [297993.758493] Process pdflush (pid: 4538, ti=e0002000 task=c210ea60 task.ti=e0002000)
> [297993.758498] Stack: 00000000 c1c34f60 c028b35d 00000000 c20b4b40 00000003 c20b4b40 c028b35d
> [297993.758515] c12a7480 c20b4b40 00000003 c029fd40 c029f25c c013bdba c03d26a0 c2048000
> [297993.758532] c03ec4a8 c20b4b40 00000000 00000400 c0290b67 c2048000 00000000 003d0900
> [297993.758549] Call Trace:
> [297993.758555] [<c028b35d>] skb_clone+0x2d/0x250
> [297993.758567] [<c028b35d>] skb_clone+0x2d/0x250
> [297993.758575] [<c029fd40>] snap_rcv+0x0/0xa0
> [297993.758582] [<c029f25c>] llc_rcv+0xdc/0x2b0
> [297993.758589] [<c013bdba>] clocksource_get_next+0x3a/0x40
> [297993.758602] [<c0290b67>] netif_receive_skb+0x237/0x420
> [297993.758613] [<c027119d>] netif_poll+0x4fd/0xbf0
> [297993.758621] [<c010891b>] sched_clock+0x3b/0x80
> [297993.758630] [<c011e5f3>] scheduler_tick+0xf3/0x100
> [297993.758643] [<c02930de>] net_rx_action+0xde/0x260
> [297993.758652] [<c0127302>] __do_softirq+0x92/0x130
> [297993.758662] [<c012742c>] do_softirq+0x8c/0x90
> [297993.758670] [<c0106e20>] do_IRQ+0x40/0x70
> [297993.758677] [<ee08b799>] journal_add_journal_head+0x69/0x160 [jbd]
> [297993.758699] [<c0259946>] evtchn_do_upcall+0xb6/0xf0
> [297993.758709] [<c01057a6>] hypervisor_callback+0x46/0x4e
> [297993.758718] [<ee0c18cc>] walk_page_buffers+0x1c/0x70 [ext3]
> [297993.758740] [<ee0c4ad1>] ext3_ordered_writepage+0xe1/0x1a0 [ext3]
> [297993.758759] [<ee0c1920>] bget_one+0x0/0x10 [ext3]
> [297993.758774] [<c0157a38>] __writepage+0x8/0x30
> [297993.758782] [<c0157ef4>] write_cache_pages+0x214/0x310
> [297993.758791] [<c0157a30>] __writepage+0x0/0x30
> [297993.758801] [<c0158010>] generic_writepages+0x20/0x30
> [297993.758810] [<c0158069>] do_writepages+0x49/0x50
> [297993.758817] [<c01980c3>] __writeback_single_inode+0x93/0x3c0
> [297993.758828] [<c012bb9e>] del_timer_sync+0xe/0x20
> [297993.758838] [<c02ff246>] schedule+0x356/0x900
> [297993.758847] [<c01f5d9d>] _atomic_dec_and_lock+0x3d/0x70
> [297993.758857] [<c019877e>] sync_sb_inodes+0x17e/0x240
> [297993.758866] [<c0198c49>] writeback_inodes+0x99/0xd0
> [297993.758876] [<c0158715>] wb_kupdate+0x85/0xf0
> [297993.758885] [<c0158ab0>] pdflush+0x0/0x260
> [297993.758892] [<c0158bf8>] pdflush+0x148/0x260
> [297993.758900] [<c0158690>] wb_kupdate+0x0/0xf0
> [297993.758908] [<c0136312>] kthread+0x42/0x70
> [297993.758915] [<c01362d0>] kthread+0x0/0x70
> [297993.758922] [<c0105927>] kernel_thread_helper+0x7/0x10
> [297993.758930] =======================
> [297993.758934] Code: 02 01 c6 44 02 01 01 89 f9 0f b6 f1 64 a1 08 00 42 c0 8b 94 83 90 00 00 00 85 d2 74 72 8b 42 0c 85 c0 74 6b 8b 5a 0c 0f b7 42 0a <8b> 04 83 89 42 0c 89 fa 84 d2 74 2e 64 a1 08 00 42 c0 c1 e0 06
> [297993.759012] EIP: [<c0176eec>] kmem_cache_alloc+0x5c/0xe0 SS:ESP 0069:e0003c00
> [297993.759030] Kernel panic - not syncing: Fatal exception in interrupt
>
>
> [16785.730498] BUG: unable to handle kernel paging request at virtual address 00100104
> [16785.730517] printing eip:
> [16785.730528] 2d463000 -> *pde = 00000000:45af1001
> [16785.730533] 2d506000 -> *pme = 00000000:00000000
> [16785.730540] Oops: 0000 [#1]
> [16785.730543] SMP
> [16785.730550] Modules linked in: xt_multiport iptable_filter ip_tables x_tables ipv6 evdev ext3 jbd mbcache dm_mirror dm_snapshot dm_mod fuse apparmor commonca p
> [16785.730581] CPU: 0
> [16785.730582] EIP: 0061:[<ee0c18c9>] Not tainted VLI
> [16785.730584] EFLAGS: 00010286 (2.6.22-14-xen #1)
> [16785.730607] EIP is at walk_page_buffers+0x19/0x70 [ext3]
> [16785.730613] eax: 00000000 ebx: fffffffe ecx: ffffffff edx: 00100100
> [16785.730619] esi: 00100100 edi: c18404b4 ebp: ffffffff esp: c2149e04
> [16785.730625] ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0069
> [16785.730631] Process pdflush (pid: 60, ti=c2148000 task=c20fcf90 task.ti=c2148 000)
> [16785.730635] Stack: 00000000 c1143210 c1830b20 c18404b4 c1143210 c1143210 ee0c 4ad1 00001000
> [16785.730652] 00000000 ee0c1920 00000000 c1830b20 c2149f70 c28b46f4 0000 0000 c2149f70
> [16785.730666] 00000000 c0157a38 c1830b20 c0157ef4 00000000 0000000e c015 7a30 c28b46f4
> [16785.730681] Call Trace:
> [16785.730687] [<ee0c4ad1>] ext3_ordered_writepage+0xe1/0x1a0 [ext3]
> [16785.730703] [<ee0c1920>] bget_one+0x0/0x10 [ext3]
> [16785.730717] [<c0157a38>] __writepage+0x8/0x30
> [16785.730727] [<c0157ef4>] write_cache_pages+0x214/0x310
> [16785.730736] [<c0157a30>] __writepage+0x0/0x30
> [16785.730745] [<c0158010>] generic_writepages+0x20/0x30
>
> [16785.730753] [<c0158069>] do_writepages+0x49/0x50
> [16785.730760] [<c01980c3>] __writeback_single_inode+0x93/0x3c0
> [16785.730770] [<c012bb9e>] del_timer_sync+0xe/0x20
> [16785.730780] [<c02ff246>] schedule+0x356/0x900
> [16785.730789] [<c019877e>] sync_sb_inodes+0x17e/0x240
> [16785.730799] [<c0198c49>] writeback_inodes+0x99/0xd0
> [16785.730807] [<c0158715>] wb_kupdate+0x85/0xf0
> [16785.730816] [<c0158ab0>] pdflush+0x0/0x260
> [16785.730823] [<c0158bf8>] pdflush+0x148/0x260
> [16785.730830] [<c0158690>] wb_kupdate+0x0/0xf0
> [16785.730838] [<c0136312>] kthread+0x42/0x70
> [16785.730846] [<c01362d0>] kthread+0x0/0x70
> [16785.730852] [<c0105927>] kernel_thread_helper+0x7/0x10
> [16785.730861] =======================
> [16785.730864] Code: 74 f0 31 c0 39 cb 5b 0f 92 c0 c3 8d b4 26 00 00 00 00 55 57 89 d7 56 53 83 ec 08 89 0c 24 31 c9 89 44 24 04 8b 6a 14 8d 5c 0d 00 <8b> 72 04 3b 1c 24 76 2f 89 d8 29 e8 3b 44 24 1c 73 25 8b 44 24
> [16785.730938] EIP: [<ee0c18c9>] walk_page_buffers+0x19/0x70 [ext3] SS:ESP 0069: c2149e04
--
Mathematics is the supreme nostalgia of our time.