ISST-LTE:pVM:thymelp2:ubuntu 16.04: cannot analyse vmcore with crash (s390x, ppc64)

Bug #1564487 reported by bugproxy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
crash (Ubuntu)
Fix Released
Undecided
Chris J Arges

Bug Description

== Comment: #0 - Ping Tian Han - 2016-03-31 02:19:27 ==
---Problem Description---
With the kexec-tools comes from https://bugzilla.linux.ibm.com/show_bug.cgi?id=136588#c58 , we have dumpped a vmcore for bug 139815 after quiting from xmon. But when trying to analyse with crash:

% sudo crash /usr/lib/debug/boot/vmlinux-4.4.0-15-generic /var/crash/201603310043/dump.201603310043

crash 7.1.4
Copyright (C) 2002-2015 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.

GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "powerpc64le-unknown-linux-gnu"...

please wait... (gathering module symbol data)
crash: invalid structure member offset: module_num_symtab
       FILE: kernel.c LINE: 3450 FUNCTION: module_init()

[/usr/bin/crash] error trace: 1008af0c => 10124250 => 10179bd0 => 10083378

  10083378: (undetermined)
  10179bd0: OFFSET_verify+80
  10124250: module_init+1104
  1008af0c: main_loop+764

%

== Comment: #4 - Hari Krishna Bathini - 2016-03-31 08:20:39 ==
(In reply to comment #2)
> Hi,
>
> Crash failing with 4.4+ kernels is a known issue which is fixed by the
> upstream patch
>
> commit 6f1f78e33474d00d5f261d7ed9d835c558b34d61
> Author: Dave Anderson <email address hidden>
> Date: Wed Jan 20 09:56:36 2016 -0500
>
> Fix for the changes made to the kernel module structure introduced by
> this kernel commit for Linux 4.5 and later kernels:
>
> commit 7523e4dc5057e157212b4741abd6256e03404cf1
> module: use a structure to encapsulate layout.
>
> Without the patch, the crash session fails during initialization
> with the error message: "crash: invalid structure member offset:
> module_init_text_size".
> (<email address hidden>)
>

We also need the below upstream patch along with the patch mentioned above to fix this issue

commit 098cdab16dfa6a85e9dad2cad604dee14ee15f66
Author: Dave Anderson <email address hidden>
Date: Fri Feb 12 14:32:53 2016 -0500

    Fix for the changes made to the kernel module structure introduced by
    this kernel commit for Linux 4.5 and later kernels:

      commit 8244062ef1e54502ef55f54cced659913f244c3e
      modules: fix longstanding /proc/kallsyms vs module insertion race.

    Without the patch, the crash session fails during initialization
    with the error message: "crash: invalid structure member offset:
    module_num_symtab".
    (<email address hidden>)

Thanks
Hari

== Comment: #9 - Hendrik Brueckner - 2016-03-31 11:47:13 ==
Problem also exists on s390x:

bugproxy (bugproxy)
tags: added: architecture-all bugnameltc-139847 severity-high targetmilestone-inin1604
Changed in ubuntu:
assignee: nobody → Taco Screen team (taco-screen-team)
Kevin W. Rudd (kevinr)
affects: ubuntu → crash (Ubuntu)
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2016-03-31 12:05 EDT-------
Canonical,

Since this impacts multiple architectures, do you need/want individual mirror requests for each arch?

Changed in crash (Ubuntu):
assignee: Taco Screen team (taco-screen-team) → Canonical Kernel Team (canonical-kernel-team)
Revision history for this message
Breno Leitão (breno-leitao) wrote :

Dear Canonical,

These two patches are required in 4.4 kernel because Tim cherry picked 8244062ef1e54502ef55f54cced659913f244c3e, and it became 21b06ff8d4dd559b93ca60e26e8ffd811d030b39. This broke crash, and now these two patches are required in the crash tool:

 * 098cdab16dfa6a85e9dad2cad604dee14ee15f66
 * 6f1f78e33474d00d5f261d7ed9d835c558b34d61

Changed in crash (Ubuntu):
status: New → Confirmed
Chris J Arges (arges)
Changed in crash (Ubuntu):
assignee: Canonical Kernel Team (canonical-kernel-team) → Chris J Arges (arges)
Revision history for this message
Chris J Arges (arges) wrote :

Uploaded crash_7.1.4-1ubuntu4 which contains the missing patch. Hopefully that will be available for testing soon.

Changed in crash (Ubuntu):
status: Confirmed → In Progress
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-04-01 10:56 EDT-------
(In reply to comment #16)
> Uploaded crash_7.1.4-1ubuntu4 which contains the missing patch. Hopefully
> that will be available for testing soon.

Just so we are clear, you mean "missing patches"? Because there are two of them
as mentioned already by Breno:

098cdab16dfa6a85e9dad2cad604dee14ee15f66
6f1f78e33474d00d5f261d7ed9d835c558b34d61

Thanks
Hari

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package crash - 7.1.4-1ubuntu4

---------------
crash (7.1.4-1ubuntu4) xenial; urgency=medium

  * d/p/0002-Fix-for-the-changes-made-to-the-kernel-module-struct.patch:
    Cherry-pick additional crash commit needed to fix issues related to
    kernel commit 8244062e. (LP: #1564487)

 -- Chris J Arges <email address hidden> Fri, 01 Apr 2016 08:46:07 -0500

Changed in crash (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Breno Leitão (breno-leitao) wrote :

The package just appeared in the archive and I was able to try it on ppc64el.

 sudo crash /usr/lib/debug/boot/vmlinux-4.4.0-16-generic dump.201603291717

crash 7.1.4
Copyright (C) 2002-2015 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.

GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "powerpc64le-unknown-linux-gnu"...

      KERNEL: /usr/lib/debug/boot/vmlinux-4.4.0-16-generic
    DUMPFILE: dump.201603291717 [PARTIAL DUMP]
        CPUS: 8
        DATE: Tue Mar 29 17:17:45 2016
      UPTIME: 01:41:34
LOAD AVERAGE: 0.07, 0.23, 0.22
       TASKS: 873
    NODENAME: 1604
     RELEASE: 4.4.0-16-generic
     VERSION: #32-Ubuntu SMP Thu Mar 24 22:31:14 UTC 2016
     MACHINE: ppc64le (3425 Mhz)
      MEMORY: 8 GB
       PANIC: "sysrq: SysRq : Trigger a crash"
         PID: 8951
     COMMAND: "bash"
        TASK: c000000003debd90 [THREAD_INFO: c0000001f6c4c000]
         CPU: 7
       STATE: TASK_RUNNING (SYSRQ)

crash> dmesg | tail
[ 6094.597764] [c0000001f6c4fd90] [c0000000002e1940] vfs_write+0xc0/0x230
[ 6094.597845] [c0000001f6c4fde0] [c0000000002e297c] SyS_write+0x6c/0x110
[ 6094.597926] [c0000001f6c4fe30] [c000000000009204] system_call+0x38/0xb4
[ 6094.598005] Instruction dump:
[ 6094.598047] 3842e9a0 7c0802a6 f8010010 f821ffe1 60000000 60000000 3d220019 39494ee4
[ 6094.598181] 39200001 912a0000 7c0004ac 39400000 <992a0000> 38210020 e8010010 7c0803a6
[ 6094.598336] ---[ end trace 02e18198e9ef49b7 ]---
[ 6094.601009]
[ 6094.601080] Sending IPI to other CPUs
[ 6094.604482] IPI complete

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-04-04 10:36 EDT-------
*** Bug 139867 has been marked as a duplicate of this bug. ***

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.