kexec/kdump not working in ubuntu 16.04

Bug #1546260 reported by bugproxy
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kexec-tools (Ubuntu)
Fix Released
Undecided
Taco Screen team

Bug Description

== Comment: #0 - Praveen K. Pandey <email address hidden> - 2016-02-04 04:11:48 ==
Hi

  I installed Ubuntu16.04 as PowerVM/KVM/PowerNV and setup kdump as well use PPA build to resolve sym link problem broken installed kdump-tools 1:1.5.9-4~lp1536904, and trigger kdump .While dumping process console hung not able to get vmcore.

Reproducible Step:

1- Installed Ubuntu16.04
2- Installed PPA build of Kdump mentioned in bug135822
3- Start kdump service
4- Trigger a panic

Expected Result :

Able to generate kdump

Actual Result :

System halt on dump process , seems me having initrd issue

LOG:

root@ubuntu:~# kdump-config load
Modified cmdline:BOOT_IMAGE=/boot/vmlinux-4.4.0-2-generic root=UUID=9d1662d4-db2a-4a1a-b8d7-8b0d4e7872dd ro splash quiet irqpoll maxcpus=1 nousb systemd.unit=kdump-tools.service elfcorehdr=156288K
 * loaded kdump kernel

root@ubuntu:~# kdump-config show
DUMP_MODE: kdump
USE_KDUMP: 1
KDUMP_SYSCTL: kernel.panic_on_oops=1
KDUMP_COREDIR: /var/crash
crashkernel addr:
   /var/lib/kdump/vmlinuz: symbolic link to /boot/vmlinux-4.4.0-2-generic
kdump initrd:
   /var/lib/kdump/initrd.img: symbolic link to /var/lib/kdump/initrd.img-4.4.0-2-generic
current state: ready to kdump

kexec command:
  /sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinux-4.4.0-2-generic root=UUID=2a78d070-4b73-4da6-b7eb-d41164c909fa ro splash quiet irqpoll maxcpus=1 nousb systemd.unit=kdump-tools.service" --initrd=/var/lib/kdump/initrd.img /var/lib/kdump/vmlinuz
root@ubuntu:~#
root@ubuntu:~# kdump-config status
current state : ready to kdump
root@ubuntu:~#

root@ubuntu:~# echo 1 > /proc/sys/kernel/sysrq

root@ubuntu:~# echo c > /proc/sysrq-trigger
[ 164.495349] sysrq: SysRq : Trigger a crash
[ 164.495360] Unable to handle kernel paging request for data at address 0x00000000
[ 164.495364] Faulting instruction address: 0xc000000000654db4
[ 164.495368] Oops: Kernel access of bad area, sig: 11 [#1]
[ 164.495370] SMP NR_CPUS=2048 NUMA pSeries
[ 164.495374] Modules linked in: rpadlpar_io rpaphp dccp_diag dccp tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag binfmt_misc uas usb_storage pseries_rng rtc_generic autofs4 ibmvscsi ibmveth
[ 164.495389] CPU: 0 PID: 1281 Comm: bash Not tainted 4.4.0-2-generic #16-Ubuntu
[ 164.495393] task: c0000003f1f2a200 ti: c000000006ef8000 task.ti: c000000006ef8000
[ 164.495396] NIP: c000000000654db4 LR: c000000000655e68 CTR: c000000000654d80
[ 164.495399] REGS: c000000006efb990 TRAP: 0300 Not tainted (4.4.0-2-generic)
[ 164.495401] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28242222 XER: 00000001
[ 164.495410] CFAR: c000000000008468 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 1
GPR00: c000000000655e68 c000000006efbc10 c000000001583800 0000000000000063
GPR04: c0000003ff609c50 c0000003ff61b4f0 c0000003ffe6c200 000000000000015b
GPR08: 0000000000000007 0000000000000001 0000000000000000 c0000003ffe72710
GPR12: c000000000654d80 c000000007ae0000 ffffffffffffffff 0000000022000000
GPR16: 0000000010170710 000001002c890608 00000000101406f0 00000000100c7100
GPR20: 0000000000000000 000000001017df98 0000000010140588 0000000000000000
GPR24: 0000000010153440 000000001017b848 c0000000014c8540 0000000000000004
GPR28: c0000000014c8900 0000000000000063 c0000000014810bc 0000000000000000
[ 164.495451] NIP [c000000000654db4] sysrq_handle_crash+0x34/0x50
[ 164.495454] LR [c000000000655e68] __handle_sysrq+0xe8/0x270
[ 164.495456] Call Trace:
[ 164.495460] [c000000006efbc10] [c000000000de3008] _fw_tigon_tg3_bin_name+0x2cc30/0x33d18 (unreliable)
[ 164.495464] [c000000006efbc30] [c000000000655e68] __handle_sysrq+0xe8/0x270
[ 164.495468] [c000000006efbcd0] [c000000000656608] write_sysrq_trigger+0x78/0xa0
[ 164.495472] [c000000006efbd00] [c000000000374e70] proc_reg_write+0xb0/0x110
[ 164.495476] [c000000006efbd50] [c0000000002dcabc] __vfs_write+0x6c/0xe0
[ 164.495480] [c000000006efbd90] [c0000000002dd7f0] vfs_write+0xc0/0x230
[ 164.495484] [c000000006efbde0] [c0000000002de82c] SyS_write+0x6c/0x110
[ 164.495488] [c000000006efbe30] [c000000000009204] system_call+0x38/0xb4
[ 164.495490] Instruction dump:
[ 164.495493] 3842ea80 7c0802a6 f8010010 f821ffe1 60000000 60000000 3d220019 394956e4
[ 164.495499] 39200001 912a0000 7c0004ac 39400000 <992a0000> 38210020 e8010010 7c0803a6
[ 164.495506] ---[ end trace 5ddbc181c028cb6c ]---
[ 164.497176]
[ 164.497180] Sending IPI

bugproxy (bugproxy)
tags: added: architecture-ppc64le bugnameltc-136588 severity-critical targetmilestone-inin1604
Changed in ubuntu:
assignee: nobody → Taco Screen team (taco-screen-team)
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/1546260/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2016-02-17 13:30 EDT-------
verify_sha256_digest() function call is returning 1 in the below snippet
on ubuntu 16.04, which is leading to hang
---
printf("I'm in purgatory\n");
setup_arch();
if (verify_sha256_digest()) {
for(;;) {
/* loop forever */
}
}
---

The problem seems to be with the compiler. kexec binary in ubuntu 16.04 is compiled
with gcc v5.2.1 [GCC: (Ubuntu 5.2.1-27ubuntu1) 5.2.1 20151129 to be precise].
On using a kexec binary compiled with gcc v4.9.2, we don't hit the problem.
Not sure what changed between v4.9.2 and v5..1 which makes the purgatory unhappy.

While someone from compiler team looks at this, you could use a kexec binary build with
gcc v4.9.2 as a workaround for this problem

Thanks
Hari

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-02-19 09:22 EDT-------
Please provide materials that will allow the compiler team to reproduce the bug, including preprocessed source and compilation flags. Detailed instructions for GCC bug reporting may be found at https://gcc.gnu.org/bugs/.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-02-22 12:46 EDT-------
Hi William

Purgatory code checks for the sanity of the code before proceeding [with verify_sha256_digest() call]
This call is failing when compiled with gcc verison 5.3.1-8ubuntu2 (available in Ubuntu 16.04) while
it works just fine when compiled with gcc version 4.9.2-10ubuntu12 available in ubuntu 15.04.

Couple of observations that may be relevant:

1. The below warning is observed when compiled with gcc version 5.3.1-8ubuntu2

purgatory/arch/ppc64/console-ppc64.c: In function 'putchar':
purgatory/arch/ppc64/console-ppc64.c:42:2: warning: '*((void *)&buff+8)' may be used uninitialized in this function [-Wmaybe-uninitialized]
plpar_hcall_norets(H_PUT_TERM_CHAR, 0, 1,

2. In the failing environment "printf("I'm in purgatory\n")" also yields no output on console for pseries.
This usually would succeed if the call plpar_hcall_norets() (hypervisor call) goes through.

Also, things work fine with 5.3.1-8ubuntu2 compiler if -O0 is used instead of -Os.
Attaching the preprocessed code for purgatory and also the compile options. Please let me know,
if I need to add more material.

Thanks
Hari

Revision history for this message
bugproxy (bugproxy) wrote : Preprocessed kexec-tools purgatory code for ppc64

------- Comment (attachment only) From <email address hidden> 2016-02-22 12:48 EDT-------

Revision history for this message
bugproxy (bugproxy) wrote : purgatory compile options used in failure scenario

------- Comment (attachment only) From <email address hidden> 2016-02-22 12:49 EDT-------

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2016-02-22 14:12 EDT-------
*** Bug 136794 has been marked as a duplicate of this bug. ***

------- Comment From <email address hidden> 2016-02-22 15:51 EDT-------
Hari, thanks, very helpful! There is still at least one bit missing for me to build the whole thing -- could you please attach ppc64_asm.h? This is included by two of the .S files.

Revision history for this message
bugproxy (bugproxy) wrote : purgatory/arch/ppc64/ppc64_asm.h file

------- Comment on attachment From <email address hidden> 2016-02-23 00:22 EDT-------

(In reply to comment #27)
> Hari, thanks, very helpful! There is still at least one bit missing for me
> to build the whole thing -- could you please attach ppc64_asm.h? This is
> included by two of the .S files.

Sorry for missing that out.

Thanks
Hari

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2016-02-24 19:04 EDT-------
One thing I noticed may be relevant since we're talking about -Os code, which means out-of-line register save and restore functions. gcc-5 moves some code before the register save call, while gcc-4.9 doesn't seem to. It is important that the call to functions like _savegpr0_14 *not* go through stubs (for example to reach a destination more than 33M away) that might destroy regs. The save/restore functions have non-standard calling conventions, so you can't use stubs that destroy *any* of the volatile regs. In particular, r0, r11 and r12 are not available for use by a stub.

Revision history for this message
bugproxy (bugproxy) wrote : toc alignment patch

------- Comment on attachment From <email address hidden> 2016-02-25 07:44 EDT-------

This is not a compiler issue after all. See
http://lists.infradead.org/pipermail/kexec/2016-February/015440.html and
http://lists.infradead.org/pipermail/kexec/2016-February/015441.html

These two plus this patch should fix the problem

Revision history for this message
bugproxy (bugproxy) wrote : optional patch to make better use of ld feature

------- Comment (attachment only) From <email address hidden> 2016-02-25 07:46 EDT-------

Revision history for this message
bugproxy (bugproxy) wrote : optional patch to avoid ia64 damage

------- Comment (attachment only) From <email address hidden> 2016-02-25 07:47 EDT-------

Revision history for this message
bugproxy (bugproxy) wrote : rebased patch set

------- Comment on attachment From <email address hidden> 2016-02-25 22:00 EDT-------

The toc alignment patch didn't apply cleanly after Anton's patches. This tarfile contains a rebased patchset including Anton's patches.

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2016-02-26 04:50 EDT-------
> kexec load operation failed with this patch on an ubuntu lpar (ppc64le).

Yes, please drop "Use ld to provide ppc64 register save/restore functions". This ld feature is broken. :-(

Revision history for this message
bugproxy (bugproxy) wrote : purgatory compile options used in failure scenario

------- Comment (attachment only) From <email address hidden> 2016-02-22 12:49 EDT-------

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2016-03-09 15:25 EDT-------
*** Bug 138700 has been marked as a duplicate of this bug. ***

------- Comment From <email address hidden> 2016-03-09 15:28 EDT-------
*** Bug 138700 has been marked as a duplicate of this bug. ***

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-03-13 09:48 EDT-------
*** Bug 138700 has been marked as a duplicate of this bug. ***

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-03-15 10:11 EDT-------
The bug is fixed by the below patch-set posted by Anton & Alan

http://lists.infradead.org/pipermail/kexec/2016-February/015446.html
http://lists.infradead.org/pipermail/kexec/2016-February/015447.html
http://lists.infradead.org/pipermail/kexec/2016-February/015448.html

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-03-21 13:03 EDT-------
Moving previous comment to external - Canonical please consider this patch.
(In reply to comment #56)
> Hi Hari
>
> able to apply the patches cleanly on kexec-tools-2.0.11 I think
> canonical can apply on kexec-tools-2.0.11 .
>
> Regards
> Praveen

Changed in ubuntu:
status: New → Confirmed
affects: ubuntu → kexec-tools (Ubuntu)
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-03-24 01:46 EDT-------
(In reply to comment #51)
> The bug is fixed by the below patch-set posted by Anton & Alan
>

...now accepted upstream with commit ids

> http://lists.infradead.org/pipermail/kexec/2016-February/015446.html

4a2ae3a39c64dc43e9d094be9541253234ff4822

> http://lists.infradead.org/pipermail/kexec/2016-February/015447.html

1e423dc297d10eb7ff25c829d2856ef12fc81d77

> http://lists.infradead.org/pipermail/kexec/2016-February/015448.html

3debb8cf3272216119cb2e59a4963ce3c18fe8e3

Thanks
Hari

Revision history for this message
bugproxy (bugproxy) wrote : Preprocessed kexec-tools purgatory code for ppc64

------- Comment (attachment only) From <email address hidden> 2016-02-22 12:48 EDT-------

Revision history for this message
bugproxy (bugproxy) wrote : purgatory compile options used in failure scenario

------- Comment (attachment only) From <email address hidden> 2016-02-22 12:49 EDT-------

Revision history for this message
bugproxy (bugproxy) wrote : purgatory/arch/ppc64/ppc64_asm.h file

------- Comment on attachment From <email address hidden> 2016-02-23 00:22 EDT-------

(In reply to comment #27)
> Hari, thanks, very helpful! There is still at least one bit missing for me
> to build the whole thing -- could you please attach ppc64_asm.h? This is
> included by two of the .S files.

Sorry for missing that out.

Thanks
Hari

Revision history for this message
bugproxy (bugproxy) wrote : rebased patch set

------- Comment on attachment From <email address hidden> 2016-02-25 22:00 EDT-------

The toc alignment patch didn't apply cleanly after Anton's patches. This tarfile contains a rebased patchset including Anton's patches.

Revision history for this message
Breno Leitão (breno-leitao) wrote :
Louis Bouchard (louis)
Changed in kexec-tools (Ubuntu Xenial):
status: Confirmed → In Progress
no longer affects: kexec-tools (Ubuntu Xenial)
Revision history for this message
Louis Bouchard (louis) wrote :

Hello,

I have a new test package ready for testing if you are able to verify. I included the three upstream commits. I haven't identified the changes made to the purgatory/arch/ppc64/Makefile file (part of the rebase tarfile) so those are not included.

Since you are already using the PPA that contains kdump-tools 1:1.5.9-4~lp1536904, I can make the test package available there so you can test without having to add a new PPA.

You should be aware though that you should no longer use that PPA. The modifications made to kexec-tools are now present in the public archive. in the 1:1.5.9-5 package.

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2016-03-28 07:54 EDT-------
(In reply to comment #68)
> Hello,
>
> I have a new test package ready for testing if you are able to verify. I
> included the three upstream commits. I haven't identified the changes made
> to the purgatory/arch/ppc64/Makefile file (part of the rebase tarfile) so
> those are not included.
>
> Since you are already using the PPA that contains kdump-tools
> 1:1.5.9-4~lp1536904, I can make the test package available there so you can
> test without having to add a new PPA.
>
> You should be aware though that you should no longer use that PPA. The
> modifications made to kexec-tools are now present in the public archive. in
> the 1:1.5.9-5 package.

Hi louis ,

need one help i got confused with above statement , we have a bug raised for broken link for kdump initrd and vmlinux there we got PPA build of kdump 1:1.5.9-4~lp1536904 and that got resolved , is all three patches (kexec) you are pushing in same PPA repo or it is part of official build 1:1.5.9-5 package.

Regards
Praveen

Revision history for this message
Louis Bouchard (louis) wrote :

Hello,

I am sorry that my previous comment was unclear and a bit misleading.

The kdump package in the PPA (kdump 1:1.5.9-4~lp1536904) fixes https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1536904 as you said. This is now in the official archive so you should no longer be using the PPA. Please confirm that you will be using the official archive so I can proceed to delete this PPA.

Now the kexec-tools fixes have been uploaded to a different PPA : ppa:louis-bouchard/kexec-tools-test

Please verify that the new package do indeed fixes your issue. If this is the case, I will see that it gets uploaded to the archive.

Kind regards,

...Louis

Revision history for this message
Breno Leitão (breno-leitao) wrote :

Hey Louis, thanks for the package.

> Now the kexec-tools fixes have been uploaded to a different PPA : ppa:louis-bouchard/kexec-tools-test

It seems that this PPA is not enabled to build for ppc64el architecture. Would you enable that, please?

Meanwhile, i can test with building from source.

Revision history for this message
Breno Leitão (breno-leitao) wrote :

Anyway, I just got the source package from ppa:louis-bouchard/kexec-tools-test and built it on ppc64el and it seems to be working fine.

The dump files are being generated:

$ find
.
./kexec_cmd
./201603291535
./201603291535/dump.201603291535
./201603291535/dmesg.201603291535
./201603291717
./201603291717/dump.201603291717
./201603291717/dmesg.201603291717

I think we are good to have it on the main archive

Revision history for this message
bugproxy (bugproxy) wrote : kexec-tools fixed

------- Comment (attachment only) From <email address hidden> 2016-03-30 13:06 EDT-------

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2016-04-04 00:57 EDT-------
Hi, Canonical. What is the ETA for this to be released?

Revision history for this message
Louis Bouchard (louis) wrote :

Hello,

Sorry for the missing build and missing answers, I was not subscribed to the bug.

Following your tests, I have uploaded the fixed version to the archive which is waiting for approval since we are under beta freeze.

Since this is bugfixing, it should be approved. I'll let you know once it hits the archive.

Kind regards,

...Louis

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package kexec-tools - 1:2.0.10-1ubuntu2

---------------
kexec-tools (1:2.0.10-1ubuntu2) xenial; urgency=medium

  * [PowerPC64] Fix failure in purgatory when compiled with gcc5
    Application of upstream fixes so kexec-tools work with gcc5 on PowerPC64

    commit 4a2ae3a39c64dc43e9d094be9541253234ff4822,
    1e423dc297d10eb7ff25c829d2856ef12fc81d77,
    3debb8cf3272216119cb2e59a4963ce3c18fe8e3
    [lp: #1546260]

 -- Louis Bouchard <email address hidden> Tue, 05 Apr 2016 11:14:18 +0200

Changed in kexec-tools (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-04-05 09:42 EDT-------
*** Bug 137124 has been marked as a duplicate of this bug. ***

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-04-11 00:44 EDT-------
Hi

Verified this bug on friday build (8april-2016) and seem me problem got fixed .

Regards
Praveen

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2016-04-14 19:46 EDT-------
*** Bug 136793 has been marked as a duplicate of this bug. ***

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.