Louis, While we can't test this without access to a machine with large amounts of memory, is it possible to apply this patch and provide an image to IBM for testing? Michael On 02/01/2017 11:09 PM, bugproxy wrote: > Public bug reported: > > Problem Description > =========================== > In Ubuntu16.10 tried kdump in Brazos system (32TB Memory and 192 core). when trigger panic kdump process stuck in boot process need to do force reboot .After reboot system captured vmcore-incomplete. > > Reproducible Step: > ====================== > 1- Install Ubuntu16.10 > 2- boot system with 31TB and 192 Core > 3- configure kdump in system > 4- verify kdump in system that it is running > 5- Trigger panic in system > > Actual Result > -------------------------- > kdump process stuck in boot process need to do force reboot > > Expected Result > ----------------------------- > Kdump will proceed and vmcore captured successfully. > > LOG: > > root@ltc-brazos1:~# cat /proc/cmdline > BOOT_IMAGE=/boot/vmlinux-4.4.0-30-generic root=UUID=516c4b1b-6700-4b55-bd37-d61c4c5af6af ro quiet splash crashkernel=4096M > root@ltc-brazos1:~# kdump-config show > DUMP_MODE: kdump > USE_KDUMP: 1 > KDUMP_SYSCTL: kernel.panic_on_oops=1 > KDUMP_COREDIR: /var/crash > crashkernel addr: > /var/lib/kdump/vmlinuz: symbolic link to /boot/vmlinux-4.4.0-30-generic > kdump initrd: > /var/lib/kdump/initrd.img: symbolic link to /var/lib/kdump/initrd.img-4.4.0-30-generic > current state: ready to kdump > > kexec command: > /sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinux-4.4.0-30-generic root=UUID=516c4b1b-6700-4b55-bd37-d61c4c5af6af ro quiet splash irqpoll nr_cpus=1 nousb systemd.unit=kdump-tools.service" --initrd=/var/lib/kdump/initrd.img /var/lib/kdump/vmlinuz > root@ltc-brazos1:~# > root@ltc-brazos1:~# dpkg -l | grep kdump > ii kdump-tools 1:1.6.0-2 all scripts and tools for automating kdump (Linux crash dumps) > root@ltc-brazos1:~# > root@ltc-brazos1:~# echo c > /proc/sysrq-trigger > > > ltc-brazos1 login: [ 416.229464] sysrq: SysRq : Trigger a crash > [ 416.229496] Unable to handle kernel paging request for data at address 0x00000000 > [ 416.229502] Faulting instruction address: 0xc000000000670014 > [ 416.229508] Oops: Kernel access of bad area, sig: 11 [#1] > [ 416.229511] SMP NR_CPUS=2048 NUMA pSeries > [ 416.229517] Modules linked in: pseries_rng btrfs xor raid6_pq rtc_generic sunrpc autofs4 ses enclosure ipr > [ 416.229532] CPU: 65 PID: 404785 Comm: bash Not tainted 4.4.0-30-generic #49-Ubuntu > [ 416.229537] task: c00001f9d583c8e0 ti: c00001fa13cd8000 task.ti: c00001fa13cd8000 > [ 416.229543] NIP: c000000000670014 LR: c0000000006710c8 CTR: c00000000066ffe0 > [ 416.229548] REGS: c00001fa13cdb990 TRAP: 0300 Not tainted (4.4.0-30-generic) > [ 416.229552] MSR: 8000000000009033 CR: 28242222 XER: 00000001 > [ 416.229565] CFAR: c000000000008468 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 1 > GPR00: c0000000006710c8 c00001fa13cdbc10 c0000000015b5d00 0000000000000063 > GPR04: c00001fab9049c50 c00001fab905b4e0 c0001f3fff3d0000 0000000000000313 > GPR08: 0000000000000007 0000000000000001 0000000000000000 c0001f3fff3dec68 > GPR12: c00000000066ffe0 c000000007546980 ffffffffffffffff 0000000022000000 > GPR16: 0000000010170dc8 00000100174901d8 0000000010140f58 00000000100c7570 > GPR20: 0000000000000000 000000001017dd58 0000000010153618 000000001017b608 > GPR24: 00003ffff8966c94 0000000000000001 c0000000014f8e58 0000000000000004 > GPR28: c0000000014f9218 0000000000000063 c0000000014b11dc 0000000000000000 > [ 416.229631] NIP [c000000000670014] sysrq_handle_crash+0x34/0x50 > [ 416.229636] LR [c0000000006710c8] __handle_sysrq+0xe8/0x270 > [ 416.229640] Call Trace: > [ 416.229645] [c00001fa13cdbc10] [c000000000e08f28] _fw_tigon_tg3_bin_name+0x2ce58/0x342b0 (unreliable) > [ 416.229652] [c00001fa13cdbc30] [c0000000006710c8] __handle_sysrq+0xe8/0x270 > [ 416.229658] [c00001fa13cdbcd0] [c000000000671868] write_sysrq_trigger+0x78/0xa0 > [ 416.229666] [c00001fa13cdbd00] [c00000000037ae30] proc_reg_write+0xb0/0x110 > [ 416.229673] [c00001fa13cdbd50] [c0000000002e186c] __vfs_write+0x6c/0xe0 > [ 416.229678] [c00001fa13cdbd90] [c0000000002e25a0] vfs_write+0xc0/0x230 > [ 416.229684] [c00001fa13cdbde0] [c0000000002e35dc] SyS_write+0x6c/0x110 > [ 416.229690] [c00001fa13cdbe30] [c000000000009204] system_call+0x38/0xb4 > [ 416.229695] Instruction dump: > [ 416.229698] 38425d20 7c0802a6 f8010010 f821ffe1 60000000 60000000 3d220019 394931e4 > [ 416.229707] 39200001 912a0000 7c0004ac 39400000 <992a0000> 38210020 e8010010 7c0803a6 > [ 416.229717] ---[ end trace 16e5fbbf7faa7340 ]--- > [ 416.232059] > [ 416.232086] Sending IPI to other CPUs > [ 416.242558] IPI complete > [ [ 416.229695] Instruction dump: > [ 416.229698] 38425d20 7c0802a6 f8010010 f821ffe1 60000000 60000000 3d220019 394931e4 > [ 416.229707] 39200001 912a0000 7c0004ac 39400000 <992a0000> 38210020 e8010010 7c0803a6 > [ 416.229717] ---[ end trace 16e5fbbf7faa7340 ]--- > [ 416.232059] > [ 416.232086] Sending IPI to other CPUs > [ 416.242558] IPI complete > I'm in purgatory > -> smp_release_cpus() > spinning_secondaries = 1528 > <- smp_release_cpus() > <- setup_system() > [ 1.146155] sd 0:2:1:0: [sdb] Assuming drive cache: write through > [ 1.154176] sd 0:2:0:0: [sda] Assuming drive cache: write through > /dev/sdb2: recovering journal > /dev/sdb2: clean, 69482/136331264 files, 9047821/545318400 blocks > > --------------------------------------------------------------------------------------- > -------------------------------------------------------------------------------------- > tu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. . . .1;-1fUbuntu 16.101;-1f. > > > --------------------------------------------------------------------------------------- > -------------------------------------------------------------------------------------- > --------------------------------------------------------------------------------------- > -------------------------------------------------------------------------------------- > > after force reboot > > root@ltc-brazos1:/var/crash# ls > 201607161510 kexec_cmd > root@ltc-brazos1:/var/crash# cd 201607161510/ > root@ltc-brazos1:/var/crash/201607161510# ls > vmcore-incomplete > root@ltc-brazos1: > > Note : waited for Kdump process more than 2 Hour . > > Regards > Praveen > > == Comment: #12 - Vaishnavi Bhat