[UBUNTU 18.04] Ubuntu 18.04 kernel 4.15.0-194 crashes on IPL

Bug #1994601 reported by bugproxy
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
Fix Released
Critical
Skipper Bug Screeners
linux (Ubuntu)
Invalid
Undecided
Canonical Kernel Team
Bionic
Fix Released
Undecided
Luke Nowakowski-Krijger

Bug Description

SRU Justification:
==================

[ Impact ]

 * Ubuntu 18.04 / bionic installations with latest kernel 4.15.0-194
   are no longer able to IPL (boot) on IBM z14 or newer hardware.

 * This issue got introduced by upstream commit e4f74400308c
   "s390/archrandom: simplify back to earlier design and initialize earlier"
   that was SRUed to 18.04/bionic based on LP#1989625,
   which made changes in the s390s IPL/boot area of kernel/arch/random.

 * The reason seems to be that the bad patch moves the decision about
   if arch randomness is available to setup.c function setup_randomness().
   This code uses a static_key s390_arch_random_available.
   But in the Canonical kernel the initialization function
   for the jump labels (where the static keys are based on)
   jump_label_init() is called in generic start_kernel()
   wheres in the upstream kernel the init function is
   called early in setup_arch().

 * Reverting this commit from bionic master-next makes bionic systems
   again bootable.
   (https://launchpad.net/~fheimes/+archive/ubuntu/test/)

[ Test Plan ]

 * An IBM z14 or LinuxONE II or newer system is needed.

 * Now install latest bionic on that system - doesn't if it's on LPAR,
   z/VM or KVM.

 * After the installation (an the trigger of the post install reboot),
   the system will not come up.

 * To test a patched kernel with e4f74400308c can be tested in the
   following way:

 * Install 18.04 GA and prevent it from doing any kernel updates.

 * Means, install in 'island' mode
   or select in d-i 'Advanced Installation'
   and explicitly choose '4.15.0-50 generic' to install.

 * That allows the system to come up and to update the kernel to
   a modified one.

 * Then reboot and verify if the system comes up properly.

[ Where problems could occur ]

 * Problems could occur due to the fact that the commit
   was not cleanly reversible because of minor context changes.

 * Adjustments that were needed might break other things if not
   done carefully.

 * Further commits (applied after e4f74400308c) may still rely
   on the bad e4f74400308c commit - or even further patches
   (from upstream stable).

 * In worst case IPL / boot might get broken,
   even on hardware older than z14.

 * If the revert works fine can be easily tested and was tested based on
   https://launchpad.net/~fheimes/+archive/ubuntu/test/
   and the above test plan.

[ Other Info ]

 * Ubuntu 20.04 (focal, using legacy image with virt-install)
   was tested as well, but is not affected by this issue.
__________

---Problem Description---
Ubuntu 18.04 crashes during IPL with no output on the console.

Contact Information = Viktor Mihajlovski <email address hidden>

---uname output---
n/a

Machine Type = 3096

---Debugger---
A debugger is not configured

---Steps to Reproduce---
 Install Ubuntu 18.04 as a KVM guest using the following command:

virt-install -n bionic --cdrom /var/lib/libvirt/images/ubuntu-18.04.5-server-s390x.iso --memory 2048 --disk size=8

then reboot.

Stack trace output:
 no

Oops output:
 no

== Comment: #1 - Viktor Mihajlovski <email address hidden> - 2022-10-25 10:48:30 ==
Installing under z/VM leads to the same failure.

== Comment: #2 - Viktor Mihajlovski <email address hidden> - 2022-10-25 10:55:10 ==
I have captured a dump using virsh dump --memory-only. The output of crash log is uploaded

== Comment: #7 - Harald Freudenberger <email address hidden> - 2022-10-26 07:33:52 ==
Looks like all ubuntu 18.04 installations on s390 are not working any more.
It is not an issue with z14 but z17 also fails to run a fresh installed ubuntu 18.04.

== Comment: #8 - Harald Freudenberger <email address hidden> - 2022-10-26 08:25:52 ==
when I use the 'advanced installation' where I am able to choose the kernel package and then choose the 4.15.0-50 generic the installed Ubuntu 18.04 comes up fine. So this issue is somewhere between kernel 4.15.0-50 and 4.15.0-194.

Revision history for this message
bugproxy (bugproxy) wrote : crash log output

Default Comment by Bridge

tags: added: architecture-s39064 bugnameltc-200228 severity-critical targetmilestone-inin---
Changed in ubuntu:
assignee: nobody → Skipper Bug Screeners (skipper-screen-team)
affects: ubuntu → linux (Ubuntu)
Revision history for this message
Launchpad Janitor (janitor) wrote : Re: [UBUNTU 18.04] Ubuntu 18.04 crashes during IPL

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Frank Heimes (fheimes) wrote :

Thanks for reporting this.
What is the version of the Ubuntu host where you want to install the 18.04 guest onto?
So that I can try to properly recreate this.

Changed in ubuntu-z-systems:
assignee: nobody → Skipper Bug Screeners (skipper-screen-team)
importance: Undecided → Critical
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2022-10-26 11:12 EDT-------
I have the strong suggestion that this is the cause:

commit 6edb63a7b6cd57825e47cf6a8600b694a19f0d90
Author: Jason A. Donenfeld <email address hidden>
Date: Sat Jun 11 00:20:23 2022 +0200

s390/archrandom: simplify back to earlier design and initialize earlier
BugLink: https://bugs.launchpad.net/bugs/1989625
commit e4f74400308cb8abde5fdc9cad609c2aba32110c upstream.

...

because that's the only patch which has changes in the s390 start kernel/arch/random areas.

------- Comment From <email address hidden> 2022-10-26 11:15 EDT-------
Recreate is totally simple: just install a Ubuntu 18.04 - does not matter if it is on LPAR or zVM or as a KVM guest.
If my suggestion is right, this happens since kernel 4.15.0-194.

Revision history for this message
Frank Heimes (fheimes) wrote : Re: [UBUNTU 18.04] Ubuntu 18.04 crashes during IPL

Trying to recreate, I was just able:
- to successfully complete an 18.04 installation on a 22.10 host (that what I just had at hand) and
- a successful 18.04 installation on a z/VM 6.4 host
I selected in both cases the strongly recommended option 'Install security updates automatically'.
(And don't be surprised that the installed system reports itself as 18.04.6, even if the 18.04.5 ISO image was used - there was another update needed for other architectures that led to a .6.)

Please can you share more details about the system you are using, especially the KVM host?

Changed in ubuntu-z-systems:
status: New → Incomplete
Revision history for this message
Frank Heimes (fheimes) wrote (last edit ):

Btw. for the KVM guest install I used:

sudo qemu-img create -f qcow2 /var/lib/libvirt/images/bionic.qcow2 8G

sudo virt-install --name bionic --vcpus 2 --ram 1024 --disk path=/var/lib/libvirt/images/bionic.qcow2,size=5,bus=virtio,format=qcow2 --os-type linux --os-variant generic --network network=default,model=virtio --graphics none --console pty -c /var/lib/libvirt/images/ubuntu-18.04.5-server-s390x.iso

uname -a
Linux ubuntu 4.15.0-194-generic #205-Ubuntu SMP Fri Sep 16 19:53:54 UTC 2022 s390x s390x s390x GNU/Linux

Revision history for this message
Frank Heimes (fheimes) wrote (last edit ):

Again, 18.04.5 installations for for me, I used this (latest) 18.04.5 ISO:
https://cdimage.ubuntu.com/releases/18.04/release/ubuntu-18.04.5-server-s390x.iso

If there is a reason to revert 's390/archrandom: simplify back to earlier design and initialize earlier' we could consider that, but works here (z13 with z/VM 6.4 and z13 with Ubuntu 22.10 KVM host).

Revision history for this message
Frank Heimes (fheimes) wrote (last edit ):

Since the Ubuntu kernel team does for every kernel update and SRU regression testing, a non-bootable kernel would have been identified and the kernel would not have left -proposed, it would even not have landed in -proposed.
That makes me think that the issue is elsewhere - at the KVM host side?!

I may try on z15, too (at least a KVM guest install).

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

If there is a reason to revert 's390/archrandom: simplify back to earlier design and initialize earlier' we could consider that, but works here (z13 with z/VM 6.4 and z13 with Ubuntu 22.10 KVM ost).

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2022-10-26 11:48 EDT-------
Digging into details gives:

The reason seems to be that the patch
moves the decission about if arch randomness
is available to setup.c function setup_randomness().
This code uses a static_key s390_arch_random_available.
But in the Canonical kernel the initialization function
for the jump labels (where the static keys are based on)
jump_label_init() is called in generic start_kernel()
wheres in the upstream kernel the init function is
called early in setup_arch().
So maybe another patch which is moving the jump_label_init()
from start_kernel() to setup_arch() is missing here.

I did NOT install any security fixes ... will try.

Revision history for this message
Frank Heimes (fheimes) wrote : Re: [UBUNTU 18.04] Ubuntu 18.04 crashes during IPL

Ah - interesting.
I could recreate the situation on z15 / L1III:

$ virsh start bionic --console
Domain 'bionic' started
Connected to domain 'bionic'
Escape character is ^] (Ctrl + ])
ubuntu@z15test:~$

There is obviously a difference in the behavior of the z13 and z15.

I can create a test kernel with the mentioned patch reverted...

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2022-10-26 12:09 EDT-------
Well, my fresh installed kvm guest from https://cdimage.ubuntu.com/releases/18.04/release/ubuntu-18.04.5-server-s390x.iso crashes even with security updates enabled.

Revision history for this message
Frank Heimes (fheimes) wrote : Re: [UBUNTU 18.04] Ubuntu 18.04 crashes during IPL

Hi Harald, see my previous comment - I could recreate this on z15 now (but it's fine on z13).
So seems to be related to (z14) z15 and newer?!

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2022-10-26 12:54 EDT-------
My configuration is OpenSUSE (KVM host) on a z14- we used the latest cloud build from: https://cloud-images.ubuntu.com/bionic/current/bionic-server-cloudimg-s390x.img

Revision history for this message
Frank Heimes (fheimes) wrote : Re: [UBUNTU 18.04] Ubuntu 18.04 crashes during IPL

A test kernel is being build at this PPA:
https://launchpad.net/~fheimes/+archive/ubuntu/test
(will be available in some minutes)

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2022-10-26 14:29 EDT-------
(In reply to comment #21)
> Hi Harald, see my previous comment - I could recreate this on z15 now (but
> it's fine on z13).
> So seems to be related to (z14) z15 and newer?!

We do patch the kernel depending on hardware level and enable newer instructions. Given Haralds finding of using static key before it is initialized might indicate a problem with that.

Frank, can you build a kernel with
commit 95e61b1b5d6394b53d147c0fcbe2ae70fbe09446
Author: Vasily Gorbik <email address hidden>
AuthorDate: Thu Jun 18 17:17:19 2020 +0200
Commit: Heiko Carstens <email address hidden>
CommitDate: Mon Jun 29 16:28:39 2020 +0200

s390/setup: init jump labels before command line parsing

and check if this helps?

Revision history for this message
Frank Heimes (fheimes) wrote : Re: [UBUNTU 18.04] Ubuntu 18.04 crashes during IPL

Another build is running right now at https://launchpad.net/~fheimes/+archive/ubuntu/test2
with 's390/archrandom: simplify back to earlier design and initialize earlier' still in and 's390/setup: init jump labels before command line parsing' on top.
This will take a while - will test it tomorrow.

Changed in ubuntu-z-systems:
status: Incomplete → Confirmed
Revision history for this message
Frank Heimes (fheimes) wrote :

Hi Christian, all,
both kernels are now ready:

1) with "s390/archrandom: simplify back to earlier design and initialize earlier" reverted:
f98553359746 (HEAD -> master-next) Revert "s390/archrandom: simplify back to earlier design and initialize earlier"
(https://launchpad.net/~fheimes/+archive/ubuntu/test/)
I was able to successfully test this kernel, a KVM guest on a z15 LPAR boots after having this modified kernel installed.

2) with "s390/archrandom: simplify back to earlier design and initialize earlier",
but "s390/setup: init jump labels before command line parsing" on top:
4ab78427995e (HEAD -> master-next) s390/setup: init jump labels before command line parsing"
(https://launchpad.net/~fheimes/+archive/ubuntu/test2/)
I was NOT able to successfuly test this, a KVM guest on a z15 LPAR was still NOT able to boot with this modified kernel.

Now, should we proceed with the revert (which btw. did not revert cleanly, I had to massage it a little bit), or do you want to fix upstream "s390/archrandom: simplify back to earlier design and initialize earlier"?

(Just fyi: the upcoming last day for commit for the next Ubuntu kernel SRU cycle is Nov, 2nd - so we should have decided for a fix by Nov 1st to get it in w/o missing one SRU cycle.)

Revision history for this message
Frank Heimes (fheimes) wrote :

I just tried 20.04 (focal legacy image), just to be sure, but it's not affected and worked fine.

tags: added: patch
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2022-10-27 07:25 EDT-------
> Now, should we proceed with the revert (which btw. did not revert cleanly, I
> had to massage it a little bit), or do you want to fix upstream
> "s390/archrandom: simplify back to earlier design and initialize earlier"?

If Viktor can confirm that the revert works, I would go with a revert. Obviously the upstream fix is not enough and we might need more patches which are not known by now.

Frank Heimes (fheimes)
description: updated
Frank Heimes (fheimes)
summary: - [UBUNTU 18.04] Ubuntu 18.04 crashes during IPL
+ [UBUNTU 18.04] Ubuntu 18.04 kernel 4.15.0-194 crashes on IPL
Revision history for this message
Dimitri John Ledkov (xnox) wrote : Re: [Bug 1994601] Re: [UBUNTU 18.04] Ubuntu 18.04 kernel 4.15.0-194 crashes on IPL

Separately recent rework in random causes hangs and inability to boot on
x86 virtio-rnd machines too.

I feel like I need to escalate this issue to upstream kernel,
together/separately from the virtio-rnd one.

It seems like rnd rewrites are not quite as robust as the old status quo.

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2022-10-27 08:49 EDT-------
With Frank's revert [build https://launchpad.net/~fheimes/+archive/ubuntu/test/]
kernel boots up OK.

Revision history for this message
Frank Heimes (fheimes) wrote :

Kernel SRU request submitted for bionic:
https://lists.ubuntu.com/archives/kernel-team/2022-October/134331.html
changing status to 'In Progress'.

Changed in linux (Ubuntu):
status: Confirmed → In Progress
Changed in ubuntu-z-systems:
status: Confirmed → In Progress
Changed in linux (Ubuntu):
assignee: Skipper Bug Screeners (skipper-screen-team) → Canonical Kernel Team (canonical-kernel-team)
Changed in linux (Ubuntu Bionic):
assignee: nobody → Luke Nowakowski-Krijger (lukenow)
status: New → Fix Committed
Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
status: In Progress → Fix Committed
Revision history for this message
Frank Heimes (fheimes) wrote :

Ubuntu 18.04 kernel 4.15.0-196 (currently in -proposed) incl. this fix.
https://launchpad.net/ubuntu/+source/linux/4.15.0-196.207

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 4.15.0-196.207

---------------
linux (4.15.0-196.207) bionic; urgency=medium

  * bionic/linux: 4.15.0-196.207 -proposed tracker (LP: #1994992)

  * [UBUNTU 18.04] Ubuntu 18.04 kernel 4.15.0-194 crashes on IPL (LP: #1994601)
    - SAUCE: Revert "s390/archrandom: simplify back to earlier design and
      initialize earlier"

 -- Luke Nowakowski-Krijger <email address hidden> Thu, 27 Oct 2022 13:56:02 -0700

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
bugproxy (bugproxy)
tags: added: targetmilestone-inin1804
removed: targetmilestone-inin---
Frank Heimes (fheimes)
Changed in linux (Ubuntu):
status: In Progress → Invalid
Changed in ubuntu-z-systems:
status: Fix Committed → Fix Released
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2022-11-10 11:26 EDT-------
I re-ran the installation as described above and now I was able to boot the VM. Thanks!

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-azure-4.15/4.15.0-1162.177 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-bionic-linux-azure-4.15 verification-needed-bionic
Revision history for this message
Frank Heimes (fheimes) wrote :

This bug doesn't affect the azure kernel or any other non-linux-generic kernels.
Hence updating the tags to unblock the process...

tags: added: verification-done-bionic
removed: verification-needed-bionic
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.