Moonshot ProLiant m400 fails to boot "Wrong Ramdisk Image Format"

Bug #1900796 reported by Po-Hsu Lin
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ubuntu-kernel-tests
New
Undecided
Unassigned
flash-kernel (Ubuntu)
Fix Released
Undecided
Unassigned
Xenial
New
Undecided
Unassigned
Bionic
Confirmed
Undecided
Unassigned
Focal
Fix Released
Undecided
Unassigned
Groovy
Fix Released
Undecided
Unassigned
Hirsute
Fix Released
Undecided
Unassigned
plymouth (Ubuntu)
Invalid
Undecided
Unassigned
Xenial
Invalid
Undecided
Unassigned
Bionic
Invalid
Undecided
Unassigned
Focal
Invalid
Undecided
Unassigned
Groovy
Invalid
Undecided
Unassigned
Hirsute
Invalid
Undecided
Unassigned

Bug Description

[Impact]
Due to a firmware (u-boot) bug in reading ext4 filesystems extents, ProLiant m400 systems may fail to boot after installing a new kernel. This seems to be exacerbated when there is limited free space on the /boot filesystem. HPE is no longer providing new firmware fixes for this platform.

[Test Case]
Install a new kernel and reboot. When this bug is triggered, you'll see the following errors (emphasis <<>> mine):

## Executing script at 4004000000
11349894 bytes read in 312 ms (34.7 MiB/s)
<<invalid extent block>>
## Booting kernel from Legacy Image at 4002000000 ...
   Image Name: kernel 5.8.0-25-generic
   Created: 2020-10-21 5:26:34 UTC
   Image Type: ARM Linux Kernel Image (gzip compressed)
   Data Size: 11349830 Bytes = 10.8 MiB
   Load Address: 00080000
   Entry Point: 00080000
   Verifying Checksum ... OK
Wrong Ramdisk Image Format
<<Ramdisk image is corrupt or invalid>>

[Where Problems Could Occur]
The workaround I've added here is to attempt to defrag the boot files so that the u-boot parsing bug is not triggered. This workaround is only activated for machines tagged with a certain property, and only the m400 server is tagged w/ that property. If there is a bug in detecting the platform or property, it could of course impact other platforms. Though it should be said that this code uses a well-established flash-kernel pattern. On the m400, the code only implements the workaround if /boot is on an ext4 filesystem (the Ubuntu default). If the filesystem detection code is buggy, we may unintentionally run e4defrag on a non-ext4 filesystem which could cause errors. Those errors currently only cause a warning to be printed - it does not fail the script. Users who miss this warning could still end up with an unbootable system if the workaround fails -- which it may, if the disk is very close to full. Long term, we should consider making this error fatal.

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :
Po-Hsu Lin (cypressyew)
description: updated
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

There is no such issue on another ARM64 ThunderX node (starmie-kernel, with architecture set to arm64/generic on MAAS)

$ sudo apt install plymouth
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
  plymouth-theme-ubuntu-text
Suggested packages:
  desktop-base plymouth-themes
The following packages will be upgraded:
  plymouth plymouth-theme-ubuntu-text
2 upgraded, 0 newly installed, 0 to remove and 31 not upgraded.
Need to get 121 kB of archives.
After this operation, 0 B of additional disk space will be used.
Do you want to continue? [Y/n]
Get:1 http://ports.ubuntu.com/ubuntu-ports focal-proposed/main arm64 plymouth-theme-ubuntu-text arm64 0.9.4git20200323-0ubuntu6.1 [9148 B]
Get:2 http://ports.ubuntu.com/ubuntu-ports focal-proposed/main arm64 plymouth arm64 0.9.4git20200323-0ubuntu6.1 [112 kB]
Fetched 121 kB in 0s (318 kB/s)
(Reading database ... 112976 files and directories currently installed.)
Preparing to unpack .../plymouth-theme-ubuntu-text_0.9.4git20200323-0ubuntu6.1_arm64.deb ...
Unpacking plymouth-theme-ubuntu-text (0.9.4git20200323-0ubuntu6.1) over (0.9.4git20200323-0ubuntu6) ...
Preparing to unpack .../plymouth_0.9.4git20200323-0ubuntu6.1_arm64.deb ...
Unpacking plymouth (0.9.4git20200323-0ubuntu6.1) over (0.9.4git20200323-0ubuntu6) ...
Setting up plymouth (0.9.4git20200323-0ubuntu6.1) ...
update-initramfs: deferring update (trigger activated)
update-rc.d: warning: start and stop actions are no longer supported; falling back to defaults
update-rc.d: warning: start and stop actions are no longer supported; falling back to defaults
Setting up plymouth-theme-ubuntu-text (0.9.4git20200323-0ubuntu6.1) ...
update-initramfs: deferring update (trigger activated)
Processing triggers for man-db (2.9.1-1) ...
Processing triggers for systemd (245.4-4ubuntu3.2) ...
Processing triggers for initramfs-tools (0.136ubuntu6.3) ...
update-initramfs: Generating /boot/initrd.img-5.8.0-25-generic
W: Possible missing firmware /lib/firmware/ast_dp501_fw.bin for module ast
Unsupported platform on EFI system, doing nothing.

Revision history for this message
dann frazier (dannf) wrote :

I think this is a firmware issue. Normally you should see something like:

288 bytes read in 34 ms (7.8 KiB/s)
## Executing script at 4004000000
10741547 bytes read in 323 ms (31.7 MiB/s) <- This is loading the kernel
64692535 bytes read in 1672 ms (36.9 MiB/s) <- This is loading the initramfs
## Booting kernel from Legacy Image at 4002000000 ...
   Image Name: kernel 5.4.0-56-generic
   Created: 2020-12-08 1:43:18 UTC
   Image Type: ARM Linux Kernel Image (gzip compressed)
   Data Size: 10741483 Bytes = 10.2 MiB
   Load Address: 00080000
   Entry Point: 00080000
   Verifying Checksum ... OK
## Loading init Ramdisk from Legacy Image at 4005000000 ...
   Image Name: ramdisk 5.4.0-56-generic
   Created: 2020-12-08 1:43:18 UTC
   Image Type: ARM Linux RAMDisk Image (uncompressed)
   Data Size: 64692471 Bytes = 61.7 MiB
   Load Address: 00000000
   Entry Point: 00000000
   Verifying Checksum ... OK
## Flattened Device Tree blob at 4003000000
   Booting using the fdt blob at 0x0000004003000000
   Uncompressing Kernel Image ... OK
   Loading Ramdisk to 4fec24d000, end 4feffff0f7 ... OK
   Loading Device Tree to 0000004000078000, end 000000400007fa67 ... OK

But in your output - where we'd expect to see initramfs load size/rate, we instead see "invalid extent block". And then later:

Wrong Ramdisk Image Format
Ramdisk image is corrupt or invalid

This could be a u-boot bug, possibly fixed by this patch:
https://lists.denx.de/pipermail/u-boot/2014-January/170802.html

I was able to reproduce this by installing bionic, upgrading to the 5.4-based HWE kernel, rebooting, then upgrading my HWE kernel from bionic-proposed. Possibly the interesting bit here is that I was running a 5.4-based ext4 when generating the new uInitrd. I was able to get it working by defragmenting the file:

$ sudo e4defrag -c /boot/uInitrd
e4defrag 1.44.1 (24-Mar-2018)
<File> now/best size/ext
/boot/uInitrd 15/1 2469 KB

 Total/best extents 15/1
 Average size per extent 2469 KB
 Fragmentation score 1
 [0-30 no problem: 31-55 a little bit fragmented: 56- needs defrag]
 This file (/boot/uInitrd) does not need defragmentation.
 Done.
$ sudo e4defrag /boot/uInitrd
e4defrag 1.44.1 (24-Mar-2018)
ext4 defragmentation for /boot/uInitrd
[1/1]/boot/uInitrd: 100% [ OK ]
 Success: [1/1]

Since that is ext4-specific, a workaround might be to use an ext2/ext3 /boot. There is newer firmware available for m400s that we could try, but "An active warranty or support agreement covering Proliant servers must be linked to your HPE Support Center profile to access this BIOS.", so I can't access it :(

Changed in plymouth (Ubuntu):
status: New → Invalid
Revision history for this message
dann frazier (dannf) wrote :

A workaround might be to have flash-kernel automatically defrag files it generates on ext4, so adding a task for it.

Changed in flash-kernel (Ubuntu):
status: New → Confirmed
dann frazier (dannf)
description: updated
Revision history for this message
dann frazier (dannf) wrote :

Here's a patch for flash-kernel that works around the issue.

summary: - plymouth in proposed cause F-5.8 unable to boot on moonshot ARM64 "Wrong
- Ramdisk Image Format"
+ Moonshot ProLiant m400 fails to boot "Wrong Ramdisk Image Format"
tags: added: patch
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package flash-kernel - 3.100ubuntu5

---------------
flash-kernel (3.100ubuntu5) hirsute; urgency=medium

  * Add workaround for older u-boot versions that can fail to read files
    using ext4 extents, and enable it for HP ProLiant m400 Moonshot Server
    Cartridges. LP: #1900796

 -- dann frazier <email address hidden> Mon, 14 Dec 2020 16:48:14 -0700

Changed in flash-kernel (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
dann frazier (dannf) wrote :

The latest available firmware still has the problem, so looks like we'll need to workaround it.

dann frazier (dannf)
Changed in plymouth (Ubuntu Groovy):
status: New → Invalid
Changed in plymouth (Ubuntu Focal):
status: New → Invalid
Changed in plymouth (Ubuntu Bionic):
status: New → Invalid
Changed in plymouth (Ubuntu Xenial):
status: New → Invalid
Changed in flash-kernel (Ubuntu Groovy):
status: New → Confirmed
Changed in flash-kernel (Ubuntu Focal):
status: New → Confirmed
Changed in flash-kernel (Ubuntu Bionic):
status: New → Confirmed
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Po-Hsu, or anyone else affected,

Accepted flash-kernel into groovy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/flash-kernel/3.103ubuntu1~20.10.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-groovy to verification-done-groovy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-groovy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in flash-kernel (Ubuntu Groovy):
status: Confirmed → Fix Committed
tags: added: verification-needed verification-needed-groovy
Revision history for this message
Brian Murray (brian-murray) wrote :

Hello Po-Hsu, or anyone else affected,

Accepted flash-kernel into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/flash-kernel/3.103ubuntu1~20.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in flash-kernel (Ubuntu Focal):
status: Confirmed → Fix Committed
tags: added: verification-needed-focal
dann frazier (dannf)
description: updated
dann frazier (dannf)
tags: added: verification-done verification-done-focal verification-done-groovy
removed: verification-needed verification-needed-focal verification-needed-groovy
Revision history for this message
dann frazier (dannf) wrote :

I've verified that an m400 remains bootable after upgrading to and executing the focal-proposed and groovy-proposed versions of flash-kernel.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package flash-kernel - 3.103ubuntu1~20.10.1

---------------
flash-kernel (3.103ubuntu1~20.10.1) groovy; urgency=medium

  * Backport latest upstream version to groovy (LP: #1904890)
  * This includes Dann Frazier's workaround for LP: #1900796

 -- Dave Jones <email address hidden> Tue, 12 Jan 2021 17:14:57 +0000

Changed in flash-kernel (Ubuntu Groovy):
status: Fix Committed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for flash-kernel has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package flash-kernel - 3.103ubuntu1~20.04.1

---------------
flash-kernel (3.103ubuntu1~20.04.1) focal; urgency=medium

  * Backport latest upstream version to groovy (LP: #1904890)
  * This includes Dann Frazier's workaround for LP: #1900796

 -- Dave Jones <email address hidden> Tue, 12 Jan 2021 17:14:57 +0000

Changed in flash-kernel (Ubuntu Focal):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.