subiquity raid+lvm installation failed

Bug #1784124 reported by Michael Kofler
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
subiquity
Fix Released
Undecided
Unassigned

Bug Description

I just tested Subiquity as of 18.04.1 in a virtual machine with two disks.

first try:

I wanted to join two disks to RAID-1, then setup LVM on the RAID, then setup 3 logical volumes for /boot, / and swap. This failed, Subiquity insists that /boot is directly on a disk.

second try:

I created two RAID-1 volumes, a small one for /boot and a large on for LVM. /boot directly in the first RAID-1 volume, LVM in the second volume. Logical volumes for / and swap. This failed also, Subiquity still insists on /boot on local disk. (This makes the whole idea of RAID obsolete. If the wrong disk fails, I won't be able to boot.)

third try:

/boot directly on disk 1. The remaining space of disk 1 and all space on disk2 combined to RAID 1, in it LVM, there two LVs for / and swap. Subiquity accepts this setup, but the installation fails soon afterwards with 'an error has occurred'. Quite helpful ...

I attached one screenshot with my setup, will try to add two more screenshots lateron.

Revision history for this message
Michael Kofler (michael-kofler) wrote :
Revision history for this message
Michael Kofler (michael-kofler) wrote :
Revision history for this message
Michael Kofler (michael-kofler) wrote :
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

The can't create /boot on RAID thing is unfortunate but fiddly and didn't get fixed in time for the point release. I'm not sure why your final attempt failed, we need to implement subiquity collect-logs or something similar :/

Changed in subiquity:
status: New → Incomplete
Revision history for this message
Tom Reynolds (tomreyn) wrote :

Michael: You ack'd that /boot on RAID does not work to date, so there seems to be an issue there (admittedly duplicated by bug 1785332). Why is it incomplete?

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote : Re: [Bug 1784124] Re: subiquity raid+lvm installation failed

If this bug was just about /boot on raid, I'd have marked it as a
duplicate. But it also mentions an install failure with not enough
information to debug, hence incomplete.

On Sat, 23 Feb 2019, 04:25 Tom Reynolds, <email address hidden> wrote:

> Michael: You ack'd that /boot on RAID does not work to date, so there
> seems to be an issue there (admittedly duplicated by bug 1785332). Why
> is it incomplete?
>
> --
> You received this bug notification because you are subscribed to
> subiquity.
> https://bugs.launchpad.net/bugs/1784124
>
> Title:
> subiquity raid+lvm installation failed
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/subiquity/+bug/1784124/+subscriptions
>

Revision history for this message
Tom Reynolds (tomreyn) wrote :

Thanks for clarifying, Michael H-D. So I understand we are looking for more information on Maichael K.'s "third try" with this setup:

HDD1 (10GB) HDD2 (10GB) MD
PT1: 1MB, bios_grub
PT2: 1GB, ext4, /boot
PT3: 8.997GB, comp. of md0 PT1: 9GB, comp. of md0 MD0: 8.989GB, RAID-1, PV of vg0

VG0
"lvswap": 1GB, swap
"lvroot": 7.988GB, ext4, /

As Michael K. reports, this failed (not directly) after partitioning with 'an error has occurred', using the 18.04.1 server-live installer.

I have just reproduced this error with the exact same HDD, partition, MD RAID-1, LVM setup (also using VirtualBox), using the 18.04.2 server-live amd64 installer.

I suspect that the differently sized component devices of the md-0 RAID-1 array devices cause curtin to become confused about how much storage is available on this RAID array for LVs.

From install.log:
        start: cmd-install/stage-partitioning/builtin/cmd-block-meta: configuring lvm_partition: lv-0
        Running command ['lvcreate', 'vg0', '--name', 'lvswap', '--zero=y', '--wipesignatures=y', '--size', '1073741824b'] with allowed return codes [0] (capture=False)
          Logical volume "lvswap" created.
...
        finish: cmd-install/stage-partitioning/builtin/cmd-block-meta: SUCCESS: configuring lvm_partition: lv-0
        start: cmd-install/stage-partitioning/builtin/cmd-block-meta: configuring lvm_partition: lv-1
        Running command ['lvcreate', 'vg0', '--name', 'lvroot', '--zero=y', '--wipesignatures=y', '--size', '8577351680b'] with allowed return codes [0] (capture=False)
          Volume group "vg0" has insufficient free space (2044 extents): 2045 required.
        An error occured handling 'lv-1': ProcessExecutionError - Unexpected error while running command.
        Command: ['lvcreate', 'vg0', '--name', 'lvroot', '--zero=y', '--wipesignatures=y', '--size', '8577351680b']
        Exit code: 5
        Reason: -

Full logs are attached.

Revision history for this message
Tom Reynolds (tomreyn) wrote :
Revision history for this message
Tom Reynolds (tomreyn) wrote :
Revision history for this message
Tom Reynolds (tomreyn) wrote :
Revision history for this message
Tom Reynolds (tomreyn) wrote :

Testing again with identically sized md0 component devices does not fix it - it fails with the same (except that I named the VG vg1 this time) message:
   Volume group "vg1" has insufficient free space (2044 extents): 2045 required.

So this rather seems to be an off-by-one issue or something else entirely.

Partitioning scheme:

HDD1 (10GB)
PT1: 1MB, bios_grub
PT2: 1GB, ext4, /boot
PT3: 8.997GB, comp. of md0

HDD2 (10GB)
PT1: 1MB, unassigned
PT2: 1GB, unassigned
PT3: 8.997GB, comp. of md0

MD
MD0: 8.989GB, RAID-1, PV of vg1

VG0
"lvswap": 1GB, swap
"lvroot": 7.988GB, ext4, /

Revision history for this message
Tom Reynolds (tomreyn) wrote :

I believe to have shown that the partitioning issue ("unknown error" after manual partitioning) is a real, reproducible issue, affecting both Michael K. and me. Settings this back to confirmed.
If more info is needed then please state so and list the required information. Thanks!

Changed in subiquity:
status: Incomplete → Confirmed
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Both aspects of this bug are now fixed in the edge channel and will be released soon. (The "can't put /boot on RAID" has been fixed for a while, actually, but the RAID size calculation fix is new).

Changed in subiquity:
status: Confirmed → Fix Committed
Changed in subiquity:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.