Raid is incorrectly determined as DEGRADED preventing boot in 12.04
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
mdadm (Ubuntu) |
Confirmed
|
High
|
Unassigned |
Bug Description
After upgrading from 11.04 to 12.04 in two steps, my server failed to boot printing:
"Could not start the RAID in degraded mode.", referring to /dev/md/3. Then dropping to an initramfs-shell.
My RAID setup is the following:
# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [raid1] [linear] [multipath] [raid0] [raid10]
md3 : active raid0 dm-2[0] sdc2[2] sdb2[1] sdd2[3]
82075648 blocks super 1.2 1024k chunks
md0 : active raid1 sdf1[1] sde1[0]
530048 blocks [2/2] [UU]
md4 : active raid5 sdf3[1] sdh3[4] sdg3[2] sde3[0]
5856021120 blocks super 1.2 level 5, 128k chunk, algorithm 2 [4/4] [UUUU]
md2 : active raid6 sdh2[3] sdf2[1] sdg2[2] sde2[0]
1950720 blocks super 1.2 level 6, 512k chunk, algorithm 2 [4/4] [UUUU]
md1 : active raid5 sda1[0] sdc1[2] sdb1[1] sdd1[3]
11712000 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
The following are my mount points:
# mount
/dev/mapper/
/dev/md0 on /boot type ext4 (rw)
/dev/md3 on /something/
# grep -e md -e mapper -e boot /etc/fstab
/dev/mapper/
UUID=9b199b09-
/dev/mapper/
/dev/mapper/
/dev/md3 /something/
Current crypttab setup:
# cat /etc/crypttab
md1_crypt /dev/md1 none luks,discard
md2_crypt /dev/md2 /dev/urandom cipher=
md3_crypt /dev/md3 /some/key/file cipher=
sda2_crypt /dev/sda2 /some/key/file cipher=
md4_crypt /dev/md4 /some/key/file cipher=
As you can see the first fact is that /dev/md3 should not be relevant for booting the system. It's not the rootfs, it's not the swap, it's not /boot. Which is all I need to get my system up and running.
The part #2 which you can find as a contributor to the problem is that /dev/md3 is a RAID0 (0 drive tolerance for fault) which includes a device which is initiated in the crypttab (sda2_crypt). So once slice of the /dev/md3 is encrypted.
During boot, this is what happens:
1) System mounts the initrd stuff (which has a local derrivation of fstab, mdadm.conf and crypttab). It tries to determine what to do. It determines the system has an encrypted rootfs, and correctly prompt for the password.
2) /dev/mapper/
3) The system moves on trying to determine how to assemble the rest of the raids. It reads mdadm.conf (the problem persists even though I remove this file, although then my md3 is named md127). It finds definitions of md0, md1, md2, md3 & md4. It will try to run the stuff from /usr/share/
4) /usr/share/
md3 : active raid0 dm-2[0] sdc2[2] sdb2[1] sdd2[3]
82075648 blocks super 1.2 1024k chunks
Notice the first device.
5) For some reason, even though adding BOOT_DEGRADED=true in /etc/default/mdadm it will ignore this for a degraded RAID0, as it is probably marked as faulty and not degraded?
6) The system halts. Throws me into the initramfs-shell.
I got the system successfully booting by "hacking" the mdadm-functions file:
--- usr/share/
+++ /usr/share/
@@ -3,8 +3,9 @@
degraded_arrays()
{
- mdadm --misc --scan --detail --test >/dev/null 2>&1
- return $((! $?))
+# mdadm --misc --scan --detail --test >/dev/null 2>&1
+ return 0
+# return $((! $?))
}
mountroot_fail()
@@ -83,10 +84,11 @@
echo "Started the RAID in degraded mode."
return 0
else
+ mdadm --stop /dev/md3
echo "Could not start the RAID in degraded mode."
fi
fi
fi
fi
- return 1
+ return 0
}
So basically I force mdadm-functions to always return 0, and never check for degraded arrays. In addition I make it stop the faulty assembled /dev/md3 which will be re-assembled after the initramfs completes anyhow.
This setup was working in 11.04.
Lucky me having a remote serial console to actually solve it... :)
The setup should be quite reproducible along any 12.04 setup.
Changed in mdadm (Ubuntu): | |
importance: | Undecided → High |
tags: | added: precise |
Changed in mdadm (Ubuntu): | |
status: | New → Confirmed |
Potential solution:
Make the initramfs scripts completely ignore anything that is not critical for mounting the rootfs and booting the system. It should not even bother with /dev/md3 as it is not associated wih any aspects for successfully come to a login prompt.
In this case, only initializing raids required for the successful boot would suffice. Then it would assemble the crypt devices and hand the system over to the init. Which would assemble the rest of the raids successfully and then the rest of the encrypted devices.
This is basically what happens in 11.04.