Critical Dapper bug: Missing secondary from software RAID1, no message or warning

Bug #58893 reported by Timothy Miller
4
Affects Status Importance Assigned to Milestone
mdadm (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

I have two identical drives that I have configured as RAID1. I set up software RAID1 on them and installed Dapper.

After installing, I shut down and pulled the power connector from the secondary disk. I booted up, and all was fine.

The thing that really disturbs me is that there were no messages of any kind, reporting that the secondary drive was offline. This is a serious problem, because the user would never know if one of their drives had failed, so they wouldn't know to replace the drive, and their data is at serious risk. (Especially since if one of two identical drives fails (that were bought at the same time), the other one is probably not far behind it.)

Revision history for this message
Timothy Miller (theosib) wrote :

I would like to suggest that, when the user logs in, they should see a graphical dialog that reports that a drive is missing from the array (or whatever the situation is) so that they know that there's a problem.

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

This is what mdadm monitor is for and it is installed by default. dpkg-reconfigure mdadm if you need to change default settings.

Changed in mdadm:
status: Unconfirmed → Rejected
Revision history for this message
Timothy Miller (theosib) wrote :

With all due respect, I think it's inappropriate to reject this bug. I think you didn't understand the bug I was reporting. If mdadm is installed by default, then it wasn't automatically configured properly. It reported nothing to me, so either the installation didn't set it up right, or it's broken.

The bug I'm reporting is that there is no in-your-face report that the drive is missing, and there certainly well should be. Please reopen this bug.

And if you're talking about the command-line mdadm tools, then that's REALLY missing the point, because I'm talking about a usability problem here, and users should not be required to drop to the command line to manage their RAID array.

Revision history for this message
Dustin Kirkland  (kirkland) wrote :

Hi Timothy-

My apologies to you, on behalf of whoever closed this report as "Invalid". That was clearly the wrong response, as they did not understand your problem.

I could confirm the behavior you were seeing.

I have spent quite a bit of time addressing a number of RAID issues in Intrepid, and I think this one should now be resolved.

There is now a debconf question within mdadm, which will set a value of BOOT_DEGRADED=true|false in /etc/initramfs-tools/conf.d/mdadm.

If it's set to 'true', then the administrator wants to boot the system even if an md-degraded event has occurred. This is desirable behavior for some sysadmins who want their unattended, RAID-protected system to boot no matter what.

If it's set to 'false', the initramfs will spend 30 seconds trying to construct the md device, and if that fails, a meaningful error message is printed, along with an interactive prompt asking if you would like to boot your system on this degraded array [y/N]. The default is "no", and it will time out if there is no response within 15 seconds. In which case, you will be dropped to a busybox shell in the initramfs.

Would it be possible for you to test the new behavior in Intrepid?

I'm going to mark this bug as "Fix Released" for now. If you test it in intrepid, and still see a problem, please respond here and we can reopen it.

Thanks,
:-Dustin

Changed in mdadm:
status: Invalid → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.