Comment 2 for bug 1185394

Revision history for this message
Martin Pitt (pitti) wrote : Re: [Bug 1185394] [NEW] systemd-udev fails when processing many logical volumes on boot

Before I forget everything over the long weekend, some braindump:

I tried to reproduce this with today's server iso install in a VM. I
used two virtio disks the first one (sda) with a 5 GB root and two
PVs, the second one (sdb) with two more PVs. I built two VGs with 8
LVs in total, putting /home on VG1/LV1, /home/iso on VG2/LV2 (trying
to replicate Stefan's setup), and mounted the other LVs to places like
/data1. This has booted fine 10 out of 10 times.

I tried to add a sleep to 85-lvm2.rules to simulate a long-running
vgchange, but it still works fine. I modified udev's initramfs-bottom
script to show processes before (got some 10 udev workers) and after
(no udev processes left) the udevadm control --exit, so this also
works.

The conjecture is that due to our 0024-avoid-exit-deadlock-for-dm_cookie.patch
the workers for the LVs are not stopped by the control --exit and thus
race with the following mount --move of /dev to /root/dev. I could
never reproduce this in my VM, thus this is what I would like to find
out from Stefan with (4) and (1). Our udev 175 initramfs script
behaved in pretty much the same way, but we should compare the outputs
of (1) and (4) between udev 175 and 202.

I'd also like to look at the core dump to see where it actually
crashes. I guess something tries to open /dev/null and does not expect
a failure, and thus runs into a NULL FILE*, but it would be nice to
confirm this.

I do not have a good idea why /dev/null is missing. Right after the
mount --move we add a compat symlink from /dev/ to /root/dev, so that
udev workers and their callouts should in theory even be able to
continue after the mount --move. We did not have this compat symlink
in udev 175; in theory this should make things worse, but just to be
sure to cover all bases, here's:

  (5) comment out the last line (ln -s) in
      /usr/share/initramfs-tools/scripts/init-bottom/udev

(plus "sudo update-initramfs -u"), and see whether that makes a
difference. In theory it should lead to lost events and more errors
like "/dev/null missing", but perhaps we are missing some weird check
in the lvm2/dmsetup scripts/rules.

Thanks,

Martin
--
Martin Pitt | http://www.piware.de
Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org)