Starting libvirtd takes too long because of "udevadm settle" timeout

Bug #1027987 reported by Andreas Ntaflos
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
libvirt (Ubuntu)
Fix Released
Medium
Unassigned
Precise
Fix Released
Medium
Unassigned
Quantal
Fix Released
Medium
Unassigned

Bug Description

=====================================
SRU Justification:
1. Impact: starting libvirtd takes 2 minutes
2. Development fix: in debian and quantal, this is fixed by explicitly
   giving a shorter timeout when calling udevadm settle.
3. Stable fix: same as development fix
4. Test case: define an lvm based storage pool, then stop and start libvirt.
5. Regression potential: we're lowering a timeout, so in theory the timing
   of some races could be changed.
=====================================
This bug has already been reported in http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=663931 but Libvirt 0.9.8 in Ubuntu Precise suffers from the same problem.

When an LVM-based storage pool is present libvirtd takes around three minutes to start up because a timeout occurs when libvirtd calls "udevadm settle". In the Debian bug report this has been identified as a problem in Libvirt, additional info is found here: http://article.gmane.org/gmane.linux.hotplug.devel/17421

 A local workaround is to issue "service udev restart" before starting libvirtd.

The bug has been worked around in Debian's Libvirt 0.9.12 by reducing the "udev settle" timeout to ten seconds.

Can we expect this workaround to somehow make it into Libvirt 0.9.8 and Ubuntu Precise?

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Thanks for reporting this bug. Marking it confirmed based on separate debian reports.

I see no commits in the libvirt git tree addressing this. Do you know whether it has been discussed on irc (or libvirt mailing list)?

Changed in libvirt (Ubuntu):
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Andreas Ntaflos (daff) wrote :

Sorry, no idea where this commit is supposed to be. I was going off the information in the last entry of the Debian bug report, which contains this:

* [202939f] Reduce udevadm settle timeout to 10 seconds
     (Closes: #663931)

Maybe this is a Debian-only commit that didn't go back into Libvirt?

Revision history for this message
Ryan Tandy (rtandy) wrote :

Affected here on a fresh precise server install using lvm. Nice timing Andreas on creating this bug :)

My interpretation of the Debian bug is that they put that workaround in their own package but were seeking a proper fix upstream (probably in the kernel).

Strange thing is that I built a very similar server a couple of weeks ago and I haven't noticed this issue. Wondering whether that might reflect a difference in startup order/timing, or... ??

Revision history for this message
Ryan Tandy (rtandy) wrote :

Ah... the answer is because on the other server I forgot to define a pool... the VMs are just referencing the LVs directly :) ... any idea what I'm missing out on by not having the VG defined as a pool in libvirt?

Revision history for this message
Andreas Ntaflos (daff) wrote :

Ryan, what you are missing out on is Libvirt being able to create volumes (LVs) in that pool (the VG) on command. I assume you create LVs manually via lvcreate and then tell Libvirt to use them for VMs? If Libvirt manages the pool you can use virt-manager, virsh or any of the APIs/bindings to talk to Libvirt and command the creation of volumes for VMs.

An interesting point you brought up: we have dozens of servers running Ubuntu 12.04 for many weeks now where either this problem was not present (or not as promiment) or I simply have not noticed it consciously before. Unfortunately I can't test and experiment much since most of these servers are in production. On the servers I can experiment the problem is definitely present and noticeable.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package libvirt - 0.9.13-0ubuntu5

---------------
libvirt (0.9.13-0ubuntu5) quantal; urgency=low

  * add patch Reduce-udevadm-settle-timeout-to-10-seconds.patch (copied from
    Debian tree) to fix 3 minute hang during pool-refresh when using LVM
    backed pools. (LP: #1027987)
  * debian/control: add pm-utils to libvirt-bin Suggests. (LP: #994476)
 -- Serge Hallyn <email address hidden> Thu, 26 Jul 2012 11:05:18 -0500

Changed in libvirt (Ubuntu Quantal):
status: Confirmed → Fix Released
Revision history for this message
Andreas Ntaflos (daff) wrote :

Is this going to make it into 12.04.2?

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

This will need to wait until the current package staged in precise-proposed is promoted to precise-updates, but here is the debdiff for precise.

Changed in libvirt (Ubuntu Precise):
status: New → Triaged
importance: Undecided → Medium
description: updated
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Andreas, or anyone else affected,

Accepted libvirt into precise-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/libvirt/0.9.8-2ubuntu17.5 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please change the bug tag from verification-needed to verification-done. If it does not, change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in libvirt (Ubuntu Precise):
status: Triaged → Fix Committed
tags: added: verification-needed
Revision history for this message
Chris Halse Rogers (raof) wrote :

SRU team ping: this bug has been awaiting verification for a month now. It is blocking the fix for bug #1055658 from going into precise-updates.

Could someone please verify that the package in precise-proposed fixes this bug, so we may release this update?

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

I"ve tried to reproduce it today, but the bug wasn't happening with the version in -updates.

Revision history for this message
Ryan Tandy (rtandy) wrote :

I can't seem to reproduce it now either. I installed a couple of older revisions of both libvirt and udev but couldn't get it to happen. I wonder if a change somewhere else has affected this.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Hi Ryan,

can you confirm that at least the version currently in -proposed doesn't break anything for you?

Revision history for this message
Ryan Tandy (rtandy) wrote :

Well, I'm not exactly an advanced libvirt user; but libvirt 0.9.8-2ubuntu17.5 from proposed has been working well for all my use cases, and I'm still not seeing the symptoms this bug was originally reported about: "virsh pool-refresh vg0" and "udevadm settle" finish quickly. Fixed or at least invalidated, IMO.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Thanks, Ryan.

Clint, based on comment #14 can I call this verification-done?

Revision history for this message
Brian Murray (brian-murray) wrote :

Adam and I discussed this in #ubuntu-release and think that it is safe to mark it as verification-done.

tags: added: verification-done
removed: verification-needed
Revision history for this message
Colin Watson (cjwatson) wrote : Update Released

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package libvirt - 0.9.8-2ubuntu17.5

---------------
libvirt (0.9.8-2ubuntu17.5) precise-proposed; urgency=low

  * add patch Reduce-udevadm-settle-timeout-to-10-seconds.patch (copied from
    Debian tree) to fix 3 minute hang during pool-refresh when using LVM
    backed pools. (LP: #1027987)
  * add upstream patch command-avoid-double-close-bugs toi avoid a race when
    starting multiple VMs concurrently. (LP: #1055658)
 -- Serge Hallyn <email address hidden> Wed, 03 Oct 2012 11:48:31 -0500

Changed in libvirt (Ubuntu Precise):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.