qemu-img qcow2 conversion hangs on large core systems

Bug #1457639 reported by dann frazier
24
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Ubuntu Cloud Archive
Fix Released
Undecided
Unassigned
Kilo
Triaged
High
Unassigned
Liberty
Fix Released
High
Unassigned
qemu (Ubuntu)
Fix Released
High
Unassigned
Vivid
Won't Fix
High
Unassigned
Wily
Fix Released
High
Unassigned

Bug Description

[Impact]
qemu-img frequently hangs when converting a file to qcow2 on a system with a large number of cores.

[Test Case]
On a system with a large number of cores (my test system has 48), run:

wget http://cloud-images.ubuntu.com/trusty/current/trusty-server-cloudimg-arm64-disk1.img -O disk.img
for i in $(seq 1 10); do qemu-img convert -O qcow2 disk.img disk.qcow2; done

Mine always hangs within the first 3 iterations.

[Regression Risk]
Fix is a clean cherry pick from upstream, which should give us a clear path to resolution if regressions are found.

Tags: hs-arm64
dann frazier (dannf)
Changed in qemu (Ubuntu Vivid):
status: New → Confirmed
importance: Undecided → High
Revision history for this message
dann frazier (dannf) wrote :

I pushed a fix based on the debian-unstable branch to:
  git://git.launchpad.net/~dannf/qemu

I haven't been able to reproduce this issue on Debian unstable so I didn't file a Debian bug. However, as the upstream commit log describes, easy reproduction is pretty sensitive to the environment and doesn't imply the bug is *not* there - so I'm assuming it makes more sense to go through Debian than to be an Ubuntu-specific patch.

Revision history for this message
dann frazier (dannf) wrote :

Debian has been updated to 2.3 which includes the upstream fix, so ignore my last comment. I assume wily will get a 2.3 upload soon, and that will fix the issue there.

I have pushed a branch for vivid to git://git.launchpad.net/~dannf/qemu (lp1457639-vivid)

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

This should be fixed in 1:2.3+dfsg-4ubuntu1

Changed in qemu (Ubuntu Wily):
status: Confirmed → Fix Released
Revision history for this message
William Grant (wgrant) wrote :

We can reproduce this naturally within a few hours on ppc64el with a backport of vivid's qemu.

Changed in qemu (Ubuntu Vivid):
assignee: nobody → Serge Hallyn (serge-hallyn)
dann frazier (dannf)
Changed in qemu (Ubuntu Vivid):
status: Confirmed → In Progress
Changed in qemu (Ubuntu Vivid):
assignee: Serge Hallyn (serge-hallyn) → nobody
Revision history for this message
Haw Loeung (hloeung) wrote :

Any chance of having the fixes (or fixed version) backported to trusty? We need this for the HP Moonshot hardware we're using in ScalingStack and am currently working around it with a backport of the version in Vivid[1]

[1]https://launchpad.net/~canonical-is-sa/+archive/ubuntu/arm64-infra-workarounds/+packages

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

I don't see where anyone has identified what patches actually fixed this, and no obvious message in git log , so offhand I'd say chances are slim. Could you use the cloud archive, which should have the newer versions?

Revision history for this message
dann frazier (dannf) wrote :

@Serge: I identified the patch that fixes this - see comment #2.

Revision history for this message
dann frazier (dannf) wrote :
Revision history for this message
dann frazier (dannf) wrote :

Sorry for the additional post - I just noticed that @Haw was asking about trusty.

As I recall, the issue fixed by the patch I provided was introduced in vivid, and was fixed before wily. In other words, only vivid should be impacted. I'm surprised to hear that it impacts trusty.

Revision history for this message
William Grant (wgrant) wrote : Re: [Bug 1457639] Re: qemu-img qcow2 conversion hangs on large core systems

On 28/10/15 08:42, dann frazier wrote:
> Sorry for the additional post - I just noticed that @Haw was asking
> about trusty.
>
> As I recall, the issue fixed by the patch I provided was introduced in
> vivid, and was fixed before wily. In other words, only vivid should be
> impacted. I'm surprised to hear that it impacts trusty.

It's the kilo cloud-archive version that we're using, which is a
backport of vivid's. It's vivid that needs the SRU.

 affects cloud-archive

Haw Loeung (hloeung)
no longer affects: qemu (Ubuntu Trusty)
Changed in qemu (Ubuntu Vivid):
status: In Progress → Won't Fix
James Page (james-page)
Changed in cloud-archive:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.