Ubuntu 10.04 LTS as guest freezes after xm restore

Bug #881542 reported by Michal Daszkowski
28
This bug affects 4 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Undecided
Unassigned
Lucid
Fix Released
Medium
Stefan Bader

Bug Description

SRU Justification:

Impact: Since "xen: Use IRQF_FORCE_RESUME" was accepted into upstream stable for 2.6.32, a save/restore cycle will leave the guest in a halt state after restore.

Fix: Two upstream patches (one of those would be in the latest stable, the other had to be reverted as it broke compile on some architectures. The suggested patch is the revised version.
* xen/timer: Missing IRQF_NO_SUSPEND in timer code broke suspend.
* genirq: Add IRQF_RESUME_EARLY and resume such IRQs earlier

Testcase:
* Boot pv-guest
* Run xm save <id> <file> on the dom0
* Run xm restore <file> on the dom0
* Try to re-connect to the console with xm console <id>

---

Description: Ubuntu 10.04.3 LTS
Release: 10.04

When I'm restoring virtual machine after been saved (for example during shutdown) it completely freezes or sometimes is working, but I cannot execute interactive commands.

This problem is related to linux-image-2.6.32-34-server package, because when I'm using 2.6.32-34 kernel, all is OK.

Dom0 system is Debian 6.0.2, xen hypervisor version is 4.0.1-2.
---
AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access /dev/snd/: No such file or directory
AplayDevices: Error: [Errno 2] No such file or directory
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
CurrentDmesg: [ 11.980044] eth0: no IPv6 routers present
DistroRelease: Ubuntu 10.04
Lspci: Error: [Errno 2] No such file or directory
Lsusb: Error: [Errno 2] No such file or directory
Package: linux (not installed)
PciMultimedia:

ProcCmdLine: root=/dev/xvda2 ro root=/dev/xvda2 ro
ProcEnviron:
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.32-34.77-server 2.6.32.44+drm33.19
Regression: Yes
Reproducible: Yes
Tags: lucid suspend resume regression-update needs-upstream-testing
Uname: Linux 2.6.32-34-server x86_64
UserGroups:

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 881542

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: lucid
Revision history for this message
Michal Daszkowski (orr) wrote : BootDmesg.txt

apport information

tags: added: apport-collected
description: updated
Revision history for this message
Michal Daszkowski (orr) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Michal Daszkowski (orr) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Michal Daszkowski (orr) wrote : ProcModules.txt

apport information

Revision history for this message
Michal Daszkowski (orr) wrote : UdevDb.txt

apport information

Revision history for this message
Michal Daszkowski (orr) wrote : UdevLog.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Michal Daszkowski (orr) wrote :

The previous version of this package: 2.6.32-33-server works properly. Problems are after upgrade to version 2.6.32-34. Sorry for mistake in first description.

Revision history for this message
Stefan Bader (smb) wrote :

Thanks for opening the bug. So there are a few more pieces of information that would be useful. From the dmesg I can see that the current failing version is 2.6.32-34.77. To get the exact version of the older kernel, it would be good if you could boot that one and attach the dmesg gathered there.
Also from your mail you said you use PVM, are you providing the kernel and initrd from the xen cfg, or are you using one of pygrub or pvgrub? When restore fails, are you able to connect to the xen console (xm console) and does that show any errors?

I only had a HVM lucid server install prepared which I tried with the .34 kernel. But I am also running a 4.1.1 hypervisor. So I need to try getting that started in PVM mode. Meanwhile there is an even newer kernel pending to be released. The packages can be found under the builds at https://launchpad.net/ubuntu/lucid/+source/linux/2.6.32-35.78. Maybe that already fixes the issue or at least changes the behaviour enough for it not happening.

Revision history for this message
Stefan Bader (smb) wrote :

Alright, I can re-create the issue. And the 2.6.32-35 kernel is broken, too. Cannot say much more right now but I will be able to collect any further data locally.

Changed in linux (Ubuntu):
assignee: nobody → Stefan Bader (stefan-bader-canonical)
status: Confirmed → In Progress
Revision history for this message
Stefan Bader (smb) wrote :

The following commit is causing this. Given that it was only added because the Xen folks asked for the better implementation and fixing the issue seems to require an unknown number of additional patches with unknown side effects, I would get this reverted (and the previous patch adding the infrastructure).

Author: Thomas Gleixner <email address hidden>
Date: Sat Feb 5 20:08:59 2011 +0000
    xen: Use IRQF_FORCE_RESUME
    commit 676dc3cf5bc36a9e129a3ad8fe3bd7b2ebf20f5d upstream.

Stefan Bader (smb)
Changed in linux (Ubuntu Lucid):
assignee: nobody → Stefan Bader (stefan-bader-canonical)
importance: Undecided → Medium
status: New → In Progress
Changed in linux (Ubuntu):
assignee: Stefan Bader (stefan-bader-canonical) → nobody
status: In Progress → Invalid
Stefan Bader (smb)
description: updated
Tim Gardner (timg-tpi)
Changed in linux (Ubuntu Lucid):
status: In Progress → Fix Committed
Revision history for this message
jagudo (jagudo) wrote :

Same problem for Lucid (10.04) in today update: 2.6.32-36-generic

Revision history for this message
Davim (davim) wrote :

It's solved on 2.6.32-37-generic-pae please make sure this is not forgotten on the next update...

Revision history for this message
Stefan Bader (smb) wrote :

Should be fix-released since 2.6.32-37.80 (not auto updated because the same change via upstream stable replaced the patch with the bug reference).

Changed in linux (Ubuntu Lucid):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.