cloud-image VM causes kernel panic if image is resized

Bug #1123220 reported by Neil Wilson
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
cloud-initramfs-tools (Ubuntu)
Won't Fix
Low
Unassigned

Bug Description

I'd very much like to use the official cloud images from cloud-images.ubuntu.com on the Brightbox cloud, but I can never get them to boot properly in KVM.

If the disk is resized then the VM always crashes with a kernel panic.

Process is as follows:

On a Precise Host running libvirt.

- wget http://cloud-images.ubuntu.com/quantal/current/quantal-server-cloudimg-amd64-disk1.img
- sudo cp quantal-server-cloudimg-amd64-disk1.img /var/lib/libvirt/images/test.img
- sudo qemu-img resize /var/lib/libvirt/images/test.img 20G
- virsh create test.xml

If you view the console in virt-manager you'll find that the kernel has panicked on the disk remount after resizing.

Issuing a 'virsh reset srv-7867c' causes the boot to progress normally. Similarly if the disk isn't resized the boot progresses normally.

You get the same problem with:

- all the standard images - with varying degrees of feedback in the failure message.
- using the RHEL6 version of libvirt and kvm
- if you remove the virtio from the disk stanza and replace with ide emulation.

Any ideas?

Related bugs:
 * bug 1122245: booting from a cloud image hangs until virsh console is used
 * bug 1061977: Machine fails to commission when console=ttyS0 is present on kernel opts
 * bug 1016695: add console=tty1 to cloud-image kernel boot parameters

ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: cloud-initramfs-growroot 0.4ubuntu1
ProcVersionSignature: Ubuntu 3.2.0-37.58-virtual 3.2.35
Uname: Linux 3.2.0-37-virtual x86_64
ApportVersion: 2.0.1-0ubuntu17.1
Architecture: amd64
Date: Tue Feb 12 15:48:19 2013
MarkForUpload: True
PackageArchitecture: all
SourcePackage: cloud-initramfs-tools
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Neil Wilson (neil-aldur) wrote :
Revision history for this message
Neil Wilson (neil-aldur) wrote :
Scott Moser (smoser)
no longer affects: ubuntu-on-ec2
Scott Moser (smoser)
affects: cloud-initramfs-tools (Ubuntu) → ubuntu
tags: added: cloud-images cloud-images-build
affects: ubuntu → cloud-initramfs-tools (Ubuntu)
Revision history for this message
Scott Moser (smoser) wrote :

This is really interesting.
Its easily reproducible with:
 wget wget http://cloud-images.ubuntu.com/quantal/current/quantal-server-cloudimg-amd64-disk1.img -O disk.img
 qemu-img resize disk.img 20G
 kvm -serial none -drive file=disk.img,if=virtio -curses -m 256

The problem is "fixed", if you remove '-serial none' from the kvm cmdline, and thus get the default serial device that kvm appends.

The original test.xml can be fixed in a similar manner by simply adding:
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>

It can also be fixed by mounting the image and removing 'console=ttyS0' from the kernel command lines in /boot/grub/grub.cfg.

Its hard to see, because observing it makes it work. But, I suspect that the root of the problem is that cloud-initramfs-growpart is writing to stdout, which is redirected to /dev/console, and /dev/console writes are going to the non-existant device 'ttyS0' (as told to by the command line).

Those writes are failing, and something is then leaving the disk in a bad state.

Revision history for this message
Scott Moser (smoser) wrote :

I'm going to mark this 'low', as the work around is easy: attach a serial device.

Changed in cloud-initramfs-tools (Ubuntu):
importance: Undecided → Low
status: New → Triaged
description: updated
Revision history for this message
Scott Moser (smoser) wrote :

It seems to me that 'echo hi mom' from the initramfs (or from an upstart job with 'output console') should never result in failure. This is failing in the above situation because the kernel cmdline has 'console=tty1 console=ttyS0' on it, but ttyS0 is not a valid device.
So, it seems the following are potential fixes to the issue:
 a.) kernel doing better sanity checking on console= argument and not attaching /dev/console to something that is going to fail
 b.) initramfs and upstart verifying that /dev/console can be written to, and if not, then redirecting output to somewhere that *is* writable (possibly /run/ or /dev/null).

Clearly, i could change cloud-initramfs-growroot to not fail when it fails to write to its stdout, but that seems not a general fix.

For initramfs, we could do something very early in the initramfs like this:

if ! echo "initramfs running" > /dev/console; then
  read cmdline < /proc/cmdline
  consoles=""
  for tok in $cmdline; do
    [ ${tok#console=} != ${tok} ] || continue;
    tok=${tok#console=}; tok=${tok%,*};
    # reverse order on cmdline
    consoles="$tok $consoles";
  done
  failed=""
  found=""
  for console in $consoles; do
    dev="/dev/${console#/dev}"
    echo "initramfs running" > "$dev" &&
      found=$dev && break ||
      failed="${failed:+${failed}$dev}"
  done
  if [ -n "$found" ]; then
    exec > "$found" 2>&1
  else
    exec > "/run/initramfs.log" 2>&1
  fi
  echo "WARN: Failed write to /dev/console${failed:+ and ${failed}}"
fi

Revision history for this message
Scott Moser (smoser) wrote :

Oh, and there is more information on bug 1061977, where I attempted to do 'a' as a solution, or find a way to re-assign /dev/console from user-space.

Revision history for this message
Scott Moser (smoser) wrote :

above, it seems it'd make good sense to write the warning to /dev/kmsg also.

Revision history for this message
Scott Moser (smoser) wrote :

I'm adding a email thread that I had with smb, apw, and slangasek.
It discusses the issue and potential solutions in some detail.

Scott Moser (smoser)
Changed in cloud-init (Ubuntu):
status: New → Triaged
importance: Undecided → Low
description: updated
Revision history for this message
Scott Moser (smoser) wrote :

I'm going to mark this 'wont fix' in cloud-initramfs-tools.
and take away cloud-init.

no longer affects: cloud-init (Ubuntu)
Changed in cloud-initramfs-tools (Ubuntu):
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.