Official UEC image fails to boot

Bug #464418 reported by Torsten Spindler
26
This bug affects 4 people
Affects Status Importance Assigned to Milestone
eucalyptus (Ubuntu)
Invalid
High
Unassigned

Bug Description

Build Version/Date: 1.6~bzr931-0ubuntu7
Environment used for testing: Ubuntu 9.10

Summary: The official UEC image fails to boot

Steps to Reproduce: Download http://uec-images.ubuntu.com/releases/karmic/release/ubuntu-9.10-uec-amd64.tar.gz
                                  Upload to UEC
                                 Run image

Expected result: running image

Actual result: image drops to initramfs upon start

Tags: uec-images
Revision history for this message
Torsten Spindler (tspindler) wrote :

Console output for failed start attempt.

Revision history for this message
Nick Barcet (nijaba) wrote :

From your log it looks like /dev/sda1 (the root partition) is not found:

Gave up waiting for root device. Common problems:
 - Boot args (cat /proc/cmdline)
   - Check rootdelay= (did the system wait long enough?)
   - Check root= (did the system wait for the right device?)
 - Missing modules (cat /proc/modules; ls /dev)
ALERT! /dev/sda1 does not exist. Dropping to a shell!

Wondering what is causing this...

Revision history for this message
Thierry Carrez (ttx) wrote :

Boots for me... Is it something you can reproduce every time ?

Changed in eucalyptus (Ubuntu):
status: New → Incomplete
importance: Undecided → High
tags: added: uec-images
Revision history for this message
Torsten Spindler (tspindler) wrote : Re: [Bug 464418] Re: Official UEC image fails to boot

On Fri, 2009-10-30 at 10:07 +0000, Thierry Carrez wrote:
> Boots for me... Is it something you can reproduce every time ?

Yes, 4 out of 4 times. Can repeat it more often if needed.

Revision history for this message
Thierry Carrez (ttx) wrote :

Sounds like an error in image bundling/registration then. What steps did you follow to upload the image ?

Revision history for this message
Torsten Spindler (tspindler) wrote :

On Fri, 2009-10-30 at 10:50 +0000, Thierry Carrez wrote:
> Sounds like an error in image bundling/registration then. What steps did
> you follow to upload the image ?

https://help.ubuntu.com/community/UEC/BundlingImages

The only option I left out is the -r $IARCH

Revision history for this message
Dustin Kirkland  (kirkland) wrote :

Which -t machine type are you starting the instance with?

:-Dustin

Revision history for this message
Torsten Spindler (tspindler) wrote :

On Fri, 2009-10-30 at 13:33 +0000, Dustin Kirkland wrote:
> Which -t machine type are you starting the instance with?

-t m1.large

Revision history for this message
Dustin Kirkland  (kirkland) wrote :

Hmm, i'm running with -t c1.medium just fine here.

--
:-Dustin

Revision history for this message
Mathias Gug (mathiaz) wrote :

On Fri, Oct 30, 2009 at 11:02:13AM -0000, Torsten Spindler wrote:
> On Fri, 2009-10-30 at 10:50 +0000, Thierry Carrez wrote:
> > Sounds like an error in image bundling/registration then. What steps did
> > you follow to upload the image ?
>
> https://help.ubuntu.com/community/UEC/BundlingImages
>
> The only option I left out is the -r $IARCH
>

Could you provide the exact sequence of commands you've used to bundle/register the image?

--
Mathias Gug
Ubuntu Developer http://www.ubuntu.com

Revision history for this message
Torsten Spindler (tspindler) wrote :

Here the relevant parts from history:

  535 euca-bundle-image --kernel true -i karmic-uec-amd64-vmlinuz-virtual
  536 euca-upload-bundle -m /tmp/karmic-uec-amd64-vmlinuz-virtual.manifest.xml
  537 euca-upload-bundle -m /tmp/karmic-uec-amd64-vmlinuz-virtual.manifest.xml -b karmic-uec-kernel
  538 euca-register karmic-uec-kernel/karmic-uec-amd64-vmlinuz-virtual.manifest.xml
  540 euca-bundle-image --ramdisk true karmic-uec-amd64-initrd-virtual
  541 euca-bundle-image --ramdisk true -i karmic-uec-amd64-initrd-virtual
  542 euca-upload-bundle -b karmic-uec-ramdisk -m /tmp/karmic-uec-amd64-initrd-virtual.manifest.xml
  543 euca-register karmic-uec-ramdisk/karmic-uec-amd64-initrd-virtual.manifest.xml
  544 euca-bundle-image --ramdisk eri-3A9E1A79 --kernel eki-3E0D1AA0 -i karmic-uec-amd64.img
  545 euca-upload-bundle -b uec-karmic-image -m /tmp/karmic-uec-amd64.img.manifest.xml
  546 euca-register uec-karmic-image/karmic-uec-amd64.img.manifest.xml
  547 euca-run-instances -k mykey emi-D4B814FB -t m1.large

Nick Barcet (nijaba)
Changed in eucalyptus (Ubuntu):
status: Incomplete → New
Revision history for this message
Torsten Spindler (tspindler) wrote :

This seems to be cloud specific to the one at my home. I have the image running on a different cloud and it works fine there.

Revision history for this message
Torsten Spindler (tspindler) wrote :

I cannot reproduce the problem anymore with the cloud I've got currently running. I suggest to invalidate the bug.

Revision history for this message
Thierry Carrez (ttx) wrote :

OK, please reopen if you can find what was specific to that "home" cloud :)

Changed in eucalyptus (Ubuntu):
status: New → Invalid
Revision history for this message
Boris Devouge (bdevouge) wrote :

I am affected by this and have a 50 % failure rate (or success rate) ar running images. One out of 2 of my launched instances from my 'custom' image exhibits the same issue.

Revision history for this message
Kindjal (kindjal) wrote :

I too am affected by this.

ubuntu 9.10
eucalyptus 1.6~bzr931-0ubuntu7.5

Is there a fix? Is this a misconfiguration?

Revision history for this message
Scott Moser (smoser) wrote :

Kindjal, can you give more information on this setup ? Can you also attach a console log ?

The failure just doesn't make any sense to me, and none of Dustin, Theirry, or myself have seen it.

Revision history for this message
Kindjal (kindjal) wrote :
Download full text (13.5 KiB)

Scott,

Thank you for your interest.

My setup consists of two physical hosts, one acting as Cloud Controller (CC), the other acting as Node Controller (NC). They were both installed via the Ubuntu 9.10 netboot installation:

  http://cdimage.ubuntu.com/netboot/

Note that the netboot install does not offer a boot-time splash screen with an "Install Ubuntu Enterprise Cloud" option. Rather, you do a "normal" install, and when you get to the software selection step (tasksel), there are three options related to cloud:

Cloud computing cluster
Cloud computing node
Ubuntu Enterprise Cloud (instance)

I had suspected that choosing:

Ubuntu Enterprise Cloud (instance)

would lead to a subsequent prompt of "cluster" or "node", but it did not. The system resulting from that selection would not boot. You get to Grub (grub2 I believe), and when the kernel is loaded you get:

 invalid magic number

Invistigating this leads me to believe there's a problem with grub2 loading ext4 filesystems. It looks like this bug:

https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/487689

Instead of following that path, I re-installed using the tasksel choices:

Cloud computing cluster
Cloud computing node

These installs complete without error.

I proceeded through configuration steps as per:

https://help.ubuntu.com/community/UEC/CDInstall

Including the selection of a guest OS image on the NC.

For some time, I tried running guest VMs using Xen. This led me down a two day diversion of discovering Ubuntu's departure from Xen support, lack of native kernel support with the Ubuntu kernel, the installation of a Debian kernel, issues with Xen-3.3 and that kernel, building Xen-3.4.2, discovery of issues with Xen-3.4.2, patching it, discovering issues with the Debian kernel and grub2, all of which led finally to a system that would boot, but fail to mount its root file system. I aborted that path and started over focusing on KVM.

In all, I believe I tried install and setup 5 or 6 times, each time resulting in setups that weren't quite right, mostly due to networking. My CC and NC are not on the same network segment, which led to issues with MANAGED networking. It took me a while to land on SYSTEM as a network mode, having to manually configure a bridge on the NC.

Another side note, in httpd-nc_error_log I see:

ERROR: Disallowed command //usr/share/eucalyptus/populate_arp.pl

I find:
https://bugs.launchpad.net/ubuntu/+source/eucalyptus/+bug/461829

I attempted to resolve this via a patch to:

root@blade7-2-2:/etc/eucalyptus# diff -u wrappers.conf.orig wrappers.conf
--- wrappers.conf.orig 2010-03-26 10:10:52.017817733 -0500
+++ wrappers.conf 2010-03-26 10:11:31.434067644 -0500
@@ -47,3 +47,5 @@
 powerwake /usr/bin/powerwake
 virsh /usr/bin/virsh
 which /usr/bin/which
+# Add by hand xref https://bugs.launchpad.net/ubuntu/+source/eucalyptus/+bug/461829
+populate_arp.pl /usr/share/eucalyptus/populate_arp.pl 0

But I still see the errors.

I discovered that eucalyptus is an upstart job, but eucalyptus-cc is an init.d script.

I discovered that eucalyptus-cc needs "cleanstop" and that that really means ...

Revision history for this message
Kindjal (kindjal) wrote :

Here's a console.log

Revision history for this message
Kindjal (kindjal) wrote :

Here's another console.log with the other failure to mount fs.

Revision history for this message
Scott Moser (smoser) wrote :

> I had suspected that choosing:
>
> Ubuntu Enterprise Cloud (instance)

You definitely do not want 'instance'. Instance is the thing running
inside the cloud. (Ie, this would help you get package selection for a
"image" to run in a VM inside the cloud).

> system that would boot, but fail to mount its root file system. I
> aborted that path and started over focusing on KVM.

Sorry for the lost time. KVM is definitely the "supported path".

> In all, I believe I tried install and setup 5 or 6 times, each time
> resulting in setups that weren't quite right, mostly due to networking.
> My CC and NC are not on the same network segment, which led to issues
> with MANAGED networking. It took me a while to land on SYSTEM as a
> network mode, having to manually configure a bridge on the NC.

While I would not expect you to fail where you did, the UEC Images will
not function correctly in any mode without a metadata service (ie, STATIC
or SYSTEM) (http://open.eucalyptus.com/wiki/EucalyptusNetworking_v1.5.2)

The images are built to obtain information about themselves from the
metadata service, and will not function if it is not present.

Another thing to note, is that images for ec2/Eucalyptus will expect to
find a root filesystem at /dev/sda1 , not at /dev/sda. If you just boot
an image in kvm with '-hda disk.img' you will quite likely fail with "can't
find root" as the /etc/fstab will be expecting /dev/sda1 and you will have
presented a partition image as a disk. Eucalyptus takes the partition
image and turns it into a "full disk" with a partition table.

I believe the problems you're running into (at this point at least) are
more with eucalyptus configuration than with the images themselves.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.