Rebundled uec instance boot fail

Bug #551847 reported by Andy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
euca2ools
Fix Released
Undecided
Unassigned
euca2ools (Ubuntu)
Fix Released
Low
Unassigned
Lucid
Fix Released
Low
Unassigned

Bug Description

Binary package hint: plymouth

Steps to reproduce.

The master-controller and all nodes were updated today with apt-get upgrade.

#On master controller
wget http://uec-images.ubuntu.com/server/lucid/20100330/lucid-server-uec-amd64.tar.gz
uec-publish-tarball lucid-server-uec-amd64.tar.gz mybucket
euca-run-instances -k mykey -t c1.medium emi-BDB314B8

scp -i mykey.priv .euca/* ubuntu@10.16.2.177:/home/ubuntu/
ssh -i mykey.priv ubuntu@10.16.2.177

On new running instance
. eucarc
euca-bundle-vol -c ${EC2_CERT} -k ${EC2_PRIVATE_KEY} -u ${EC2_USER_ID} --ec2cert ${EUCALYPTUS_CERT} --no-inherit --kernel eki-F4AD10E3 --ramdisk eri-090F114A -d /mnt -r x86_64 -p myimage -s 2048

euca-upload-bundle -b test3 -m /mnt/myimage.manifest.xml
euca-register test3/myimage.manifest.xml

#On Master
euca-run-instances emi-1C210CB6 -k mykey -t c1.medium
euca-get-console-output i-3FC807A8

After about 10 minutes euca-get-console still shows

...
...
Begin: Running /scripts/local-bottom ...
Done.
Done.
Begin: Running /scripts/init-bottom ...
Done.
[ 3.822253] e1000: 0000:00:03.0: e1000_probe: (PCI:33MHz:32-bit) d0:0d:3f:c8:07:a8
[ 3.871628] e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
[ 3.895344] EXT3 FS on sda1, internal journal

Tags: uec
Andy (abarringer)
tags: added: uec
Revision history for this message
Steve Langasek (vorlon) wrote :

Nothing here points to plymouth being involved in the failure. Reassigning to cloud-init.

affects: plymouth (Ubuntu) → cloud-init (Ubuntu)
Revision history for this message
Scott Moser (smoser) wrote :

I really would expect that this is a problem with euca-bundle-vol. Putting this in euca2ools. Sorry for the movement spam.

affects: cloud-init (Ubuntu) → euca2ools (Ubuntu)
Thierry Carrez (ttx)
summary: - uec instance boot fail
+ Rebundled uec instance boot fail
Changed in euca2ools (Ubuntu):
importance: Undecided → High
Revision history for this message
Neil Soman (neilsoman) wrote :

Is this still an issue? Any updates?

How do we reproduce this on the upstream side?

thanks.

Revision history for this message
Scott Moser (smoser) wrote :

I can confirm this on beta2 image (20100407.1).

Changed in euca2ools (Ubuntu):
status: New → Confirmed
Revision history for this message
Scott Moser (smoser) wrote :

Ok, I found the source of this problem. Its really a dupe of bug 341006 , or at very least, a fix to that bug that addressed the MAC addresses used by UEC would make this problem go away.

The reason there is no output after the kernel in the console above is that upstart is quietly waiting for eth0 to come up, but it never will because /etc/udev/rules.d/70-persistent-net.rules has given 'eth0' to a device attached in the bundled instance.

So, that leaves us with the following solutions / work arounds:

a.) cloud-init (or some other package) could possibly add some rules or black lists so the persistent-net rules were not written in images where it was installed. (I'd have to see if this would be possible)
b.) euca-bundle-vol could take it upon itself to remove that file
c.) the user can 'sudo rm -f /etc/udev/rules.d/70-persistent-net.rules' before invoking euca-bundle-vol
d.) the user can use a more clean approach to re-bundling an image.

 I generally don't like the idea behind euca-bundle-vol, and think that a tool like this is going to result in a never ending series of hacks to take a booted instance and turn it into a clean image. I would much prefer that re-bundling was done by downloading the image (http://uec-images.ubuntu.com/releases), extracting it, mounting loopback, modifying, and unmounting.

That said, I'll take a look and see 'a' could be accomplished.

Revision history for this message
Scott Moser (smoser) wrote :

I'm changing the importance of this to 'low' given the easy work around of 'rm etc/udev/rules.d/70-persistent-net.rules'

Changed in euca2ools (Ubuntu):
importance: High → Low
Revision history for this message
Mathias Gug (mathiaz) wrote : Re: [Bug 551847] Re: Rebundled uec instance boot fail

On Thu, Apr 08, 2010 at 09:08:43PM -0000, Scott Moser wrote:
>
> a.) cloud-init (or some other package) could possibly add some rules or black lists so the persistent-net rules were not written in images where it was installed. (I'd have to see if this would be possible)

Kvm uses the same vendor id for mac addresses. Does EC2 use something similar? If so we could just blacklist mac addresses to not be written to /etc/udev/rules.d/70-persistent-net.rules.

--
Mathias Gug
Ubuntu Developer http://www.ubuntu.com

Revision history for this message
Scott Moser (smoser) wrote :

On Fri, 9 Apr 2010, Mathias Gug wrote:

> Kvm uses the same vendor id for mac addresses. Does EC2 use something
> similar? If so we could just blacklist mac addresses to not be written
> to /etc/udev/rules.d/70-persistent-net.rules.

Read the mentioned bug for more information.

Thierry Carrez (ttx)
Changed in euca2ools (Ubuntu Lucid):
assignee: nobody → Scott Moser (smoser)
Revision history for this message
Scott Moser (smoser) wrote :

Just for reference, http://forum.eucalyptus.com/forum/multiple-network-interface-configuration discusses multiple network interfaces in a eucalyptus instance. So, simple blacklisting there would possibly, though unlikely could lead to inconsistent naming (kernel change isn't going to occur in Eucalyptus and the kernel driver woudl be the same for both devices).

I was previously unaware of this. Its something that will need to be addressed. The current cloud-init expects that

Revision history for this message
Scott Moser (smoser) wrote :

The current cloud-init expects that 'eth0' will come up and have the metadata service on it.

Revision history for this message
Scott Moser (smoser) wrote :

I just realized that this same issue was fixed in ec2-bundle-vol in bug 308548 .

As such, it would be in keeping with that to make the same fix to euca-bundle-vol.

Revision history for this message
Scott Moser (smoser) wrote :

This is basically bug 308548 fix moved to euca2ools. I've added a environment variable hack-around in the unlikely event that the user wishes for /etc/udev/rules.d/70-persistent-net.rules and /etc/udev/rules.d/z25_persistent-net.rules to be copied to the target. If thats the case , setting 'EUCA_BUNDLE_VOL_EMPTY_EXCLUDES=1' will not pre-fill the excludes list.

Revision history for this message
Scott Moser (smoser) wrote :

Quoting nurmi:

| As discussed this morning, we don't foresee any problem that would
| come up if this patch were put in place. Neil has looked at the
| patch, and agrees that it will solve this specific problem. In the
| future, several options will likely be explored, ranging from
| modifications to the UEC/EC2 VMs themselves to remove persistent net
| rules, to adding a general notion of 'distro excludes' to euca2ools.

Changed in euca2ools (Ubuntu Lucid):
assignee: Scott Moser (smoser) → nobody
status: Confirmed → In Progress
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package euca2ools - 1.2-0ubuntu10

---------------
euca2ools (1.2-0ubuntu10) lucid; urgency=low

  * euca-bundle-vol: exclude persistent udev net device rules LP: #551847
 -- Scott Moser <email address hidden> Wed, 14 Apr 2010 16:42:22 -0400

Changed in euca2ools (Ubuntu Lucid):
status: In Progress → Fix Released
Revision history for this message
Scott Moser (smoser) wrote :
Changed in euca2ools:
status: New → Fix Committed
Changed in euca2ools:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.