[SRU] non-interactive grub updates broken for /dev/xvda devices on Cloud-Images/Cloud-init

Bug #1336855 reported by Ben Howard
40
This bug affects 6 people
Affects Status Importance Assigned to Milestone
cloud-init
Fix Released
Medium
Unassigned
cloud-init (Ubuntu)
Fix Released
Critical
Unassigned
Declined for Saucy by Scott Moser
Precise
Fix Released
Critical
Unassigned
Trusty
Fix Released
Critical
Unassigned
Utopic
Fix Released
Critical
Unassigned

Bug Description

[SRU JUSTIFICATION]

[IMPACT] Cloud-init, as part of the first boot configures grub-pc to set the device that grub should install to. However, in the case of HVM instances, /dev/xvda and /dev/xvda1 are not considered (only sda, sda1, vda and vda1). Since AWS HVM instances and Xen use /dev/xvdX devices, this means that any Grub update with an ABI change will break the instances, rendering them unable to boot.

[FIX] Cloud-init has been patched to understand /dev/xvda devices and set the correct grub-pc/install_device. Further, cloud-init's postinst has been patched to fix people who might be affected by this bug.

[Test Case 1]
1. Boot HVM instance store AMI ami-90b156f8 (us-east-1)
2. Update grub
3. Update cloud-init from -proposed
4. Reboot instance
5. instance should come back up

[Test Case 2 -- 12.04 Only]
1. Boot HVM instance store AMI ami-90b156f8 (us-east-1)
2. run "cloud-init-cfg grub_dpkg --freqenucy always"
3. run "debconf-show grub-pc", confirm that grub-pc/install_devices is /dev/xvda
4. update grub
5. Reboot
6. instance should come back up

[Test Case 3 -- 14.04 Only]
1. Boot HVM instance store AMI ami-1f958c76 (us-east-1)
2. run "cloud-init single grub_dpkg --freqenucy always"
3. run "debconf-show grub-pc", confirm that grub-pc/install_devices is /dev/xvda
4. update grub
5. Reboot
6. instance should come back up

[Test Case 4]
1. Install from -proposed
2. Simulate a first-run:
   echo "grub-pc grub-pc/install_devices select /dev/sda" | debconf-set-selections
3. Run: cloud-init single --name=grub-dpkg --frequency=always
4. Run: debconf-show grub-pc
5. confirm that /dev/xvda is shown as the install device

----ORIGINAL report----

It looks like a recent update to grub or the kernel on 12.04 is breaking
unattended installs on EC2 for HVM instances.

You can reproduce the problem by doing the following:

region: us-east-1
virtualization type: HVM (e.g. r3.xlarge)
AMI ID: ami-7a916212

dpkg --configure –a
apt-get update
apt-get install -y ruby ruby-dev libicu-dev libssl-dev libxslt-dev
libxml2-dev monit
apt-get dist-upgrade –y

Related branches

tags: added: cloud-images precise
Revision history for this message
Robert C Jennings (rcj) wrote :

I can recreate as well.

AMI ID: ubuntu-precise-12.04-amd64-server-20140606 (ami-a69665ce)
Availability zone: us-east-1b
Instance type: t2.small
Root device type: ebs (ssd)

# sudo apt-get update
# sudo apt-get upgrade
 <recreate>

Revision history for this message
Robert C Jennings (rcj) wrote :

script output from a recreate

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

The relevant piece here is that the grub installation process is not selecting the right device.

Setting up grub-pc-bin (1.99-21ubuntu3.15) ...
Setting up grub-pc (1.99-21ubuntu3.15) ...
/usr/sbin/grub-probe: error: cannot stat `/dev/sda'.
Generating grub.cfg ...
Found linux image: /boot/vmlinuz-3.2.0-65-virtual
Found initrd image: /boot/initrd.img-3.2.0-65-virtual
Found linux image: /boot/vmlinuz-3.2.0-64-virtual
Found initrd image: /boot/initrd.img-3.2.0-64-virtual
Found memtest86+ image: /boot/memtest86+.bin
done

Revision history for this message
Robert C Jennings (rcj) wrote :
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

Narrowed this down to Cloud-init. Cloud-init is not selecting the right device, and is reconfiguring Grub to use /dev/sda:
Jul 2 16:53:10 ip-10-169-38-57 [CLOUDINIT] __init__.py[DEBUG]: handling grub-dpkg with freq=None and args=[]
Jul 2 16:53:10 ip-10-169-38-57 [CLOUDINIT] cc_grub_dpkg.py[DEBUG]: setting grub debconf-set-selections with '/dev/sda','false'
Jul 2 16:53:10 ip-10-169-38-57 [CLOUDINIT] __init__.py[DEBUG]: handling apt-pipelining with freq=None and args=[]

And the mounts:
$ mount
/dev/xvda1 on / type ext4 (rw)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
none on /sys/fs/fuse/connections type fusectl (rw)
none on /sys/kernel/debug type debugfs (rw)
none on /sys/kernel/security type securityfs (rw)
udev on /dev type devtmpfs (rw,mode=0755)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755)
none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880)
none on /run/shm type tmpfs (rw,nosuid,nodev)
/dev/xvdb on /mnt type ext3 (rw,_netdev)

Changed in ubuntu:
status: New → Confirmed
importance: Undecided → High
affects: ubuntu → cloud-init (Ubuntu)
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

The reason for this happening is quite simple: /dev/xvda1 is only considered as a potential grub device if /dev/xvda does not exist. Since it does appear on HVM instances, the logic is invalid.

The following works:

ben@prongs:/work/patching/cloud-init/cloudinit/config$ bzr diff
=== modified file 'cloudinit/config/cc_grub_dpkg.py'
--- cloudinit/config/cc_grub_dpkg.py 2014-02-12 19:56:55 +0000
+++ cloudinit/config/cc_grub_dpkg.py 2014-07-02 17:00:47 +0000
@@ -46,7 +46,8 @@
             idevs_empty = "false"
         if idevs is None:
             idevs = "/dev/sda"
- for dev in ("/dev/sda", "/dev/vda", "/dev/sda1", "/dev/vda1"):
+ for dev in ("/dev/sda", "/dev/vda", "/dev/xvda", "/dev/sda1",
+ "/dev/vda1", "/dev/xvda1"):
                 if os.path.exists(dev):
                     idevs = dev
                     break

Changed in cloud-init (Ubuntu):
importance: High → Critical
Revision history for this message
Robert C Jennings (rcj) wrote :

Confirmed your fix Ben. Changed /usr/lib/python2.7/dist-packages/cloudinit/CloudConfig/cc_grub_dpkg.py, removed /var/lib/cloud/instance/sem/config-grub-dpkg, and ran cloud-init-cfg grub-dpkg. The cloud-init log confirmed xvda was selected and an upgrade of grub worked.

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

Here is a simple work around for the time being:

    [ -b /dev/xvda ] && {
          echo "grub-pc grub-pc/install_devices string /dev/xvda" \
             debconf-set-selections
          echo "grub-pc grub-pc/install_devices_empty boolean false" \
             debconf-set-selections
    }

(This will check if /dev/xvda is a block device and then tell dpkg to use the right device for installing grub)

summary: - non-interactive grub updates for 12.04 break on AWS
+ [SRU] non-interactive grub updates for 12.04 break on AWS
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote : Re: [SRU] non-interactive grub updates for 12.04 break on AWS

To clarify, the fixes tested only fix the cloud-init piece. Since the cloud-init module in question only runs at first boot. So while the patch does address the problem, it only fixes it for future instances. Existing instances that upgrade grub will be broken. Meaning that if a ABI changes takes place in 14.04, this bug will also affect 14.04.

We need to come up a fix for existing instances as well.

description: updated
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

Worked out a postinst script that 1) detects if the root device and the grub-pc install device is mismatched; 2) if so, re-sets the grub-pc device and then installs grub to the device in question. The changes should be safe, but I want a through to be safe.

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :
tags: added: patch
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :
Revision history for this message
Robert C Jennings (rcj) wrote :

I have reviewed the patch (lp:~utlemming/ubuntu/precise/cloud-init/lp1336855 through r211) and tested it on EC2 and an OpenStack cloud and it performed well. The upgrade of cloud-init forced a grub-install and it ran with the correct root device.

Revision history for this message
Robert C Jennings (rcj) wrote :

That should have been "with the correct *boot* device"

Revision history for this message
Dusty (draper7) wrote :

I was able to workaround the issue by modifying /etc/default/grub and issuing update-grub ( post apt-get upgrade ). I assume it has something to do with the console settings?

----- bad grub -----
# If you change this file, run 'update-grub' afterwards to update
# /boot/grub/grub.cfg.
# For full documentation of the options in this file, see:
# info -f grub -n 'Simple configuration'

GRUB_DEFAULT=0
#GRUB_HIDDEN_TIMEOUT=0
#GRUB_HIDDEN_TIMEOUT_QUIET=true
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="console=ttyS0"
GRUB_CMDLINE_LINUX=""

# Uncomment to enable BadRAM filtering, modify to suit your needs
# This works with Linux (no patch required) and with any kernel that obtains
# the memory map information from GRUB (GNU Mach, kernel of FreeBSD ...)
#GRUB_BADRAM="0x01234567,0xfefefefe,0x89abcdef,0xefefefef"

# Uncomment to disable graphical terminal (grub-pc only)
GRUB_TERMINAL=console

# The resolution used on graphical terminal
# note that you can use only modes which your graphic card supports via VBE
# you can see them in real GRUB with the command `vbeinfo'
#GRUB_GFXMODE=640x480

# Uncomment if you don't want GRUB to pass "root=UUID=xxx" parameter to Linux
#GRUB_DISABLE_LINUX_UUID=true

# Uncomment to disable generation of recovery mode menu entries
#GRUB_DISABLE_RECOVERY="true"

# Uncomment to get a beep at grub start
#GRUB_INIT_TUNE="480 440 1"

#Disable recordfail timeout
GRUB_RECORDFAIL_TIMEOUT=0
-----

----- working grub -----
# If you change this file, run 'update-grub' afterwards to update
# /boot/grub/grub.cfg.
# For full documentation of the options in this file, see:
# info -f grub -n 'Simple configuration'

GRUB_DEFAULT=0
#GRUB_HIDDEN_TIMEOUT=0
GRUB_HIDDEN_TIMEOUT_QUIET=true
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT=""
GRUB_CMDLINE_LINUX=""

# Uncomment to enable BadRAM filtering, modify to suit your needs
# This works with Linux (no patch required) and with any kernel that obtains
# the memory map information from GRUB (GNU Mach, kernel of FreeBSD ...)
#GRUB_BADRAM="0x01234567,0xfefefefe,0x89abcdef,0xefefefef"

# Uncomment to disable graphical terminal (grub-pc only)
#GRUB_TERMINAL=console

# The resolution used on graphical terminal
# note that you can use only modes which your graphic card supports via VBE
# you can see them in real GRUB with the command `vbeinfo'
#GRUB_GFXMODE=640x480

# Uncomment if you don't want GRUB to pass "root=UUID=xxx" parameter to Linux
#GRUB_DISABLE_LINUX_UUID=true

# Uncomment to disable generation of recovery mode menu entries
#GRUB_DISABLE_RECOVERY="true"

# Uncomment to get a beep at grub start
#GRUB_INIT_TUNE="480 440 1"

#Disable recordfail timeout
#GRUB_RECORDFAIL_TIMEOUT=0
-----

Revision history for this message
Ron Wail (ronwail) wrote :

This is also a problem in trusty using the latest release build (20140813) with cloud-init 0.7.5.

Revision history for this message
Ron Wail (ronwail) wrote :

Has there been any progress on getting this resolved in the released cloud image builds?

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

Uploaded a new branch for SRU after chatting with Scott. Pending review.

Revision history for this message
Ron Wail (ronwail) wrote : Re: [Bug 1336855] Re: [SRU] non-interactive grub updates for 12.04 break on AWS

Thanks for the reply... I'm still working out how the Ubuntu
daily/release dev process works :)

I just found the page that documents what SRU is :)

Am I right in thinking that after review it will be released on precise
and trusty and hit the cloud-images/AWS at the same time?

We're currently running on precise but moving to trusty at the same time
as we migrate our deployment processes to rely more on cloud-init on
local vagrant as well as AWS builds... so there's lots of moving parts :)

Thanks,

Ron

On 2014-09-18 05:52 , Ben Howard wrote:
> Uploaded a new branch for SRU after chatting with Scott. Pending review.
>
> ** Branch linked: lp:~utlemming/ubuntu/precise/cloud-
> init/lp1336855-1363260
>

Ron

--
"Design depends largely on constraints." - Charles Eames

Changed in cloud-init (Ubuntu Trusty):
status: New → Confirmed
Changed in cloud-init (Ubuntu Precise):
status: New → Confirmed
importance: Undecided → Critical
Changed in cloud-init (Ubuntu Trusty):
importance: Undecided → Critical
assignee: nobody → Ben Howard (utlemming)
Changed in cloud-init (Ubuntu Precise):
assignee: nobody → Ben Howard (utlemming)
Changed in cloud-init (Ubuntu Utopic):
assignee: nobody → Ben Howard (utlemming)
description: updated
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote : Re: [SRU] non-interactive grub updates for 12.04 break on AWS

In re: to comment #19
> Am I right in thinking that after review it will be released on precise and trusty and hit the cloud-images/AWS at the same time?

The update will be released to the update archive around the same time for 12.04 and 14.04. The next batch of cloud-images released after this update hits the archive will include the fix.

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

Attached debdiff against trusty-proposed. Tested and confirmed it works.

description: updated
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :
summary: - [SRU] non-interactive grub updates for 12.04 break on AWS
+ [SRU] non-interactive grub updates broken for /dev/xvda devices on
+ Cloud-Images/Cloud-init
Changed in cloud-init (Ubuntu Trusty):
status: Confirmed → In Progress
Changed in cloud-init (Ubuntu Utopic):
status: Confirmed → In Progress
Changed in cloud-init (Ubuntu Precise):
status: Confirmed → In Progress
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

Attached debdiff for Utopic

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-init - 0.7.6~bzr1020-0ubuntu1

---------------
cloud-init (0.7.6~bzr1020-0ubuntu1) utopic; urgency=medium

  [ Ben Howard ]
  * Fix for cloud-init misidentifying grub install device (LP: #1336855).

  [ Scott Moser ]
  * New upstream snapshot.
    * cc_grub_dpkg: consider /dev/xvda as candidate for grub installation
      (LP: #1336855)
    * resizefs: fix backgrounding of resizefs (LP: #1338614)
    * cloud-init-blocknet: remove debug code
 -- Scott Moser <email address hidden> Tue, 23 Sep 2014 14:20:09 -0400

Changed in cloud-init (Ubuntu Utopic):
status: In Progress → Fix Released
Revision history for this message
Chris J Arges (arges) wrote : Please test proposed package

Hello Ben, or anyone else affected,

Accepted cloud-init into precise-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/cloud-init/0.6.3-0ubuntu1.14 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in cloud-init (Ubuntu Precise):
status: In Progress → Fix Committed
tags: added: verification-needed
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

Verified for precise. Waiting on trusty-proposed to appear.

tags: added: needed verification-precise-done
removed: verification-needed
Mathew Hodson (mhodson)
tags: added: verification-done-precise
removed: needed verification-precise-done
Revision history for this message
Chris J Arges (arges) wrote :

Hello Ben, or anyone else affected,

Accepted cloud-init into trusty-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/cloud-init/0.7.5-0ubuntu1.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in cloud-init (Ubuntu Trusty):
status: In Progress → Fix Committed
tags: added: verification-needed
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

Confirmed for both 12.04 and 14.04. Marking as verification done.

tags: added: trusty verification-done
removed: verification-done-precise verification-needed
Revision history for this message
Scott Moser (smoser) wrote :

fixed in 0.7.6

Changed in cloud-init:
importance: Undecided → Medium
status: New → Fix Released
Revision history for this message
Chris J Arges (arges) wrote : Update Released

The verification of the Stable Release Update for cloud-init has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-init - 0.7.5-0ubuntu1.3

---------------
cloud-init (0.7.5-0ubuntu1.3) trusty-proposed; urgency=medium

  * d/patches/lp-1336855-grub_xvda.patch: include xvda devices for
    consideration for grub configuration (LP: #1336855).
 -- Ben Howard <email address hidden> Thu, 18 Sep 2014 16:47:23 -0600

Changed in cloud-init (Ubuntu Trusty):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-init - 0.6.3-0ubuntu1.14

---------------
cloud-init (0.6.3-0ubuntu1.14) precise-proposed; urgency=medium

  * debian/patches/lp-1363260-add-cloudsigma_ds.patch: backport from
    14.04/14.10 CloudSigma datasource to enable CloudSigma (LP: #1363260).
  * debian/patches/lp-1336855-grub_xvda.patch: include xvda devices for
    consideration for grub configuration (LP: #1336855).
    - added logic to postinst to fix broken grub installations
 -- Ben Howard <email address hidden> Fri, 29 Aug 2014 15:40:23 -0600

Changed in cloud-init (Ubuntu Precise):
status: Fix Committed → Fix Released
Revision history for this message
James Falcon (falcojr) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.