[SRU] race condition between startup script and cloud-init

Bug #1389399 reported by Ben Howard
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
gce-utils (Ubuntu)
Fix Released
High
Unassigned
Nominated for Precise by Ben Howard
Nominated for Trusty by Ben Howard
Nominated for Utopic by Ben Howard
Nominated for Vivid by Ben Howard

Bug Description

[IMPACT] Roughly 40% of users using google start-up scripts with apt commands will fail.

[CAUSE] Cloud-init and the Google Startup Scripts processes may run at the same time. If the cloud-init apt cloud-config module is running at the start of the Google Startup script then it may fail.

[FIX] Start the Google Startup scripts after cloud-init has finished.

[REGRESSION POTENTIAL] The risk for regressions is low.

[TESTING]
1. Create a new image from the test PPA
2. Launch an instance with a startup script of "apt-get -y update; apt-get -y install --download-only ubuntu-desktop && touch /tmp/it_worked"
3. Wait for instance to come up; confirm that /tmp/it_worked exists.

[ORIGINAL REPORT]
The image ships with archive.ubuntu.com in the bits on disk. Cloud-init changes that to gce.clouds.archive.ubuntu.com and also runs apt-get update. The user's startup script in this case calls apt-add-repository and then apt-get update. When these overlap, problems arise. When they don't, the process works. We've tested this by adding a sleep command (e.g. 'sleep 30') to the start of the user startup script in our reproduction runs.

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

The root cause here is that gce-utils has a job called "google.conf" that triggers the start of the startup scripts. It starts on "runlevel [2345]".

I pushed a build recipe change, that mirrors the fix uploaded here:
https://launchpad.net/~cpc-gce-team/+archive/ubuntu/lp1389399

We just need to QA the fix, then release new images.

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

PPA builds confirm the fix of preventing the race condition. We are in the process of finalizing more QA. I would like Google to confirm the fix before we upload and process the partner SRU.

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

Uploaded the fix to 15.04; it is pending in -proposed.

description: updated
summary: - race condition between startup script and cloud-init
+ [SRU] race condition between startup script and cloud-init
description: updated
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

PPA fix has been verified by GCE. Uploaded 12.04, 14.04 and 14.10 fix. Pending SRU acceptance.

Changed in gce-utils (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.