cloud-init restarts during package upgrade when it shouldn't, regenerating SSH host keys on Azure Focal

Bug #1999159 reported by James Falcon
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
cloud-init (Ubuntu)
Fix Released
High
Unassigned
Focal
Fix Released
High
Unassigned

Bug Description

=== Begin SRU Template ===
[Impact]
In 0f6db8c5 first included in focal in version 22.3-13-g70ce6442-0ubuntu1~20.04.1, Build-Depends was changed from "debhelper (>= 9.20160709)" to "debhelper-compat (= 10)". This is causing cloud-init to erroneously have a restart section in the post-install script:

# Automatically added by dh_systemd_start/12.10ubuntu1
if [ "$1" = "configure" ] || [ "$1" = "abort-upgrade" ] || [ "$1" = "abort-deconfigure" ] || [ "$1" = "abort-remove" ] ; then
 if [ -d /run/systemd/system ]; then
  systemctl --system daemon-reload >/dev/null || true
  if [ -n "$2" ]; then
   deb-systemd-invoke try-restart 'cloud-config.service' 'cloud-config.target' 'cloud-final.service' 'cloud-init-hotplugd.service' 'cloud-init-hotplugd.socket' 'cloud-init-local.service' 'cloud-init.service' 'cloud-init.target' >/dev/null || true
  fi
 fi
fi

This script makes cloud-init restart on upgrade with services running interleaved and out of order. On older Azure instances, this creates a race on upgrade that can lead to ssh host keys being replaced.

[Test Case]
1.
- Launch a focal instance
- Ensure /var/lib/dpkg/info/cloud-init.postinst does not contain a section restarting cloud-init services
- Ensure /var/log/cloud-init.log contains no logs of cloud-init services running interleaved or out of order
2.
- Upgrade a focal instance from pre-22.3 to the latest version
- Ensure /var/lib/dpkg/info/cloud-init.postinst does not contain a section restarting cloud-init
- Ensure /var/log/cloud-init.log and /var/log/cloud-init-output.log do not contain any logs with the upgraded package version referencd 22.4.2-0ubuntu0~20.04.2 as that would indicate cloud-init serviers were restarted

Verification Test script

#!/bin/bash
set -ex

echo ==== Focal Verification of https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1999159 ====

# Create upgrade to proposed script
cat > setup_proposed.sh <<EOF
#/bin/bash
mirror=http://archive.ubuntu.com/ubuntu
echo deb \$mirror \$(lsb_release -sc)-proposed main | tee /etc/apt/sources.list.d/proposed.list
apt-get update -q;
apt-get install -qy cloud-init;
EOF

lxc launch ubuntu-daily:focal sru-f
while ! lxc exec sru-f -- cloud-init status --wait --long; do
   sleep 5
done

echo "=== OLD version Before cloud-init upgrade"
lxc exec sru-f -- cloud-init -v
echo "=== Track number of boot events cloud-init saw before upgrade from 22.4.2"
BOOT_RECORDS=$(lxc exec sru-f -- cloud-init analyze show | grep 'boot record')

echo "=== OLD version postinst contains try-restart logic to restart cloud-init services"
lxc exec sru-f grep try-restart /var/lib/dpkg/info/cloud-init.postinst && echo SUCCESS: OLD version still restarts services || echo FAILURE: OLD version does not try-restart cloud-init services

echo "=== Upgrade to -proposed version of cloud-init: ....20.04.2"
lxc file push setup_proposed.sh sru-f/
lxc exec sru-f -- bash /setup_proposed.sh
lxc exec sru-f -- cloud-init -v
echo "=== Number of boot events from cloud-init service restart should not have changed"
NEW_BOOT_RECORDS=$(lxc exec sru-f -- cloud-init analyze show | grep 'boot record')
if [ "$BOOT_RECORDS" == "$NEW_BOOT_RECORDS" ]; then
   echo SUCCESS same number of boot records across upgrade
else
   echo FAILURE: different boot records seen across upgrade $NEW_BOOT_RECORDS
fi
echo "=== NEW version postinst does NOT contain try-restart logic for cloud-init services"
lxc exec sru-f grep try-restart /var/lib/dpkg/info/cloud-init.postinst && echo FAILURE: NEW version still restarts services || echo SUCCESS: NEW version does not try-restart cloud-init services

echo === No new log messages indicating NEW version of cloud-init was restarted
lxc exec sru-f grep 22.4.2-0ubuntu0~20.04.2 /var/log/cloud-init-output.log && echo SUCCESS: not service restarts across upgrade || echo FAILURE: found service restarts across upgrade

[Regression Potential]
It is possible some functionality added in version 10 of debhelper is no longer available. Since this change was made in version 22.3 of cloud-init, this would be limited to focal versions of cloud-init 22.3 and 22.4. The control.tar.gz when building with debhelper compat 9 vs 10 has been examined to ensure there are no other functional changes to the output besides the fix for this regression and this analysis is included in https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1999159/comments/3.

[Other Info]
PR with change: https://github.com/canonical/cloud-init/pull/1900
Potentially related debhelper bug: https://bugs.launchpad.net/ubuntu/+source/debconf/+bug/1959054

=== End SRU Template ===

James Falcon (falcojr)
description: updated
description: updated
Chad Smith (chad.smith)
description: updated
description: updated
Changed in cloud-init (Ubuntu):
status: New → In Progress
Changed in cloud-init (Ubuntu Focal):
status: New → In Progress
Chad Smith (chad.smith)
summary: - cloud-init restarts after upgrade when it shouldn't
+ cloud-init restarts after upgrade when it shouldn't, regenerating SSH
+ host keys on Azure Focal
summary: - cloud-init restarts after upgrade when it shouldn't, regenerating SSH
- host keys on Azure Focal
+ cloud-init restarts during package upgrade when it shouldn't,
+ regenerating SSH host keys on Azure Focal
James Falcon (falcojr)
description: updated
James Falcon (falcojr)
description: updated
James Falcon (falcojr)
description: updated
Revision history for this message
Chad Smith (chad.smith) wrote (last edit ):
Download full text (3.4 KiB)

In discussion with @vorlon. general sentiment for making debhelper compat version changes in stable releases is a no-no because of this type of bug being introduced by automated debhelper injected content in the binary package. In order to verify that we don't have a potential to regress something else by moving back from debhelper-compat 10 to debhelper-compat 9 we need to also analyze a diff of any debian control files that are automatically generated.

the diff of such control files is here and obtained by the following procedure.

```
lxc launch ubuntu-dev-tools sru-f;
lxc exec sru-f bash
$ apt install ubuntu-dev-tools -y;
$ pull-lp-debs cloud-init 22.4.2-0ubuntu0~20.04.1;
$ add-apt-repository ppa:chad.smith/cloud-init-uploads -y;
$ pull-ppa-debs ppa:chad.smith/cloud-init-uploads cloud-init 22.4.2-0ubuntu0~20.04.2

$ dpkg-deb -R cloud-init_22.4.2-0ubuntu0~20.04.1_all.deb 20.04.1
$ dpkg-deb -R cloud-init_22.4.2-0ubuntu0~20.04.2_all.deb 20.04.2
$ diff -urN 20.04.1 20.04.2 | pastebinit
```

Resulting pastebin of that effort on focal:

https://paste.ubuntu.com/p/pCWx2tCNVq/

Analysis on the potential downgrade exposure paths for debhelper compat 10 back to 9 to ensure no dragons lurk in the debhelper compat downgrade path for any control scripts.

== debian/control
 version changes and a Installed Size increase
 -Version: 22.4.2-0ubuntu0~20.04.1
+Version: 22.4.2-0ubuntu0~20.04.2

== DEBIAN/md5sums deltas:
   usr/lib/python3/dist-packages/cloudinit/version.py # changed due to .2 vs .1 in pkg'd version string
   usr/share/doc/cloud-init/changelog.Debian.gz

== DEBIAN/postinst
-- postinst.1 dropped the try-restart section which caused the original restart of services across package upgrade here when we moved to debhelper-compat 10

-# Automatically added by dh_systemd_start/12.10ubuntu1
-if [ "$1" = "configure" ] || [ "$1" = "abort-upgrade" ] || [ "$1" = "abort-deconfigure" ] || [ "$1" = "abort-remove" ] ; then
- if [ -d /run/systemd/system ]; then
- systemctl --system daemon-reload >/dev/null || true
- if [ -n "$2" ]; then
- deb-systemd-invoke try-restart 'cloud-config.service' 'cloud-config.target' 'cloud-final.service' 'cloud-init-hotplugd.service' 'cloud-init-hotplugd.socket' 'cloud-init-local.service' 'cloud-init.service' 'cloud-init.target' >/dev/null || true
- fi
- fi
-fi

-- postinst.2: identical dh_python3 section is a little earlier in the file in deb compat level 9

+
+# Automatically added by dh_python3:
+if which py3compile >/dev/null 2>&1; then
+ py3compile -p cloud-init
+fi
+if which pypy3compile >/dev/null 2>&1; then
+ pypy3compile -p cloud-init || true
+fi
+
+# End automatically added section

-- postinst.3: remove escaping of tilde '\~' in all rm_conffile lines in debhelper-compat when moving back to compat level 9

compat lvl 10: dpkg-maintscript-helper rm_conffile /etc/init/cloud-config.conf 0.7.9-243-ge74d775-0ubuntu2\~ -- "$@"

compat lvl 9: dpkg-maintscript-helper rm_conffile /etc/init/cloud-config.conf 0.7.9-243-ge74d775-0ubuntu2~ -- "$@"

--- postinst.4: Additional bookend comments around each rm_conffile line

+# Autom...

Read more...

Revision history for this message
Chad Smith (chad.smith) wrote :

Binary controll tarball diffs between 20.04.1 (debhelper-compat = 10) versus 20.04.2 (debhelper-compat = 9)

Attached is the diff between the two binary package versions generated before (compat 10) and after the revert of d/control d/compat (compat 9)

Revision history for this message
Chad Smith (chad.smith) wrote (last edit ):

Per the breakdown and alaysis of my comment #1 above:

  The only differences in debhelper-compate script changes we are seeing across the downgrade of debhelper-compat 10 to debhelper-compat 9:

   1. The removal errant try-restart <all cloud-init services>
       -- -which we need to drop and was the intent of this bug/fix
   2. Migration of identical dh_python3 section is a little earlier
      -- no risk change, same content in a different part of the file
   3.bookend comments and escaping of ~ in rm_conffiles
      -- were working in both compat 10 and compat 9 version of our postinst postrm prerm scripts.

Additionally, the intent of this changeset is also to revert to the previous debhelper compat state that was already present and fully functional in cloud-init prior to 22.3-13. So the way this will continue to behave was reasonable for years prior to this change. Our only risk would be in the downgrade from compat level 10 to compat level 9 and the diffs we see in control files don't indicate risk there.

Revision history for this message
Steve Langasek (vorlon) wrote : Please test proposed package

Hello James, or anyone else affected,

Accepted cloud-init into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/cloud-init/22.4.2-0ubuntu0~20.04.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

description: updated
description: updated
Changed in cloud-init (Ubuntu Focal):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-focal
Chad Smith (chad.smith)
Changed in cloud-init (Ubuntu Focal):
importance: Undecided → High
Changed in cloud-init (Ubuntu):
importance: Undecided → High
Chad Smith (chad.smith)
description: updated
Revision history for this message
Chad Smith (chad.smith) wrote :

SRU verification script

description: updated
Revision history for this message
Chad Smith (chad.smith) wrote :

SRU verification script for focal upgrade

Revision history for this message
Chad Smith (chad.smith) wrote :
Download full text (8.8 KiB)

SRU verification results pointing to no try-restart in cloud-init.postinst, and confirmed no service restarts across upgrade path. Any service restart would trigger new logs in either cloud-init analyze show or cloud-init-output.log.

+ echo ==== Focal Verification of https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1999159 ====
==== Focal Verification of https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1999159 ====
+ cat
+ lxc launch ubuntu-daily:focal sru-f
Creating sru-f
Starting sru-f
+ lxc exec sru-f -- cloud-init status --wait --long
.........................................................
status: done
time: Fri, 09 Dec 2022 00:01:17 +0000
detail:
DataSourceNoCloud [seed=/var/lib/cloud/seed/nocloud-net][dsmode=net]
+ echo '=== OLD version Before cloud-init upgrade'
=== OLD version Before cloud-init upgrade
+ lxc exec sru-f -- cloud-init -v
/usr/bin/cloud-init 22.3.4-0ubuntu1~20.04.1
+ echo '=== Track number of boot events cloud-init saw before upgrade from 22.4.2'
=== Track number of boot events cloud-init saw before upgrade from 22.4.2
++ lxc exec sru-f -- cloud-init analyze show
++ grep 'boot record'
+ BOOT_RECORDS='1 boot records analyzed'
+ echo '=== OLD version postinst contains try-restart logic to restart cloud-init services'
=== OLD version postinst contains try-restart logic to restart cloud-init services
+ lxc exec sru-f grep try-restart /var/lib/dpkg/info/cloud-init.postinst
   deb-systemd-invoke try-restart 'cloud-config.service' 'cloud-config.target' 'cloud-final.service' 'cloud-init-hotplugd.service' 'cloud-init-hotplugd.socket' 'cloud-init-local.service' 'cloud-init.service' 'cloud-init.target' >/dev/null || true
+ echo SUCCESS: OLD version still restarts services
SUCCESS: OLD version still restarts services
+ echo '=== Upgrade to -proposed version of cloud-init: ....20.04.2'
=== Upgrade to -proposed version of cloud-init: ....20.04.2
+ lxc file push setup_proposed.sh sru-f/
+ lxc exec sru-f -- bash /setup_proposed.sh
deb http://archive.ubuntu.com/ubuntu focal-proposed main
Hit:1 http://archive.ubuntu.com/ubuntu focal InRelease
Get:2 http://security.ubuntu.com/ubuntu focal-security InRelease [114 kB]
Get:3 http://archive.ubuntu.com/ubuntu focal-updates InRelease [114 kB]
Get:4 http://archive.ubuntu.com/ubuntu focal-backports InRelease [108 kB]
Get:5 http://archive.ubuntu.com/ubuntu focal-proposed InRelease [267 kB]
Get:6 http://security.ubuntu.com/ubuntu focal-security/main amd64 Packages [1892 kB]
Get:7 http://archive.ubuntu.com/ubuntu focal/universe amd64 Packages [8628 kB]
Get:8 http://security.ubuntu.com/ubuntu focal-security/main Translation-en [310 kB]
Get:9 http://security.ubuntu.com/ubuntu focal-security/main amd64 c-n-f Metadata [11.5 kB]
Get:10 http://security.ubuntu.com/ubuntu focal-security/restricted amd64 Packages [1383 kB]
Get:11 http://security.ubuntu.com/ubuntu focal-security/restricted Translation-en [195 kB]
Get:12 http://security.ubuntu.com/ubuntu focal-security/restricted amd64 c-n-f Metadata [596 B]
Get:13 http://security.ubuntu.com/ubuntu focal-security/universe amd64 Packages [777 kB]
Get:14 http://security.ubuntu.com/ubuntu focal-security/universe Translation-en [150 kB]
Get...

Read more...

tags: added: verification-done verification-done-focal
removed: verification-needed verification-needed-focal
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-init - 22.4.2-0ubuntu0~20.04.2

---------------
cloud-init (22.4.2-0ubuntu0~20.04.2) focal; urgency=medium

  * d/compat & d/control: revert bump debhelper-comat to v10. Avoid
    service restarts across package upgrade (LP: #1999159)
    - d/compat: replaced with compat level 9
    - d/control: Build-Depends: revert to debhelper >= 9

 -- Chad Smith <email address hidden> Thu, 08 Dec 2022 09:45:11 -0700

Changed in cloud-init (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Steve Langasek (vorlon) wrote : Update Released

The verification of the Stable Release Update for cloud-init has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Chad Smith (chad.smith)
Changed in cloud-init (Ubuntu):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.