Path to swapfile doesn't use a static device path

Bug #1896638 reported by Francis Ginther
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ec2-hibinit-agent (Ubuntu)
Fix Released
Undecided
Unassigned
Xenial
Fix Released
Undecided
Unassigned
Bionic
Fix Released
Undecided
Unassigned
Focal
Fix Released
Undecided
Unassigned
Groovy
Fix Released
Undecided
Unassigned
hibagent (Ubuntu)
Fix Released
Undecided
Unassigned
Xenial
Won't Fix
Undecided
Unassigned
Bionic
Fix Released
Undecided
Alberto Contreras
Focal
Fix Released
Undecided
Alberto Contreras
Groovy
Won't Fix
Undecided
Unassigned
Jammy
Fix Released
Undecided
Alberto Contreras
Kinetic
Fix Released
Undecided
Alberto Contreras

Bug Description

[Impact]

* Using the device name on the kernel cmdline in the resume= option leads to failure to resume from hibernation when the device name is not stable, which can be the case for nvme drives.

[Test Case]

* ec2-hibinit-agent

  * Set up an EC2 instance to allow hibernation
  * Wait for hibinit-agent.service fully started
  * /etc/default/grub.d/99-set-swap.cfg should refer to the resume=partition by PARTUUID

* hibagent

  * Spin up an EC2 spot instance with `hibernate` as `Interruption behavior` [1].
  * Install the latest hibagent: `sudo apt-get install hibagent`
  * Enable hibernation: `sudo /usr/bin/enable-ec2-spot-hibernation`
  * Create an AWS FIS experiment template to send a spot-instance-interruption signal [2], make it point to the created instance and launch it.
    Note: This step is optional, one can wait for AWS EC2 to send the interruption signal, but it could take a lot of time.
  * After some minutes, EC2 will send a signal to resume the interrupted instance.
  * Verify the instance has correctly been resumed from hibernation.

[1] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/interruption-behavior.html#specifying-spot-interruption-behavior
[2] https://catalog.us-east-1.prod.workshops.aws/workshops/5fc0039f-9f15-47f8-aff0-09dc7b1779ee/en-US/030-basic-content/078-ec2-spot/020-spot-ec2-interrup

[Regression Potential]

* Failure to discover PARTUUID makes the system unable to resume. A potential crash would cause the system unable to set up hibernation or unable to resume. (On Focal PARTUUID is already in use, even without this fix.)

[Original Bug Text]

When the agent inserts the resume device path and offset into the kernel cmdline, it uses device names such as the following:

`resume_offset=223232 resume=/dev/nvme1n1p1`

The issue is that `/dev/nvme1n1p1` is not static. On the reboot, the block device may appear at `/dev/nvme0n1p1` resulting in failure to find the swapfile used to suspend.

The solution should be to use a persistent block device naming scheme.

Related branches

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

please use by-id with uuid path as listed in https://wiki.ubuntu.com/FSTAB

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

i.e. ext4 /dev/disk/by-uuid/9958d086-4f20-4807-8976-58807a5bc7e0 or some such, if it will be accepted by our initrd.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

although, our cloud images normally boot by-label, thus potentially we should use the same LABEL= device, as specified in fstab of our cloud-images.

Revision history for this message
Francis Ginther (fginther) wrote :

I was unable to get either the by-uuid or by-label paths to work in my initial hacking. The kernel cmdline looked like what I would expect, for example '... resume_offset=225280 resume=/dev/disk/by-label/cloudimg-rootfs', but the hibernation image was not restored. Not currently seeing anything in the dmesg that indicates what failed. This was with the 4.15 kernel on xenial.

Revision history for this message
Francis Ginther (fginther) wrote :

I've updated the find_device_for_file() function in /usr/bin/hibinit-agent to:

def find_device_for_file(filename):
    # Find the mount point for the swap file ('df -P /swap')
    df_out = check_output(['df', '-P', filename]).decode('ascii')
    dev_str = df_out.split("\n")[1].split()[0]
    lsblk_out = check_output(['lsblk', '-dno', 'PARTUUID', dev_str]).decode('ascii')
    uuid_str = lsblk_out.strip()
    return "PARTUUID=%s" % (uuid_str)

I've tested this on xenial and bionic. With xenial, I would see 2-3% failures on certain instance types due to this issue. Now I see none.

Balint Reczey (rbalint)
Changed in ec2-hibinit-agent (Ubuntu Focal):
status: New → Invalid
Changed in ec2-hibinit-agent (Ubuntu Groovy):
status: New → Invalid
Changed in ec2-hibinit-agent (Ubuntu Bionic):
status: New → Confirmed
Changed in ec2-hibinit-agent (Ubuntu Focal):
status: Invalid → Confirmed
Changed in ec2-hibinit-agent (Ubuntu Groovy):
status: Invalid → Confirmed
Balint Reczey (rbalint)
Changed in ec2-hibinit-agent (Ubuntu Xenial):
status: New → Confirmed
Changed in ec2-hibinit-agent (Ubuntu Groovy):
status: Confirmed → In Progress
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ec2-hibinit-agent - 1.0.0-0ubuntu11

---------------
ec2-hibinit-agent (1.0.0-0ubuntu11) groovy; urgency=medium

  * Disable suspending the system (LP: #1898087)
  * Always set resume device by PARTUUID instead of by device name.
    Based on patch by Francis Ginther. (LP: #1896638)

 -- Balint Reczey <email address hidden> Wed, 14 Oct 2020 20:49:52 +0200

Changed in ec2-hibinit-agent (Ubuntu Groovy):
status: In Progress → Fix Released
Balint Reczey (rbalint)
description: updated
Balint Reczey (rbalint)
description: updated
Revision history for this message
Balint Reczey (rbalint) wrote :

I've uploaded the fix for Focal.

Changed in ec2-hibinit-agent (Ubuntu Focal):
status: Confirmed → In Progress
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Francis, or anyone else affected,

Accepted ec2-hibinit-agent into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ec2-hibinit-agent/1.0.0-0ubuntu9.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in ec2-hibinit-agent (Ubuntu Focal):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-focal
Revision history for this message
Balint Reczey (rbalint) wrote :
Download full text (3.8 KiB)

Verified 1.0.0-0ubuntu9.1 on Focal:

ubuntu@ip-172-31-1-183:~$ sudo apt purge ec2-hibinit-agent
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages will be REMOVED:
  ec2-hibinit-agent*
0 upgraded, 0 newly installed, 1 to remove and 22 not upgraded.
After this operation, 65.5 kB disk space will be freed.
Do you want to continue? [Y/n]
(Reading database ... 87759 files and directories currently installed.)
Removing ec2-hibinit-agent (1.0.0-0ubuntu9) ...
(Reading database ... 87744 files and directories currently installed.)
Purging configuration files for ec2-hibinit-agent (1.0.0-0ubuntu9) ...
Sourcing file `/etc/default/grub'
Sourcing file `/etc/default/grub.d/40-force-partuuid.cfg'
Sourcing file `/etc/default/grub.d/50-cloudimg-settings.cfg'
Sourcing file `/etc/default/grub.d/init-select.cfg'
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-5.4.0-1029-aws
Found initrd image: /boot/initrd.img-5.4.0-1029-aws
Found linux image: /boot/vmlinuz-5.4.0-1024-aws
Found initrd image: /boot/initrd.img-5.4.0-1024-aws
Found Ubuntu 20.04.1 LTS (20.04) on /dev/xvda1
done
ubuntu@ip-172-31-1-183:~$ sudo apt install ec2-hibinit-agent
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
  ec2-hibinit-agent
0 upgraded, 1 newly installed, 0 to remove and 22 not upgraded.
Need to get 11.5 kB of archives.
After this operation, 66.6 kB of additional disk space will be used.
Get:1 http://us-east-2.ec2.archive.ubuntu.com/ubuntu focal-proposed/main amd64 ec2-hibinit-agent all 1.0.0-0ubuntu9.1 [11.5 kB]
Fetched 11.5 kB in 0s (297 kB/s)
Selecting previously unselected package ec2-hibinit-agent.
(Reading database ... 87738 files and directories currently installed.)
Preparing to unpack .../ec2-hibinit-agent_1.0.0-0ubuntu9.1_all.deb ...
Unpacking ec2-hibinit-agent (1.0.0-0ubuntu9.1) ...
Setting up ec2-hibinit-agent (1.0.0-0ubuntu9.1) ...
Created symlink /etc/systemd/system/multi-user.target.wants/hibinit-agent.service → /lib/systemd/system/hibinit-agent.service.
ubuntu@ip-172-31-1-183:~$ service hibinit-agent status
● hibinit-agent.service - EC2 instance hibernation setup agent
     Loaded: loaded (/lib/systemd/system/hibinit-agent.service; enabled; vendor preset: enabled)
     Active: inactive (dead) since Thu 2020-10-29 20:14:43 UTC; 15s ago
       Docs: file:/usr/share/doc/ec2-hibinit-agent/README
   Main PID: 29451 (code=exited, status=0/SUCCESS)

Oct 29 20:14:43 ip-172-31-1-183 hibinit-agent[29451]: Swap pre-heating is skipped, the swap blocks won't be touched during to ensure they are ready
Oct 29 20:14:43 ip-172-31-1-183 hibinit-agent[29451]: Running: mkswap /swap-hibinit
Oct 29 20:14:43 ip-172-31-1-183 hibinit-agent[29451]: Running: swapon /swap-hibinit
Oct 29 20:14:43 ip-172-31-1-183 hibinit-agent[29451]: Updating the kernel offset for the swapfile: /swap-hibinit
Oct 29 20:14:43 ip-172-31-1-183 hibinit-agent[29451]: Updating GRUB to use the device PARTUUID=a2f52878-01 with offset 1687552 for resume
Oct 29 20:14:43 ip-172-31-1-183 hibinit-agent[29451]: GRUB configuration i...

Read more...

tags: added: verification-done verification-done-focal
removed: verification-needed verification-needed-focal
Revision history for this message
Francis Ginther (fginther) wrote :

I ran this through our automated hibernation testing. On spot checking the logs, I do see the resume path being set to the PARTUUID as expected by this SRU. I consider this to cover `verification-done-focal` for our testing.

Revision history for this message
Brian Murray (brian-murray) wrote :

Hello Francis, or anyone else affected,

Accepted ec2-hibinit-agent into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ec2-hibinit-agent/1.0.0-0ubuntu4~18.04.5 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in ec2-hibinit-agent (Ubuntu Bionic):
status: Confirmed → Fix Committed
tags: added: verification-needed verification-needed-bionic
removed: verification-done
Changed in ec2-hibinit-agent (Ubuntu Xenial):
status: Confirmed → Fix Committed
tags: added: verification-needed-xenial
Revision history for this message
Brian Murray (brian-murray) wrote :

Hello Francis, or anyone else affected,

Accepted ec2-hibinit-agent into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ec2-hibinit-agent/1.0.0-0ubuntu4~16.04.4 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Chris Halse Rogers (raof) wrote : Update Released

The verification of the Stable Release Update for ec2-hibinit-agent has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ec2-hibinit-agent - 1.0.0-0ubuntu9.1

---------------
ec2-hibinit-agent (1.0.0-0ubuntu9.1) focal; urgency=medium

  * Disable suspending the system (LP: #1898087)
  * Always set resume device by PARTUUID instead of by device name.
    Based on patch by Francis Ginther. (LP: #1896638)

 -- Balint Reczey <email address hidden> Fri, 02 Oct 2020 18:07:48 +0200

Changed in ec2-hibinit-agent (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Balint Reczey (rbalint) wrote :
Download full text (4.2 KiB)

Verified 1.0.0-0ubuntu4~18.04.5 on Bionic:

ubuntu@ip-172-31-0-42:~$ dpkg -l ec2-hibinit-agent | cat
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-=================-======================-============-=================================
ii ec2-hibinit-agent 1.0.0-0ubuntu4~18.04.5 all Amazon EC2 hibernation agent
ubuntu@ip-172-31-0-42:~$ service hibinit-agent status
● hibinit-agent.service - EC2 instance hibernation setup agent
   Loaded: loaded (/lib/systemd/system/hibinit-agent.service; enabled; vendor preset: enabled)
   Active: inactive (dead) since Fri 2020-11-20 18:26:20 UTC; 2min 18s ago
     Docs: file:/usr/share/doc/ec2-hibinit-agent/README
 Main PID: 3496 (code=exited, status=0/SUCCESS)

Nov 20 18:26:20 ip-172-31-0-42 hibinit-agent[20133]: Will check if swap is at least: 4000 megabytes
Nov 20 18:26:20 ip-172-31-0-42 hibinit-agent[20133]: There's sufficient swap available (have 4194304000, need 4194304000)
Nov 20 18:26:20 ip-172-31-0-42 hibinit-agent[20133]: Running: swapon /swap-hibinit
Nov 20 18:26:20 ip-172-31-0-42 hibinit-agent[20133]: Updating the kernel offset for the swapfile: /swap-hibinit
Nov 20 18:26:20 ip-172-31-0-42 hibinit-agent[20133]: Updating GRUB to use the device PARTUUID=e9adeae8-01 with offset 362496 for resume
Nov 20 18:26:20 ip-172-31-0-42 hibinit-agent[20133]: GRUB configuration is updated
Nov 20 18:26:20 ip-172-31-0-42 hibinit-agent[20133]: Setting swap device to 51713 with offset 362496
Nov 20 18:26:20 ip-172-31-0-42 hibinit-agent[20133]: Done updating the swap offset. Turning swapoff
Nov 20 18:26:20 ip-172-31-0-42 hibinit-agent[20133]: Running: swapoff /swap-hibinit
Nov 20 18:26:20 ip-172-31-0-42 systemd[1]: Started EC2 instance hibernation setup agent.
ubuntu@ip-172-31-0-42:~$ cat /etc/default/grub.d/99-set-swap.cfg
GRUB_CMDLINE_LINUX_DEFAULT="$GRUB_CMDLINE_LINUX_DEFAULT no_console_suspend=1 resume_offset=362496 resume=PARTUUID=e9adeae8-01"
ubuntu@ip-172-31-0-42:~$

Verified 1.0.0-0ubuntu4~16.04.4 on Xenial:

ubuntu@ip-172-31-13-97:~$ dpkg -l ec2-hibinit-agent | cat
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-=================-======================-============-=================================
ii ec2-hibinit-agent 1.0.0-0ubuntu4~16.04.4 all Amazon EC2 hibernation agent
ubuntu@ip-172-31-13-97:~$ service hibinit-agent status
● hibinit-agent.service - EC2 instance hibernation setup agent
   Loaded: loaded (/lib/systemd/system/hibinit-agent.service; enabled; vendor preset: enabled)
   Active: inactive (dead) since Fri 2020-11-20 18:15:31 UTC; 12min ago
     Docs: file:/usr/share/doc/ec2-hibinit-agent/README
 Main PID: 10521 (code=exited, status=0/SUCCESS)

Nov 20 18:15:31 ip-172-31-13-97 hibinit-agent[10516]: Allocating 4194304000 bytes in /swap-hibinit
Nov 20 18:15:31 ip-172-31-...

Read more...

Revision history for this message
Balint Reczey (rbalint) wrote :

For the record with the not fixed packages:

ubuntu@ip-172-31-0-42:~$ cat /etc/default/grub.d/99-set-swap.cfg
GRUB_CMDLINE_LINUX_DEFAULT="$GRUB_CMDLINE_LINUX_DEFAULT no_console_suspend=1 resume_offset=1716224 resume=/dev/xvda1"
ubuntu@ip-172-31-0-42:~$ dpkg -l ec2-hibinit-agent
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-=====================================-=======================-=======================-================================================================================
ii ec2-hibinit-agent 1.0.0-0ubuntu4~18.04.4 all Amazon EC2 hibernation agent

ubuntu@ip-172-31-13-97:~$ cat /etc/default/grub.d/99-set-swap.cfg
GRUB_CMDLINE_LINUX_DEFAULT="$GRUB_CMDLINE_LINUX_DEFAULT no_console_suspend=1 resume_offset=1652736 resume=/dev/xvda1"
ubuntu@ip-172-31-13-97:~$ dpkg -l ec2-hibinit-agent
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-=====================================-=======================-=======================-================================================================================
ii ec2-hibinit-agent 1.0.0-0ubuntu4~16.04.3 all Amazon EC2 hibernation agent

tags: added: verification-done verification-done-bionic verification-done-xenial
removed: verification-needed verification-needed-bionic verification-needed-xenial
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ec2-hibinit-agent - 1.0.0-0ubuntu4~18.04.5

---------------
ec2-hibinit-agent (1.0.0-0ubuntu4~18.04.5) bionic; urgency=medium

  * Always set resume device by PARTUUID instead of by device name.
    Based on patch by Francis Ginther. (LP: #1896638)

 -- Balint Reczey <email address hidden> Tue, 03 Nov 2020 19:41:35 +0100

Changed in ec2-hibinit-agent (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ec2-hibinit-agent - 1.0.0-0ubuntu4~16.04.4

---------------
ec2-hibinit-agent (1.0.0-0ubuntu4~16.04.4) xenial; urgency=medium

  * Always set resume device by PARTUUID instead of by device name.
    Based on patch by Francis Ginther. (LP: #1896638)

 -- Balint Reczey <email address hidden> Tue, 03 Nov 2020 19:43:33 +0100

Changed in ec2-hibinit-agent (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
Balint Reczey (rbalint) wrote :
Revision history for this message
Brian Murray (brian-murray) wrote :

The Groovy Gorilla has reached end of life, so this bug will not be fixed for that release

Changed in hibagent (Ubuntu Groovy):
status: New → Won't Fix
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package hibagent - 1.0.1+git20230216.9ac1209f7-0ubuntu1

---------------
hibagent (1.0.1+git20230216.9ac1209f7-0ubuntu1) lunar; urgency=medium

  * New upstream snapshot from their `ubuntu` branch (LP: #1896638)
    - d/p/disable-hibernate-test.patch: disable a test that only works on an
      actual EC2 instance
  * d/p/setuptools-fix-package-discovery.patch: fix compatibility with modern
    setuptools
  * d/control: drop X-Python3-Versions as it lists a long obsolete lower bound

 -- Simon Chopin <email address hidden> Wed, 15 Feb 2023 11:49:49 +0100

Changed in hibagent (Ubuntu):
status: New → Fix Released
no longer affects: ec2-hibinit-agent (Ubuntu Jammy)
no longer affects: ec2-hibinit-agent (Ubuntu Kinetic)
Changed in hibagent (Ubuntu Kinetic):
assignee: nobody → Alberto Contreras (aciba)
Changed in hibagent (Ubuntu Jammy):
assignee: nobody → Alberto Contreras (aciba)
Changed in hibagent (Ubuntu Focal):
assignee: nobody → Alberto Contreras (aciba)
Changed in hibagent (Ubuntu Bionic):
assignee: nobody → Alberto Contreras (aciba)
Revision history for this message
Simon Chopin (schopin) wrote :

I've uploaded Alberto's fixes to Kinetic, Jammy, Focal and Bionic, now waiting for SRU processing.

Revision history for this message
Robie Basak (racb) wrote :

> [Test Case]

I'm just passing by so this isn't a full review, but if the reason to land the SRU is to unbreak hibernation in some cases, then this is what you should be testing. Testing some technical effect which you believe to be the root cause is not sufficient. Please expand your test plan to actually test the user story you're trying to fix.

Revision history for this message
Alberto Contreras (aciba) wrote :

I understand that, but I believe it is not possible. Per [1], only Amazon EC2 can resume a hibernated Spot Instance, thus I do not think we can reproduce the user story.

This is also explained in the test plan of #2013336 .

[1] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/hibernate-spot-instances.html

Revision history for this message
Robie Basak (racb) wrote :

Could you set up a spot instance and wait for it to be hibernated and resumed to ensure it works before considering the SRU verified? Or is that not going to be practical?

Revision history for this message
Alberto Contreras (aciba) wrote :

That's a great idea, thanks!

I have found a way to send the hibernation signal using AWS FIS and reflected the user story test in the [Test Case].

Thanks for pointing this out.

description: updated
Changed in hibagent (Ubuntu Bionic):
status: New → In Progress
Changed in hibagent (Ubuntu Focal):
status: New → In Progress
Changed in hibagent (Ubuntu Jammy):
status: New → In Progress
Changed in hibagent (Ubuntu Kinetic):
status: New → In Progress
Revision history for this message
Steve Langasek (vorlon) wrote : Please test proposed package

Hello Francis, or anyone else affected,

Accepted hibagent into kinetic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/hibagent/1.0.1-0ubuntu2.22.10.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-kinetic to verification-done-kinetic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-kinetic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in hibagent (Ubuntu Kinetic):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-kinetic
removed: verification-done
Changed in hibagent (Ubuntu Jammy):
status: In Progress → Fix Committed
tags: added: verification-needed-jammy
Revision history for this message
Steve Langasek (vorlon) wrote :

Hello Francis, or anyone else affected,

Accepted hibagent into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/hibagent/1.0.1-0ubuntu2.22.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in hibagent (Ubuntu Focal):
status: In Progress → Fix Committed
tags: added: verification-needed-focal
removed: verification-done-focal
Revision history for this message
Steve Langasek (vorlon) wrote :

Hello Francis, or anyone else affected,

Accepted hibagent into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/hibagent/1.0.1-0ubuntu1.20.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in hibagent (Ubuntu Bionic):
status: In Progress → Fix Committed
tags: added: verification-needed-bionic
removed: verification-done-bionic
Revision history for this message
Steve Langasek (vorlon) wrote :

Hello Francis, or anyone else affected,

Accepted hibagent into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/hibagent/1.0.1-0ubuntu1.18.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Alberto Contreras (aciba) wrote :

It looks like AWS EC2 has disabled the ability to request spot instances with the interruption behavior set as 'hibernate'.
I have tried to reproduce it in multiple regions and with multiple valid instance types and I consistently get the following error:

```
launchSpecTemporarilyBlacklisted Repeated errors have occurred processing the launch specification "t3.micro, ami-08d931621368a5861, Linux/UNIX, eu-west-3a while launching spot instance". It will not be retried for at least 13 minutes. Error message: The request with instanceType 't3.micro' and Linux/UNIX is not supported when instanceInterruptionBehavior is set to 'hibernate'. (Service: AmazonEC2; Status Code: 400; Error Code: InvalidParameterCombination; Proxy: null)
```

I have been able to reproduce and verify that the hibernation works and that this bug is fixed simulating the workflow on normal instance with bionic, focal, jammy and kinetic:

apt purge ec2-hibinit-agent
apt-get update
apt-get upgrade -y

cat <<EOF >/etc/apt/sources.list.d/ubuntu-$(lsb_release -cs)-proposed.list
# Enable Ubuntu proposed archive
deb http://archive.ubuntu.com/ubuntu/ $(lsb_release -cs)-proposed restricted main multiverse universe
EOF

apt-get update
apt-get install -y hibagent
apt-cache policy hibagent
systemctl is-active hibagent.target || /usr/bin/enable-ec2-spot-hibernation

# Verify no errors
systemctl status hibagent
journalctl -u hibagent

# Verify lp #1896638 (resume partition by PARTUUID)
grep PART /etc/default/grub.d/99-set-swap.cfg

systemctl hibernate

# Start the instance and verify the hibernation resuming was okay
systemctl status hibinit-agent
journalctl --reverse

tags: added: verification-done verification-done-bionic verification-done-focal verification-done-jammy verification-done-kinetic
removed: verification-done-xenial verification-needed verification-needed-bionic verification-needed-focal verification-needed-jammy verification-needed-kinetic
tags: removed: verification-done
Revision history for this message
Alberto Contreras (aciba) wrote :

Xenial has reached end of life, so this bug will not be fixed for that release.

Changed in hibagent (Ubuntu Xenial):
status: New → Won't Fix
tags: added: verification-done
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Hi Alberto, since you wrote a script to test this in comment #31, do you have the logs of its output?

Revision history for this message
Alberto Contreras (aciba) wrote :

Hello Andreas.

I attach the logs of a verification session on an instance running Jammy.

Please, let me know if more checks are needed and/or if I should attach the logs for Bionic, Focal and Kinetic.

Thanks.

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Hey Alberto! Yes, please provide the output of this script for all the verified series, if possible. Thank you.

Revision history for this message
Alberto Contreras (aciba) wrote :
Revision history for this message
Alberto Contreras (aciba) wrote :
Revision history for this message
Alberto Contreras (aciba) wrote :
Revision history for this message
Alberto Contreras (aciba) wrote :

I have attached the logs for the remaining series. Thanks!

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package hibagent - 1.0.1-0ubuntu1.18.04.1

---------------
hibagent (1.0.1-0ubuntu1.18.04.1) bionic; urgency=medium

  * d/p/lp1896638-set-resume-device-by-partition-uuid: Set resume device
    by PARTUUID instead of by name. Thanks to Tony Nie <email address hidden>.
    (LP: #1896638)

 -- Alberto Contreras <email address hidden> Fri, 31 Mar 2023 10:02:42 +0200

Changed in hibagent (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package hibagent - 1.0.1-0ubuntu1.20.04.1

---------------
hibagent (1.0.1-0ubuntu1.20.04.1) focal; urgency=medium

  * d/p/lp1896638-set-resume-device-by-partition-uuid: Set resume device
    by PARTUUID instead of by name. Thanks to Tony Nie <email address hidden>.
    (LP: #1896638)
  * d/p/lp2013336-do-not-modify-GRUB-config-on-GRUB2-systems: Do not attempt to
    modify non existent GRUB config file. Thanks to Robert Schweikert
    <email address hidden>. (LP: #2013336)

 -- Alberto Contreras <email address hidden> Tue, 04 Apr 2023 18:31:39 +0200

Changed in hibagent (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package hibagent - 1.0.1-0ubuntu2.22.04.1

---------------
hibagent (1.0.1-0ubuntu2.22.04.1) jammy; urgency=medium

  * d/p/lp1896638-set-resume-device-by-partition-uuid: Set resume device
    by PARTUUID instead of by name. Thanks to Tony Nie <email address hidden>.
    (LP: #1896638)
  * d/p/lp2013336-do-not-modify-GRUB-config-on-GRUB2-systems: Do not attempt to
    modify non existent GRUB config file. Thanks to Robert Schweikert
    <email address hidden>. (LP: #2013336)

 -- Alberto Contreras <email address hidden> Tue, 04 Apr 2023 18:22:05 +0200

Changed in hibagent (Ubuntu Jammy):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package hibagent - 1.0.1-0ubuntu2.22.10.1

---------------
hibagent (1.0.1-0ubuntu2.22.10.1) kinetic; urgency=medium

  * d/p/lp1896638-set-resume-device-by-partition-uuid: Set resume device
    by PARTUUID instead of by name. Thanks to Tony Nie <email address hidden>.
    (LP: #1896638)
  * d/p/lp2013336-do-not-modify-GRUB-config-on-GRUB2-systems: Do not attempt to
    modify non existent GRUB config file. Thanks to Robert Schweikert
    <email address hidden>. (LP: #2013336)

 -- Alberto Contreras <email address hidden> Tue, 04 Apr 2023 18:03:13 +0200

Changed in hibagent (Ubuntu Kinetic):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.