SEV_SNP failure to init

Bug #2037316 reported by John Cabaj
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-aws (Ubuntu)
Fix Released
High
Unassigned
Jammy
Invalid
Undecided
Unassigned
Lunar
Fix Released
High
Unassigned
Mantic
Fix Released
High
Unassigned
linux-gcp (Ubuntu)
Fix Released
Undecided
Unassigned
Jammy
Fix Committed
Undecided
Unassigned
Lunar
Fix Released
Undecided
Unassigned
Mantic
Fix Released
Undecided
Unassigned

Bug Description

[Impact]

* Kernel fails to boot on SEV-SNP instances when compiled with GCC 12.3.0

[Fix]

* https://<email address hidden>/

[Test Case]

* Compile tested
* Boot tested
* Tested by Google

[Where things could go wrong]

* Patches relatively isolated and maintain similar checking functionality, just earlier in boot. Likely a low chance of regression.

Revision history for this message
John Cabaj (john-cabaj) wrote :

Patches submitted to mantic:linux-gcp and lunar:linux-gcp for 2023.10.02 SRU cycle. jammy:linux-gcp-6.2 will get changes from lunar:linux-gcp during aforementioned SRU cycle.

Changed in linux-gcp (Ubuntu Jammy):
status: New → Fix Committed
Changed in linux-gcp (Ubuntu Lunar):
status: New → Fix Committed
Changed in linux-gcp (Ubuntu Mantic):
status: New → Fix Committed
Tim Gardner (timg-tpi)
Changed in linux-aws (Ubuntu Lunar):
status: New → Fix Committed
Changed in linux-aws (Ubuntu Jammy):
status: New → Invalid
Tim Gardner (timg-tpi)
Changed in linux-aws (Ubuntu Mantic):
status: New → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (62.6 KiB)

This bug was fixed in the package linux-gcp - 6.5.0-1006.6

---------------
linux-gcp (6.5.0-1006.6) mantic; urgency=medium

  * mantic/linux-gcp: 6.5.0-1006.6 -proposed tracker (LP: #2037626)

  * SEV_SNP failure to init (LP: #2037316)
    - x86/sev-es: Allow copy_from_kernel_nofault in earlier boot
    - x86/sev-es: Only set x86_virt_bits to correct value

  * Miscellaneous Ubuntu changes
    - [Config] update gcc version in annotations

  [ Ubuntu: 6.5.0-7.7 ]

  * mantic/linux: 6.5.0-7.7 -proposed tracker (LP: #2037611)
  * kexec enable to load/kdump zstd compressed zimg (LP: #2037398)
    - [Packaging] Revert arm64 image format to Image.gz
  * Mantic minimized/minimal cloud images do not receive IP address during
    provisioning (LP: #2036968)
    - [Config] Enable virtio-net as built-in to avoid race
  * Miscellaneous Ubuntu changes
    - SAUCE: Add mdev_set_iommu_device() kABI
    - [Config] update gcc version in annotations

  [ Ubuntu: 6.5.0-6.6 ]

  * mantic/linux: 6.5.0-6.6 -proposed tracker (LP: #2035595)
  * Mantic update: v6.5.3 upstream stable release (LP: #2035588)
    - drm/amd/display: ensure async flips are only accepted for fast updates
    - cpufreq: intel_pstate: set stale CPU frequency to minimum
    - tpm: Enable hwrng only for Pluton on AMD CPUs
    - Input: i8042 - add quirk for TUXEDO Gemini 17 Gen1/Clevo PD70PN
    - Revert "fuse: in fuse_flush only wait if someone wants the return code"
    - Revert "f2fs: clean up w/ sbi->log_sectors_per_block"
    - Revert "PCI: tegra194: Enable support for 256 Byte payload"
    - Revert "net: macsec: preserve ingress frame ordering"
    - reiserfs: Check the return value from __getblk()
    - splice: always fsnotify_access(in), fsnotify_modify(out) on success
    - splice: fsnotify_access(fd)/fsnotify_modify(fd) in vmsplice
    - splice: fsnotify_access(in), fsnotify_modify(out) on success in tee
    - eventfd: prevent underflow for eventfd semaphores
    - fs: Fix error checking for d_hash_and_lookup()
    - iomap: Remove large folio handling in iomap_invalidate_folio()
    - tmpfs: verify {g,u}id mount options correctly
    - selftests/harness: Actually report SKIP for signal tests
    - vfs, security: Fix automount superblock LSM init problem, preventing NFS sb
      sharing
    - ARM: ptrace: Restore syscall restart tracing
    - ARM: ptrace: Restore syscall skipping for tracers
    - btrfs: zoned: skip splitting and logical rewriting on pre-alloc write
    - erofs: release ztailpacking pclusters properly
    - locking/arch: Avoid variable shadowing in local_try_cmpxchg()
    - refscale: Fix uninitalized use of wait_queue_head_t
    - clocksource: Handle negative skews in "skew is too large" messages
    - powercap: arm_scmi: Remove recursion while parsing zones
    - OPP: Fix potential null ptr dereference in dev_pm_opp_get_required_pstate()
    - OPP: Fix passing 0 to PTR_ERR in _opp_attach_genpd()
    - selftests/resctrl: Add resctrl.h into build deps
    - selftests/resctrl: Don't leak buffer in fill_cache()
    - selftests/resctrl: Unmount resctrl FS if child fails to run benchmark
    - selftests/resctrl: Close perf value read fd on errors
    - sched/fair: remove ut...

Changed in linux-gcp (Ubuntu Mantic):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (62.4 KiB)

This bug was fixed in the package linux-aws - 6.5.0-1007.7

---------------
linux-aws (6.5.0-1007.7) mantic; urgency=medium

  * mantic/linux-aws: 6.5.0-1007.7 -proposed tracker (LP: #2037624)

  * SEV_SNP failure to init (LP: #2037316)
    - x86/sev-es: Allow copy_from_kernel_nofault in earlier boot
    - x86/sev-es: Only set x86_virt_bits to correct value

  * Miscellaneous Ubuntu changes
    - [Config] update toolchain version in annotations

  [ Ubuntu: 6.5.0-7.7 ]

  * mantic/linux: 6.5.0-7.7 -proposed tracker (LP: #2037611)
  * kexec enable to load/kdump zstd compressed zimg (LP: #2037398)
    - [Packaging] Revert arm64 image format to Image.gz
  * Mantic minimized/minimal cloud images do not receive IP address during
    provisioning (LP: #2036968)
    - [Config] Enable virtio-net as built-in to avoid race
  * Miscellaneous Ubuntu changes
    - SAUCE: Add mdev_set_iommu_device() kABI
    - [Config] update gcc version in annotations

  [ Ubuntu: 6.5.0-6.6 ]

  * mantic/linux: 6.5.0-6.6 -proposed tracker (LP: #2035595)
  * Mantic update: v6.5.3 upstream stable release (LP: #2035588)
    - drm/amd/display: ensure async flips are only accepted for fast updates
    - cpufreq: intel_pstate: set stale CPU frequency to minimum
    - tpm: Enable hwrng only for Pluton on AMD CPUs
    - Input: i8042 - add quirk for TUXEDO Gemini 17 Gen1/Clevo PD70PN
    - Revert "fuse: in fuse_flush only wait if someone wants the return code"
    - Revert "f2fs: clean up w/ sbi->log_sectors_per_block"
    - Revert "PCI: tegra194: Enable support for 256 Byte payload"
    - Revert "net: macsec: preserve ingress frame ordering"
    - reiserfs: Check the return value from __getblk()
    - splice: always fsnotify_access(in), fsnotify_modify(out) on success
    - splice: fsnotify_access(fd)/fsnotify_modify(fd) in vmsplice
    - splice: fsnotify_access(in), fsnotify_modify(out) on success in tee
    - eventfd: prevent underflow for eventfd semaphores
    - fs: Fix error checking for d_hash_and_lookup()
    - iomap: Remove large folio handling in iomap_invalidate_folio()
    - tmpfs: verify {g,u}id mount options correctly
    - selftests/harness: Actually report SKIP for signal tests
    - vfs, security: Fix automount superblock LSM init problem, preventing NFS sb
      sharing
    - ARM: ptrace: Restore syscall restart tracing
    - ARM: ptrace: Restore syscall skipping for tracers
    - btrfs: zoned: skip splitting and logical rewriting on pre-alloc write
    - erofs: release ztailpacking pclusters properly
    - locking/arch: Avoid variable shadowing in local_try_cmpxchg()
    - refscale: Fix uninitalized use of wait_queue_head_t
    - clocksource: Handle negative skews in "skew is too large" messages
    - powercap: arm_scmi: Remove recursion while parsing zones
    - OPP: Fix potential null ptr dereference in dev_pm_opp_get_required_pstate()
    - OPP: Fix passing 0 to PTR_ERR in _opp_attach_genpd()
    - selftests/resctrl: Add resctrl.h into build deps
    - selftests/resctrl: Don't leak buffer in fill_cache()
    - selftests/resctrl: Unmount resctrl FS if child fails to run benchmark
    - selftests/resctrl: Close perf value read fd on errors
    - sched/fair: rem...

Changed in linux-aws (Ubuntu Mantic):
status: Fix Committed → Fix Released
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-gcp/6.2.0-1017.19 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-lunar-linux-gcp' to 'verification-done-lunar-linux-gcp'. If the problem still exists, change the tag 'verification-needed-lunar-linux-gcp' to 'verification-failed-lunar-linux-gcp'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-lunar-linux-gcp-v2 verification-needed-lunar-linux-gcp
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-aws/6.2.0-1015.15 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-lunar-linux-aws' to 'verification-done-lunar-linux-aws'. If the problem still exists, change the tag 'verification-needed-lunar-linux-aws' to 'verification-failed-lunar-linux-aws'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-lunar-linux-aws-v2 verification-needed-lunar-linux-aws
Revision history for this message
Thomas Bechtold (toabctl) wrote :

I did test the kernel from lunar proposed (6.2.0.1015.16) on AWS. The image boots with sev-snp enabled:

# sudo dmesg | grep -i sev
[ 5.563677] Memory Encryption Features active: AMD SEV SEV-ES SEV-SNP
[ 6.140250] SEV: Using SNP CPUID table, 64 entries present.
[ 8.507286] SEV: SNP guest platform device initialized.
[ 20.829729] sev-guest sev-guest: Initialized SEV guest driver (using vmpck_id 0)

# apt-cache policy linux-aws
linux-aws:
  Installed: 6.2.0.1015.16
  Candidate: 6.2.0.1015.16
  Version table:
 *** 6.2.0.1015.16 100
        100 http://archive.ubuntu.com/ubuntu lunar-proposed/main amd64 Packages
        100 /var/lib/dpkg/status
     6.2.0.1013.14 500
        500 http://us-east-2.ec2.archive.ubuntu.com/ubuntu lunar-updates/main amd64 Packages
        500 http://security.ubuntu.com/ubuntu lunar-security/main amd64 Packages
     6.2.0.1003.4 500
        500 http://us-east-2.ec2.archive.ubuntu.com/ubuntu lunar/main amd64 Packages

tags: added: verification-done-lunar-linux-aws
removed: verification-needed-lunar-linux-aws
Stefan Bader (smb)
Changed in linux-aws (Ubuntu Lunar):
importance: Undecided → High
Changed in linux-aws (Ubuntu Mantic):
importance: Undecided → High
Revision history for this message
John Cabaj (john-cabaj) wrote :

Tested the kernel from lunar proposed (6.2.0-1018-gcp) on GCP n2d instances. The image boots with sev enabled:

john_cabaj@john-cabaj-sev-snp2:~$ sudo dmesg | grep -i sev
[ 0.303257] Memory Encryption Features active: AMD SEV

tags: added: verification-done-lunar-linux-gcp
removed: verification-needed-lunar-linux-gcp
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-gcp - 6.2.0-1017.19

---------------
linux-gcp (6.2.0-1017.19) lunar; urgency=medium

  * lunar/linux-gcp: 6.2.0-1017.19 -proposed tracker (LP: #2038064)

  * CVE-2023-42755
    - [Config] remove NET_CLS_RSVP and NET_CLS_RSVP6

  * SEV_SNP failure to init (LP: #2037316)
    - x86/sev-es: Allow copy_from_kernel_nofault in earlier boot
    - x86/sev-es: Only set x86_virt_bits to correct value

  [ Ubuntu: 6.2.0-35.35 ]

  * lunar/linux: 6.2.0-35.35 -proposed tracker (LP: #2038229)
  * Packaging resync (LP: #1786013)
    - [Packaging] update helper scripts
  * CVE-2023-4244
    - netfilter: nf_tables: don't skip expired elements during walk
    - netfilter: nf_tables: integrate pipapo into commit protocol
    - netfilter: nft_set_rbtree: fix overlap expiration walk
    - netfilter: nf_tables: adapt set backend to use GC transaction API
    - netfilter: nft_set_hash: mark set element as dead when deleting from packet
      path
    - netfilter: nf_tables: drop map element references from preparation phase
    - netfilter: nf_tables: GC transaction API to avoid race with control plane
    - netfilter: nf_tables: remove busy mark and gc batch API
    - netfilter: nf_tables: don't fail inserts if duplicate has expired
    - netfilter: nf_tables: fix kdoc warnings after gc rework
    - netfilter: nf_tables: fix GC transaction races with netns and netlink event
      exit path
    - netfilter: nf_tables: GC transaction race with netns dismantle
    - netfilter: nf_tables: GC transaction race with abort path
    - netfilter: nf_tables: use correct lock to protect gc_list
    - netfilter: nf_tables: defer gc run if previous batch is still pending
    - netfilter: nft_dynset: disallow object maps
    - netfilter: nft_set_rbtree: skip sync GC for new elements in this transaction
  * CVE-2023-5197
    - netfilter: nf_tables: skip bound chain in netns release path
    - netfilter: nf_tables: disallow rule removal from chain binding
  * CVE-2023-4921
    - net: sched: sch_qfq: Fix UAF in qfq_dequeue()
  * CVE-2023-4881
    - netfilter: nftables: exthdr: fix 4-byte stack OOB write
  * CVE-2023-4623
    - net/sched: sch_hfsc: Ensure inner classes have fsc curve
  * CVE-2023-4622
    - af_unix: Fix null-ptr-deref in unix_stream_sendpage().
  * CVE-2023-42756
    - netfilter: ipset: Fix race between IPSET_CMD_CREATE and IPSET_CMD_SWAP
  * CVE-2023-42755
    - net/sched: Retire rsvp classifier
    - [Config] remove NET_CLS_RSVP and NET_CLS_RSVP6
  * CVE-2023-42753
    - netfilter: ipset: add the missing IP_SET_HASH_WITH_NET0 macro for
      ip_set_hash_netportnet.c
  * CVE-2023-42752
    - igmp: limit igmpv3_newpack() packet size to IP_MAX_MTU
    - net: add SKB_HEAD_ALIGN() helper
    - net: remove osize variable in __alloc_skb()
    - net: factorize code in kmalloc_reserve()
    - net: deal with integer overflows in kmalloc_reserve()
  * CVE-2023-34319
    - xen/netback: Fix buffer overrun triggered by unusual packet

 -- John Cabaj <email address hidden> Thu, 05 Oct 2023 21:59:43 -0500

Changed in linux-gcp (Ubuntu Lunar):
status: Fix Committed → Fix Released
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-aws-6.5/6.5.0-1008.8~22.04.1 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy-linux-aws-6.5' to 'verification-done-jammy-linux-aws-6.5'. If the problem still exists, change the tag 'verification-needed-jammy-linux-aws-6.5' to 'verification-failed-jammy-linux-aws-6.5'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-aws-6.5-v2 verification-needed-jammy-linux-aws-6.5
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (104.4 KiB)

This bug was fixed in the package linux-aws - 6.2.0-1015.15

---------------
linux-aws (6.2.0-1015.15) lunar; urgency=medium

  * lunar/linux-aws: 6.2.0-1015.15 -proposed tracker (LP: #2038059)

  * SEV_SNP failure to init (LP: #2037316)
    - x86/sev-es: Allow copy_from_kernel_nofault in earlier boot
    - x86/sev-es: Only set x86_virt_bits to correct value

  [ Ubuntu: 6.2.0-36.37 ]

  * lunar/linux: 6.2.0-36.37 -proposed tracker (LP: #2038076)
  * Regression for ubuntu_bpf test build caused by upstream bdeeed3498c7
    (LP: #2035181)
    - selftests/bpf: fix static assert compilation issue for test_cls_*.c
  * CVE-2023-4244
    - netfilter: nf_tables: don't skip expired elements during walk
    - netfilter: nf_tables: adapt set backend to use GC transaction API
    - netfilter: nft_set_hash: mark set element as dead when deleting from packet
      path
    - netfilter: nf_tables: GC transaction API to avoid race with control plane
    - netfilter: nf_tables: don't fail inserts if duplicate has expired
    - netfilter: nf_tables: fix kdoc warnings after gc rework
    - netfilter: nf_tables: fix GC transaction races with netns and netlink event
      exit path
    - netfilter: nf_tables: GC transaction race with netns dismantle
    - netfilter: nf_tables: GC transaction race with abort path
    - netfilter: nf_tables: use correct lock to protect gc_list
    - netfilter: nf_tables: defer gc run if previous batch is still pending
    - netfilter: nft_dynset: disallow object maps
    - netfilter: nft_set_rbtree: skip sync GC for new elements in this transaction
  * CVE-2023-4563
    - netfilter: nf_tables: remove busy mark and gc batch API
  * CVE-2023-42756
    - netfilter: ipset: Fix race between IPSET_CMD_CREATE and IPSET_CMD_SWAP
  * CVE-2023-4623
    - net/sched: sch_hfsc: Ensure inner classes have fsc curve
  * Fix unstable audio at low levels on Thinkpad P1G4 (LP: #2037077)
    - ALSA: hda/realtek - ALC287 I2S speaker platform support
  * Lunar update: upstream stable patchset 2023-09-21 (LP: #2037005)
    - Upstream stable to v6.1.41, v6.4.6
    - io_uring: treat -EAGAIN for REQ_F_NOWAIT as final for io-wq
    - ALSA: hda/realtek - remove 3k pull low procedure
    - ALSA: hda/realtek: Add quirk for Clevo NS70AU
    - ALSA: hda/realtek: Enable Mute LED on HP Laptop 15s-eq2xxx
    - maple_tree: set the node limit when creating a new root node
    - maple_tree: fix node allocation testing on 32 bit
    - keys: Fix linking a duplicate key to a keyring's assoc_array
    - perf probe: Add test for regression introduced by switch to
      die_get_decl_file()
    - btrfs: fix warning when putting transaction with qgroups enabled after abort
    - fuse: revalidate: don't invalidate if interrupted
    - fuse: Apply flags2 only when userspace set the FUSE_INIT_EXT
    - btrfs: set_page_extent_mapped after read_folio in btrfs_cont_expand
    - btrfs: zoned: fix memory leak after finding block group with super blocks
    - fuse: ioctl: translate ENOSYS in outarg
    - btrfs: fix race between balance and cancel/pause
    - selftests: tc: set timeout to 15 minutes
    - selftests: tc: add 'ct' action kconfig dep
    - regmap: Drop initial version of m...

Changed in linux-aws (Ubuntu Lunar):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.