linux-azure-cvm: Properly reallocate the kernel image

Bug #1980023 reported by Marcelo Cerri
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-azure-cvm (Ubuntu)
New
Undecided
Unassigned
Focal
Fix Released
Critical
Marcelo Cerri
llvm-defaults (Ubuntu)
New
Undecided
Unassigned

Bug Description

[Impact]

The kernel header defines a field called init_size that specifies the amount of memory that the kernel requires for the in-place decompression, and the bootloader is expected to load the kernel into a buffer of this size. This doesn't happen when using the systemd EFI stub to load the kernel though - the kernel image is stored on disk in a PE section with a virtual size no larger than the compressed size, so it's loaded into memory by the bootloader into a buffer that's too small for the in-place decompression. The initrd is loaded into memory immediately after the kernel.

To work around this, the kernel's EFI stub allocates a new buffer of init_size bytes and relocates the kernel image into it (21cb9b41: "efi/x86: Always relocate the kernel for EFI handover entry"), but this code has a bug - it copies init_size bytes from the source buffer (ie, where the kernel image was loaded into memory by the bootloader) to the new buffer. This ends up reading past the end of the .linux and .initrd PE sections and all of the memory regions allocated by the bootloader, resulting in an out of bounds read and causing problem with Confidential VMs.

This is fixed by 688eb282: "efi/x86: Only copy the compressed kernel image in efi_relocate_kernel()", which needs to be backported to the kernel we provide for CVM. Ideally, this would have been fixed in systemd's EFI stub by setting the virtual size of the .linux PE section to init_size, which would cause the bootloader load the kernel into a buffer large enough, making this additional relocation unnecessary.

[Test Plan]

Tested by Microsoft and boot tested by me.

[Where problems could occur]

Decompression might fail if init_size is wrong causing the system to not boot.

Marcelo Cerri (mhcerri)
Changed in linux-azure-cvm (Ubuntu Focal):
assignee: nobody → Marcelo Cerri (mhcerri)
status: New → In Progress
importance: Undecided → Critical
Revision history for this message
Marcelo Cerri (mhcerri) wrote :
summary: - Properly realocate the the kernel image
+ Properly reallocate the the kernel image
summary: - Properly reallocate the the kernel image
+ linux-azure-cvm: Properly reallocate the kernel image
Marcelo Cerri (mhcerri)
Changed in linux-azure-cvm (Ubuntu Focal):
status: In Progress → Fix Committed
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

> Ideally, this would have been fixed in systemd's EFI stub by setting the virtual size of the .linux PE section to init_size, which would cause the bootloader load the kernel into a buffer large enough, making this additional relocation unnecessary.

I'm not too sure if that is a bug in systemd though. the kernel.efi (and addition of .linux PE section) is never done by systemd, but by other tooling (i.e. core-initrd / dracut / etc).

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

$ objdump -x ./usr/lib/linux/efi/kernel.efi-5.4.0-1085-azure-cvm

./usr/lib/linux/efi/kernel.efi-5.4.0-1085-azure-cvm: file format pei-x86-64
./usr/lib/linux/efi/kernel.efi-5.4.0-1085-azure-cvm
architecture: i386:x86-64, flags 0x00000133:
HAS_RELOC, EXEC_P, HAS_SYMS, HAS_LOCALS, D_PAGED
start address 0x0000000000004000

Characteristics 0x206
 executable
 line numbers stripped
 debugging information removed

Time/Date Thu Jan 1 01:00:00 1970
Magic 020b (PE32+)
MajorLinkerVersion 2
MinorLinkerVersion 34
SizeOfCode 0000000000007600
SizeOfInitializedData 000000000264e800
SizeOfUninitializedData 0000000000000000
AddressOfEntryPoint 0000000000004000
BaseOfCode 0000000000004000
ImageBase 0000000000000000
SectionAlignment 00001000
FileAlignment 00000200
MajorOSystemVersion 0
MinorOSystemVersion 0
MajorImageVersion 0
MinorImageVersion 0
MajorSubsystemVersion 0
MinorSubsystemVersion 0
Win32Version 00000000
SizeOfImage 04c65000
SizeOfHeaders 00000400
CheckSum 0265f929
Subsystem 0000000a (EFI application)
DllCharacteristics 00000000
SizeOfStackReserve 0000000000000000
SizeOfStackCommit 0000000000000000
SizeOfHeapReserve 0000000000000000
SizeOfHeapCommit 0000000000000000
LoaderFlags 00000000
NumberOfRvaAndSizes 00000010

Note that SizeOfUninitializedData 0000000000000000

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

The Data Directory
Entry 0 0000000000000000 00000000 Export Directory [.edata (or where ever we found it)]
Entry 1 0000000000000000 00000000 Import Directory [parts of .idata]
Entry 2 0000000000000000 00000000 Resource Directory [.rsrc]
Entry 3 0000000000000000 00000000 Exception Directory [.pdata]
Entry 4 0000000002658790 00000780 Security Directory
Entry 5 000000000000c000 0000000a Base Relocation Directory [.reloc]
Entry 6 0000000000000000 00000000 Debug Directory
Entry 7 0000000000000000 00000000 Description Directory
Entry 8 0000000000000000 00000000 Special Directory
Entry 9 0000000000000000 00000000 Thread Storage Directory [.tls]
Entry a 0000000000000000 00000000 Load Configuration Directory
Entry b 0000000000000000 00000000 Bound Import Directory
Entry c 0000000000000000 00000000 Import Address Table Directory
Entry d 0000000000000000 00000000 Delay Import Directory
Entry e 0000000000000000 00000000 CLR Runtime Header
Entry f 0000000000000000 00000000 Reserved

No idea what these things are

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

PE File Base Relocations (interpreted .reloc section contents)

Virtual Address: 000024b2 Chunk size 10 (0xa) Number of fixups 1
 reloc 0 offset 0 [24b2] ABSOLUTE

Sections:
Idx Name Size VMA LMA File off Algn
  0 .text 00007500 0000000000004000 0000000000004000 00000400 2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .reloc 0000000a 000000000000c000 000000000000c000 00007a00 2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .data 00002128 000000000000d000 000000000000d000 00007c00 2**5
                  CONTENTS, ALLOC, LOAD, DATA
  3 .dynamic 00000110 0000000000010000 0000000000010000 00009e00 2**3
                  CONTENTS, ALLOC, LOAD, DATA
  4 .rela 00000e58 0000000000011000 0000000000011000 0000a000 2**3
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  5 .dynsym 00000378 0000000000012000 0000000000012000 0000b000 2**3
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  6 .cmdline 00000050 0000000000030000 0000000000030000 0000b400 2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  7 .sbat 000000ff 0000000000050000 0000000000050000 0000b600 2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  8 .linux 009e5980 0000000002000000 0000000002000000 0000b800 2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  9 .initrd 01c64f31 0000000003000000 0000000003000000 009f1200 2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA

Note that VMA and LVM are the same for .linux section. and size of .linux section is specified exactly to its size.

Should we increase .linux size to init_size then? or push .initrd to always be much later without much care?

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

i wonder if we have to set virtualsize to something large enough in the section specification (which none of the GNU or LLVM tooling supports), or if it is enough to ensure that initrd VMA address is far away enough. At the moment it is just 16MB away, whilst we need about 50MB instead.

I am thinking to attempt reading init_size, and push .initrd section further away, and see if that helps loading kernel image without the need to relocate it.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

playing with this more, we need llvm-objcopy to allow properly setting VirtualSize on COFF binaries, and then set that to the init_size of the kernel.

no longer affects: systemd (Ubuntu Focal)
affects: systemd (Ubuntu) → llvm-toolchain-14 (Ubuntu)
affects: llvm-toolchain-14 (Ubuntu) → llvm-defaults (Ubuntu)
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-azure-cvm/5.4.0-1085.90+cvm2 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-focal
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-azure-cvm - 5.4.0-1085.90+cvm2

---------------
linux-azure-cvm (5.4.0-1085.90+cvm2) focal; urgency=medium

  * focal/linux-azure-cvm: 5.4.0-1085.90+cvm2 -proposed tracker (LP: #1980027)

  * linux-azure-cvm: Properly reallocate the kernel image (LP: #1980023)
    - efi/x86: Only copy the compressed kernel image in efi_relocate_kernel()

 -- Marcelo Henrique Cerri <email address hidden> Mon, 27 Jun 2022 23:43:47 -0300

Changed in linux-azure-cvm (Ubuntu Focal):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.