Comment 12 for bug 1624096

Revision history for this message
Laszlo Ersek (Red Hat) (lersek) wrote :

@Jason, there are two separate topics in your question.

First, controlling the boot order from the QEMU command line (i.e., filtering and/or reordering the persistent UEFI boot options that (a) exist from earlier in the varstore, plus (b) OVMF's platform BDS regenerates at every boot).

For this, you have to use the

    -device XXXX,bootindex=N

propertiey, which in turn necessitates the modern, separate notation for backend/frontend.

For example, for network devices you have to spell out

    -netdev XXXX,id=netdev0,... \
    -device virtio-net-pci,netdev=netdev0,bootindex=2

For disks, for example with the virtio-blk-pci frontend, it requires

    -drive if=none,id=drive0,file=ZZZ,... \
    -device virtio-blk-pci,drive=drive0,bootindex=1

The various shorthands like "-net nic", "-hda", "-drive if=virtio" don't allow you to specify the bootindex=N property, and therefore are unsuitable for OVMF. (At least if you want to control the boot order from the QEMU command line.)

So, in this specific case, assuming you have one QCOW2 system disk (created with qemu-img) that you want to install Ubuntu to, plus the installer ISO you want to install from, I would recommend:

    -drive if=pflash,readonly,format=raw,file=PATH_TO_OVMF_CODE_FD \
    -drive if=pflash,format=raw,file=PATH_TO_PRIVATE_VARSTORE \
    \
    -debugcon file:ovmf.debug.log \
    -global isa-debugcon.iobase=0x402 \
    \
    -chardev stdio,signal=off,mux=on,id=char0 \
    -mon chardev=char0,mode=readline,default \
    -serial chardev:char0 \
    \
    -device virtio-scsi-pci,id=scsi0 \
    \
    -drive id=sysdisk,if=none,format=qcow2,discard=on,cache=writeback,file=... \
    -device scsi-hd,drive=sysdisk,bus=scsi0.0,bootindex=1 \
    \
    -drive id=installer,if=none,format=raw,file=... \
    -device scsi-cd,drive=installer,bus=scsi0.0,bootindex=2 \

This will (a) capture the OVMF log; (b) give you access to both the QEMU monitor and the guest's serial console -- switch between them with [C-a c]; (c) create a virtio-scsi disk and CD-ROM for the guest, with the (target) system disk and the installer ISO, respectively; (d) assign bootindex=1 to the system disk, and bootindex=2 to the installer ISO.

The upshot is that when you first boot the VM, the installer ISO will be launched (because the system disk is still empty), but after installation, the VM will boot off of the system disk.

If there is a (QEMU default, or manually configured) virtual NIC in the VM as well, then PXE boot will *not* be attempted. The reason is that you assign a bootindex to at least one device, but no bootindex is assigned to the NIC. This will cause OVMF to filter out any UEFI boot options (created manually or automatically) that would refer to the NIC.

If the yakkety installer still doesn't boot with the above command line snippet (*and* with the shim bug fixed or worked around), then I'd say the installer ISO is malformed in some other way.

The second topic is why the shim bug doesn't hit hard on some physical systems. For this, consider how EFI_FILE_PROTOCOL.Close() works -- it releases the entire container structure that contains EFI_FILE_PROTOCOL. When you call FileProtocol->Close() next, using the same pointer -- i.e., use-after-free --, then the Close function pointer is read from freed storage.

As I mentioned earler, due to OVMF setting bit #3 (value 8) in PcdDebugPropertyMask, memory that is freed gets scrubbed with the byte value 0xAF. (Funnily enough, this hex value comes from the name of Andrew Fish, the inventor of EFI.) So when you read the Close function pointer from an EFI_FILE_PROTOCOL instance that has been closed (released) already, you get 0xAFAFAFAFAFAFAFAF -- that's why you see such an instruction pointer (RIP) in the register dump above.

Now, when shim executes the use-after-free (= the second close) on a UEFI system that does *not* do the memory scrubbing on free, then all the earlier contents of the freed EFI_FILE_PROTOCOL instance are likely still in place. Hence the call probably corrupts memory elsewhere, but it does not blow up at once. (Which is actually much worse bug behavior.)

This is why you don't see any direct symptoms on physical machines: memory scrubbing on free is a debugging feature, and none of the physical firmwares in question enable it apparently. The upstream shim commit that fixes the regression also mentions "This issue only affects certain systems".