SEV does not enable IOMMU on SCSI controller

Bug #1845986 reported by Adam Spiers
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Stephen Finucane
Train
Fix Committed
Undecided
Stephen Finucane

Bug Description

https://review.opendev.org/#/c/644565/ added logic to libvirt/designer.py for enabling iommu for certain devices where virtio is used. This is required for AMD SEV[0]. However it missed the case of a SCSI controller where the model is virtio-scsi, e.g.:

    <controller type='scsi' index='0' model='virtio-scsi'>

As with other virtio devices, here a child element needs to be added to the config when SEV is enabled:

    <driver iommu="on" />

[0] http://specs.openstack.org/openstack/nova-specs/specs/train/approved/amd-sev-libvirt-support.html#proposed-change

Changed in nova:
assignee: nobody → Adam Spiers (adam.spiers)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/685756

Changed in nova:
status: New → In Progress
Revision history for this message
Eric Fried (efried) wrote :

Can you please explain more about what actually breaks?

Revision history for this message
Eric Fried (efried) wrote :

Fix proposed to branch: master (before the bug, so didn't get picked up by the bot)
Review: https://review.opendev.org/684825

Eric Fried (efried)
Changed in nova:
importance: Undecided → High
tags: added: train-rc-potential
Matt Riedemann (mriedem)
Changed in nova:
assignee: Adam Spiers (adam.spiers) → Boris Bobrov (bbobrov)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/train)

Related fix proposed to branch: stable/train
Review: https://review.opendev.org/686414

Revision history for this message
Stephen Finucane (stephenfinucane) wrote :

As noted by Sean at [1]

> right so to clarify, in its current state on master,
> SEV will only cause the error below if you enable have the following image metadata properties are set
>
> (hw_disk_bus=scsi or hw_cdrom_bus=scsi) and hw_scsi_modle=virtio-scsi
> hw_video_model=virtio (this is the default on arm but SEV only works on AMD x86_64
> hw_qemu_guest_agent=yes
>
> in all other cases it should work correctly.
>
> the error is caused because when any of the above combination of image properties are set a virio devices
> is created without instruct qemu to use dma mappable memory for the device by setting driver=iommu.
>
> as a result SEV will try to encrypt the device memory which will cause the guest kernel to lockup when udev tries to initialise the devices.
>
> sev will be functional if the default disk/video models are used and if the qemu disk agent is not used.

[1] https://review.opendev.org/#/c/686414/1/releasenotes/notes/bug-1845986-95cbede0a296b088.yaml@5

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/train)

Reviewed: https://review.opendev.org/686414
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=9545edc79d75e75d47c51d5da651975c01e919ec
Submitter: Zuul
Branch: stable/train

commit 9545edc79d75e75d47c51d5da651975c01e919ec
Author: Stephen Finucane <email address hidden>
Date: Thu Oct 3 15:23:53 2019 +0100

    docs: Highlight the current broken state of SEV

    This won't be resolved in time for Train GA, so add a release note
    highlighting the problem until such a time as the release is fixed.

    Change-Id: Iae30e12084640d1c0f072d2db18653111988929e
    Signed-off-by: Stephen Finucane <email address hidden>
    Related-Bug: #1845986
    Stable-Only

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/train)

Change abandoned by Matt Riedemann (<email address hidden>) on branch: stable/train
Review: https://review.opendev.org/685756
Reason: This is definitely not ready and the change on master looks semi-abandoned since there has been radio silence from the SUSE team for the past couple of weeks. We can restore and update this if/when the fix is approved on master.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/693072

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.opendev.org/696697

Changed in nova:
assignee: Boris Bobrov (bbobrov) → Stephen Finucane (stephenfinucane)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/696697
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=2b0024c04070dcffc056e9df7b9ced40cbfdcc00
Submitter: Zuul
Branch: master

commit 2b0024c04070dcffc056e9df7b9ced40cbfdcc00
Author: Boris Bobrov <email address hidden>
Date: Wed Sep 25 20:07:50 2019 +0200

    Switch to uses_virtio to enable iommu driver for AMD SEV

    In order to use AMD SEV, devices that use virtio should also use
    iommu driver. The problem is to understand that a device uses virtio,
    because there are many attributes and values that can indicate it.
    For example, it can be `model='virtio'` for rng and
    `<target bus='virtio'/>` for disk.

    Add new property for devices `uses_virtio`, that will indicate that a
    device in its current configuration is a virtio device, and implement it
    for known devices.

    The patch does not change how resulting libvirt XMLs look and does not
    fix user-visible bugs. This refactoring is required for further work on
    AMD SEV related patches.

    The changes to memballoon-related tests were required because the tests
    were using fake values that does are not possible in real life and that
    were preventing tests of the new method. The tests now use the value
    that nova currently uses to generate libvirt xml.

    Change-Id: I0f85918996128d573b5dbd6ac49a9c2356cd40a9
    Partial-Bug: #1845986

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.opendev.org/684825
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=1a88d35cb161df34f11ce17fed3b62621f774672
Submitter: Zuul
Branch: master

commit 1a88d35cb161df34f11ce17fed3b62621f774672
Author: Boris Bobrov <email address hidden>
Date: Fri Nov 29 13:40:00 2019 +0100

    Also enable iommu for virtio controllers and video in libvirt

    Ie54fca066f33 added logic to libvirt/designer.py for enabling iommu
    for certain devices where virtio is used. This is required for AMD
    SEV[0]. However it missed two cases.

    Firstly, a SCSI controller can have the model as 'virtio-scsi', e.g.:

        <controller type='scsi' index='0' model='virtio-scsi'>

    As with other virtio devices, here a child element needs to be added
    to the config when SEV is enabled:

        <driver iommu="on" />

    We do not need to cover the case of a controller with type
    'virtio-serial' now, since even though it is supported by libvirt, it
    is not currently used anywhere in Nova.

    Secondly, a video device can be virtio, e.g. when vgpus are in use:

        <video>
            <model type='virtio'/>
        </video>

    Also take this opportunity to clarify the corresponding documentation
    around disk bus options.

    [0] http://specs.openstack.org/openstack/nova-specs/specs/train/approved/amd-sev-libvirt-support.html#proposed-change

    Partial-Bug: #1845986
    Change-Id: I626c35d1653e6a25125320032d0a4a0c67ab8bcf

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.opendev.org/693072
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=c2fd8294fdb615d5229c4eb2abef1226b7cc580b
Submitter: Zuul
Branch: master

commit c2fd8294fdb615d5229c4eb2abef1226b7cc580b
Author: Boris Bobrov <email address hidden>
Date: Tue Nov 5 20:06:29 2019 +0100

    Create a controller for qga when SEV is used

    When a guest agent is requested, it requires a virtio-serial controller.
    If the controller is not created explicitly, it will be created by
    libvirt. But if AMD SEV is also requested, the controller is expected
    to use the iommu driver.

    In order to achive that, the controller is created explicitly with the
    required driver.

    Change-Id: I47c248649b9d77e8bd7c5350fc84d89342be4623
    Closes-Bug: 1845986

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/698738

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: stable/train
Review: https://review.opendev.org/698739

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/train)

Related fix proposed to branch: stable/train
Review: https://review.opendev.org/698746

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/train)

Reviewed: https://review.opendev.org/698738
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=f38e151b0d2aac5eee80890dd0eaa5ad6b6c1799
Submitter: Zuul
Branch: stable/train

commit f38e151b0d2aac5eee80890dd0eaa5ad6b6c1799
Author: Boris Bobrov <email address hidden>
Date: Wed Sep 25 20:07:50 2019 +0200

    Switch to uses_virtio to enable iommu driver for AMD SEV

    In order to use AMD SEV, devices that use virtio should also use
    iommu driver. The problem is to understand that a device uses virtio,
    because there are many attributes and values that can indicate it.
    For example, it can be `model='virtio'` for rng and
    `<target bus='virtio'/>` for disk.

    Add new property for devices `uses_virtio`, that will indicate that a
    device in its current configuration is a virtio device, and implement it
    for known devices.

    The patch does not change how resulting libvirt XMLs look and does not
    fix user-visible bugs. This refactoring is required for further work on
    AMD SEV related patches.

    The changes to memballoon-related tests were required because the tests
    were using fake values that does are not possible in real life and that
    were preventing tests of the new method. The tests now use the value
    that nova currently uses to generate libvirt xml.

    Change-Id: I0f85918996128d573b5dbd6ac49a9c2356cd40a9
    Partial-Bug: #1845986
    (cherry picked from commit 2b0024c04070dcffc056e9df7b9ced40cbfdcc00)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.opendev.org/685756
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=0662697bbf982b60c595d363324b9b6e501f9ded
Submitter: Zuul
Branch: stable/train

commit 0662697bbf982b60c595d363324b9b6e501f9ded
Author: Boris Bobrov <email address hidden>
Date: Fri Nov 29 13:40:00 2019 +0100

    Also enable iommu for virtio controllers and video in libvirt

    Ie54fca066f33 added logic to libvirt/designer.py for enabling iommu
    for certain devices where virtio is used. This is required for AMD
    SEV[0]. However it missed two cases.

    Firstly, a SCSI controller can have the model as 'virtio-scsi', e.g.:

        <controller type='scsi' index='0' model='virtio-scsi'>

    As with other virtio devices, here a child element needs to be added
    to the config when SEV is enabled:

        <driver iommu="on" />

    We do not need to cover the case of a controller with type
    'virtio-serial' now, since even though it is supported by libvirt, it
    is not currently used anywhere in Nova.

    Secondly, a video device can be virtio, e.g. when vgpus are in use:

        <video>
            <model type='virtio'/>
        </video>

    Also take this opportunity to clarify the corresponding documentation
    around disk bus options.

    [0] http://specs.openstack.org/openstack/nova-specs/specs/train/approved/amd-sev-libvirt-support.html#proposed-change

    Partial-Bug: #1845986
    Change-Id: I626c35d1653e6a25125320032d0a4a0c67ab8bcf
    (cherry picked from commit 1a88d35cb161df34f11ce17fed3b62621f774672)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.opendev.org/698739
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=7bdd8e97fb5c4ae302eff79d257cec6b8cd2d27c
Submitter: Zuul
Branch: stable/train

commit 7bdd8e97fb5c4ae302eff79d257cec6b8cd2d27c
Author: Boris Bobrov <email address hidden>
Date: Tue Nov 5 20:06:29 2019 +0100

    Create a controller for qga when SEV is used

    When a guest agent is requested, it requires a virtio-serial controller.
    If the controller is not created explicitly, it will be created by
    libvirt. But if AMD SEV is also requested, the controller is expected
    to use the iommu driver.

    In order to achive that, the controller is created explicitly with the
    required driver.

    Change-Id: I47c248649b9d77e8bd7c5350fc84d89342be4623
    Closes-Bug: 1845986
    (cherry picked from commit c2fd8294fdb615d5229c4eb2abef1226b7cc580b)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/train)

Reviewed: https://review.opendev.org/698746
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=c64a136ea7db77d6d1062a73b04956fb6c5cdd9e
Submitter: Zuul
Branch: stable/train

commit c64a136ea7db77d6d1062a73b04956fb6c5cdd9e
Author: Stephen Finucane <email address hidden>
Date: Thu Dec 12 15:40:15 2019 +0000

    Revert "docs: Highlight the current broken state of SEV" (partially)

    This partially reverts commit 9545edc79d75e75d47c51d5da651975c01e919ec.
    The change in the admin docs is removed but the release note is kept
    since that applies to 20.0 GA and any patch releases made before this
    fix was merged.

    Change-Id: I8cd8594408480aa83e88b0c2e91189a656d7890d
    Signed-off-by: Stephen Finucane <email address hidden>
    Related-Bug: 1845986

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 20.1.0

This issue was fixed in the openstack/nova 20.1.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.