[20.04] Allow to reset an opencapi adapter

Bug #1862121 reported by bugproxy
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Fix Released
Medium
Ubuntu on IBM Power Systems Bug Triage
linux (Ubuntu)
Fix Released
Undecided
Canonical Kernel Team

Bug Description

== Comment: #0 - Frederic Barrat <email address hidden> - 2020-01-27 10:56:49 ==
---Problem Description---
We've added code in firmware to allow to reset an opencapi adapter on a powerpc system, and retrain the opencapi link.

On linux, resetting the opencapi link is re-using the existing PCI hotplug framework, but there's a bit of enablement code missing and we'd like to add it to Ubuntu 20.04 since it's a LTS release.

Contact Information = <email address hidden>

---Additional Hardware Info---
opencapi adapter needed

---uname output---
Linux wsp02 5.3.0-26-generic #28-Ubuntu SMP Wed Dec 18 05:34:53 UTC 2019 ppc64le ppc64le ppc64le GNU/Linux

Machine Type = all P9 supporting opencapi: mihawk and AC922 (witherspoon)

---Debugger---
A debugger is not configured

Stack trace output:
 no

Oops output:
 no

System Dump Info:
  The system is not configured to capture a system dump.

*Additional Instructions for <email address hidden>:
-Attach sysctl -a output output to the bug.

== Comment: #1 - Frederic Barrat <email address hidden> - 2020-01-27 11:00:49 ==
The linux code in question has been on the mailing list for a while and is expected to be merged in the 5.6 merge window opening today.
Here is a link to the patches on the mailing list. I'll post the official commit IDs once they are known.

http://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=144336

Once merged upstream, is it possible to add them to the 20.04 kernel? Of course, I'll handle the backport if it's needed.

== Comment: #3 - Frederic Barrat <email address hidden> - 2020-01-30 02:51:05 ==
The patches have been merged in the powerpc maintainer's 'next' tree and will be in kernel 5.6 and now have official commit IDs:
05dd7da76986 powerpc/powernv/ioda: Fix ref count for devices with their own PE
80f1ff83fa11 powerpc/powernv/ioda: Protect PE list
c1a2feade085 powerpc/powernv/ioda: set up PE on opencapi device when enabling
f724385fea01 powerpc/powernv/ioda: Release opencapi device
bbb789046084 powerpc/powernv/ioda: Find opencapi slot for a device node
658ab186dd22 pci/hotplug/pnv-php: Remove erroneous warning
323c2a26ff43 pci/hotplug/pnv-php: Improve error msg on power state change failure
ea53919b15bf pci/hotplug/pnv-php: Register opencapi slots
be1611e043de pci/hotplug/pnv-php: Relax check when disabling slot
748ac391ab9a pci/hotplug/pnv-php: Wrap warnings in macro
49ce94b8677c ocxl: Add PCI hotplug dependency to Kconfig

They apply cleanly on kernel v5.4 and v5.5, starting from the bottom.

They impact powerpc-only code and even more specifically the powernv platform.
The PCI hotplug module modifications for powernv are to make the hotplug driver aware of the new opencapi slots.
The "ioda" modifications are to manage the state of opencapi devices properly, since they can now be destroyed and recreated when the opencapi link is reset.

Could the patches be added to the kernel which will be used for Ubuntu 20.04?
Thanks

CVE References

bugproxy (bugproxy)
tags: added: architecture-ppc64le bugnameltc-183470 severity-medium targetmilestone-inin2004
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → linux (Ubuntu)
Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
importance: Undecided → Medium
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
status: New → Triaged
Changed in linux (Ubuntu):
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → Canonical Kernel Team (canonical-kernel-team)
Revision history for this message
Frank Heimes (fheimes) wrote :

All the 11 commits can be found in linux-next and are tagged with:
next-20200130
next-20200205
next-20200207
So highly likely to be pulled into 5.6.

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2020-02-11 05:12 EDT-------
(In reply to comment #7)
> All the 11 commits can be found in linux-next and are tagged with:
> next-20200130
> next-20200205
> next-20200207
> So highly likely to be pulled into 5.6.

The commits are now in 5.6-rc1

Any definitive decision about the kernel version used as a base for 20.04?

Revision history for this message
Frank Heimes (fheimes) wrote : Re: Allow to reset an opencapi adapter

The kernel version for Ubuntu 20.04 is fix, it will be kernel 5.4, see: https://kernel.ubuntu.com/

The discussion is ongoing whether this can be cherry-picked/backported into 5.4 or not...

Frank Heimes (fheimes)
summary: - Allow to reset an opencapi adapter
+ [20.04] Allow to reset an opencapi adapter
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2020-02-19 02:53 EDT-------
Some additional comments about the impact of the patches. They only impact powerpc and more specifically the powernv platform. Opencapi devices are only supported on the powernv platform, and require a Power9 processor.

The first 5 patches (powerpc/powernv/pci-ioda) impact the opencapi PCI devices, and to a much lesser extent the nvlink (GPU) devices.
Patch 05dd7da76986 ("powerpc/powernv/ioda: Fix ref count for devices with their own PE") technically impacts nvlink (GPUs) PCI devices, but I'm only replacing a refcounting leak by a (oh so slightly) less ugly leak.
Patch 80f1ff83fa11("powerpc/powernv/ioda: Protect PE list") is for both nvlink and opencapi, but it looks pretty safe. nvlink devices are discovered at boot and never go away. Opencapi devices, with that series, can now come and go, that's why I'm adding the extra lock protection.
The 3 other patches for powerpc/powernv/pci-ioda really impact opencapi PCI devices only.

The second part of the series is for the powernv PCI hotplug driver (pnv-php) but there's little there other than adding the new opencapi slots. For existing PCI slots, it's mostly cosmetics.

Finally, the last patch is to add a dependency on the PCI hotplug framework for the ocxl driver. Ocxl is the opencapi driver. As mentioned above, it is only used of Power9 and it's the driver I'm maintaining.

I hope it clarifies things. I do believe the impact is fairly limited outside of opencapi device support and I'm trying pretty hard to make that work.

Revision history for this message
Frank Heimes (fheimes) wrote :

Hello Frederic, thank you, this is really helpful for understanding the impact and potential risks.
This LP ticket is still on the list of tickets for 20.04.
The final decision will be done by the kernel team.
Please keep in mind that the kernel freeze for 20.04 is at the beginning of April:
https://wiki.ubuntu.com/FocalFossa/ReleaseSchedule
So there is still some time to go, hence it may take some more days to get a final decision.
But stay tuned ...

Revision history for this message
Frank Heimes (fheimes) wrote :

Patch request submitted:
https://lists.ubuntu.com/archives/kernel-team/2020-February/107733.html
changing status to 'In Progress'.

Changed in ubuntu-power-systems:
status: Triaged → In Progress
Changed in linux (Ubuntu):
status: New → In Progress
Revision history for this message
Frank Heimes (fheimes) wrote :

The patch was applied:
https://lists.ubuntu.com/archives/kernel-team/2020-February/107820.html
hence changing the status to Fix Committed.

Changed in linux (Ubuntu):
status: In Progress → Fix Committed
Changed in ubuntu-power-systems:
status: In Progress → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-focal
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2020-03-12 13:13 EDT-------
I've tested the opencapi reset functionality successfully on the following kernel:

Linux zz1 5.4.0-18-generic #22-Ubuntu SMP Sat Mar 7 18:06:34 UTC 2020 ppc64le ppc64le ppc64le GNU/Linux

Marking bug with 'verification-done-focal'

tags: added: verification-done-focal
removed: verification-needed-focal
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (81.5 KiB)

This bug was fixed in the package linux - 5.4.0-18.22

---------------
linux (5.4.0-18.22) focal; urgency=medium

  * focal/linux: 5.4.0-18.22 -proposed tracker (LP: #1866488)

  * Packaging resync (LP: #1786013)
    - [Packaging] resync getabis
    - [Packaging] update helper scripts

  * Add sysfs attribute to show remapped NVMe (LP: #1863621)
    - SAUCE: ata: ahci: Add sysfs attribute to show remapped NVMe device count

  * [20.04 FEAT] Compression improvements in Linux kernel (LP: #1830208)
    - lib/zlib: add s390 hardware support for kernel zlib_deflate
    - s390/boot: rename HEAP_SIZE due to name collision
    - lib/zlib: add s390 hardware support for kernel zlib_inflate
    - s390/boot: add dfltcc= kernel command line parameter
    - lib/zlib: add zlib_deflate_dfltcc_enabled() function
    - btrfs: use larger zlib buffer for s390 hardware compression
    - [Config] Introducing s390x specific kernel config option CONFIG_ZLIB_DFLTCC

  * [UBUNTU 20.04] s390x/pci: increase CONFIG_PCI_NR_FUNCTIONS to 512 in kernel
    config (LP: #1866056)
    - [Config] Increase CONFIG_PCI_NR_FUNCTIONS from 64 to 512 starting with focal
      on s390x

  * CONFIG_IP_MROUTE_MULTIPLE_TABLES is not set (LP: #1865332)
    - [Config] CONFIG_IP_MROUTE_MULTIPLE_TABLES=y

  * Dell XPS 13 9300 Intel 1650S wifi [34f0:1651] fails to load firmware
    (LP: #1865962)
    - iwlwifi: remove IWL_DEVICE_22560/IWL_DEVICE_FAMILY_22560
    - iwlwifi: 22000: fix some indentation
    - iwlwifi: pcie: rx: use rxq queue_size instead of constant
    - iwlwifi: allocate more receive buffers for HE devices
    - iwlwifi: remove some outdated iwl22000 configurations
    - iwlwifi: assume the driver_data is a trans_cfg, but allow full cfg

  * [FOCAL][REGRESSION] Intel Gen 9 brightness cannot be controlled
    (LP: #1861521)
    - Revert "USUNTU: SAUCE: drm/i915: Force DPCD backlight mode on Dell Precision
      4K sku"
    - Revert "UBUNTU: SAUCE: drm/i915: Force DPCD backlight mode on X1 Extreme 2nd
      Gen 4K AMOLED panel"
    - SAUCE: drm/dp: Introduce EDID-based quirks
    - SAUCE: drm/i915: Force DPCD backlight mode on X1 Extreme 2nd Gen 4K AMOLED
      panel
    - SAUCE: drm/i915: Force DPCD backlight mode for some Dell CML 2020 panels

  * [20.04 FEAT] Enable proper kprobes on ftrace support (LP: #1865858)
    - s390/ftrace: save traced function caller
    - s390: support KPROBES_ON_FTRACE

  * alsa/sof: load different firmware on different platforms (LP: #1857409)
    - ASoC: SOF: Intel: hda: use fallback for firmware name
    - ASoC: Intel: acpi-match: split CNL tables in three
    - ASoC: SOF: Intel: Fix CFL and CML FW nocodec binary names.

  * [UBUNTU 20.04] Enable CONFIG_NET_SWITCHDEV in kernel config for s390x
    starting with focal (LP: #1865452)
    - [Config] Enable CONFIG_NET_SWITCHDEV in kernel config for s390x starting
      with focal

  * Focal update: v5.4.24 upstream stable release (LP: #1866333)
    - io_uring: grab ->fs as part of async offload
    - EDAC: skx_common: downgrade message importance on missing PCI device
    - net: dsa: b53: Ensure the default VID is untagged
    - net: fib_rules: Correctly set table field when table number exceeds 8 bit...

Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
Changed in ubuntu-power-systems:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.