s390/dasd: reduce the default queue depth and nr of hardware queues

Bug #1852257 reported by bugproxy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
Fix Released
High
Frank Heimes
linux (Ubuntu)
Fix Released
Undecided
Skipper Bug Screeners
Bionic
Fix Released
High
Unassigned

Bug Description

SRU Justification:
==================

[Impact]

* On s390x systems with a small memory footprint, but large amounts of DASD disks,

* the memory can get depleted (even during installation) which can eventually lead to a situation where the OOM kicks in (followed by even more problems).

* Starting with kernel 4.18 the patch below leads to 90% memory consumption savings per active DASD device.

* The below backport is needed to fix this and get the improvement into bionic's kernel 4.15.

[Fix]

* 3284da34a87ab7a527a593f89bbdaf6debe9e713 3284da3 "s390/dasd: reduce the default queue depth and nr of hardware queues"

* Backport: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1852257/+attachment/5304830/+files/0001-s390-dasd-reduce-the-default-queue-depth-and-nr-of-h.patch

[Test Case]

* Configure a s390x system (z/VM guest or LPAR) with only a bit RAM, but lot's of DASDs devices.

* Now gradually enable more and more DASDs and monitor the memory usage (be sure to exclude shared memory and cache).

* One can notice a difference in mem usage of about 10:1 per activated DASD comparing the current stock 4.15 kernel with a patches kernel 4.15.

* With about less than 1GB memory and 40+ DASD devices one may start to run into an OOM situation w(o the patch.

[Regression Potential]

* The regression potential can be considered as moderate, since:

* this is purely s390x specific

* it again only affects DASD disk storage (no zFCP/SCSI disk storage)

* and it's again only limited to smaller systems (so more z/VM guests rather than LPARs).

[Other Info]

* A cherry-pick to 4.15 wasn't clean (problem in one line), hence the backport, which applied, compiled and worked fine.

__________

This fix is already available with Ubuntu 18.10, but need also be integrated into Ubuntu 18.04 LTS kernel (4.15).

This is available git-commit:
https://github.com/torvalds/linux/commit/3284da34a87a

Backport information will be provided.

bugproxy (bugproxy)
tags: added: architecture-s39064 bugnameltc-182420 severity-high targetmilestone-inin1804
Changed in ubuntu:
assignee: nobody → Skipper Bug Screeners (skipper-screen-team)
affects: ubuntu → linux (Ubuntu)
Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
importance: Undecided → High
status: New → Triaged
Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
assignee: nobody → Frank Heimes (frank-heimes)
Revision history for this message
bugproxy (bugproxy) wrote : reduce the default queue depth and nr of hardware queues

------- Comment on attachment From <email address hidden> 2019-11-12 10:22 EDT-------

This is the original patch re-based on the 18.04 master-next branch. Hope this is sufficient for you.

Revision history for this message
Frank Heimes (fheimes) wrote :

Thanks Stefan, looks good - applies and compiles fine.
I'll proceed with the kernel SRU...

Changed in ubuntu-z-systems:
status: Triaged → Confirmed
Frank Heimes (fheimes)
description: updated
Revision history for this message
Frank Heimes (fheimes) wrote :

Kernel SRU request submitted:
https://lists.ubuntu.com/archives/kernel-team/2019-November/thread.html#105478

Changing status in the 'Bionic' nomination to 'In Progress'
and status in 'linux (Ubuntu)' (which reflects the current development release, hence today 'Focal' to 'Fix Released', since the patch got already upstream accepted with 4.18.

Changed in linux (Ubuntu Bionic):
status: New → In Progress
Changed in linux (Ubuntu):
status: New → Fix Released
Changed in ubuntu-z-systems:
status: Confirmed → In Progress
Stefan Bader (smb)
Changed in linux (Ubuntu Bionic):
importance: Undecided → High
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
status: In Progress → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Revision history for this message
Frank Heimes (fheimes) wrote :

I successfully verified that the memory usage especially environments with multiple DASD disks (tested with up to 8 DASDs) reduced significantly if kernel 4.15.0.73 from bionic proposed is used.

tags: added: verification-done-bionic
removed: verification-needed-bionic
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (42.4 KiB)

This bug was fixed in the package linux - 4.15.0-74.84

---------------
linux (4.15.0-74.84) bionic; urgency=medium

  * bionic/linux: 4.15.0-74.84 -proposed tracker (LP: #1856749)

  * [Hyper-V] KVP daemon fails to start on first boot of disco VM (LP: #1820063)
    - [Packaging] bind hv_kvp_daemon startup to hv_kvp device

  * Unrevert "arm64: Use firmware to detect CPUs that are not affected by
    Spectre-v2" (LP: #1854207)
    - arm64: Get rid of __smccc_workaround_1_hvc_*
    - arm64: Use firmware to detect CPUs that are not affected by Spectre-v2

  * Bionic kernel panic on Cavium ThunderX CN88XX (LP: #1853485)
    - SAUCE: irqchip/gic-v3-its: Add missing return value in
      its_irq_domain_activate()

linux (4.15.0-73.82) bionic; urgency=medium

  * bionic/linux: 4.15.0-73.82 -proposed tracker (LP: #1854819)

  * CVE-2019-14901
    - SAUCE: mwifiex: Fix heap overflow in mmwifiex_process_tdls_action_frame()

  * CVE-2019-14896 // CVE-2019-14897
    - SAUCE: libertas: Fix two buffer overflows at parsing bss descriptor

  * CVE-2019-14895
    - SAUCE: mwifiex: fix possible heap overflow in mwifiex_process_country_ie()

  * CVE-2019-18660: patches for Ubuntu (LP: #1853142) // CVE-2019-18660
    - powerpc/64s: support nospectre_v2 cmdline option
    - powerpc/book3s64: Fix link stack flush on context switch
    - KVM: PPC: Book3S HV: Flush link stack on guest exit to host kernel

  * Please add patch fixing RK818 ID detection (LP: #1853192)
    - SAUCE: mfd: rk808: Fix RK818 ID template

  * [SRU][B/OEM-B/OEM-OSP1/D] Enable new Elan touchpads which are not in current
    whitelist (LP: #1853246)
    - HID: quirks: Fix keyboard + touchpad on Lenovo Miix 630
    - Input: elan_i2c - export the device id whitelist
    - HID: quirks: Refactor ELAN 400 and 401 handling

  * Lenovo dock MAC Address pass through doesn't work in Ubuntu (LP: #1827961)
    - r8152: Add macpassthru support for ThinkPad Thunderbolt 3 Dock Gen 2

  * s390/dasd: reduce the default queue depth and nr of hardware queues
    (LP: #1852257)
    - s390/dasd: reduce the default queue depth and nr of hardware queues

  * External microphone can't work on some dell machines with the codec alc256
    or alc236 (LP: #1853791)
    - SAUCE: ALSA: hda/realtek - Move some alc256 pintbls to fallback table
    - SAUCE: ALSA: hda/realtek - Move some alc236 pintbls to fallback table

  * Memory leak in net/xfrm/xfrm_state.c - 8 pages per ipsec connection
    (LP: #1853197)
    - xfrm: Fix memleak on xfrm state destroy

  * CVE-2019-19083
    - drm/amd/display: memory leak

  * update ENA driver for DIMLIB dynamic interrupt moderation (LP: #1853180)
    - net: ena: add intr_moder_rx_interval to struct ena_com_dev and use it
    - net: ena: switch to dim algorithm for rx adaptive interrupt moderation
    - net: ena: reimplement set/get_coalesce()
    - net: ena: enable the interrupt_moderation in driver_supported_features
    - net: ena: remove code duplication in
      ena_com_update_nonadaptive_moderation_interval _*()
    - net: ena: remove old adaptive interrupt moderation code from ena_netdev
    - net: ena: remove ena_restore_ethtool_params() and relevant fields
    - net: ena: remov...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Changed in ubuntu-z-systems:
status: Fix Committed → Fix Released
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2020-01-07 02:54 EDT-------
IBM Bugzilla Status-> closed, Fix Released with Bionic

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.