Fix for zlib CRC32 optimization for s390x

Bug #1982583 reported by Ilya Leoshkevich
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
In Progress
High
Skipper Bug Screeners
zlib (Ubuntu)
In Progress
High
Unassigned
Jammy
Incomplete
Undecided
Unassigned
Kinetic
Won't Fix
Undecided
Unassigned

Bug Description

SRU Justification:
------------------

[ Impact ]

 * There were two issues identified in the current
   zlib CRC32 optimization for s390x implementation:

 * 1) s390_crc32_vx() signature mismatch
      which causes a warning

 * 2) '-DS390_CRC32_VX' was not added to SFLAGS
      which results in vectorization being enabled only in the static library.

 * The fixes are quite small and affect each only one line:

 * 1) by using unsigned longs instead of uint32_t in s390_crc32_vx declaration

 * 2) by add line 'SFLAGS="$SFLAGS -DS390_CRC32_VX"'

[ Test Plan ]

 * An affected Ubuntu release ([20.04], 22.04 and 22.10) installed
   on a z15/LinuxONE III or newer system is needed.

 * Then it's possible to test the updated package with the help
   of a small test program (in C) that checks for
   s390_crc32_vx() signature mismatches.

 * The bug reporter has a set of s390x-specific tests that will be executed.

 * Test will be done by IBM.

[ Where problems could occur ]

 * The fixes are each limited to one line, hence there are
   not many issues to expect, other than:

 * Typos (e.g. in the flags), mixing of CFLAGS and SFLAGS,

 * in case the changed data type in s390_crc32_vx is causing issues
   inside of s390_crc32_vx or in other parts of the code.

 * Structural and syntactical issues can be identified with a test build
   that was done for all affected Ubuntu releases and for all major archs:
   https://launchpad.net/~fheimes/+archive/ubuntu/lp1990379+lp1982583

[ Other Info ]

 * This bug (LP#1982583) is solved in combination with LP#1990379,
   so that only one package update is needed.
   However, LP#1990379 also affects Focal, but this bug only Jammy and Kinetic.

 * To fix LP#1990379 also for focal the debdiff mentioned there is needed, too.
__________

'zlib CRC32 optimization for s390x works only in a static library'

I've discovered two issues in lp1932010-ibm-z-add-vectorized-crc32-implementation.patch:

1) s390_crc32_vx() signature mismatch, resulting in a warning.
2) -DS390_CRC32_VX is not added to SFLAGS, resulting in vectorization being enabled only in the static library.

I've attached the updated patch.

Revision history for this message
Ilya Leoshkevich (iii-i) wrote :
bugproxy (bugproxy)
tags: added: architecture-s39064 bugnameltc-199129 severity-high targetmilestone-inin---
Frank Heimes (fheimes)
description: updated
summary: - zlib CRC32 optimization for s390x works only in a static library
+ Fix for zlib CRC32 optimization for s390x
Revision history for this message
Frank Heimes (fheimes) wrote :

I transferred the modification into a separate patch (d/p/lp1982583-fix-for-zlib-crc32-optimization-for-s390x.patch),
and build a patched version (for all major architectures) in PPA for jammy and kinetic:
https://launchpad.net/~fheimes/+archive/ubuntu/lp1990379+lp1982583
and attaching here the debdiff for jammy and kinetic - which are for both LP bugs, this LP#1982583 and LP#1990379.

Changed in ubuntu-z-systems:
assignee: nobody → Skipper Bug Screeners (skipper-screen-team)
Changed in zlib (Ubuntu):
importance: Undecided → High
Changed in ubuntu-z-systems:
importance: Undecided → High
Changed in zlib (Ubuntu):
status: New → In Progress
Changed in ubuntu-z-systems:
status: New → In Progress
Frank Heimes (fheimes)
description: updated
description: updated
tags: added: foundation-triage-discuss
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2022-10-10 19:01 EDT-------
Thanks! With

https://ppa.launchpadcontent.net/fheimes/lp1990379+lp1982583/ubuntu kinetic/main s390x zlib1g s390x 1:1.2.11.dfsg-4.1ubuntu2

the following command:

python3 -c 'import zlib; zlib.crc32(b"X"*5000000000)'

begins to work ~2x faster.

Revision history for this message
Simon Chopin (schopin) wrote :

Folded into the LP #1990379 uploads to Jammy, Kinetic and Lunar.

Revision history for this message
Steve Langasek (vorlon) wrote :

> The fixes are each limited to one line, hence there are not many issues to expect

Except that one of these one-line changes is stated to enable a new optimization path in the shared library which was accidentally not enabled before. Since almost everything in Ubuntu uses the shared library and not the static library, this code is basically untested in Ubuntu up until now, and this is a core library with many reverse-dependencies!

The code was originally introduced in impish, and the related bug LP: #1932010 doesn't show anything in the way of a test plan.

I am holding this SRU until the change a) has landed in the devel series, b) there is a more comprehensive test plan to check for regressions (including on non-z15 hardware) than a small test program.

Changed in zlib (Ubuntu Kinetic):
status: New → Incomplete
Changed in zlib (Ubuntu Jammy):
status: New → Incomplete
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2022-11-22 04:52 EDT-------
As a comprehensive test plan I would suggest the following:

* Run zlib-ng testsuite against Ubuntu zlib. This is possible since https://github.com/zlib-ng/zlib-ng/commit/e63f36b1cf615a81e2cfa2d97fc54a5f493c9c19, the testsuite is quite comprehensive.

* Run pyzlib testsuite - https://github.com/iii-i/pyzlib. It stresses some DFLTCC-related corner cases, might be useful for CRC-32 as well.

* Do a large transfer with rsync -z. Copying locally is enough.

* Do a stress-ng run, e.g. stress-ng --zlib --timeout 60s.

This needs to be run 4 times:

- On a >=z15 machine.
- On a >=z15 machine with DFLTCC=0 environment variable.
- On a <=z14 machine (no DFLTCC).
- On a <=zEC12 machine (no vgfma at runtime).

Revision history for this message
Ilya Leoshkevich (iii-i) wrote :

I ran tests 1-3 with Frank's 1:1.2.13.dfsg-1ubuntu3. They all pass; the performance improvement is also measurable.

Test 4 turned out to be meaningless: Ubuntu requires at least z13.

Revision history for this message
Utkarsh Gupta (utkarsh) wrote :

Ubuntu 22.10 (Kinetic Kudu) has reached end of life, so this bug will not be fixed for that specific release.

Changed in zlib (Ubuntu Kinetic):
status: Incomplete → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.