glibc 2.38/armf: can't find libgcc_s.so.1 during tests

Bug #2031495 reported by Simon Chopin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
glibc (Ubuntu)
Fix Released
High
Unassigned

Bug Description

[impact]

The glibc testsuite has been broken on armhf for mantic, this needs to be fixed to help ensure future SRUs won't introduce unseen regressions on that arch.

[test case]

Ensure that the armhf autopkgtests are successful in -proposed, and that the testsuite has been entirely run

[regression potential]

While there aren't any direct regression risks at runtime, if this patch is wrong we would potentially have the tests importing the wrong versions of the library, resulting in hard to debug test failures.

[original report]
glibc 2.38-1ubuntu3 tests fail with the following failure (on multiple instances):

4112s /tmp/autopkgtest.rMSc2W/build.aCV/src/build-tree/armhf-prof/elf/tst-pldd: error while loading shared libraries: libgcc_s.so.1: cannot open shared object file: No such file or directory

CVE References

Simon Chopin (schopin)
tags: added: foundations-todo
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

I spent a while poking at this and got completely confused, not even managing to reproduce the failures (possibly because I wasn't running in a container?). If someone manages to get an environment set up where the failing tests can be run and fail in the same way as they do in the logs, I'm certainly happy to do some debugging.

Revision history for this message
Simon Chopin (schopin) wrote :

Alright, I managed to get it in a state wher I could "easily" reproduced it, and after many hours of bisecting on my Pi the culprit seems to be 1d5024f4f052c12e404d42d3b5bfe9c3e9fd27c4 (confirmed by reverting it on master)⋅.

I can't do too much more about it since I'm off for a week.

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

I guess I'm not overly surprised that adding "-fexceptions -fasynchronous-unwind-tables" to a link command adds a dependence on libgcc_s.so.1. Don't understand why it causes this failure though.

Revision history for this message
Steve Langasek (vorlon) wrote :

This bug shows up on update_excuses for glibc 2.38-1ubuntu5 but there are no failing autopkgtests. Presumed fixed.

Changed in glibc (Ubuntu):
status: Triaged → Fix Released
Revision history for this message
Simon Chopin (schopin) wrote :

It wasn't fixed, but since it only impacts the -prof variant of libc6 and due to the closeness of the release I preferred not to hold the upload on that. Reopening.

Changed in glibc (Ubuntu):
status: Fix Released → In Progress
tags: removed: update-excuse
Changed in glibc (Ubuntu):
importance: Critical → High
Revision history for this message
Simon Chopin (schopin) wrote :

My current state of investigation is that the new dependency to libgcc_s.so.1 exposes a bug in the code building the testroot: it basically copies the few .so dependencies of a test binary into the testroot using the path as exposed by ld.so. However, that path doesn't necessarily match the ld.so search path due to usr-merge.

AFAICT it wasn't a problem so far, because there was no actual dependency on libgcc_s.so.1 in the tests themselves, until the new flags added the dep for some architectures, notably armhf. And it only fails on the -prof variant because the standard slibdir is /lib, so ld.so would still find the lib, but -prof changes slibdir to its own -prof subdirectory.

Now, I still don't know why it fails in the autopkgtests but the build itself passes. Might be from different filesystems being used, or just whatever syscall is used to compute said path behaves differently on an armhf kernel vs an arm64. I'm still trying to figure out where in the code the path is first computed, to see if it's easily fixable there, but an alternative solution would be to detect a usr-merged system when building the testroot, and mirror it there.

Revision history for this message
Simon Chopin (schopin) wrote :

After some more digging, it turns out that the reason the patches don't match is because the tested ld.so still uses the system /etc/ld.so.cache when using it to identify the DSO to load.

The fix for this should be fairly straightforward, I'll write a patch and send it upstream tomorrow.

Revision history for this message
Simon Chopin (schopin) wrote :
tags: removed: foundations-todo
Simon Chopin (schopin)
description: updated
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package glibc - 2.38-3ubuntu1

---------------
glibc (2.38-3ubuntu1) noble; urgency=medium

  * debian/patches/git-updates.diff: update from upstream stable branch
    Dropped changes, superseded by the upstream git updates:
    - debian/patches/CVE-2023-4911.patch: terminate immediately if end of
      input is reached in elf/dl-tunables.c.
    - d/p/u/0001-Fix-leak-in-getaddrinfo-introduced-by-the-fix-for-CV:
      Cherry-picked to fix a regression in one of the previous CVE fixes
  * Merge 2.38-3 from Debian experimental
    Dropped changes, included in Debian:
    - debian/patches/hurd-i386/git-powerpc-longjmp.diff: Fix build after chk
      hidden builtin fix.
  * Drop d/p/lp2032624.patch as advised by upstream.
    Downstream users will have to actually implement those types or stop
    pretending they're GCC. (LP: #2032624)
  * d/p/lp2031495.patch: fix test suite on armhf for -prof variant
    (LP: #2031495)
  * d/control.in/i386: fix math-vector-fortran.h file move (LP: #2039234)

 -- Simon Chopin <email address hidden> Mon, 23 Oct 2023 18:54:07 +0200

Changed in glibc (Ubuntu):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.