Autopkgtest failures on amd64

Bug #2048768 reported by Danilo Egea Gondolfo
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
gcc-10 (Ubuntu)
Confirmed
Undecided
Unassigned
Noble
Won't Fix
Undecided
Unassigned
gcc-11 (Ubuntu)
Status tracked in Noble
Noble
Fix Released
Undecided
Unassigned
gcc-12 (Ubuntu)
Status tracked in Noble
Noble
Fix Released
Undecided
Unassigned
gcc-13 (Ubuntu)
Status tracked in Noble
Noble
Fix Released
Undecided
Unassigned
gcc-8 (Ubuntu)
Status tracked in Noble
Noble
Invalid
Undecided
Unassigned
gcc-9 (Ubuntu)
Confirmed
Undecided
Unassigned
Noble
Won't Fix
Undecided
Unassigned
linux (Ubuntu)
Status tracked in Noble
Noble
Fix Released
Undecided
Unassigned
llvm-toolchain-14 (Ubuntu)
Confirmed
Undecided
Unassigned
Noble
Won't Fix
Undecided
Unassigned
llvm-toolchain-15 (Ubuntu)
Confirmed
Undecided
Unassigned
Noble
Won't Fix
Undecided
Unassigned
llvm-toolchain-16 (Ubuntu)
Status tracked in Noble
Noble
Confirmed
Undecided
Unassigned

Bug Description

Some tests related to the address sanitizer are occasionally failing on amd64 (also for llvm-toolchain-15 and 16):

--------------
FAIL: LLVM regression suite :: test_leaksan.c (38 of 45)
746s ******************** TEST 'LLVM regression suite :: test_leaksan.c' FAILED ********************
746s Script:
746s --
746s : 'RUN: at line 4'; /usr/bin/clang-14 -o /tmp/autopkgtest.gHVujV/autopkgtest_tmp/build/tests/Output/test_leaksan.c.tmp -fsanitize=address -g /tmp/autopkgtest.gHVujV/autopkgtest_tmp/tests/test_leaksan.c
746s : 'RUN: at line 5'; env ASAN_OPTIONS="log_path=stdout:exitcode=0" /tmp/autopkgtest.gHVujV/autopkgtest_tmp/build/tests/Output/test_leaksan.c.tmp 2>&1 > /tmp/autopkgtest.gHVujV/autopkgtest_tmp/build/tests/Output/test_leaksan.c.tmp.out
746s : 'RUN: at line 6'; grep -q "detected memory leaks" /tmp/autopkgtest.gHVujV/autopkgtest_tmp/build/tests/Output/test_leaksan.c.tmp.out
746s --
746s Exit Code: 139
746s
746s Command Output (stderr):
746s --
746s /tmp/autopkgtest.gHVujV/autopkgtest_tmp/build/tests/Output/test_leaksan.c.script: line 3: 3335 Segmentation fault (core dumped) env ASAN_OPTIONS="log_path=stdout:exitcode=0" /tmp/autopkgtest.gHVujV/autopkgtest_tmp/build/tests/Output/test_leaksan.c.tmp 2>&1 > /tmp/autopkgtest.gHVujV/autopkgtest_tmp/build/tests/Output/test_leaksan.c.tmp.out
--------------

If you run the test manually you'll notice that it works but eventually crashes:

--------------------
ubuntu@autopkgtest:/tmp/autopkgtest.oXC2FP/autopkgtest_tmp/build/tests/Output$ ./test_leaksan.c.tmp

=================================================================
==8631==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 7 byte(s) in 1 object(s) allocated from:
    #0 0x5e9c3441ed12 in __interceptor_malloc (/tmp/autopkgtest.oXC2FP/autopkgtest_tmp/build/tests/Output/test_leaksan.c.tmp+0xa3d12) (BuildId: 6f71ac388125722ade1ea86ee3661c0d884dd193)
    #1 0x5e9c3445acb8 in main /tmp/autopkgtest.oXC2FP/autopkgtest_tmp/tests/test_leaksan.c:13:7
    #2 0x7e84e1e280cf (/lib/x86_64-linux-gnu/libc.so.6+0x280cf) (BuildId: f0b834daa3d05a80967e9ec2f990a1ea71c958fa)

SUMMARY: AddressSanitizer: 7 byte(s) leaked in 1 allocation(s).
ubuntu@autopkgtest:/tmp/autopkgtest.oXC2FP/autopkgtest_tmp/build/tests/Output$ ./test_leaksan.c.tmp

=================================================================
==8634==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 7 byte(s) in 1 object(s) allocated from:
    #0 0x5f19be5f6d12 in __interceptor_malloc (/tmp/autopkgtest.oXC2FP/autopkgtest_tmp/build/tests/Output/test_leaksan.c.tmp+0xa3d12) (BuildId: 6f71ac388125722ade1ea86ee3661c0d884dd193)
    #1 0x5f19be632cb8 in main /tmp/autopkgtest.oXC2FP/autopkgtest_tmp/tests/test_leaksan.c:13:7
    #2 0x77c7d3c280cf (/lib/x86_64-linux-gnu/libc.so.6+0x280cf) (BuildId: f0b834daa3d05a80967e9ec2f990a1ea71c958fa)

SUMMARY: AddressSanitizer: 7 byte(s) leaked in 1 allocation(s).

ubuntu@autopkgtest:/tmp/autopkgtest.oXC2FP/autopkgtest_tmp/build/tests/Output$ ./test_leaksan.c.tmp
Segmentation fault (core dumped)
--------------------

After some investigation I found that it will not fail with ASLR disabled:

sudo sysctl kernel.randomize_va_space=0

while : ; do env ASAN_OPTIONS="log_path=stdout:exitcode=0" ./test_leaksan.c.tmp >/dev/null; if [ $? -ne 0 ] ; then echo crashed ; fi done

If you enable ASLR it will start to crash:

$ sudo sysctl kernel.randomize_va_space=2

$ while : ; do env ASAN_OPTIONS="log_path=stdout:exitcode=0" ./test_leaksan.c.tmp >/dev/null; if [ $? -ne 0 ] ; then echo crashed ; fi done
Segmentation fault (core dumped)
crashed
Segmentation fault (core dumped)
crashed
Segmentation fault (core dumped)
crashed
Segmentation fault (core dumped)
crashed
Segmentation fault (core dumped)
crashed

If you enable ASLR again and run it with "setarch -R" (to disable ASLR for this binary), it will also not crash.

description: updated
description: updated
description: updated
description: updated
Revision history for this message
Danilo Egea Gondolfo (danilogondolfo) wrote :

After some more digging, this problem is affecting builds with ASAN (address sanitizer) in LLVM 17 too (and likely 16) and also GCC.

Running the netplan.io tests with ASAN (built with gcc) on Noble will lead to crashes:

+ ./_leakcheckbuild/tests/ctests/test_netplan_validation
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
/home/danilo/bin/netplan-leak-check.sh: line 19: 463561 Segmentation fault (core dumped) ./${test}

All these tests will run just fine if I disable ASLR (sudo sysctl kernel.randomize_va_space=0)

I'm running kernel 6.6.0-14-generic.

Checking llvm-toolchain-17 tests, after and before kernel 6.6 migrated in Noble, they are failing for different reasons:

Before: https://autopkgtest.ubuntu.com/results/autopkgtest-noble/noble/amd64/l/llvm-toolchain-17/20231227_142627_0e232@/log.gz

613s wasm-ld-17: error: /usr/lib/wasm32-wasi/libc++.a(stdlib_new_delete.cpp.o): undefined symbol: __cxa_allocate_exception
613s wasm-ld-17: error: /usr/lib/wasm32-wasi/libc++.a(stdlib_new_delete.cpp.o): undefined symbol: __cxa_throw

After: https://autopkgtest.ubuntu.com/results/autopkgtest-noble/noble/amd64/l/llvm-toolchain-17/20240106_050845_ecb23@/log.gz

292s sanitize=address is failing

I can get to the same errors by disabling and re-enabling ASLR in my Noble test image.

Revision history for this message
Danilo Egea Gondolfo (danilogondolfo) wrote :

To make a stronger correlation with kernel 6.6, I booted my system with kernel 6.5.0-14-generic and the netplan.io tests with ASAN work just fine.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

we area bout to move to v6.7 kernel as v6.6 is sort of dead to us already.

Are you able to test with linux-generic-wip (linux-unstable source) from this ppa https://launchpad.net/~canonical-kernel-team/+archive/ubuntu/unstable?field.series_filter=noble ?

Revision history for this message
Danilo Egea Gondolfo (danilogondolfo) wrote :
Download full text (3.2 KiB)

Hi Dimitri, I just tried it here. The same problem is happening with kernel 6.7 and the same workaround (disabling ASLR) works.

Example of failure:

clang++-$VERSION -O1 -g -fsanitize=address -fno-omit-frame-pointer foo.cpp
ASAN_OPTIONS=verbosity=1 ./a.out &> foo.log || true
./debian/qualify-clang.sh: line 634: 4281 Segmentation fault (core dumped) ASAN_OPTIONS=verbosity=1 ./a.out &> foo.log
if ! grep -q "Init done" foo.log; then
    echo "asan verbose mode failed"
    cat foo.log
    exit 42
fi
asan verbose mode failed
==4281==Registered root region at 0x736c74f00b70 of size 48
==4281==Registered root region at 0x736c74b007a0 of size 32
==4281==Unregistered root region at 0x736c74f00b70 of size 48
==4281==Unregistered root region at 0x736c74b007a0 of size 32
==4281==AddressSanitizer: failed to intercept '__isoc99_printf'
==4281==Registered root region at 0x736c74f00b70 of size 48
==4281==Registered root region at 0x736c74b007a0 of size 32
==4281==Unregistered root region at 0x736c74f00b70 of size 48
==4281==Unregistered root region at 0x736c74b007a0 of size 32
==4281==AddressSanitizer: failed to intercept '__isoc99_sprintf'
==4281==Registered root region at 0x736c74f00b70 of size 48
==4281==Registered root region at 0x736c74b007a0 of size 32
==4281==Unregistered root region at 0x736c74f00b70 of size 48
==4281==Unregistered root region at 0x736c74b007a0 of size 32
==4281==AddressSanitizer: failed to intercept '__isoc99_snprintf'
==4281==Registered root region at 0x736c74f00b70 of size 48
==4281==Registered root region at 0x736c74b007a0 of size 32
==4281==Unregistered root region at 0x736c74f00b70 of size 48
==4281==Unregistered root region at 0x736c74b007a0 of size 32
==4281==AddressSanitizer: failed to intercept '__isoc99_fprintf'
....
==4281==Unregistered root region at 0x736c74b00780 of size 32
==4281==AddressSanitizer: failed to intercept 'crypt'
==4281==Registered root region at 0x736c74f00b70 of size 48
==4281==Registered root region at 0x736c74b00780 of size 32
==4281==Unregistered root region at 0x736c74f00b70 of size 48
==4281==Unregistered root region at 0x736c74b00780 of size 32
==4281==AddressSanitizer: failed to intercept 'crypt_r'
==4281==Registered root region at 0x736c74900f40 of size 64
==4281==Registered root region at 0x736c74b00780 of size 32
==4281==Unregistered root region at 0x736c74900f40 of size 64
==4281==Unregistered root region at 0x736c74b00780 of size 32
==4281==AddressSanitizer: failed to intercept '__cxa_rethrow_primary_exception'
==4281==AddressSanitizer: libc interceptors initialized
|| `[0x10007fff8000, 0x7fffffffffff]` || HighMem ||
|| `[0x02008fff7000, 0x10007fff7fff]` || HighShadow ||
|| `[0x00008fff7000, 0x02008fff6fff]` || ShadowGap ||
|| `[0x00007fff8000, 0x00008fff6fff]` || LowShadow ||
|| `[0x000000000000, 0x00007fff7fff]` || LowMem ||
MemToShadow(shadow): 0x00008fff7000 0x000091ff6dff 0x004091ff6e00 0x02008fff6fff
redzone=16
max_redzone=2048
quarantine_size_mb=256M
thread_local_quarantine_size_kb=1024K
malloc_context_size=30
SHADOW_SCALE: 3
SHADOW_GRANULARITY: 8
SHADOW_OFFSET: 0x7fff8000
==4281==Installed the sigaction for signal 11
==4281==Installed the sigaction for signal 7
==428...

Read more...

Revision history for this message
Danilo Egea Gondolfo (danilogondolfo) wrote :

The same problem was reported on Arch and they could confirm it started due to changes in the kernel related to ASLR.

https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/issues/20

Relevant changed from kernel 6.5 to 6.6 on Ubuntu:

-CONFIG_ARCH_MMAP_RND_BITS=28
+CONFIG_ARCH_MMAP_RND_BITS=32
 CONFIG_HAVE_ARCH_MMAP_RND_COMPAT_BITS=y
-CONFIG_ARCH_MMAP_RND_COMPAT_BITS=8
+CONFIG_ARCH_MMAP_RND_COMPAT_BITS=16

Setting vm.mmap_rnd_bits to 28 bits seems to be enough to workaround the problem.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux (Ubuntu):
status: New → Confirmed
Changed in llvm-toolchain-14 (Ubuntu):
status: New → Confirmed
Revision history for this message
James Paton-Smith (jamesps) wrote :

This issue might also affect 22.04 with the HWE kernel

Going from kernel 6.5.0-21 to 6.5.0-25, we're seeing the same change in kernel parameters

On a system with kernel 6.5.0-21 I have these kernel parameters set:
vm.mmap_rnd_bits = 28
vm.mmap_rnd_compat_bits = 8

On a system with the new kernel 6.5.0-25:
vm.mmap_rnd_bits = 32
vm.mmap_rnd_compat_bits = 16

Revision history for this message
Seth Arnold (seth-arnold) wrote :
Matthias Klose (doko)
no longer affects: llvm-toolchain-17 (Ubuntu)
no longer affects: llvm-toolchain-18 (Ubuntu)
no longer affects: llvm-toolchain-17 (Ubuntu Noble)
no longer affects: llvm-toolchain-18 (Ubuntu Noble)
Matthias Klose (doko)
Changed in gcc-13 (Ubuntu Noble):
status: New → Fix Committed
Matthias Klose (doko)
no longer affects: gcc-14 (Ubuntu)
no longer affects: gcc-14 (Ubuntu Noble)
Changed in gcc-8 (Ubuntu Noble):
status: New → Invalid
Revision history for this message
Philippe Blain (phil-blain) wrote :

Hello, I see https://git.launchpad.net/ubuntu/+source/gcc-13/commit/?id=6c5be2a496335c513dbe6fa85df2402cfc0f0a8b imported the LLVM patch for Asan. There was also a similar change in LLVM for LSan: https://github.com/llvm/llvm-project/commit/5ffe955570a5d743bbbae204ce1b132e89fa86dc.

I was wondering if it was planned to also include that patch ?

I've experience the same kind of failure with Lsan on GCC 13: https://github.com/phil-blain/git/actions/runs/8281327655/job/22659588626. This is the `linux-leaks` job in the Git CI suite which uses GCC 13.1.0 from the ubuntu-toolchain-r/test PPA.

Side question: will the PPA also be updated with the Asan patch?

Thanks!

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in gcc-10 (Ubuntu):
status: New → Confirmed
Changed in gcc-11 (Ubuntu):
status: New → Confirmed
Changed in gcc-12 (Ubuntu):
status: New → Confirmed
Changed in gcc-9 (Ubuntu):
status: New → Confirmed
Changed in llvm-toolchain-15 (Ubuntu):
status: New → Confirmed
Changed in llvm-toolchain-16 (Ubuntu):
status: New → Confirmed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package gcc-13 - 13.2.0-21ubuntu1

---------------
gcc-13 (13.2.0-21ubuntu1) noble; urgency=medium

  * Merge with Debian; remaining changes:
    - Build from upstream sources.

gcc-13 (13.2.0-21) unstable; urgency=medium

  * Update to git 20240324 from the gcc-13 branch.
    - Fix PR fortran/101135, PR fortran/55978.
  * Update the proposed patch to fix PR ada/114064.
  * Move the _FORTIFY_SOURCE setting from cc1 to the driver.
  * Refresh patches.

gcc-13 (13.2.0-20) unstable; urgency=medium

  * Update to git 20240323 from the gcc-13 branch.
    - Fix PR other/109668, PR tree-optimization/114231,
      PR tree-optimization/112793, PR tree-optimization/113670,
      PR middle-end/113622, PR tree-optimization/114203, PR middle-end/114070,
      PR middle-end/114070, PR tree-optimization/114027, PR debug/112718,
      PR tree-optimization/113910, PR tree-optimization/111736,
      PR tree-optimization/114396, PR target/113950 (PPC),
      PR target/111822 (x86), PR target/114160 (RISCV), PR sanitizer/112709,
      PR sanitizer/112709, PR target/114339 (x86), PR middle-end/113907,
      PR target/114310 (AArch64), PR rtl-optimization/110079,
      PR rtl-optimization/114211, PR target/114184 (x86), PR fortran/114001,
      PR fortran/103715, PR fortran/110826, PR fortran/100988,
      PR fortran/104819, PR libstdc++/66146, PR libstdc++/114316,
      PR libstdc++/114147.
  * Make vhdl known to the PPC backend.
  * Use the proposed patch to fix PR ada/114064, 64bit definitions for
    time_t_bits type on 32bit archs.
  * libsanitizer: Remove crypt and crypt_r interceptors for releases
    with glibc (>= 2.31). Remove the build hacks for crypt.h.
  * libstdc++-dev: Install libstdc++_libbacktrace.a. Note that all Filesystem TS
    and std::stacktrace symbols were added to libstdc++exp.a and GCC 14 stops
    shipping libstdc++_libbacktrace.a. Closes: #1065359.

 -- Matthias Klose <email address hidden> Sun, 24 Mar 2024 14:57:17 +0100

Changed in gcc-13 (Ubuntu Noble):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 6.8.0-20.20

---------------
linux (6.8.0-20.20) noble; urgency=medium

  * noble/linux: 6.8.0-20.20 -proposed tracker (LP: #2058221)

  * Noble update: v6.8.1 upstream stable release (LP: #2058224)
    - x86/mmio: Disable KVM mitigation when X86_FEATURE_CLEAR_CPU_BUF is set
    - Documentation/hw-vuln: Add documentation for RFDS
    - x86/rfds: Mitigate Register File Data Sampling (RFDS)
    - KVM/x86: Export RFDS_NO and RFDS_CLEAR to guests
    - Linux 6.8.1

  * Autopkgtest failures on amd64 (LP: #2048768)
    - [Packaging] update to clang-18

  * Miscellaneous Ubuntu changes
    - SAUCE: apparmor4.0.0: LSM stacking v39: fix build error with
      CONFIG_SECURITY=n
    - [Config] amd64: MITIGATION_RFDS=y

 -- Paolo Pisati <email address hidden> Mon, 18 Mar 2024 11:08:14 +0100

Changed in linux (Ubuntu Noble):
status: Confirmed → Fix Released
Revision history for this message
Matthias Klose (doko) wrote :

marked old toolchain versions as won't fix

Changed in gcc-9 (Ubuntu Noble):
status: Confirmed → Won't Fix
Changed in gcc-10 (Ubuntu Noble):
status: Confirmed → Won't Fix
Changed in gcc-11 (Ubuntu Noble):
status: Confirmed → Won't Fix
Changed in llvm-toolchain-14 (Ubuntu Noble):
status: Confirmed → Won't Fix
Changed in llvm-toolchain-15 (Ubuntu Noble):
status: Confirmed → Won't Fix
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package gcc-12 - 12.3.0-17ubuntu1

---------------
gcc-12 (12.3.0-17ubuntu1) noble; urgency=medium

  * Merge with Debian; remaining changes:
    - Build from upstream sources.

gcc-12 (12.3.0-17) unstable; urgency=medium

  * Fix typo in distro_defaults spec.

gcc-12 (12.3.0-16) unstable; urgency=medium

  * Update to git 20240403 from the gcc-12 branch.
    - Fix PR tree-optimization/111407, PR target/113233 (loong64),
      PR sanitizer/97696, PR tree-optimization/110838,
      PR tree-optimization/91838, PR target/108743 (Darwin),
      PR target/101737 (SH), PR tree-optimization/110221, PR ada/113979,
      PR d/112285, PR d/112290, PR d/114171, PR d/113758, PR d/113125,
      PR fortran/107426, PR fortran/50410, PR fortran/103715,
      PR libstdc++/40380, PR libstdc++/107376, PR libstdc++/112473,
      PR libstdc++/111172, PR libstdc++/112089, PR libstdc++/113960,
      PR libstdc++/108846, PR libstdc++/86419, PR libstdc++/109758,
      PR libstdc++/110593, PR libstdc++/66146, PR libstdc++/107500,
      PR libstdc++/114147, PR libstdc++/110167.
  * ASan: move allocator base to avoid conflict with high-entropy ASLR
    for x86-64 Linux. Patch taken from LLVM. LP: #2048768.
  * Make vhdl known to the PPC backend.
  * libsanitizer: Remove crypt and crypt_r interceptors for releases
    with glibc (>= 2.31).
  * d/p/gcc-distro-defaults: Make -fstack-protector-explicit known.
  * Move the _FORTIFY_SOURCE setting from cc1 to the driver.
  * Update the proposed patch to fix PR ada/114064.
  * Apply proposed patch for PR libquadmath/114533 (Simon Chopin).
    Addresses: #1064426. LP: #2052929.
  * Refresh patches.

 -- Matthias Klose <email address hidden> Wed, 03 Apr 2024 14:27:33 +0200

Changed in gcc-12 (Ubuntu Noble):
status: Confirmed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package gcc-11 - 11.4.0-9ubuntu1

---------------
gcc-11 (11.4.0-9ubuntu1) noble; urgency=medium

  * Merge with Debian; remaining changes:
    - Build from upstream sources.

gcc-11 (11.4.0-9) unstable; urgency=medium

  * Update to git 20240412 from the gcc-11 branch.
    - Fix PR fortran/114474.
  * Fix the gm2 bootstrap.

 -- Matthias Klose <email address hidden> Fri, 12 Apr 2024 10:32:22 +0200

Changed in gcc-11 (Ubuntu Noble):
status: Won't Fix → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.