cmsg_so_mark.sh / cmsg_time.sh / cmsg_ipv6.sh in net from ubuntu_kernel_selftests hang with non-amd64

Bug #2000667 reported by Po-Hsu Lin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ubuntu-kernel-tests
Fix Released
Undecided
Po-Hsu Lin
linux (Ubuntu)
Fix Released
Medium
Andrea Righi
Kinetic
Fix Released
Medium
Po-Hsu Lin
Lunar
Fix Released
Medium
Andrea Righi

Bug Description

[Impact]
cmsg_* tests in net tests from ubuntu_kernel_selftests will hang on
non-amd64 systems and eventually causing "Incomplete" test results on
RISCV kernels due to the timeout setting.

This is because of an infinity while loop caused by a char variable
used here to take the getopt() return value in cmsg_sender.c, it should
be an int instead.

[Fix]
* 1573c68820 ("selftests: net: fix cmsg_so_mark.sh test hang")
This patch can be cherry-picked into both Kinetic and Lunar, these test
cases are only available in these newer kernels.

[Test]
Compile the patched cmsg_sender.c on a non-amd64 system, and the cmsg_*
tests will no longer hanging.

[Where problems could occur ]
Change limited to testing tools, no actual impact to real functions.

[Original Bug Report]
Issue found with 5.19.0-1010.11, 5.19.0-1011.12

This issue does not exist in 5.19.0-1009.10 because the net test can't be built by that time.

Test output:
 Running 'make run_tests -C net TEST_PROGS=cmsg_so_mark.sh TEST_GEN_PROGS='' TEST_CUSTOM_PROGS='''
  make: Entering directory '/home/ubuntu/autotest/client/tmp/ubuntu_kernel_selftests/src/linux/tools/testing/selftests/net'
  make --no-builtin-rules ARCH=riscv -C ../../../.. headers_install
  make[1]: Entering directory '/home/ubuntu/autotest/client/tmp/ubuntu_kernel_selftests/src/linux'
    INSTALL ./usr/include
  make[1]: Leaving directory '/home/ubuntu/autotest/client/tmp/ubuntu_kernel_selftests/src/linux'
  TAP version 13
  1..1
  # selftests: net: cmsg_so_mark.sh
 Timer expired (5400 sec.), nuking pid 82951

A manual test shows it will stuck with:
$ sudo ./cmsg_so_mark.sh
+ NS=ns
+ IP4=172.16.0.1/24
+ TGT4=172.16.0.2
+ IP6=2001:db8:1::1/64
+ TGT6=2001:db8:1::2
+ MARK=1000
+ trap cleanup EXIT
+ ip netns add ns
+ ip netns exec ns sysctl -w 'net.ipv4.ping_group_range=0 2147483647'
+ ip -netns ns link add type dummy
+ ip -netns ns link set dev dummy0 up
+ ip -netns ns addr add 172.16.0.1/24 dev dummy0
+ ip -netns ns addr add 2001:db8:1::1/64 dev dummy0
+ ip -netns ns rule add fwmark 1000 lookup 300
+ ip -6 -netns ns rule add fwmark 1000 lookup 300
+ ip -netns ns route add prohibit any table 300
+ ip -6 -netns ns route add prohibit any table 300
+ BAD=0
+ TOTAL=0
+ for ovr in setsock cmsg both
+ for i in 4 6
+ '[' 4 == 4 ']'
+ TGT=172.16.0.2
+ for p in u i r
+ '[' u == u ']'
+ prot=UDP
+ '[' u == i ']'
+ '[' u == r ']'
+ '[' setsock == setsock ']'
+ m=-M
+ '[' setsock == cmsg ']'
+ '[' setsock == both ']'
+ ip netns exec ns ./cmsg_sender -4 -p u -M 1001 172.16.0.2 1234
(test stuck here)

Po-Hsu Lin (cypressyew)
summary: - cmsg_so_mark.sh in net from ubuntu_kernel_selftests hang with RISCV
+ cmsg_so_mark.sh in net from ubuntu_kernel_selftests hang with K-RISCV
kernel
Revision history for this message
Po-Hsu Lin (cypressyew) wrote (last edit ): Re: cmsg_so_mark.sh in net from ubuntu_kernel_selftests hang with K-RISCV kernel

Discussion upstream:
https://lore.kernel<email address hidden>/t/

strace, dmesg, syslog did not provide any useful information.

Investigation with gdb shows the cmsg_sender will stuck in an infinity loop when parsing arguments in cs_parse_args().

while ((o = getopt(argc, argv, "46sS:p:m:M:d:tf:F:c:C:l:L:H:")) != -1)

The "char o" is the culprit here, should be an int instead.
Otherwise it will get stuck as char(-1) is 255 on these riscv instances, it will never match the -1 here.

Revision history for this message
Po-Hsu Lin (cypressyew) wrote (last edit ):

This is actually affecting generic kernel non-amd64 (issue exist upstream)
I will remove the riscv / allwinner variants and make this to be tracked with linux package.

Patch sent:
https://<email address hidden>/

tags: added: arm64 ppc64el riscv64 s390x
no longer affects: linux-allwinner (Ubuntu)
no longer affects: linux-allwinner (Ubuntu Kinetic)
no longer affects: linux-allwinner (Ubuntu Lunar)
no longer affects: linux-riscv (Ubuntu Kinetic)
no longer affects: linux-riscv (Ubuntu)
no longer affects: linux-riscv (Ubuntu Lunar)
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 2000667

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Po-Hsu Lin (cypressyew) wrote : Re: cmsg_so_mark.sh in net from ubuntu_kernel_selftests hang with K-RISCV kernel

This is also causing failure to cmsg_time.sh on non-amd64, as it will call cmsg_sender as well.

summary: - cmsg_so_mark.sh in net from ubuntu_kernel_selftests hang with K-RISCV
- kernel
+ cmsg_so_mark.sh / cmsg_time.sh in net from ubuntu_kernel_selftests hang
+ with K-RISCV kernel
Po-Hsu Lin (cypressyew)
summary: cmsg_so_mark.sh / cmsg_time.sh in net from ubuntu_kernel_selftests hang
- with K-RISCV kernel
+ with non-amd64
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

This is affecting cmsg_ipv6.sh test as well, for amd64 this will fail with bug 2000709

summary: - cmsg_so_mark.sh / cmsg_time.sh in net from ubuntu_kernel_selftests hang
- with non-amd64
+ cmsg_so_mark.sh / cmsg_time.sh / cmsg_ipv6.sh in net from
+ ubuntu_kernel_selftests hang with non-amd64
Po-Hsu Lin (cypressyew)
Changed in ubuntu-kernel-tests:
assignee: nobody → Po-Hsu Lin (cypressyew)
status: New → In Progress
Changed in linux (Ubuntu Kinetic):
assignee: nobody → Po-Hsu Lin (cypressyew)
status: New → In Progress
Changed in linux (Ubuntu Lunar):
assignee: nobody → Po-Hsu Lin (cypressyew)
status: Incomplete → In Progress
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :
description: updated
Stefan Bader (smb)
Changed in linux (Ubuntu Lunar):
assignee: Po-Hsu Lin (cypressyew) → Andrea Righi (arighi)
importance: Undecided → Medium
status: In Progress → Fix Committed
Changed in linux (Ubuntu Kinetic):
importance: Undecided → Medium
status: In Progress → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux/5.19.0-35.36 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-kinetic' to 'verification-done-kinetic'. If the problem still exists, change the tag 'verification-needed-kinetic' to 'verification-failed-kinetic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-kinetic-linux verification-needed-kinetic
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-oem-6.1/6.1.0-1007.7 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification-done-jammy'. If the problem still exists, change the tag 'verification-needed-jammy' to 'verification-failed-jammy'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-oem-6.1 verification-needed-jammy
Revision history for this message
Po-Hsu Lin (cypressyew) wrote (last edit ):

Test passed with linux/5.19.0-35.36 on arm64 nodes

We don't have non-amd64 arch for linux-oem-6.1/6.1.0-1007.7

tags: added: verification-done-jammy verification-done-kinetic
removed: verification-needed-jammy verification-needed-kinetic
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (52.5 KiB)

This bug was fixed in the package linux - 5.19.0-35.36

---------------
linux (5.19.0-35.36) kinetic; urgency=medium

  * kinetic/linux: 5.19.0-35.36 -proposed tracker (LP: #2004652)

  * CVE-2023-0461
    - SAUCE: Fix inet_csk_listen_start after CVE-2023-0461

linux (5.19.0-34.35) kinetic; urgency=medium

  * kinetic/linux: 5.19.0-34.35 -proposed tracker (LP: #2004299)

  * LXD containers using shiftfs on ZFS or TMPFS broken on 5.15.0-48.54
    (LP: #1990849)
    - [SAUCE] shiftfs: fix -EOVERFLOW inside the container

  * Kinetic update: upstream stable patchset 2023-01-27 (LP: #2004051)
    - ASoC: fsl_sai: use local device pointer
    - serial: Add rs485_supported to uart_port
    - serial: fsl_lpuart: Fill in rs485_supported
    - x86/sgx: Create utility to validate user provided offset and length
    - x86/sgx: Add overflow check in sgx_validate_offset_length()
    - binder: validate alloc->mm in ->mmap() handler
    - ceph: Use kcalloc for allocating multiple elements
    - ceph: fix NULL pointer dereference for req->r_session
    - wifi: mac80211: fix memory free error when registering wiphy fail
    - wifi: mac80211_hwsim: fix debugfs attribute ps with rc table support
    - riscv: dts: sifive unleashed: Add PWM controlled LEDs
    - audit: fix undefined behavior in bit shift for AUDIT_BIT
    - wifi: airo: do not assign -1 to unsigned char
    - wifi: mac80211: Fix ack frame idr leak when mesh has no route
    - wifi: ath11k: Fix QCN9074 firmware boot on x86
    - spi: stm32: fix stm32_spi_prepare_mbr() that halves spi clk for every run
    - selftests/bpf: Add verifier test for release_reference()
    - Revert "net: macsec: report real_dev features when HW offloading is enabled"
    - platform/x86: ideapad-laptop: Disable touchpad_switch
    - platform/x86: touchscreen_dmi: Add info for the RCA Cambio W101 v2 2-in-1
    - platform/x86/intel/pmt: Sapphire Rapids PMT errata fix
    - scsi: ibmvfc: Avoid path failures during live migration
    - scsi: scsi_debug: Make the READ CAPACITY response compliant with ZBC
    - drm: panel-orientation-quirks: Add quirk for Acer Switch V 10 (SW5-017)
    - block, bfq: fix null pointer dereference in bfq_bio_bfqg()
    - arm64/syscall: Include asm/ptrace.h in syscall_wrapper header.
    - nvmet: fix memory leak in nvmet_subsys_attr_model_store_locked
    - Revert "drm/amdgpu: Revert "drm/amdgpu: getting fan speed pwm for vega10
      properly""
    - ALSA: usb-audio: add quirk to fix Hamedal C20 disconnect issue
    - RISC-V: vdso: Do not add missing symbols to version section in linker script
    - MIPS: pic32: treat port as signed integer
    - xfrm: fix "disable_policy" on ipv4 early demux
    - xfrm: replay: Fix ESN wrap around for GSO
    - af_key: Fix send_acquire race with pfkey_register
    - ARM: dts: am335x-pcm-953: Define fixed regulators in root node
    - ASoC: hdac_hda: fix hda pcm buffer overflow issue
    - ASoC: sgtl5000: Reset the CHIP_CLK_CTRL reg on remove
    - ASoC: soc-pcm: Don't zero TDM masks in __soc_pcm_open()
    - x86/hyperv: Restore VP assist page after cpu offlining/onlining
    - scsi: storvsc: Fix handling of srb_status and capacity change events
    - ASoC: max983...

Changed in linux (Ubuntu Kinetic):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 6.1.0-16.16

---------------
linux (6.1.0-16.16) lunar; urgency=medium

  * lunar/linux: 6.1.0-16.16 -proposed tracker (LP: #2008480)

  * Packaging resync (LP: #1786013)
    - debian/dkms-versions -- temporarily drop broken dkms

 -- Andrea Righi <email address hidden> Fri, 24 Feb 2023 14:24:48 +0100

Changed in linux (Ubuntu Lunar):
status: Fix Committed → Fix Released
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Hints removed, closing this bug.

Changed in ubuntu-kernel-tests:
status: In Progress → Fix Released
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-aws/5.19.0-1021.22 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-kinetic' to 'verification-done-kinetic'. If the problem still exists, change the tag 'verification-needed-kinetic' to 'verification-failed-kinetic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-kinetic-linux-aws verification-needed-kinetic
removed: verification-done-kinetic
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-azure/5.19.0-1022.23 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-kinetic' to 'verification-done-kinetic'. If the problem still exists, change the tag 'verification-needed-kinetic' to 'verification-failed-kinetic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-kinetic-linux-azure
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Issue does not exist with K-azure 5.19.0-1022.23 / K-aws 5.19.0-1021.22

tags: added: verification-done-kinetic
removed: verification-needed-kinetic
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-starfive/5.19.0-1014.16 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-kinetic' to 'verification-done-kinetic'. If the problem still exists, change the tag 'verification-needed-kinetic' to 'verification-failed-kinetic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-kinetic-linux-starfive verification-needed-kinetic
removed: verification-done-kinetic
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-riscv/5.19.0-1015.16 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-kinetic' to 'verification-done-kinetic'. If the problem still exists, change the tag 'verification-needed-kinetic' to 'verification-failed-kinetic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-kinetic-linux-riscv
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-allwinner/5.19.0-1009.9 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-kinetic' to 'verification-done-kinetic'. If the problem still exists, change the tag 'verification-needed-kinetic' to 'verification-failed-kinetic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-kinetic-linux-allwinner
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

These are not blocking RISCV64 test to proceed anymore.

tags: added: verification-done-kinetic
removed: verification-needed-kinetic
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-nvidia-5.19/5.19.0-1009.9 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification-done-jammy'. If the problem still exists, change the tag 'verification-needed-jammy' to 'verification-failed-jammy'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-nvidia-5.19 verification-needed-jammy
removed: verification-done-jammy
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.