Redpine: Observed kernel panic while running wireless regressions tests

Bug #1777858 reported by Siva Rebbagondla
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Critical
Unassigned
Xenial
Fix Released
Undecided
Unassigned

Bug Description

SRU Justification:
------------------

Impact:
    Kernel freezes/panic when running wireless tests

Test case:
    1. Create a wireless soft-ap/station upon reboot.

    2. Install Checkbox plano and :
       $ sudo snap install --devmode checkbox-plano

    3. Run below wifi-ap test case
       $ checkbox-plano.checkbox-cli run .*device .*wireless/caracalla-wifi_ap_.*wlan0_auto

    4. Upon 3-4 iteraions, observed kernel crash as below,
       BUG: unable to handle kernel NULL pointer dereference at (null)
       IP: [<ffffffff810a63df>] exit_creds+0x1f/0x50
       PGD 0
       Oops: 0002 [#1] SMP
       CPU: 0 PID: 6502 Comm: rmmod Tainted: G OE 4.4.0-128-generic #154-Ubuntu
       Hardware name: Dell Inc. Edge Gateway 3003/ , BIOS 01.00.00 04/17/2017
       Stack:
       ffff88007392e600 ffff880075847dc0 ffffffff8108160a 0000000000000000
       ffff88007392e600 ffff880075847de8 ffffffff810a484b ffff880076127000
       ffff88003cd3a800 ffff880074f12a00 ffff880075847e28 ffffffffc09bed15
       Call Trace:
       [<ffffffff8108160a>] __put_task_struct+0x5a/0x140
       [<ffffffff810a484b>] kthread_stop+0x10b/0x110
       [<ffffffffc09bed15>] rsi_disconnect+0x2f5/0x300 [ven_rsi_sdio]
       [<ffffffff81578bcb>] ? __pm_runtime_resume+0x5b/0x80
       [<ffffffff816f0918>] sdio_bus_remove+0x38/0x100
       [<ffffffff8156cc64>] __device_release_driver+0xa4/0x150
       [<ffffffff8156d7a5>] driver_detach+0xb5/0xc0
       [<ffffffff8156c6c5>] bus_remove_driver+0x55/0xd0
       [<ffffffff8156dfbc>] driver_unregister+0x2c/0x50
       [<ffffffff816f0b8a>] sdio_unregister_driver+0x1a/0x20
       [<ffffffffc09bf0f5>] rsi_module_exit+0x15/0x30 [ven_rsi_sdio]
       [<ffffffff8110cad8>] SyS_delete_module+0x1b8/0x210
       [<ffffffff81851dc8>] entry_SYSCALL_64_fastpath+0x1c/0xbb

Fix:
    kthread_stop() is taking care of wait_for_completion() by default.
    No need of taking care separately.
    Issue is resolved by removing wait_for_completion() from rsi_disconnect().

Regression Petential:
    Ran Step 3 in 30-times. Didn't see any kernel panic.

This bug is for tracking purposes only, please don't triage.

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1777858

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: xenial
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
importance: Undecided → Critical
Changed in linux (Ubuntu Xenial):
status: New → Fix Committed
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'. If the problem still exists, change the tag 'verification-needed-xenial' to 'verification-failed-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
Revision history for this message
Shrirang Bagul (shrirang-bagul) wrote :

Verified on Dell IoT Edge 300x kernel snap (r93) based on 4.4.0-132.158

tags: added: verification-done-xenial
removed: verification-needed-xenial
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (16.4 KiB)

This bug was fixed in the package linux - 4.4.0-134.160

---------------
linux (4.4.0-134.160) xenial; urgency=medium

  * linux: 4.4.0-134.160 -proposed tracker (LP: #1787177)

  * locking sockets broken due to missing AppArmor socket mediation patches
    (LP: #1780227)
    - UBUNTU SAUCE: apparmor: fix apparmor mediating locking non-fs, unix sockets

  * Backport namespaced fscaps to xenial 4.4 (LP: #1778286)
    - Introduce v3 namespaced file capabilities
    - commoncap: move assignment of fs_ns to avoid null pointer dereference
    - capabilities: fix buffer overread on very short xattr
    - commoncap: Handle memory allocation failure.

  * Xenial update to 4.4.140 stable release (LP: #1784409)
    - usb: cdc_acm: Add quirk for Uniden UBC125 scanner
    - USB: serial: cp210x: add CESINEL device ids
    - USB: serial: cp210x: add Silicon Labs IDs for Windows Update
    - n_tty: Fix stall at n_tty_receive_char_special().
    - staging: android: ion: Return an ERR_PTR in ion_map_kernel
    - n_tty: Access echo_* variables carefully.
    - x86/boot: Fix early command-line parsing when matching at end
    - ath10k: fix rfc1042 header retrieval in QCA4019 with eth decap mode
    - i2c: rcar: fix resume by always initializing registers before transfer
    - ipv4: Fix error return value in fib_convert_metrics()
    - kprobes/x86: Do not modify singlestep buffer while resuming
    - nvme-pci: initialize queue memory before interrupts
    - netfilter: nf_tables: use WARN_ON_ONCE instead of BUG_ON in nft_do_chain()
    - ARM: dts: imx6q: Use correct SDMA script for SPI5 core
    - ubi: fastmap: Correctly handle interrupted erasures in EBA
    - mm: hugetlb: yield when prepping struct pages
    - tracing: Fix missing return symbol in function_graph output
    - scsi: sg: mitigate read/write abuse
    - s390: Correct register corruption in critical section cleanup
    - drbd: fix access after free
    - cifs: Fix infinite loop when using hard mount option
    - jbd2: don't mark block as modified if the handle is out of credits
    - ext4: make sure bitmaps and the inode table don't overlap with bg
      descriptors
    - ext4: always check block group bounds in ext4_init_block_bitmap()
    - ext4: only look at the bg_flags field if it is valid
    - ext4: verify the depth of extent tree in ext4_find_extent()
    - ext4: include the illegal physical block in the bad map ext4_error msg
    - ext4: clear i_data in ext4_inode_info when removing inline data
    - ext4: add more inode number paranoia checks
    - ext4: add more mount time checks of the superblock
    - ext4: check superblock mapped prior to committing
    - HID: i2c-hid: Fix "incomplete report" noise
    - HID: hiddev: fix potential Spectre v1
    - HID: debug: check length before copy_to_user()
    - x86/mce: Detect local MCEs properly
    - x86/mce: Fix incorrect "Machine check from unknown source" message
    - media: cx25840: Use subdev host data for PLL override
    - mm, page_alloc: do not break __GFP_THISNODE by zonelist reset
    - dm bufio: avoid sleeping while holding the dm_bufio lock
    - dm bufio: drop the lock when doing GFP_NOIO allocation
    - mtd: rawnand: mxc: set spa...

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
Changed in linux (Ubuntu):
status: Confirmed → Fix Released
Brad Figg (brad-figg)
tags: added: cscc
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.