vivid/linux: total ADT test failures

Bug #1558447 reported by Andy Whitcroft
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Tim Gardner
Vivid
Won't Fix
High
Unassigned
Xenial
Fix Released
Undecided
Tim Gardner

Bug Description

vivid/linux is hanging ADT test en-toto. We seem to lose sshd, the console log ends thus:

Ubuntu 15.04 adt ttyS0

adt login: [ 240.080182] INFO: task sshd:693 blocked for more than 120 seconds.
[ 240.081937] Not tainted 3.19.0-57-generic #63-Ubuntu
[ 240.083413] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 240.085791] INFO: task sshd:760 blocked for more than 120 seconds.
[ 240.087441] Not tainted 3.19.0-57-generic #63-Ubuntu
[ 240.088939] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 240.091099] INFO: task sshd:763 blocked for more than 120 seconds.
[ 240.092776] Not tainted 3.19.0-57-generic #63-Ubuntu
[ 240.094232] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 240.096431] INFO: task sshd:765 blocked for more than 120 seconds.
[ 240.098060] Not tainted 3.19.0-57-generic #63-Ubuntu
[ 240.099514] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

Revision history for this message
Andy Whitcroft (apw) wrote :

Full console log.

Martin Pitt (pitti)
Changed in linux (Ubuntu):
status: New → Invalid
Changed in linux (Ubuntu Vivid):
importance: Undecided → High
Revision history for this message
Andy Whitcroft (apw) wrote :

Reproduced this on a vivid VM. All of the failing processes are hanging sending data to a unix domain socket.

Tracked this down to a bad backport applied to the kernel:

  commit 5981b6969d649a4fe1d402c915eb751c1243410c
  Author: Joseph Salisbury <email address hidden>
  Date: Fri Mar 4 14:33:47 2016 -0500

    Revert "af_unix: Revert 'lock_interruptible' in stream receive code"

This is missing a removal of a mutex_lock() in the second stanza. Looking at what was reviewed and acked this was present in that version. Possibly this was damaged in application due to the presence of the commit below:

  commit 0c0b98c6005644d1d00c8b91fbb0cea0032f99b2
  Author: Rainer Weikusat <email address hidden>
  Date: Mon Feb 8 18:47:19 2016 +0000

    af_unix: Don't set err in unix_stream_read_generic unless there was an error

Which appears to actually be the proper fix for the issue the revert is attempting to fix.

Revision history for this message
Andy Whitcroft (apw) wrote :

Confirmed that reverting the commit below allows the kernel to boot:

  commit 5981b6969d649a4fe1d402c915eb751c1243410c
  Author: Joseph Salisbury <email address hidden>
  Date: Fri Mar 4 14:33:47 2016 -0500

    Revert "af_unix: Revert 'lock_interruptible' in stream receive code"

Revision history for this message
Stefan Bader (smb) wrote :

That sounds about right. Confusingly the email refers to the latter patch saying it might be fixed by that. Unfortunately it looks like the revert has been part of the fix and we already had both in vivid when undoing the revert was not only no longer needed but probably even fatal without the misapply.

While looking through the current history of that file, I noticed yet another tweak around that area. So maybe, while reverting the wrong revert of the revert, it might be worth looking at

commit 18eceb818dc37bbc783ec7ef7703f270cc6cd281
Author: Rainer Weikusat <email address hidden>
Date: Thu Feb 18 12:39:46 2016 +0000

    af_unix: Don't use continue to re-execute unix_stream_read_generic loop

as well...

tags: added: kernel-da-key
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I performed some testing and can confirm reverting commit 5981b69 in vivid master-next does not bring back the original bug it was meant to fix, bug 1540731 . For confirmation, I also built a test kernel with commit 5981b69 AND commit 0c0b98c6 reverted and the original bug did in fact return. I would say commit 0c0b98c6 is indeed the real fix for bug 1540731 .

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux (Ubuntu Vivid):
status: New → Confirmed
Revision history for this message
hackeron (hackeron) wrote :

Experiencing the same :( - any workaround?

Ilya Kotov (forkotov02)
Changed in linux (Ubuntu Vivid):
assignee: nobody → Ilya Kotov (forkotov02)
assignee: Ilya Kotov (forkotov02) → nobody
Tim Gardner (timg-tpi)
Changed in linux (Ubuntu Xenial):
assignee: nobody → Tim Gardner (timg-tpi)
status: Invalid → Fix Committed
Revision history for this message
hackeron (hackeron) wrote :

Thank you, problem went away with latest kernel upgrade.

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (11.3 KiB)

This bug was fixed in the package linux - 4.4.0-15.31

---------------
linux (4.4.0-15.31) xenial; urgency=low

  [ Tim Gardner ]

  * Release Tracking Bug
    - LP: #1559252

  * Xilinx KU3 Capi card does not show up in Ubuntu 16.04 (LP: #1557001)
    - SAUCE: (noup) cxl: Allow initialization on timebase sync failures

  * policy namespace stacking (LP: #1379535)
    - Revert "UBUNTU: SAUCE: Move replacedby allocation into label_alloc"
    - Revert "UBUNTU: SAUCE: Fixup: __label_update() still doesn't handle some cases correctly."
    - Revert "UBUNTU: SAUCE: fix: audit "no_new_privs" case for exec failure"
    - Revert "UBUNTU: SAUCE: fixup: warning about aa_label_vec_find_or_create not being static"
    - Revert "UBUNTU: SAUCE: apparmor: fix refcount race when finding a child profile"
    - Revert "UBUNTU: SAUCE: fixup: cast poison values to remove warnings"
    - Revert "UBUNTU: SAUCE: fixup: get rid of unused var build warning"
    - Revert "UBUNTU: SAUCE: fixup: 20/23 locking issue around in __label_update"
    - Revert "UBUNTU: SAUCE: fixup: make __share_replacedby private to get rid of build warning"
    - Revert "UBUNTU: SAUCE: fix: replacedby forwarding is not being properly update when ns is destroyed"
    - Revert "UBUNTU: SAUCE: apparmor: fix log of apparmor audit message when kern_path() fails"
    - Revert "UBUNTU: SAUCE: fixup: cleanup return handling of labels"
    - Revert "UBUNTU: SAUCE: apparmor: fix: ref count leak when profile sha1 hash is read"
    - Revert "UBUNTU: SAUCE: apparmor: Fix: query label file permission"
    - Revert "UBUNTU: SAUCE: apparmor: Don't remove label on rcu callback if the label has already been removed"
    - Revert "UBUNTU: SAUCE: apparmor: Fix: break circular refcount for label that is directly freed."
    - Revert "UBUNTU: SAUCE: apparmor: Fix: refcount bug when inserting label update that transitions ns"
    - Revert "UBUNTU: SAUCE: apparmor: Fix: now that insert can force replacement use it instead of remove_and_insert"
    - Revert "UBUNTU: SAUCE: apparmor Fix: refcount bug in pivotroot mediation"
    - Revert "UBUNTU: SAUCE: apparmor: ensure that repacedby sharing is done correctly"
    - Revert "UBUNTU: SAUCE: apparmor: Fix: update replacedby allocation to take a gfp parameter"
    - Revert "UBUNTU: SAUCE: apparmor: Fix: convert replacedby update to be protected by the labelset lock"
    - Revert "UBUNTU: SAUCE: apparmor: Fix: add required locking of __aa_update_replacedby on merge path"
    - Revert "UBUNTU: SAUCE: apparmor: Fix: deadlock in aa_put_label() call chain"
    - Revert "UBUNTU: SAUCE: apparmor: Fix: label_vec_merge insertion"
    - Revert "UBUNTU: SAUCE: apparmor: Fix: ensure new labels resulting from merge have a replacedby"
    - Revert "UBUNTU: SAUCE: apparmor: Fix: refcount leak in aa_label_merge"
    - Revert "UBUNTU: SAUCE: apparmor: Fix: refcount race between locating in labelset and get"
    - Revert "UBUNTU: SAUCE: apparmor: Fix: label merge handling of marking unconfined and stale"
    - Revert "UBUNTU: SAUCE: apparmor: add underscores to indicate aa_label_next_not_in_set() use needs locking"
    - Revert "UBUNTU: SAUCE: apparmor: debug: POISON label and replaceby ...

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
Andy Whitcroft (apw) wrote : Closing unsupported series nomination.

This bug was nominated against a series that is no longer supported, ie vivid. The bug task representing the vivid nomination is being closed as Won't Fix.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu Vivid):
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.