[Regression] APM Merlin boards fail to recover link after interface down/up

Bug #1785739 reported by dann frazier
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Unassigned
Xenial
Fix Released
Undecided
dann frazier
Bionic
Fix Released
Undecided
Unassigned
Cosmic
Fix Released
Undecided
Unassigned

Bug Description

[Impact]
On an APM Merlin (X-Gene2) board, the onboard 1G interface fails to re-establish link when the interface is brought down and back up. Juju + MAAS provider does this after every install (so to configure a bridge), making this config unusable.

This was actually a regression introduced by the fix for LP: #1632739 in 4.4.0-48.69.

[Test Case]
1) From a remote system, start a ping to the IP address of the Merlin board's eth0 interface.
2) On the merlin board: sudo ifdown eth0; sudo ifup eth0
3) Ping should recover, but doesn't

[Fix]
commit 84a527a41f38a80353f185d05e41b021e1ff672b
Author: Shaohui Xie <email address hidden>
Date: Tue May 10 17:42:26 2016 +0800

    net: phylib: fix interrupts re-enablement in phy_start

[Regression Risk]
The fix appears to be straightforward - don't enable interrupts on a phy that does not have a valid interrupt. Fixes are upstream, so any regressions should get upstream support for fixing.

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1785739

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Changed in linux (Ubuntu Bionic):
status: New → Incomplete
Changed in linux (Ubuntu Xenial):
status: New → Incomplete
dann frazier (dannf)
Changed in linux (Ubuntu Cosmic):
status: Incomplete → Fix Released
Changed in linux (Ubuntu Bionic):
status: Incomplete → Fix Released
Changed in linux (Ubuntu Cosmic):
assignee: dann frazier (dannf) → nobody
Changed in linux (Ubuntu Xenial):
assignee: nobody → dann frazier (dannf)
status: Incomplete → In Progress
description: updated
Changed in linux (Ubuntu Xenial):
status: In Progress → Fix Committed
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'. If the problem still exists, change the tag 'verification-needed-xenial' to 'verification-failed-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
dann frazier (dannf)
description: updated
tags: added: verification-done-xenial
removed: verification-needed-xenial
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 4.4.0-135.161

---------------
linux (4.4.0-135.161) xenial; urgency=medium

  * linux: 4.4.0-135.161 -proposed tracker (LP: #1788766)

  * [Regression] APM Merlin boards fail to recover link after interface down/up
    (LP: #1785739)
    - net: phylib: fix interrupts re-enablement in phy_start
    - net: phy: fix phy_start to consider PHY_IGNORE_INTERRUPT

  * qeth: don't clobber buffer on async TX completion (LP: #1786057)
    - s390/qeth: don't clobber buffer on async TX completion

  * nvme: avoid cqe corruption (LP: #1788035)
    - nvme: avoid cqe corruption when update at the same time as read

  * CacheFiles: Error: Overlong wait for old active object to go away.
    (LP: #1776254)
    - cachefiles: Fix missing clear of the CACHEFILES_OBJECT_ACTIVE flag
    - cachefiles: Wait rather than BUG'ing on "Unexpected object collision"

  * fscache cookie refcount updated incorrectly during fscache object allocation
    (LP: #1776277) // fscache cookie refcount updated incorrectly during fscache
    object allocation (LP: #1776277)
    - fscache: Fix reference overput in fscache_attach_object() error handling

  * FS-Cache: Assertion failed: FS-Cache: 6 == 5 is false (LP: #1774336)
    - Revert "UBUNTU: SAUCE: CacheFiles: fix a read_waiter/read_copier race"
    - fscache: Allow cancelled operations to be enqueued
    - cachefiles: Fix refcounting bug in backing-file read monitoring

  * linux-cloud-tools-common: Ensure hv-kvp-daemon.service starts before
    walinuxagent.service (LP: #1739107)
    - [Debian] hyper-v -- Ensure that hv-kvp-daemon.service starts before
      walinuxagent.service

 -- Khalid Elmously <email address hidden> Sun, 26 Aug 2018 23:56:50 -0400

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
Brad Figg (brad-figg)
tags: added: cscc
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.