[sos39][networking] dpdk port flapping observed during sosreport execution

Bug #1925351 reported by Eric Desrochers
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
sosreport (Ubuntu)
Fix Released
Undecided
Unassigned
Xenial
Fix Released
High
Eric Desrochers
Bionic
Fix Released
Undecided
Unassigned
Focal
Fix Released
Undecided
Unassigned
Groovy
Fix Released
Undecided
Unassigned
Hirsute
Fix Released
Undecided
Unassigned

Bug Description

[IMPACT]

sosreport networking plugin unconditionally exercise 'ethtool -e'.

EEPROM dump collection might hang on specific types of devices,

ETHTOOL(8)
       -e --eeprom-dump
              Retrieves and prints an EEPROM dump for the specified network device. When raw is enabled, then it dumps the raw EEPROM data to stdout. The length and offset parameters allow dumping certain portions of the EEPROM. Default is to dump the entire EEPROM.

[TEST PLAN]

* On a xenial system

## Should not execute 'ethtool -e' (New default behaviour)
* sosreport -o networking

## Should execute 'ethtool -e'. (Only if sosreport is instructed to do so (AKA former default behaviour))
* sosreport -o networking --plugin-option networking.eepromdump

## Should not execute 'ethtool -e' with -a (if with appropriate tunables in /etc/sos.conf)
---
[tunables]
networking.eepromdump = off
---

## Should execute 'ethtool -e' with -a (if without appropriate tunables in /etc/sos.conf)

[WHERE PROBLEM OCCURS]

No problem expected.

networking plugin will be gated to prevent 'ethtool -e' to be run by default, so that 'sosreport -o networking' will now no longer execute 'ethtool -e' command.

On the other end, if one really want to eepromdump, one can use the 'eepromdump' plugin option as follows:

sosreport -o networking --plugin-option networking.eepromdump

Yes, we are changing the default behaviour for good reasons as it may produce more harm than good the way it is at the moment, BUT the former behaviour is still available if *REALLY* needed, and after knowing the risk that this could occur.

Note:
sosreport -a ## Will still execute 'ethtool -e' as it turns all options to 'True'.

       -a, --alloptions
              Set all boolean options to True for all enabled plug-ins.

Unless one specify the following in /etc/sos.conf:
---
[tunables]
networking.eepromdump = off
---

so 'sosreport -a' with a combination of adding the right tunables in /etc/sos.conf will prevent 'ethtool_e' to be executed.

[OTHER INFORMATION]

Upstream fix:
https://github.com/sosreport/sos/commit/a74ef444a72691fab9c65fa679687cc2d6e0fc8c

$ git describe --contains aca8bd83
4.1~44

$ rmadison sosreport
=> sosreport | 3.9.1-1ubuntu0.16.04.1 | xenial-updates
sosreport | 4.1-1ubuntu0.18.04.1 | bionic-updates
sosreport | 4.1-1ubuntu0.20.04.1 | focal-updates
sosreport | 4.1-1ubuntu0.20.10.1 | groovy-updates
sosreport | 4.1-1ubuntu1 | hirsute

Eric Desrochers (slashd)
Changed in sosreport (Ubuntu):
status: New → Fix Released
Changed in sosreport (Ubuntu Xenial):
status: New → In Progress
assignee: nobody → Eric Desrochers (slashd)
importance: Undecided → High
description: updated
description: updated
Eric Desrochers (slashd)
description: updated
Eric Desrochers (slashd)
description: updated
Eric Desrochers (slashd)
description: updated
tags: added: serg sts
tags: added: seg
removed: serg
Eric Desrochers (slashd)
description: updated
description: updated
Eric Desrochers (slashd)
description: updated
description: updated
Eric Desrochers (slashd)
summary: - [networking] dpdk port flapping observed during sosreport execution
+ [sos39][networking] dpdk port flapping observed during sosreport
+ execution
Eric Desrochers (slashd)
Changed in sosreport (Ubuntu Xenial):
importance: High → Critical
Revision history for this message
Eric Desrochers (slashd) wrote :

[XENIAL][PRE-SRU TESTING]

An impacted user did 'pre-SRU' testing. Here's what has been brought to my attention:

"
Thanks for the test package.
We tested it, and as per expectation, it is not executing ethtool -e and in turn not causing dpdk port flap.
Please let us know when can we have the official fix.
"

- Eric

Changed in sosreport (Ubuntu Xenial):
importance: Critical → High
Eric Desrochers (slashd)
description: updated
Changed in sosreport (Ubuntu Bionic):
status: New → Fix Released
Changed in sosreport (Ubuntu Focal):
status: New → Fix Released
Changed in sosreport (Ubuntu Groovy):
status: New → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Please test proposed package

Hello Eric, or anyone else affected,

Accepted sosreport into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/sosreport/3.9.1-1ubuntu0.16.04.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in sosreport (Ubuntu Xenial):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-xenial
Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (sosreport/3.9.1-1ubuntu0.16.04.2)

All autopkgtests for the newly accepted sosreport (3.9.1-1ubuntu0.16.04.2) for xenial have finished running.
The following regressions have been reported in tests triggered by the package:

sosreport/3.9.1-1ubuntu0.16.04.2 (armhf)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/xenial/update_excuses.html#sosreport

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Eric Desrochers (slashd) wrote :

[XENIAL VERIFICATION]

I have tested the proposed package (3.9.1-1ubuntu0.16.04.2) and I confirm that 'ethtool -e' is now executed only when the eepromdump networking's plugin option is turn on as explained in the [TEST PLAN].

- Eric

tags: added: verification-done-xenial
removed: verification-needed-xenial
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package sosreport - 3.9.1-1ubuntu0.16.04.2

---------------
sosreport (3.9.1-1ubuntu0.16.04.2) xenial; urgency=medium

  * d/p/0005-networking-collect-ethtool-e-device-conditionally-only.patch:
    - EEPROM dump collection might hang on specific types of devices, or
      negatively impact the system otherwise. As a safe option, sos report
      should collect the command when explicitly asked via a plugopt only.
      (LP: #1925351)

 -- Eric Desrochers <email address hidden> Wed, 21 Apr 2021 09:07:20 -0400

Changed in sosreport (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for sosreport has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.