'netplan apply' fails when trying to activate another interface on another QETH device ...

Bug #1756322 reported by Frank Heimes
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Netplan
Fix Released
Undecided
Unassigned
Ubuntu on IBM z Systems
Fix Released
High
Canonical Foundations Team
netplan.io (Ubuntu)
Fix Released
Undecided
Unassigned
Xenial
Invalid
Undecided
Unassigned
Artful
Invalid
Undecided
Unassigned
nplan (Ubuntu)
Fix Released
Undecided
Unassigned
Xenial
Fix Released
Undecided
Unassigned
Artful
Won't Fix
Undecided
Unassigned
systemd (Ubuntu)
Invalid
Undecided
Unassigned
Xenial
Invalid
Undecided
Unassigned
Artful
Invalid
Undecided
Unassigned

Bug Description

[Impact]
Server users on s390x configuring qeth devices.

[Test case]
1) Reconfigure an interface for a QETH device
2) Verify that 'netplan apply' completes successfully, without error.

[Regression potential]
This change has minimal potential for regression, and it only skip qeth-based devices from "replugging", which "disconnects" them by unbinding and rebinding the driver. Potential issues would be limited to failure to rename interfaces without a reboot, for configurations that depend on this (but it already would not have worked due to netplan apply failing to rebind the device).

---

When trying to add another interface for a QETH device on a s390x system netplan apply fails:

sudo netplan apply
Cannot replug encc003: [Errno 19] No such device
Traceback (most recent call last):
  File "/usr/sbin/netplan", line 23, in <module>
    netplan.main()
  File "/usr/share/netplan/netplan/cli/core.py", line 50, in main
    self.run_command()
  File "/usr/share/netplan/netplan/cli/utils.py", line 110, in run_command
    self.func()
  File "/usr/share/netplan/netplan/cli/commands/apply.py", line 40, in run
    self.run_command()
  File "/usr/share/netplan/netplan/cli/utils.py", line 110, in run_command
    self.func()
  File "/usr/share/netplan/netplan/cli/commands/apply.py", line 87, in command_apply
    stdout=fd, stderr=fd)
  File "/usr/lib/python3.6/subprocess.py", line 291, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['udevadm', 'test-builtin', 'net_setup_link', '/sys/class/net/encc003']' returned non-zero exit status 4.

It seems like rebinding of qeth devices is not allowed.
With qeth devices, I guess one needs to "offline & online" them...
Or like unbind a whole group of them, as there are three of them per interface.

ubuntu@s1lp14:/sys/class/net/encc006/device$ ls -latr
total 0
drwxr-xr-x 5 root root 0 Mar 16 02:06 ..
-rw-r--r-- 1 root root 4096 Mar 16 13:44 uevent
drwxr-xr-x 6 root root 0 Mar 16 13:44 .
-rw-r--r-- 1 root root 4096 Mar 16 13:44 online
lrwxrwxrwx 1 root root 0 Mar 16 13:44 subsystem -> ../../../bus/ccwgroup
lrwxrwxrwx 1 root root 0 Mar 16 13:44 driver -> ../../../bus/ccwgroup/drivers/qeth
drwxr-xr-x 2 root root 0 Mar 16 13:44 vnicc
--w------- 1 root root 4096 Mar 16 13:44 recover
-rw-r--r-- 1 root root 4096 Mar 16 13:44 priority_queueing
-rw-r--r-- 1 root root 4096 Mar 16 13:44 portno
-rw-r--r-- 1 root root 4096 Mar 16 13:44 portname
-rw-r--r-- 1 root root 4096 Mar 16 13:44 performance_stats
-rw-r--r-- 1 root root 4096 Mar 16 13:44 layer2
-rw-r--r-- 1 root root 4096 Mar 16 13:44 isolation
-rw-r--r-- 1 root root 4096 Mar 16 13:44 hw_trap
lrwxrwxrwx 1 root root 0 Mar 16 13:44 cdev2 -> ../../css0/0.0.0bb4/0.0.c008
lrwxrwxrwx 1 root root 0 Mar 16 13:44 cdev1 -> ../../css0/0.0.0bb3/0.0.c007
lrwxrwxrwx 1 root root 0 Mar 16 13:44 cdev0 -> ../../css0/0.0.0bb2/0.0.c006
-rw-r--r-- 1 root root 4096 Mar 16 13:44 buffer_count
-rw-r--r-- 1 root root 4096 Mar 16 13:44 bridge_role
-rw-r--r-- 1 root root 4096 Mar 16 13:44 bridge_reflect_promisc
-rw-r--r-- 1 root root 4096 Mar 16 13:44 bridge_hostnotify
--w------- 1 root root 4096 Mar 16 13:50 ungroup
-r--r--r-- 1 root root 4096 Mar 16 13:50 switch_attrs
-r--r--r-- 1 root root 4096 Mar 16 13:50 state
drwxr-xr-x 2 root root 0 Mar 16 13:50 power
-r--r--r-- 1 root root 4096 Mar 16 13:50 inbuf_size
-r--r--r-- 1 root root 4096 Mar 16 13:50 if_name
-r--r--r-- 1 root root 4096 Mar 16 13:50 chpid
-r--r--r-- 1 root root 4096 Mar 16 13:50 card_type
-r--r--r-- 1 root root 4096 Mar 16 13:50 bridge_state
drwxr-xr-x 2 root root 0 Mar 16 13:50 blkt
drwxr-xr-x 3 root root 0 Mar 16 13:50 net

$ echo 'encc006' | sudo tee driver/unbind
encc006
tee: driver/unbind: No such device

$ echo 'cdev0' | sudo tee driver/unbind
cdev0
tee: driver/unbind: No such device

$ echo '0.0.c006' | sudo tee driver/unbind
0.0.c006
ubuntu@s1lp14:/sys/class/net/encc006/device$ Mar 16 13:52:28 s1lp14 sudo[8046]: ubuntu : TTY=pts/1 ; PWD=/sys/devices/qeth/0.0.c006 ; USER=root ; COMMAND=/usr/bin/tee driver/unbind
Mar 16 13:52:28 s1lp14 sudo[8046]: pam_unix(sudo:session): session opened for user root by ubuntu(uid=0)
Mar 16 13:52:28 s1lp14 systemd-networkd[7772]: encc006: Lost carrier
Mar 16 13:52:28 s1lp14 systemd-timesyncd[1078]: Network configuration changed, trying to establish connection.
Mar 16 13:52:28 s1lp14 systemd-timesyncd[1078]: Synchronized to time server 91.189.89.198:123 (ntp.ubuntu.com).
Mar 16 13:52:28 s1lp14 systemd-networkd[7772]: encc006.2653: Lost carrier
Mar 16 13:52:28 s1lp14 systemd-timesyncd[1078]: Network configuration changed, trying to establish connection.
Mar 16 13:52:28 s1lp14 kernel: failed to kill vid 8100/2653 for device encc006
Mar 16 13:52:28 s1lp14 sudo[8046]: pam_unix(sudo:session): session closed for user root
Mar 16 13:52:28 s1lp14 systemd-timesyncd[1078]: Synchronized to time server 91.189.89.198:123 (ntp.ubuntu.com).

However rebinding like that does not work.

Either qeth devices should be skipped, or one should operate on the whole group of them, simulating chzdev -d c006; chzdev -e c006 -> or just calling that.

Note that the device id to pass to chzdev is the whever device symlink points to, e.g. 0.0.c006 in this case:

$ ls -latr /sys/class/net/encc006/device
lrwxrwxrwx 1 root root 0 Mar 16 13:55 /sys/class/net/encc006/device -> ../../../0.0.c006

Revision history for this message
Frank Heimes (fheimes) wrote :
Revision history for this message
Frank Heimes (fheimes) wrote :
summary: - netplan apply fails when trying to add another interface
+ netplan apply fails when trying to add another interface with QETH
+ devices ...
Frank Heimes (fheimes)
description: updated
Frank Heimes (fheimes)
summary: - netplan apply fails when trying to add another interface with QETH
- devices ...
+ 'netplan apply' fails when trying to activate another interface on
+ another QETH device ...
Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
importance: Undecided → High
assignee: nobody → Canonical Foundations Team (canonical-foundations)
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Netplan does not enable s390x devices; thus one needs to `chzdev -e` devices, before netplan kicks in, or apply is used.
One cannot use netplan to "create" qeth devices.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Also netplan apply, generates and reloads udev rules. Therefore, udev tries to apply settings to things, that might not be existing.

Revision history for this message
Frank Heimes (fheimes) wrote :

Sure - a 'chzdev -e' was done.
For details please have a look at the attached steps.txt - you'll find it.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Ok, reading more carefully it looks like udevadm is failing, when called by netplan.

description: updated
Changed in systemd (Ubuntu):
status: New → Invalid
Changed in netplan.io (Ubuntu):
status: New → Confirmed
Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
status: New → Confirmed
tags: added: id-5aac06833481524435312429
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package netplan.io - 0.34.1

---------------
netplan.io (0.34.1) bionic; urgency=medium

  * Makefile: be more distro-agnostic, support different paths for pyflakes,
    pycodestyle, etc.
  * Makefile: allow varying install destinations.
  * Do not attempt to rebind driver 'qeth'. (LP: #1756322)
  * docs/netplan.md: clarify the behavior of 'dhcp6: yes'.
  * docs/netplan.md, docs/manpage.md: rework documentation files to generate
    the manpage with its own headers and other things that don't apply to other
    doc formats such as HTML.
  * migrate: command renamed from ifupdown-migrate, although still disabled.
  * tests: re-instate bridge-priority integration test.
  * Added .spec build rules file for building RPM packages.
  * debian/postinst: reworded "breadcrumbs" written to /etc/network/interfaces.

 -- Mathieu Trudel-Lapierre <email address hidden> Thu, 22 Mar 2018 15:09:12 -0400

Changed in netplan.io (Ubuntu):
status: Confirmed → Fix Released
Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
status: Confirmed → Fix Released
Revision history for this message
Frank Heimes (fheimes) wrote :

After a quick check I can confirm that it's fixed - thx!

Changed in netplan:
status: New → Fix Released
description: updated
Changed in nplan (Ubuntu):
status: New → Fix Released
Changed in netplan.io (Ubuntu Xenial):
status: New → Invalid
Changed in netplan.io (Ubuntu Artful):
status: New → Invalid
Changed in systemd (Ubuntu Xenial):
status: New → Invalid
Changed in systemd (Ubuntu Artful):
status: New → Invalid
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Frank, or anyone else affected,

Accepted nplan into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/nplan/0.32~16.04.5 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in nplan (Ubuntu Artful):
status: New → Won't Fix
Changed in nplan (Ubuntu Xenial):
status: New → Fix Committed
tags: added: verification-needed verification-needed-xenial
Revision history for this message
Mathieu Trudel-Lapierre (cyphermox) wrote :

@Frank,

Are you able to help testing this? I do not have access to hardware with qeth interfaces.

Revision history for this message
Frank Heimes (fheimes) wrote :

Successfully verified this new nplan version 0.32~16.04.5 from xenial proposed on a 16.04 z/VM guest with latest (non-proposed) updates (except nplan itself).
Was able to activate two different qeth devices on that z/VM guest with 'netplan apply' (one after the other) and both worked fine. I could use them for remote ssh access and they also survived a reboot.
See attached cmd-log for more details.

Adjusting the tags accordingly.

tags: added: verification-done verification-done-xenial
removed: verification-needed verification-needed-xenial
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nplan - 0.32~16.04.5

---------------
nplan (0.32~16.04.5) xenial; urgency=medium

  * bond/bridge: Support suffixes for time-based values so things like
    "mii-monitor-interval" can support milliseconds. (LP: #1745597)
  * Do not attempt to rebind driver 'qeth'. (LP: #1756322)
  * Allow setting ClientIdentifier=mac for networkd-renderered devices
    (LP: #1738998)
  * IPv6: accept-ra should default to being unset, so that the kernel default
    can be used. (LP: #1732002)
  * doc/netplan.md: Clarify the behavior for time-based values for bonds
    and bridges. (LP: #1756587)
  * critical: provide a way to set "CriticalConnection=true" on a networkd
    connection, especially for remote-fs scenarios. (LP: #1769682)
  * networkd: don't wipe out /run/netplan on generate: we do want to keep any
    YAML configurations in that directory, we just need to remove generated
    wpasupplicant configs. (LP: #1764869)

 -- Mathieu Trudel-Lapierre <email address hidden> Tue, 08 May 2018 10:36:24 -0400

Changed in nplan (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
Chris Halse Rogers (raof) wrote : Update Released

The verification of the Stable Release Update for nplan has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
BloodBlight (bpwoods) wrote :

I am still seeing this error on Ubuntu 18.04 with netplan.io 0.97:

root@server:/etc/netplan# netplan apply
Traceback (most recent call last):
  File "/usr/sbin/netplan", line 23, in <module>
    netplan.main()
  File "/usr/share/netplan/netplan/cli/core.py", line 50, in main
    self.run_command()
  File "/usr/share/netplan/netplan/cli/utils.py", line 130, in run_command
    self.func()
  File "/usr/share/netplan/netplan/cli/commands/apply.py", line 43, in run
    self.run_command()
  File "/usr/share/netplan/netplan/cli/utils.py", line 130, in run_command
    self.func()
  File "/usr/share/netplan/netplan/cli/commands/apply.py", line 106, in command_apply
    stderr=subprocess.DEVNULL)
  File "/usr/lib/python3.6/subprocess.py", line 291, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['udevadm', 'test-builtin', 'net_setup_link', '/sys/class/net/enp1s0f0:0']' returned non-zero exit status 4.

root@MediaServer:/etc/netplan# apt search netplan.io
...
netplan.io/bionic-updates,now 0.97-0ubuntu1~18.04.1 amd64 [installed,automatic]
  YAML network configuration abstraction for various backends
...

Revision history for this message
Steve Langasek (vorlon) wrote : Re: [Bug 1756322] Re: 'netplan apply' fails when trying to activate another interface on another QETH device ...

On Mon, Jun 03, 2019 at 12:09:45AM -0000, BloodBlight wrote:
> I am still seeing this error on Ubuntu 18.04 with netplan.io 0.97:

> root@server:/etc/netplan# netplan apply
> Traceback (most recent call last):
> File "/usr/sbin/netplan", line 23, in <module>
> netplan.main()
> File "/usr/share/netplan/netplan/cli/core.py", line 50, in main
> self.run_command()
> File "/usr/share/netplan/netplan/cli/utils.py", line 130, in run_command
> self.func()
> File "/usr/share/netplan/netplan/cli/commands/apply.py", line 43, in run
> self.run_command()
> File "/usr/share/netplan/netplan/cli/utils.py", line 130, in run_command
> self.func()
> File "/usr/share/netplan/netplan/cli/commands/apply.py", line 106, in command_apply
> stderr=subprocess.DEVNULL)
> File "/usr/lib/python3.6/subprocess.py", line 291, in check_call
> raise CalledProcessError(retcode, cmd)
> subprocess.CalledProcessError: Command '['udevadm', 'test-builtin',
> 'net_setup_link', '/sys/class/net/enp1s0f0:0']' returned non-zero exit
> status 4.

This is a bug report about QETH interfaces, which are an s390x-specific
driver. Since you are seeing an issue on amd64, please file a separate bug
report for this problem.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.