RuntimeError: duplicate mac found! both 'wwan1' and 'wwan0'

Bug #2008888 reported by Ivar Simensen
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init
Fix Released
Undecided
Chad Smith

Bug Description

Hi
This bug is a variant of https://bugs.launchpad.net/cloud-init/+bug/1997922.

I got this error only when PXE booting a target with a modem that has dual interface support, and using ubuntu-22.04.2-live-server-amd64.iso or ubuntu-22.04.1-live-server-amd64.iso. After a manual installation, the target boots fine without this error.

Detected modems that cause this error:
Quectel EG25
Quectel RM510Q-GLHA
Sierra Wireless MC7455

Failure in cloud-init prevent automatic installation of targets in our production.

Tested solution:
After studying the info in bug 1997922, I downloaded https://github.com/canonical/cloud-init/blob/main/cloudinit/net/__init__.py and changed line 1043 to:
    if driver == "mscc_felix" or driver == "fsl_enetc" or driver == "qmi_wwan":

Then created a customized version of ubuntu-22.04.2-live-server-amd64.iso and tested in our production line, and now everything works fine.

Logs:
cat > get_driver.py <<EOF
from cloudinit.net import device_driver
import sys

print(device_driver(sys.argv[1]))
EOF

for nic in wwan0 wwan1 eno1 eno2 eno3 eno4 eno5 eno6; do
 echo "----- $nic"
 ls -l /sys/class/net/$nic/
 python3 ./get_driver.py $nic;
done

Output:
---- wwan0
lrwxrwxrwx 1 root root 0 Mar 1 08:11 /sys/class/net/wwan0 -> ../../devices/pci0000:00/0000:00:15.0/usb2/2-1/2-1:1.4/net/wwan0
qmi_wwan
---- wwan1
lrwxrwxrwx 1 root root 0 Mar 1 08:11 /sys/class/net/wwan1 -> ../../devices/pci0000:00/0000:00:15.0/usb1/1-4/1-4:1.4/net/wwan1
qmi_wwan
---- eno1
lrwxrwxrwx 1 root root 0 Mar 1 08:10 /sys/class/net/eno1 -> ../../devices/pci0000:00/0000:00:09.0/0000:02:00.0/net/eno1
igb
---- eno2
lrwxrwxrwx 1 root root 0 Mar 1 08:10 /sys/class/net/eno2 -> ../../devices/pci0000:00/0000:00:0a.0/0000:03:00.0/net/eno2
igb
---- eno3
lrwxrwxrwx 1 root root 0 Mar 1 08:10 /sys/class/net/eno3 -> ../../devices/pci0000:00/0000:00:0b.0/0000:04:00.0/net/eno3
igb
---- eno4
lrwxrwxrwx 1 root root 0 Mar 1 08:10 /sys/class/net/eno4 -> ../../devices/pci0000:00/0000:00:0c.0/0000:05:00.0/net/eno4
igb
---- eno5
lrwxrwxrwx 1 root root 0 Mar 1 08:10 /sys/class/net/eno5 -> ../../devices/pci0000:00/0000:00:16.0/0000:0a:00.0/net/eno5
ixgbe
---- eno6
lrwxrwxrwx 1 root root 0 Mar 1 08:10 /sys/class/net/eno6 -> ../../devices/pci0000:00/0000:00:16.0/0000:0a:00.1/net/eno6
ixgbe

Please let me know if you want more info or help for testing a final solution

Revision history for this message
Ivar Simensen (ivar-simensen) wrote :
Revision history for this message
Chad Smith (chad.smith) wrote :

Thank you for filing this bug and helping improve Ubuntu and cloud-init.

While I think your solution is probably what we need to go forward with, I wanted to double check a couple of other data points in SysFs so we can be aware if there may be other config artifacts that can alert cloud-init to this type of device in the future.

1. In your bug file above it looks like you intended to loo over each device, running 'ls -l /sys/class/net/$nic/' But the output looks like it ran 'ls -l /sys/class/net/$nic' (without the trailing forward slash '/') This only gives us the what directory the symlink points to, but we need to see the files and timestamps inside that linked directory.

Please run:
for dev in wwan0 wwan1 eno1 eno2 eno3 eno4 eno5 eno6; do
  echo ---- $dev;
  ls -l --full-time /sys/class/net/$dev/device/driver;
  cat /sys/class/net/$dev/device/driver;
  ls -l --full-time /sys/class/net/$dev/;
done

Much thanks. The --full-time timestamps are important when we compare against the cloud-init logs collected from journalctl so we can see specifically the ordering of when device driver detection is complete for the device versus cloud-init's network discovery operations (logged in cloud-init.log).

Revision history for this message
Chad Smith (chad.smith) wrote :

Marked bug status incomplete while we await /sys/class/net/ output from your affected system.

Please also attach the output of `ip addr`.

Once /sys/class/net information is attached and ip addr, please set this bug back to New so we can confirm the solution needed.

Many thanks

Changed in cloud-init:
status: New → Incomplete
Revision history for this message
Ivar Simensen (ivar-simensen) wrote :

Here is new cloud-init log from today

Revision history for this message
Ivar Simensen (ivar-simensen) wrote :

And here is the output from the requested for loop

Revision history for this message
Ivar Simensen (ivar-simensen) wrote :

And here is the ip addr output:

root@ubuntu-server:/root# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether c4:00:ad:ab:a3:6c brd ff:ff:ff:ff:ff:ff
    altname enp2s0
3: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether c4:00:ad:ab:a3:6d brd ff:ff:ff:ff:ff:ff
    altname enp3s0
    inet 172.24.2.58/24 brd 172.24.2.255 scope global eno2
       valid_lft forever preferred_lft forever
4: eno3: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether c4:00:ad:ab:a3:6e brd ff:ff:ff:ff:ff:ff
    altname enp4s0
5: eno4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether c4:00:ad:ab:a3:6f brd ff:ff:ff:ff:ff:ff
    altname enp5s0
6: eno5: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether c4:00:ad:ab:a3:70 brd ff:ff:ff:ff:ff:ff
    altname enp10s0f0
7: eno6: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether c4:00:ad:ab:a3:71 brd ff:ff:ff:ff:ff:ff
    altname enp10s0f1
8: wwan0: <POINTOPOINT,MULTICAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/none
9: wwan1: <POINTOPOINT,MULTICAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/none
root@ubuntu-server:/root#

Changed in cloud-init:
status: Incomplete → New
Chad Smith (chad.smith)
Changed in cloud-init:
status: New → Triaged
Revision history for this message
Chad Smith (chad.smith) wrote :

Thank you for the additional logs and /sys/class/net files. It confirms that our best course of action on this hardware at the moment is to ignore based on associated driver of the duplicate MACs. I've started work on a quick upstream PR that will land shortly with this support. I've uploaded a package for Ubuntu Lunar that would allow testing if you have access to these modems.

Steps to test:
sudo add-apt-repository ppa:chad.smith/lp-2008888 -y
sudo apt install cloud-init -y
sudo cloud-init clean --logs --reboot
# After reboot
cloud-init status --long (assert no errors)
egrep 'Traceback|WARNING' /var/log/cloud-init.log (assert no warnings/traces)

Revision history for this message
Chad Smith (chad.smith) wrote :
Changed in cloud-init:
status: Triaged → In Progress
assignee: nobody → Chad Smith (chad.smith)
Changed in cloud-init:
status: In Progress → Fix Committed
Revision history for this message
James Falcon (falcojr) wrote :
Revision history for this message
Chad Smith (chad.smith) wrote : Fixed in cloud-init version 23.2.

This bug is believed to be fixed in cloud-init in version 23.2. If this is still a problem for you, please make a comment and set the state back to New

Thank you.

Changed in cloud-init:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.