Network manager crashes and restarts 2 seconds after establishing a VPN connection

Bug #1794966 reported by Piotr Kołaczkowski
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
network-manager (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

When trying to get on a company VPN, the connection is initially established and works (I can access intranet websites), but after ~1-2 seconds from starting VPN connection the Network Manager seems to crash and restart, taking all connections down. This makes VPN totally unusable.

syslog:

// VPN activation:
...
Sep 28 13:48:22 p5520 nm-dispatcher: req:4 'vpn-up' [tun0]: new request (2 scripts)
Sep 28 13:48:22 p5520 nm-dispatcher: req:4 'vpn-up' [tun0]: start running ordered scripts...
Sep 28 13:48:22 p5520 NetworkManager[6892]: <info> [1538135302.5368] device (tun0): Activation: starting connection 'tun0' (6bd0ba91-b53b-4e9a-b94c-394a9a9f336b)
Sep 28 13:48:22 p5520 NetworkManager[6892]: <info> [1538135302.5370] device (tun0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'external')
Sep 28 13:48:22 p5520 systemd[1]: Starting resolvconf-pull-resolved.service...
Sep 28 13:48:22 p5520 NetworkManager[6892]: <info> [1538135302.5377] device (tun0): state change: prepare -> config (reason 'none', sys-iface-state: 'external')
Sep 28 13:48:22 p5520 NetworkManager[6892]: <info> [1538135302.5382] device (tun0): state change: config -> ip-config (reason 'none', sys-iface-state: 'external')
Sep 28 13:48:22 p5520 NetworkManager[6892]: <info> [1538135302.5384] device (tun0): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'external')
Sep 28 13:48:22 p5520 NetworkManager[6892]: <info> [1538135302.5391] device (tun0): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'external')
Sep 28 13:48:22 p5520 NetworkManager[6892]: <info> [1538135302.5396] device (tun0): state change: secondaries -> activated (reason 'none', sys-iface-state: 'external')
Sep 28 13:48:22 p5520 systemd[1]: Started resolvconf-pull-resolved.service.
Sep 28 13:48:22 p5520 NetworkManager[6892]: <info> [1538135302.5471] device (tun0): Activation: successful, device activated.

// VPN active and usable - why can't it stay so?!!!!

Sep 28 13:48:22 p5520 nm-dispatcher: req:5 'up' [tun0]: new request (2 scripts)
Sep 28 13:48:26 p5520 org.gnome.Shell.desktop[3652]: (/usr/lib/firefox/firefox:7239): dconf-WARNING **: 13:48:26.618: Unable to open /var/lib/flatpak/exports/share/dconf/profile/user: Brak dostępu
Sep 28 13:48:27 p5520 systemd[1]: Reloading OpenBSD Secure Shell server.
Sep 28 13:48:27 p5520 systemd[1]: Reloaded OpenBSD Secure Shell server.
Sep 28 13:48:27 p5520 nm-dispatcher: req:5 'up' [tun0]: start running ordered scripts...
Sep 28 13:48:27 p5520 systemd-resolved[833]: Server returned error NXDOMAIN, mitigating potential DNS violation DVE-2018-0001, retrying transaction with reduced feature level UDP.
Sep 28 13:48:28 p5520 systemd[1]: Reloading OpenBSD Secure Shell server.
Sep 28 13:48:28 p5520 systemd[1]: Reloaded OpenBSD Secure Shell server.
Sep 28 13:48:28 p5520 systemd-resolved[833]: Server returned error NXDOMAIN, mitigating potential DNS violation DVE-2018-0001, retrying transaction with reduced feature level UDP.
Sep 28 13:48:28 p5520 systemd[1]: Reloading OpenBSD Secure Shell server.
Sep 28 13:48:28 p5520 systemd[1]: Reloaded OpenBSD Secure Shell server.

// Something fishy here:

Sep 28 13:48:30 p5520 NetworkManager[6892]: malloc(): corrupted unsorted chunks 3
Sep 28 13:48:30 p5520 systemd[1]: NetworkManager.service: Main process exited, code=dumped, status=6/ABRT
Sep 28 13:48:30 p5520 systemd[1]: NetworkManager.service: Failed with result 'core-dump'.
Sep 28 13:48:30 p5520 gnome-shell[3652]: Removing a network device that was not added
Sep 28 13:48:30 p5520 gnome-shell[3652]: message repeated 3 times: [ Removing a network device that was not added]
Sep 28 13:48:30 p5520 systemd[1]: NetworkManager.service: Service RestartSec=100ms expired, scheduling restart.
Sep 28 13:48:30 p5520 systemd[1]: NetworkManager.service: Scheduled restart job, restart counter is at 5.
Sep 28 13:48:30 p5520 systemd[1]: Stopped Network Manager Wait Online.
Sep 28 13:48:30 p5520 systemd[1]: Stopping Network Manager Wait Online...
Sep 28 13:48:30 p5520 systemd[1]: Stopped Network Manager.
Sep 28 13:48:30 p5520 systemd[1]: NetworkManager.service: Found left-over process 1581 (dhclient) in control group while starting unit. Ignoring.
Sep 28 13:48:30 p5520 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Sep 28 13:48:30 p5520 systemd[1]: NetworkManager.service: Found left-over process 7924 (dhclient) in control group while starting unit. Ignoring.
Sep 28 13:48:30 p5520 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Sep 28 13:48:30 p5520 systemd[1]: Starting Network Manager...
Sep 28 13:48:30 p5520 NetworkManager[8337]: <info> [1538135310.5754] NetworkManager (version 1.12.2) is starting... (after a restart)
Sep 28 13:48:30 p5520 NetworkManager[8337]: <info> [1538135310.5754] Read config: /etc/NetworkManager/NetworkManager.conf (lib: 10-dns-resolved.conf, 20-connectivity-ubuntu.conf, no-mac-addr-change.conf) (etc: 10-globally-managed-devices.conf, default-wifi-powersave-on.conf)
Sep 28 13:48:30 p5520 NetworkManager[8337]: <info> [1538135310.5804] bus-manager: acquired D-Bus service "org.freedesktop.NetworkManager"
Sep 28 13:48:30 p5520 NetworkManager[8337]: <info> [1538135310.5811] manager[0x559281e37020]: monitoring kernel firmware directory '/lib/firmware'.
Sep 28 13:48:30 p5520 NetworkManager[8337]: <info> [1538135310.5813] monitoring ifupdown state file '/run/network/ifstate'.
Sep 28 13:48:30 p5520 systemd[1]: Started Network Manager.

Can you at least tell me if there is a workaround for this?

ProblemType: Bug
DistroRelease: Ubuntu 18.10
Package: network-manager 1.12.2-0ubuntu4
ProcVersionSignature: Ubuntu 4.18.0-7.8-generic 4.18.5
Uname: Linux 4.18.0-7-generic x86_64
ApportVersion: 2.20.10-0ubuntu11
Architecture: amd64
CurrentDesktop: ubuntu:GNOME
Date: Fri Sep 28 13:49:02 2018
DistributionChannelDescriptor:
 # This is a distribution channel descriptor
 # For more information see http://wiki.ubuntu.com/DistributionChannelDescriptor
 canonical-oem-somerville-xenial-amd64-20160624-2
InstallationDate: Installed on 2017-04-12 (533 days ago)
InstallationMedia: Ubuntu 16.04 "Xenial" - Build amd64 LIVE Binary 20160624-10:47
NetworkManager.state:
 [main]
 NetworkingEnabled=true
 WirelessEnabled=false
 WWANEnabled=false
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=pl_PL.UTF-8
 SHELL=/bin/bash
SourcePackage: network-manager
UpgradeStatus: No upgrade log present (probably fresh install)
nmcli-nm:
 RUNNING VERSION STATE STARTUP CONNECTIVITY NETWORKING WIFI-HW WIFI WWAN-HW WWAN
 running 1.12.2 connected started full enabled enabled disabled enabled disabled

Revision history for this message
Piotr Kołaczkowski (pkolaczk-u) wrote :
Revision history for this message
Piotr Kołaczkowski (pkolaczk-u) wrote :

The problem seem to exist only if the laptop is connected to wired connection through TB16 dock.
Otherwise, when on WiFi, VPN connects fine and stays stable.

Revision history for this message
Piotr Kołaczkowski (pkolaczk-u) wrote :

Tried rebooting the laptop / replugging the dock / rebooting the dock. No change.
My TB16 firmware is up-to-date.

Revision history for this message
Piotr Kołaczkowski (pkolaczk-u) wrote :
Download full text (10.4 KiB)

Ok, I did some debugging and this looks really weird.
First, trying to run NetworkManager --debug manually:

<info> [1538138381.5570] audit: op="connection-activate" uuid="1a43441b-6b26-4e0c-8343-976afcf790fd" name="DataStax" pid=3717 uid=1001 result="success"
<info> [1538138381.5755] vpn-connection[0x55c2f915e0b0,1a43441b-6b26-4e0c-8343-976afcf790fd,"DataStax",0]: Started the VPN service, PID 8148
<info> [1538138381.5973] vpn-connection[0x55c2f915e0b0,1a43441b-6b26-4e0c-8343-976afcf790fd,"DataStax",0]: Saw the service appear; activating connection
<info> [1538138381.6059] vpn-connection[0x55c2f915e0b0,1a43441b-6b26-4e0c-8343-976afcf790fd,"DataStax",0]: VPN plugin: state changed: starting (3)
<info> [1538138381.6060] vpn-connection[0x55c2f915e0b0,1a43441b-6b26-4e0c-8343-976afcf790fd,"DataStax",0]: VPN connection: (ConnectInteractive) reply received
<info> [1538138381.6485] manager: (tun0): new Tun device (/org/freedesktop/NetworkManager/Devices/6)
<info> [1538138381.6586] devices added (path: /sys/devices/virtual/net/tun0, iface: tun0)
<info> [1538138381.6586] device added (path: /sys/devices/virtual/net/tun0, iface: tun0): no ifupdown configuration found.
<info> [1538138382.5173] vpn-connection[0x55c2f915e0b0,1a43441b-6b26-4e0c-8343-976afcf790fd,"DataStax",0]: VPN connection: (IP4 Config Get) reply received from old-style plugin
<info> [1538138382.5177] vpn-connection[0x55c2f915e0b0,1a43441b-6b26-4e0c-8343-976afcf790fd,"DataStax",7:(tun0)]: Data: VPN Gateway: 12.111.84.106
<info> [1538138382.5178] vpn-connection[0x55c2f915e0b0,1a43441b-6b26-4e0c-8343-976afcf790fd,"DataStax",7:(tun0)]: Data: Tunnel Device: "tun0"
<info> [1538138382.5178] vpn-connection[0x55c2f915e0b0,1a43441b-6b26-4e0c-8343-976afcf790fd,"DataStax",7:(tun0)]: Data: IPv4 configuration:
<info> [1538138382.5178] vpn-connection[0x55c2f915e0b0,1a43441b-6b26-4e0c-8343-976afcf790fd,"DataStax",7:(tun0)]: Data: Internal Address: 10.150.1.247
<info> [1538138382.5178] vpn-connection[0x55c2f915e0b0,1a43441b-6b26-4e0c-8343-976afcf790fd,"DataStax",7:(tun0)]: Data: Internal Prefix: 32
<info> [1538138382.5178] vpn-connection[0x55c2f915e0b0,1a43441b-6b26-4e0c-8343-976afcf790fd,"DataStax",7:(tun0)]: Data: Internal Point-to-Point Address: 10.150.1.247
<info> [1538138382.5178] vpn-connection[0x55c2f915e0b0,1a43441b-6b26-4e0c-8343-976afcf790fd,"DataStax",7:(tun0)]: Data: Static Route: 0.0.0.0/0 Next Hop: 0.0.0.0
<info> [1538138382.5178] vpn-connection[0x55c2f915e0b0,1a43441b-6b26-4e0c-8343-976afcf790fd,"DataStax",7:(tun0)]: Data: Static Route: 10.150.1.247/32 Next Hop: 0.0.0.0
<info> [1538138382.5179] vpn-connection[0x55c2f915e0b0,1a43441b-6b26-4e0c-8343-976afcf790fd,"DataStax",7:(tun0)]: Data: Internal DNS: 10.200.1.10
<info> [1538138382.5179] vpn-connection[0x55c2f915e0b0,1a43441b-6b26-4e0c-8343-976afcf790fd,"DataStax",7:(tun0)]: Data: Internal DNS: 10.200.1.11
<info> [1538138382.5179] vpn-connection[0x55c2f915e0b0,1a43441b-6b26-4e0c-8343-976afcf790fd,"DataStax",7:(tun0)]: Data: DNS Domain: 'datastax.lan'
<info> [1538138382.5179] vpn-connection[0x55c2f915e0b0,1a43441b-6b26-4e0c-8343-976afcf790fd,"DataStax",7:(tun0)]: Data: No IPv6 configuration
<info> [1...

Revision history for this message
Piotr Kołaczkowski (pkolaczk-u) wrote :

Retried a few times and it works fine under valgrind.

Revision history for this message
Piotr Kołaczkowski (pkolaczk-u) wrote :

Looks like a bug in the memory allocator.

Running fine under tcmalloc_minimal:
root@p5520:/home/pkolaczk# export LD_PRELOAD="/usr/lib/x86_64-linux-gnu/libtcmalloc_minimal.so.4"
root@p5520:/home/pkolaczk# NetworkManager --debug
<info> [1538139465.9143] NetworkManager (version 1.12.2) is starting... (after a restart)
<info> [1538139465.9143] Read config: /etc/NetworkManager/NetworkManager.conf (lib: 10-dns-resolved.conf, 20-connectivity-ubuntu.conf, no-mac-addr-change.conf) (etc: 10-globally-managed-devices.conf, default-wifi-powersave-on.conf)
...
<info> [1538139466.1844] manager: rfkill: WWAN enabled by radio killswitch; disabled by state file
iface-state: 'external')
<info> [1538139473.5529] device (tun0): state change: secondaries -> activated (reason 'none', sys-iface-state: 'external')
<info> [1538139473.5622] device (tun0): Activation: successful, device activated.
// no crash

Revision history for this message
Piotr Kołaczkowski (pkolaczk-u) wrote :

I'm attaching the crashdump. Core dump truncated, in order to not leak any sensitive information (VPN passwords, etc). It looks like there is a stacktrace there - I hope it helps.

Revision history for this message
Piotr Kołaczkowski (pkolaczk-u) wrote :

Is there a way to configure only NetworkManager to use tcmalloc permanently?

I tried adding LD_PRELOAD to /etc/init.d/network-manager or /etc/init/network-manager.conf and it does not work.

However setting LD_PRELOAD tcmalloc globally for the whole system in /etc/ld.preload.conf does fix network manager but breaks other things like firefox.

Revision history for this message
Piotr Kołaczkowski (pkolaczk-u) wrote :

How can I get debug symbols for NetworkManager?

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in network-manager (Ubuntu):
status: New → Confirmed
Revision history for this message
Iain Lane (laney) wrote :

I think this might be fixed in 1.12.4 - could you test it when it comes to cosmic please? (hopefully later today)

Revision history for this message
Piotr Kołaczkowski (pkolaczk-u) wrote :

Tested it today and indeed it looks like it was fixed. Thanks!

Revision history for this message
Iain Lane (laney) wrote :

Thanks to you!

Changed in network-manager (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.