Ubuntu
libvirt package

running guests freeze when a guest is powered down

Bug #673705 reported by Mathias Gug on 2010-11-10

This bug affects 1 person

	Status	Importance	Assigned to
libvirt (Fedora)	Fix Released	Medium	redhat-bugs #609463
libvirt (Ubuntu)	Fix Released	Low	Unassigned
linux (Ubuntu)	Invalid	Undecided	Unassigned

Bug Description

Binary package hint: qemu-kvm

I'm running multiple guests via libvirt+kvm on my laptop. When I power off one them, all remaining running guests freeze for 30 seconds.

On the host, top shows the ksmd process as using the most cpu.

I've attached a sar output file. One guest was powered down 02:56:07 PM. The remaining running guest froze until 02:56:37 PM.

ProblemType: Bug
DistroRelease: Ubuntu 10.10
Package: qemu-kvm 0.12.5+noroms-0ubuntu7
ProcVersionSignature: Ubuntu 2.6.35-22.35-generic 2.6.35.4
Uname: Linux 2.6.35-22-generic x86_64
Architecture: amd64
Date: Wed Nov 10 14:58:57 2010
EcryptfsInUse: Yes
InstallationMedia: Ubuntu 10.04 LTS "Lucid Lynx" - Release amd64 (20100429)
KvmCmdLine:
UID PID PPID C SZ RSS PSR STIME TTY TIME CMD
117 28028 1 9 155582 296184 2 14:54 ? 00:00:26 /usr/bin/kvm -S -M pc-0.12 -enable-kvm -m 390 -smp 1,sockets=1,cores=1,threads=1 -name t-test2 -uuid d013faff-add9-48d4-8aa3-70a24f8717b2 -nodefaults -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/t-test2.monitor,server,nowait -mon chardev=monitor,mode=readline -rtc base=utc -boot cd -drive file=/home/mathiaz/reference/vms/t-test2/disk.qcow2,if=none,id=drive-virtio-disk0,boot=on,format=qcow2 -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -device virtio-net-pci,vlan=0,id=net0,mac=52:54:00:ae:d2:51,bus=pci.0,addr=0x3 -net tap,fd=40,vlan=0,name=hostnet0 -usb -vnc 127.0.0.1:1 -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
MachineType: LENOVO 3249CTO
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.35-22-generic root=UUID=97b2f151-9aee-416a-8156-0585e0766d3d ro quiet splash
ProcEnviron:
PATH=(custom, user)
LANG=en_CA.utf8
SHELL=/bin/bash
SourcePackage: qemu-kvm
dmi.bios.date: 01/26/2010
dmi.bios.vendor: LENOVO
dmi.bios.version: 6QET35WW (1.05 )
dmi.board.name: 3249CTO
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvr6QET35WW(1.05):bd01/26/2010:svnLENOVO:pn3249CTO:pvrThinkPadX201:rvnLENOVO:rn3249CTO:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 3249CTO
dmi.product.version: ThinkPad X201
dmi.sys.vendor: LENOVO

Tags:

Related branches

lp:~serge-hallyn/ubuntu/natty/libvirt/fix-macaddr

lp:~serge-hallyn/ubuntu/natty/libvirt/fix-maccaddr2

Revision history for this message

Mathias Gug (mathiaz) wrote on 2010-11-10:

sar.output Edit (308.7 KiB, application/octet-stream)
BootDmesg.txt Edit (60.5 KiB, text/plain; charset="utf-8")
CurrentDmesg.txt Edit (24.0 KiB, text/plain; charset="utf-8")
Dependencies.txt Edit (2.3 KiB, text/plain; charset="utf-8")
Lspci.txt Edit (12.6 KiB, text/plain; charset="utf-8")
Lsusb.txt Edit (490 bytes, text/plain; charset="utf-8")
ProcCpuinfo.txt Edit (3.3 KiB, text/plain; charset="utf-8")
ProcInterrupts.txt Edit (1.9 KiB, text/plain; charset="utf-8")
ProcModules.txt Edit (4.2 KiB, text/plain; charset="utf-8")
RelatedPackageVersions.txt Edit (544 bytes, text/plain; charset="utf-8")
UdevDb.txt Edit (106.2 KiB, text/plain; charset="utf-8")
UdevLog.txt Edit (234.0 KiB, text/plain; charset="utf-8")

Revision history for this message

Mathias Gug (mathiaz) wrote on 2010-11-10:

The kvm command line included in the attached information is the remaining running guest that freezes for 30 seconds after another guest is shutdown. It's started from libvirt and both guests use a qemu snapshot file as the backend for their disk device.

summary:

- running guests freeze when one of the guest is powered down
+ running guests freeze when a guest is powered down

Revision history for this message

Serge Hallyn (serge-hallyn) wrote on 2010-11-10:

I assume you've not made any customizations to /etc/default/qemu-kvm
or /etc/init/qemu-kvm.conf? I'll try to reproduce this tonight.

Revision history for this message

Mathias Gug (mathiaz) wrote on 2010-11-10:

It seems that the network is actually frozen when another guest is shutdown. Sharing a screen session between an ssh connection and a console session (via virt-viewer) shows that only the ssh connection is frozen. The running guest is still running correctly as noticed with the console connection.

Revision history for this message

Mathias Gug (mathiaz) wrote on 2010-11-10:

qemu-kvm default file Edit (144 bytes, text/plain)

Revision history for this message

Mathias Gug (mathiaz) wrote on 2010-11-10:

qemu-kvm upstart job Edit (1.2 KiB, text/plain)

Revision history for this message

Serge Hallyn (serge-hallyn) wrote on 2010-11-11:

This sounds most like comments #2 and #16 in bug 584048.

It can't be bug #579892, because that patch is in maverick.

Do you have wicd or network-manager running?

Can you give the output of:

   iptables -L?
   cat /etc/network/interfaces
   cat /proc/net/arp (both before and during a network freeze)

Also, could you follow the recipe at https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/616064/comments/33 ? That should determine whether this is a kernel or libvirt/qemu bug.

thanks,
-serge

Changed in qemu-kvm (Ubuntu):
status:	New → Incomplete
importance:	Undecided → Medium
assignee:	nobody → Serge Hallyn (serge-hallyn)
importance:	Medium → Low

Revision history for this message

Mathias Gug (mathiaz) wrote on 2010-11-11:

Download full text (3.9 KiB)

Network manager is running.

Chain OUTPUT (policy ACCEPT)
target prot opt source destination

$ sudo iptables -nL -t nat
Chain PREROUTING (policy ACCEPT)
target prot opt source destination

Chain OUTPUT (policy ACCEPT)
target prot opt source destination

Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
MASQUERADE tcp -- 192.168.233.0/24 !192.168.233.0/24 masq ports: 1024-65535
MASQUERADE udp -- 192.168.233.0/24 !192.168.233.0/24 masq ports: 1024-65535
MASQUERADE all -- 192.168.233.0/24 !192.168.233.0/24
MASQUERADE tcp -- 192.168.122.0/24 !192.168.122.0/24 masq ports: 1024-65535
MASQUERADE udp -- 192.168.122.0/24 !192.168.122.0/24 masq ports: 1024-65535
MASQUERADE all -- 192.168.122.0/24 !192.168.122.0/24

$ cat /etc/network/interfaces
auto lo
iface lo inet loopback

Two guests: 179 was shutdown, 110 froze.

While the freeze:
$ cat /proc/net/arp
IP address HW type Flags HW address Mask Device
192.168.242.1 0x1 0x2 00:12:17:1a:50:47 * eth0
192.168.122.110 0x1 0x2 52:54:00:a2:4e:07 ...

Network manager is running.

$ sudo iptables -nL
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0           udp dpt:53 
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0           tcp dpt:53 
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0           udp dpt:67 
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0           tcp dpt:67 
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0           udp dpt:53 
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0           tcp dpt:53 
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0           udp dpt:67 
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0           tcp dpt:67

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         
ACCEPT     all  --  0.0.0.0/0            192.168.233.0/24    state RELATED,ESTABLISHED 
ACCEPT     all  --  192.168.233.0/24     0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           
REJECT     all  --  0.0.0.0/0            0.0.0.0/0           reject-with icmp-port-unreachable 
REJECT     all  --  0.0.0.0/0            0.0.0.0/0           reject-with icmp-port-unreachable 
ACCEPT     all  --  0.0.0.0/0            192.168.122.0/24    state RELATED,ESTABLISHED 
ACCEPT     all  --  192.168.122.0/24     0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           
REJECT     all  --  0.0.0.0/0            0.0.0.0/0           reject-with icmp-port-unreachable 
REJECT     all  --  0.0.0.0/0            0.0.0.0/0           reject-with icmp-port-unreachable

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

$ sudo iptables -nL -t nat
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination         
MASQUERADE  tcp  --  192.168.233.0/24    !192.168.233.0/24    masq ports: 1024-65535 
MASQUERADE  udp  --  192.168.233.0/24    !192.168.233.0/24    masq ports: 1024-65535 
MASQUERADE  all  --  192.168.233.0/24    !192.168.233.0/24    
MASQUERADE  tcp  --  192.168.122.0/24    !192.168.122.0/24    masq ports: 1024-65535 
MASQUERADE  udp  --  192.168.122.0/24    !192.168.122.0/24    masq ports: 1024-65535 
MASQUERADE  all  --  192.168.122.0/24    !192.168.122.0/24

$ cat /etc/network/interfaces
auto lo
iface lo inet loopback

Two guests: 179 was shutdown, 110 froze.

Before:
$ cat /proc/net/arp
IP address       HW type     Flags       HW address            Mask     Device
192.168.242.1    0x1         0x2         00:12:17:1a:50:47     *        eth0
192.168.122.110  0x1         0x2         52:54:00:a2:4e:07     *        virbr0
192.168.122.179  0x1         0x2         52:54:00:72:58:e3     *        virbr0

While the freeze:
$ cat /proc/net/arp
IP address       HW type     Flags       HW address            Mask     Device
192.168.242.1    0x1         0x2         00:12:17:1a:50:47     *        eth0
192.168.122.110  0x1         0x2         52:54:00:a2:4e:07     *        virbr0
192.168.122.179  0x1         0x2         52:54:00:72:58:e3     *        virbr0

Just after the freeze stopped:
$ cat /proc/net/arp
IP address       HW type     Flags       HW address            Mask     Device
192.168.242.1    0x1         0x2         00:12:17:1a:50:47     *        eth0
192.168.122.110  0x1         0x2         52:54:00:a2:4e:07     *        virbr0
192.168.122.179  0x1         0x2         52:54:00:72:58:e3     *        virbr0

Some time after the freeze stop (guess: ~30 seconds):
$ cat /proc/net/arp
IP address       HW type     Flags       HW address            Mask     Device
192.168.242.1    0x1         0x2         00:12:17:1a:50:47     *        eth0
192.168.122.110  0x1         0x2         52:54:00:a2:4e:07     *        virbr0

Revision history for this message

Mathias Gug (mathiaz) wrote on 2010-11-11:

Running https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/616064/comments/33 recipe for 15 minutes didn't lead to a lock up.

Revision history for this message

Serge Hallyn (serge-hallyn) wrote on 2010-11-11:

#10

Sorry, Mathiaz, that recipe tests for connections which just
pause randomly. We need to test when a guest shuts down,
or, in other words, when the veth device is removed from the
bridge. So:

1. fire up a guest with libvirt. Monitor its network
continuously (i.e. fire up a screen session over ssh doing
while [ 1 ]; do echo -n .; sleep 5s; done
and keep that open so you can see any pauses.

2. Get a usable ns_exec:

    git clone git://git.sr71.net/~hallyn/cr_tests.git
    cd cr_tests
    git checkout ns_exec
    make ns_exec
    cp ns_exec /bin/

3. Create a veth tunnel

sudo ip link add type veth

4. Open two root terminals to configure a network namespace for our test

terminal 1:
ip link add type veth
terminal 2:
/bin/ns_exec -cmn /bin/bash
echo $$ # call this $pid henceforth
terminal 1:
ifconfig veth0 0.0.0.0 up
brctl addif virbr0 veth0
ip link set veth1 netns $pid # use pid from above
terminal 2:
ifconfig veth1 up
dhclient veth1

5. Now we want to emulate shutting down a libvirt guest. Let's try
several ways:

A. From the host root shell, just remove veth0 from the bridge:

brctl delif virbr0 veth0

B. Shut down the veth interfaces. Try veth0 and veth1 on separate
runs (ifconfig veth0 down).

C. Just exit the child shell.

D. Shut down the child shell, and then remove the veth interfaces
altogether, by doing:

ip link del veth0

After each test please remove the veth devices:

ip link del veth0

Just to make sure that the commands in step 4 (referencing veth0/veth1) stay
correct.

Revision history for this message

Jeremy Foshee (jeremyfoshee) wrote on 2011-01-12:

#11

Hi Mathias,

This bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue? Can you try with the latest development release of Ubuntu? ISO CD images are available from http://cdimage.ubuntu.com/releases/ .

If it remains an issue, could you run the following command from a Terminal (Applications->Accessories->Terminal). It will automatically gather and attach updated debug information to this report.

apport-collect -p linux 673705

Also, if you could test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

[This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags:	added: needs-kernel-logs
tags:	added: needs-upstream-testing
tags:	added: kj-triage
Changed in linux (Ubuntu):
status:	New → Incomplete

Revision history for this message

Derek Simkowiak (ubuntu-cool-st) wrote on 2011-04-10:

#12

This bug is probably a duplicate issue as this:

https://bugs.launchpad.net/ubuntu/maverick/+source/qemu-kvm/+bug/584048

The problem is with Linux bridging. When adding or removing a MAC address (like for KVM, VirtualBox, or even LXC) then if the bridge changes its MAC, this symptom happens. See comment #60 at the URL above, and also see this:

https://www.redhat.com/archives/libvir-list/2010-July/msg00450.html

The workaround is to use a MAC address that starts with "fe" (or any high really high number) for your guests. This causes the kernel to default to the hardware MAC for the bridge.

Revision history for this message

Serge Hallyn (serge-hallyn) wrote on 2011-04-10: Re: [Bug 673705] Re: running guests freeze when a guest is powered down

#13

Quoting Derek Simkowiak (<email address hidden>):
> This bug is probably a duplicate issue as this:
>
> https://bugs.launchpad.net/ubuntu/maverick/+source/qemu-kvm/+bug/584048
>
> The problem is with Linux bridging. When adding or removing a MAC
> address (like for KVM, VirtualBox, or even LXC) then if the bridge
> changes its MAC, this symptom happens. See comment #60 at the URL
> above, and also see this:
>
> https://www.redhat.com/archives/libvir-list/2010-July/msg00450.html
>
> The workaround is to use a MAC address that starts with "fe" (or any
> high really high number) for your guests. This causes the kernel to
> default to the hardware MAC for the bridge.

Derek, thanks so much for this. You're almost 100% correct. I had even
suspected that bug (in comment #7), but I failed to make the simple
connection explaining this behavior.

The proposed solution does *not* suffice, and in fact I just reproduced
the bug in natty!

The current solution counts on the bridge being associated with a physical
NIC. But our default configuration uses a NAT bridge which starts with
no physical devices attached. So it starts with a zero-ed out mac addr.
Then every time you start a VM with a lower MAC address than the first
VM's, you can see this pause. And if you shut down the VM with the
lowest MACADDR, you'll again see the pause.

Changed in qemu-kvm (Ubuntu):
status:	Incomplete → Confirmed
Changed in linux (Ubuntu):
status:	Incomplete → Invalid
Changed in qemu-kvm (Ubuntu):
importance:	Low → High

Revision history for this message

Serge Hallyn (serge-hallyn) wrote on 2011-04-10:

#14

Hi Mathias,

I've contacted upstream about this issue.

For now, the simplest workaround would be to create a tap or veth NIC with a low macaddr and always keep it attached to your virbr0.

Serge Hallyn (serge-hallyn) on 2011-04-10

affects:	qemu-kvm (Ubuntu) → libvirt (Ubuntu)
Changed in libvirt (Ubuntu):
assignee:	Serge Hallyn (serge-hallyn) → nobody

Dave Walker (davewalker) on 2011-04-11

tags:

added: server-nrs

Revision history for this message

Serge Hallyn (serge-hallyn) wrote on 2011-04-12:

#15

upstream fixed this in 0.9.0 with commit 5754dbd56d4738112a86776c09e810e32f7c3224

Serge Hallyn (serge-hallyn) on 2011-04-12

Changed in libvirt (Ubuntu):
status:	Confirmed → In Progress
assignee:	nobody → Serge Hallyn (serge-hallyn)

Serge Hallyn (serge-hallyn) on 2011-04-12

Changed in libvirt (Ubuntu):
importance:	High → Low
status:	In Progress → Triaged
assignee:	Serge Hallyn (serge-hallyn) → nobody
tags:	removed: server-nrs

Revision history for this message

Serge Hallyn (serge-hallyn) wrote on 2011-10-04:

#16

Marking this fix releases as the upstream commit fixing it is in oneiric.

Changed in libvirt (Ubuntu):
status:	Triaged → Fix Released

Bug Watch Updater (bug-watch-updater) on 2017-10-27

Changed in libvirt (Fedora):
importance:	Unknown → Medium
status:	Unknown → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

redhat-bugs #609463
[CLOSED ERRATA] Edit

Bug watches keep track of this bug in other bug trackers.

Ubuntulibvirt package

running guests freeze when a guest is powered down

Bug Description

Related branches

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntu
libvirt package