EC2 kernel crash due to vmalloc

Bug #1350522 reported by Ben Howard
26
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
High
Unassigned
Utopic
Fix Released
High
Unassigned

Bug Description

During Alpha-2 automated testing, I saw the following in a log:

19:35:02 [ 2.475810] systemd-udevd[95]: starting version 204
19:35:02 [ 2.547049] ------------[ cut here ]------------
19:35:02 [ 2.547065] WARNING: CPU: 0 PID: 97 at /build/buildd/linux-3.16.0/mm/vmalloc.c:128 vmap_page_range_noflush+0x2d1/0x370()
19:35:02 [ 2.547069] Modules linked in:
19:35:02 [ 2.547073] CPU: 0 PID: 97 Comm: systemd-udevd Not tainted 3.16.0-6-generic #11-Ubuntu
19:35:02 [ 2.547077] 0000000000000009 ffff880002defb98 ffffffff81755538 0000000000000000
19:35:02 [ 2.547082] ffff880002defbd0 ffffffff8106bb0d ffff88000400ec88 0000000000000001
19:35:02 [ 2.547086] ffff880002fcfb00 ffffffffc0391000 0000000000000000 ffff880002defbe0
19:35:02 [ 2.547090] Call Trace:
19:35:02 [ 2.547096] [<ffffffff81755538>] dump_stack+0x45/0x56
19:35:02 [ 2.547101] [<ffffffff8106bb0d>] warn_slowpath_common+0x7d/0xa0
19:35:02 [ 2.547104] [<ffffffff8106bbea>] warn_slowpath_null+0x1a/0x20
19:35:02 [ 2.547108] [<ffffffff81197c31>] vmap_page_range_noflush+0x2d1/0x370
19:35:02 [ 2.547112] [<ffffffff81197cfe>] map_vm_area+0x2e/0x40
19:35:02 [ 2.547115] [<ffffffff8119a058>] __vmalloc_node_range+0x188/0x280
19:35:02 [ 2.547120] [<ffffffff810e92b4>] ? module_alloc_update_bounds+0x14/0x70
19:35:02 [ 2.547124] [<ffffffff810e92b4>] ? module_alloc_update_bounds+0x14/0x70
19:35:02 [ 2.547129] [<ffffffff8104f294>] module_alloc+0x74/0xd0
19:35:02 [ 2.547132] [<ffffffff810e92b4>] ? module_alloc_update_bounds+0x14/0x70
19:35:02 [ 2.547135] [<ffffffff810e92b4>] module_alloc_update_bounds+0x14/0x70
19:35:02 [ 2.547146] [<ffffffff810e9a6c>] layout_and_allocate+0x74c/0xc70
19:35:02 [ 2.547149] [<ffffffff810ea063>] load_module+0xd3/0x1b70
19:35:02 [ 2.547154] [<ffffffff811cfeb1>] ? vfs_read+0xf1/0x170
19:35:02 [ 2.547157] [<ffffffff810e7aa1>] ? copy_module_from_fd.isra.46+0x121/0x180
19:35:02 [ 2.547161] [<ffffffff810ebc76>] SyS_finit_module+0x86/0xb0
19:35:02 [ 2.547167] [<ffffffff8175de7f>] tracesys+0xe1/0xe6
19:35:02 [ 2.547169] ---[ end trace 8a5de7fc66e75fe4 ]---
19:35:02 [ 2.547172] vmalloc: allocation failure, allocated 20480 of 24576 bytes
19:35:02 [ 2.547175] systemd-udevd: page allocation failure: order:0, mode:0xd2
19:35:02 [ 2.547180] CPU: 0 PID: 97 Comm: systemd-udevd Tainted: G W 3.16.0-6-generic #11-Ubuntu
19:35:02 [ 2.547183] ffffffff81a88bc0 ffff880002defc08 ffffffff81755538 00000000000000d2
19:35:02 [ 2.547187] ffff880002defc90 ffffffff811642bf ffffffff81a88bc0 ffff880002defc28
19:35:02 [ 2.547191] 00003fff00000018 ffff880002defca0 ffff880002defc40 0000000000000163
19:35:02 [ 2.547195] Call Trace:
19:35:02 [ 2.547197] [<ffffffff81755538>] dump_stack+0x45/0x56
19:35:02 [ 2.547202] [<ffffffff811642bf>] warn_alloc_failed+0xdf/0x130
19:35:02 [ 2.547207] [<ffffffff8119a118>] __vmalloc_node_range+0x248/0x280
19:35:02 [ 2.547210] [<ffffffff810e92b4>] ? module_alloc_update_bounds+0x14/0x70
19:35:02 [ 2.547214] [<ffffffff810e92b4>] ? module_alloc_update_bounds+0x14/0x70
19:35:02 [ 2.547217] [<ffffffff8104f294>] module_alloc+0x74/0xd0
19:35:02 [ 2.547220] [<ffffffff810e92b4>] ? module_alloc_update_bounds+0x14/0x70
19:35:02 [ 2.547224] [<ffffffff810e92b4>] module_alloc_update_bounds+0x14/0x70
19:35:02 [ 2.547229] [<ffffffff810e9a6c>] layout_and_allocate+0x74c/0xc70
19:35:02 [ 2.547232] [<ffffffff810ea063>] load_module+0xd3/0x1b70
19:35:02 [ 2.547235] [<ffffffff811cfeb1>] ? vfs_read+0xf1/0x170
19:35:02 [ 2.547238] [<ffffffff810e7aa1>] ? copy_module_from_fd.isra.46+0x121/0x180
19:35:02 [ 2.547242] [<ffffffff810ebc76>] SyS_finit_module+0x86/0xb0
19:35:02 [ 2.547246] [<ffffffff8175de7f>] tracesys+0xe1/0xe6
19:35:02 [ 2.547248] Mem-Info:
---
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Jul 30 20:19 seq
 crw-rw---- 1 root audio 116, 33 Jul 30 20:19 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.14.4-0ubuntu2
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory
CurrentDmesg:
 [ 42.901579] systemd-logind[1378]: New seat seat0.
 [ 42.932887] systemd-logind[1378]: New session 1 of user ubuntu.
DistroRelease: Ubuntu 14.10
Ec2AMI: ami-c571d8d8
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: sa-east-1b
Ec2InstanceType: m3.medium
Ec2Kernel: aki-5553f448
Ec2Ramdisk: unavailable
IwConfig: Error: [Errno 2] No such file or directory
Lspci:

Lsusb: Error: command ['lsusb'] failed with exit code 1: unable to initialize libusb: -99
Package: linux-image-virtual 3.16.0.5.6
PackageArchitecture: amd64
PciMultimedia:

ProcFB:

ProcKernelCmdLine: root=LABEL=cloudimg-rootfs ro console=hvc0
ProcModules:

ProcVersionSignature: User Name 3.16.0-5.10-generic 3.16.0-rc6
RelatedPackageVersions:
 linux-restricted-modules-3.16.0-5-generic N/A
 linux-backports-modules-3.16.0-5-generic N/A
 linux-firmware N/A
RfKill: Error: [Errno 2] No such file or directory
Tags: utopic ec2-images third-party-packages
Uname: Linux 3.16.0-5-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm audio cdrom dialout dip floppy netdev plugdev sudo video
_MarkForUpload: True
---
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Jul 31 20:13 seq
 crw-rw---- 1 root audio 116, 33 Jul 31 20:13 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.14.5-0ubuntu1
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory
CurrentDmesg:
 [ 17.975584] systemd-logind[1248]: New seat seat0.
 [ 17.993902] systemd-logind[1248]: New session 1 of user ubuntu.
DistroRelease: Ubuntu 14.10
Ec2AMI: ami-b7ce1dc0
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: eu-west-1c
Ec2InstanceType: m3.medium
Ec2Kernel: aki-52a34525
Ec2Ramdisk: unavailable
IwConfig: Error: [Errno 2] No such file or directory
Lspci:

Lsusb: Error: command ['lsusb'] failed with exit code 1: unable to initialize libusb: -99
Package: linux-image-virtual 3.16.0.6.7
PackageArchitecture: amd64
PciMultimedia:

ProcFB:

ProcKernelCmdLine: root=LABEL=cloudimg-rootfs ro console=hvc0 initcall_debug debug ignore_loglevel LOGLEVEL=8
ProcModules:

ProcVersionSignature: Ubuntu 3.16.0-6.11-generic 3.16.0-rc7
RelatedPackageVersions:
 linux-restricted-modules-3.16.0-6-generic N/A
 linux-backports-modules-3.16.0-6-generic N/A
 linux-firmware N/A
RfKill: Error: [Errno 2] No such file or directory
Tags: utopic ec2-images
Uname: Linux 3.16.0-6-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

_MarkForUpload: True

CVE References

affects: ubuntu → linux (Ubuntu)
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote : BootDmesg.txt

apport information

tags: added: apport-collected ec2-images third-party-packages utopic
description: updated
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote : Dependencies.txt

apport information

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote : ProcEnviron.txt

apport information

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote : UdevDb.txt

apport information

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote : UdevLog.txt

apport information

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote : WifiSyslog.txt

apport information

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :
Download full text (6.4 KiB)

Another stack trace from another instance:

20:34:16 [ 2.425540] systemd-udevd[95]: starting version 204
20:34:16 [ 2.576097] ------------[ cut here ]------------
20:34:16 [ 2.576111] WARNING: CPU: 0 PID: 98 at /build/buildd/linux-3.16.0/mm/vmalloc.c:128 vmap_page_range_noflush+0x2d1/0x370()
20:34:16 [ 2.576115] Modules linked in:
20:34:16 [ 2.576120] CPU: 0 PID: 98 Comm: systemd-udevd Not tainted 3.16.0-6-generic #11-Ubuntu
20:34:16 [ 2.576123] 0000000000000009 ffff88000316fb98 ffffffff81755538 0000000000000000
20:34:16 [ 2.576128] ffff88000316fbd0 ffffffff8106bb0d ffff88000400df68 0000000000000001
20:34:16 [ 2.576132] ffff880003393900 ffffffffc01ed000 0000000000000000 ffff88000316fbe0
20:34:16 [ 2.576136] Call Trace:
20:34:16 [ 2.576147] [<ffffffff81755538>] dump_stack+0x45/0x56
20:34:16 [ 2.576152] [<ffffffff8106bb0d>] warn_slowpath_common+0x7d/0xa0
20:34:16 [ 2.576156] [<ffffffff8106bbea>] warn_slowpath_null+0x1a/0x20
20:34:16 [ 2.576159] [<ffffffff81197c31>] vmap_page_range_noflush+0x2d1/0x370
20:34:16 [ 2.576163] [<ffffffff81197cfe>] map_vm_area+0x2e/0x40
20:34:16 [ 2.576167] [<ffffffff8119a058>] __vmalloc_node_range+0x188/0x280
20:34:16 [ 2.576175] [<ffffffff810e92b4>] ? module_alloc_update_bounds+0x14/0x70
20:34:16 [ 2.576178] [<ffffffff810e92b4>] ? module_alloc_update_bounds+0x14/0x70
20:34:16 [ 2.576183] [<ffffffff8104f294>] module_alloc+0x74/0xd0
20:34:16 [ 2.576187] [<ffffffff810e92b4>] ? module_alloc_update_bounds+0x14/0x70
20:34:16 [ 2.576190] [<ffffffff810e92b4>] module_alloc_update_bounds+0x14/0x70
20:34:16 [ 2.576194] [<ffffffff810e9a6c>] layout_and_allocate+0x74c/0xc70
20:34:16 [ 2.576198] [<ffffffff810ea063>] load_module+0xd3/0x1b70
20:34:16 [ 2.576202] [<ffffffff811cfeb1>] ? vfs_read+0xf1/0x170
20:34:16 [ 2.576206] [<ffffffff810e7aa1>] ? copy_module_from_fd.isra.46+0x121/0x180
20:34:16 [ 2.576209] [<ffffffff810ebc76>] SyS_finit_module+0x86/0xb0
20:34:16 [ 2.576217] [<ffffffff8175de7f>] tracesys+0xe1/0xe6
20:34:16 [ 2.576220] ---[ end trace 0f39a359bc0e983c ]---
20:34:16 [ 2.576223] vmalloc: allocation failure, allocated 20480 of 24576 bytes
20:34:16 [ 2.576226] systemd-udevd: page allocation failure: order:0, mode:0xd2
20:34:16 [ 2.576229] CPU: 0 PID: 98 Comm: systemd-udevd Tainted: G W 3.16.0-6-generic #11-Ubuntu
20:34:16 [ 2.576232] ffffffff81a88bc0 ffff88000316fc08 ffffffff81755538 00000000000000d2
20:34:16 [ 2.576236] ffff88000316fc90 ffffffff811642bf ffffffff81a88bc0 ffff88000316fc28
20:34:16 [ 2.576239] 00003fff00000018 ffff88000316fca0 ffff88000316fc40 0000000000000163
20:34:16 [ 2.576246] Call Trace:
20:34:16 [ 2.576249] [<ffffffff81755538>] dump_stack+0x45/0x56
20:34:16 [ 2.576254] [<ffffffff811642bf>] warn_alloc_failed+0xdf/0x130
20:34:16 [ 2.576257] [<ffffffff8119a118>] __vmalloc_node_range+0x248/0x280
20:34:16 [ 2.576261] [<ffffffff810e92b4>] ? module_alloc_update_bounds+0x14/0x70
20:34:16 [ 2.576264] [<ffffffff810e92b4>] ? module_alloc_update_bounds+0x14/0x70
20:34:16 [ 2.576267] [<ffffffff8104f294>] module_alloc+...

Read more...

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

I am seeing this happen in about 5% of the launches on EC2.

Revision history for this message
Chris J Arges (arges) wrote :

Ben,

A few questions that would be useful to answer:
* It would be useful to correlate Xen hypervisor versions. On each launch with warnings we need to grep 'Xen version' in dmesg.
* Can you add the following to kernel cmdline options to get additional debug information:
initcall_debug debug ignore_loglevel LOGLEVEL=8
* If you see this problem in an instance, if you reboot does the same problem occur?

Thanks,

Revision history for this message
Ubuntu QA Website (ubuntuqa) wrote :

This bug has been reported on the Ubuntu ISO testing tracker.

A list of all reports related to this bug can be found here:
http://iso.qa.ubuntu.com/qatracker/reports/bugs/1350522

tags: added: iso-testing
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote : BootDmesg.txt

apport information

description: updated
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote : Dependencies.txt

apport information

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote : ProcEnviron.txt

apport information

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote : UdevDb.txt

apport information

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote : UdevLog.txt

apport information

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote : WifiSyslog.txt

apport information

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

* It would be useful to correlate Xen hypervisor versions. On each launch with warnings we need to grep 'Xen version' in dmesg.
Xen 4.2, w/ Amazon patches.

* Can you add the following to kernel cmdline options to get additional debug information:
initcall_debug debug ignore_loglevel LOGLEVEL=8
Added in the latest set of logs

* If you see this problem in an instance, if you reboot does the same problem occur?
Multiple reboots yields the same issue, but at different points

Revision history for this message
Stefan Bader (smb) wrote :

So I guess the next question would be whether in all 95% which do work (deducted from the ~5% failing), are all Xen version != 4.2 or a mix of versions, including 4.2? At least the fact that this persists while only rebooting make some hypervisor code involvement likely.

The log looks to me like further prove that it is systemd-udevd itself at the bottom of the problem. Which probably should not surprise me with the usual tendency of taking over the world that systemd seems to have. So the systemd-udevd is loading a module, when it gets this. We can see it is a vmalloc. So the size of the area is not that critical.
All the cases I looked at had: vmalloc: "allocation failure, allocated 20480 of 24576 bytes" after the first trace. The way I understand that first warning is: we got some space to store a PTE (page table entry) but that slot seems to already (or still) contain a valid entry. This would be returned as -EBUSY.

After the warning about the PTE, the two lines about the allocation failure are printed:
  vmalloc: allocation failure, allocated 20480 of 24576 bytes
  systemd-udevd: page allocation failure: order:0, mode:0xd2

The order-0 allocation is a bit odd. The flags/mode I would decode as:
  GFP_KERNEL(__GFP_WAIT|__GFP_IO|__GFP_FS)|__GFP_HIGHMEM

So a normal kernel page allocation (that should wait) trying for highmem (which should fallback for normal->dma32->dma). Not sure about what exactly the memory printouts tell us. Will try to chat with Andy about those. But more or less this sequence seems to repeat with varying amounts of requested/returned memory. So a bit weird that a) a normal order-0 allocation fails and this only happens in a few cases.

Revision history for this message
Stefan Bader (smb) wrote :

One good thing: I get the same output when I start a utopic cloid-image based PV guest on a Ubuntu 14.04 based Xen host (Xen-4.4).

Revision history for this message
Stefan Bader (smb) wrote :

erge: d877215 b7dd0e3
Author: Linus Torvalds <email address hidden>
Date: Wed Jul 30 09:00:20 2014 -0700

    Merge tag 'stable/for-linus-3.16-rc7-tag' of git://git.kernel.org/pub/scm/li

    Pull Xen fix from David Vrabel:
     "Fix BUG when trying to expand the grant table. This seems to occur
      often during boot with Ubuntu 14.04 PV guests"

commit b7dd0e350e0bd4c0fddcc9b8958342700b00b168
Author: David Vrabel <email address hidden>
Date: Fri Jul 11 16:42:34 2014 +0100

    x86/xen: safely map and unmap grant frames when in atomic context

That really reeks of being the problem. Will do a test kernel and re-try.

Revision history for this message
Stefan Bader (smb) wrote :

Unfortunately it seems that patch (alone?) isn't helping. Or I made some mistake in picking things. At least it still happens on th ePV guest (but not on a HVM guest which really points to this patch). :/

Revision history for this message
Bastelnerk (bastelnerk) wrote :

I'm running ubuntu 14.04 on Jiffybox, a German ec2 alternative. Due to an unrelated problem with the 3.13 kernel (see https://www.df.eu/forum/threads/72787-CONFIG_FHANDLE-Support-im-Jiffy-Kernel?highlight=kernel if curious), it is not possible to use the stock kernel: it will not boot. After installing the kernel from http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.16-utopic/ the system successfully boots, which is a great improvement on the earlier state. However, it is not able to insert any modules and gives the same error messages as quoted above.

More info:

root@localhost:~# dmesg | grep "Xen version"
[ 0.000000] Xen version: 3.1.2-308.8.2.el5 (preserve-AD)

sample error message from dmesg:

[ 4.244198] ------------[ cut here ]------------
[ 4.244213] WARNING: CPU: 0 PID: 404 at /home/apw/COD/linux/mm/vmalloc.c:128 vmap_pte_range+0x13d/0x170()
[ 4.244215] Modules linked in:
[ 4.244220] CPU: 0 PID: 404 Comm: systemd-udevd Tainted: G W 3.16.0-031600-generic #201408031935
[ 4.244222] 0000000000000080 ffff8800f3f8fb38 ffffffff81786525 000000000000546e
[ 4.244225] 0000000000000000 ffff8800f3f8fb78 ffffffff8107207c ffffffff81005719
[ 4.244228] ffff8800057417a0 ffffffffc00f4000 ffff8800f3f8fc94 0000000000000163
[ 4.244230] Call Trace:
[ 4.244235] [<ffffffff81786525>] dump_stack+0x46/0x58
[ 4.244238] [<ffffffff8107207c>] warn_slowpath_common+0x8c/0xc0
[ 4.244242] [<ffffffff81005719>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
[ 4.244250] [<ffffffff810720ca>] warn_slowpath_null+0x1a/0x20
[ 4.244255] [<ffffffff811a8dfd>] vmap_pte_range+0x13d/0x170
[ 4.244259] [<ffffffff8100a9be>] ? xen_pud_val+0xe/0x10
[ 4.244262] [<ffffffff811a8f52>] vmap_pud_range+0x122/0x1c0
[ 4.244266] [<ffffffff811a9086>] vmap_page_range_noflush+0x96/0xc0
[ 4.244269] [<ffffffff811a90e1>] map_vm_area+0x31/0x50
[ 4.244272] [<ffffffff811aa180>] __vmalloc_area_node+0x130/0x1e0
[ 4.244275] [<ffffffff811a9e06>] __vmalloc_node_range+0x86/0xd0
[ 4.244278] [<ffffffff810f41fd>] ? module_alloc_update_bounds+0x1d/0x80
[ 4.244282] [<ffffffff810f41fd>] ? module_alloc_update_bounds+0x1d/0x80
[ 4.244286] [<ffffffff81055454>] module_alloc+0x74/0xd0
[ 4.244289] [<ffffffff810f41fd>] ? module_alloc_update_bounds+0x1d/0x80
[ 4.244291] [<ffffffff810f41fd>] module_alloc_update_bounds+0x1d/0x80
[ 4.244294] [<ffffffff810f4287>] move_module+0x27/0x1c0
[ 4.244297] [<ffffffff810f44d6>] layout_and_allocate+0xa6/0xd0
[ 4.244299] [<ffffffff810f46fa>] load_module+0x12a/0x5f0
[ 4.244302] [<ffffffff810f4d6e>] SyS_finit_module+0xae/0xd0
[ 4.244306] [<ffffffff81793fad>] system_call_fastpath+0x1a/0x1f
[ 4.244308] ---[ end trace e313b695aad8c05b ]---

root@localhost:~# lsmod
Module Size Used by

root@localhost:~# free -m
             total used free shared buffers cached
Mem: 3998 483 3515 0 16 393
-/+ buffers/cache: 73 3925
Swap: 511 0 511

obviously, I would be grateful for any hints on how to proceed!

Revision history for this message
Stefan Bader (smb) wrote :

No, this has been identified as an unfortunate fallout from Kernel Address Space Layout Randomization. I am working on it. But right now you would need to re-compile the kernel and disable CONFIG_RANDOMIZE_BASE. I shared the kernel that I used to confirm this on http://people.canonical.com/~smb/lp1350522/.
Right now, I need to figure out why this has this effect on PV guests and also why the command line that is supposed to disable it (nokaslr) is not helping.

Revision history for this message
Stefan Bader (smb) wrote :

Ok, finally it seems I found the problem. It seems Xen setup code is accidentally setting up the kernel page tables in a way that causes the last 2G of memory to be identically mapped (kernel mappings). This would just work normally because the page table that covers the first 1G would correctly be clean for the second 512M (which started the module space before). If modules ever reached more than 512M of memory this just would have happened with the old layout, too.
But now that the kernel image is increased to 1G, we start to use the bad page table immediately.

Sent this patch upstream (right now only tested with the new layout)

tags: added: patch
Stefan Bader (smb)
Changed in linux (Ubuntu Utopic):
status: Confirmed → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (3.4 KiB)

This bug was fixed in the package linux - 3.16.0-12.18

---------------
linux (3.16.0-12.18) utopic; urgency=low

  [ Paolo Pisati ]

  * Revert "[Debian] dtb: symlink /lib/firmware/(uname -r)/device-tree to
    /boot/dtb-(uname -r) to make flash-kernel happy"
  * Revert "[Debian] dtb: don't remove a symlink dereferencing an existing
    directory"
  * Revert "[Debian] dtb: don't follow symlink when checking for a
    directory"
  * Revert "[Debian] dtb: symlink from /boot/dtb-$(uname -r) to /boot/dtb"
  * Revert "[Debian] dtb: move dtbs installation to /boot/dtb-$(uname -r)"

linux (3.16.0-12.17) utopic; urgency=low

  [ Andy Whitcroft ]

  * Release Tracking Bug
    - LP: #1363032
  * Revert "[Config] Switch kernel to vmlinuz (from vmlinux) on ppc64el"

  [ dann frazier ]

  * [Config] CONFIG_ARM_GIC_V3=y

  [ Douglas Lehr ]

  * SAUCE: (no-up) PCI: Increase BAR size quirk for IBM ipr SAS Crocodile
    adapters
    - LP: #1361364

  [ Marc Dietrich ]

  * [Config] arm/tegra/d-i: framebuffer and usb support for Tegra SoCs

  [ Paolo Pisati ]

  * [Config] armhf: REGULATOR_TWL4030=y
  * [Debian] dtb: move dtbs installation to /boot/dtb-$(uname -r)
  * [Debian] dtb: symlink from /boot/dtb-$(uname -r) to /boot/dtb
  * [Debian] dtb: don't follow symlink when checking for a directory
  * [Debian] dtb: don't remove a symlink dereferencing an existing
    directory
  * [Debian] dtb: symlink /lib/firmware/(uname -r)/device-tree to
    /boot/dtb-(uname -r) to make flash-kernel happy

  [ Stefan Bader ]

  * SAUCE: x86/xen: Fix setup of 64bit kernel pagetables
    - LP: #1350522

  [ Upstream Kernel Changes ]

  * drm/tegra: add MODULE_DEVICE_TABLEs
  * kvm: iommu: fix the third parameter of kvm_iommu_put_pages
    (CVE-2014-3601)
    - LP: #1362443
    - CVE-2014-3601
  * isofs: Fix unbounded recursion when processing relocated directories
    - LP: #1362447, #1362448
    - CVE-2014-5472
  * arm64/crypto: fix makefile rule for aes-glue-%.o
  * irq-gic: remove file name from heading comment
  * irqchip: gic: Move some bits of GICv2 to a library-type file
  * irqchip: gic-v3: Initial support for GICv3
  * arm64: GICv3 device tree binding documentation
  * arm64: boot protocol documentation update for GICv3
  * KVM: arm/arm64: vgic: move GICv2 registers to their own structure
  * KVM: ARM: vgic: introduce vgic_ops and LR manipulation primitives
  * KVM: ARM: vgic: abstract access to the ELRSR bitmap
  * KVM: ARM: vgic: abstract EISR bitmap access
  * KVM: ARM: vgic: abstract MISR decoding
  * KVM: ARM: vgic: move underflow handling to vgic_ops
  * KVM: ARM: vgic: abstract VMCR access
  * KVM: ARM: vgic: introduce vgic_enable
  * KVM: ARM: introduce vgic_params structure
  * KVM: ARM: vgic: split GICv2 backend from the main vgic code
  * KVM: ARM: vgic: revisit implementation of irqchip_in_kernel
  * arm64: KVM: remove __kvm_hyp_code_{start,end} from hyp.S
  * arm64: KVM: split GICv2 world switch from hyp code
  * arm64: KVM: move HCR_EL2.{IMO,FMO} manipulation into the vgic switch
    code
  * KVM: ARM: vgic: add the GICv3 backend
  * arm64: KVM: vgic: add GICv3 world switch
  * arm64: KVM: vgic: enable GICv2 emulation on top on GICv3 hardware
  * arm64...

Read more...

Changed in linux (Ubuntu Utopic):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.