Bug #686692 “natty kernel does not boot on ec2 t1.micro” : Bugs : linux package : Ubuntu

Revision history for this message

Scott Moser (smoser) wrote on 2010-12-07:

#1

log of i386 instance t1.micro Edit (5.3 KiB, text/plain)
BootDmesg.txt Edit (13.9 KiB, text/plain; charset="utf-8")
Dependencies.txt Edit (1.8 KiB, text/plain; charset="utf-8")
ProcCpuinfo.txt Edit (1.2 KiB, text/plain; charset="utf-8")
ProcCpuinfo_.txt Edit (1.2 KiB, text/plain; charset="utf-8")
ProcInterrupts.txt Edit (1.7 KiB, text/plain; charset="utf-8")
UdevDb.txt Edit (40.2 KiB, text/plain; charset="utf-8")
UdevLog.txt Edit (84.3 KiB, text/plain; charset="utf-8")

Revision history for this message

Scott Moser (smoser) wrote on 2010-12-07:

#2

log of amd64 instance t1.micro Edit (1.8 KiB, text/plain)

Changed in linux (Ubuntu):
assignee:	nobody → Stefan Bader (stefan-bader-canonical)
importance:	Undecided → High
milestone:	none → natty-alpha-2
status:	New → Confirmed

Jeremy Foshee (jeremyfoshee) on 2010-12-08

tags:

added: kernel-series-unknown

Revision history for this message

Stefan Bader (smb) wrote on 2010-12-10:

#3

Not the solution yet, unfortunately, but looking at bug #667796, we found that XEN_MAX_DOMAIN_MEMORY limits the memory a domU is reporting. Looking at Natty, this has actually changed to a fixed config option of 128GB. But this went with a quite big change to the mmu code and only changing the value back to 70 is not enough to make it work again. But at least the following commit may be a start to look at:

commit 58e05027b530ff081ecea68e38de8d59db8f87e0
Author: Jeremy Fitzhardinge <email address hidden>
Date: Fri Aug 27 13:28:48 2010 -0700

xen: convert p2m to a 3 level tree

Make the p2m structure a 3 level tree which covers the full possible
physical space.

    The p2m structure contains mappings from the domain's pfns to system-wide
    mfns. The structure has 3 levels and two roots. The first root is for
    the domain's own use, and is linked with virtual addresses. The second
    is all mfn references, and is used by Xen on save/restore to allow it to
    update the p2m mapping for the domain.

    At boot, the domain builder provides a simple flat p2m array for all the
    initially present pages. We construct the two levels above that using
    the early_brk allocator. After early boot time, set_phys_to_machine()
    will allocate any missing levels using the normal kernel allocator
    (at GFP_KERNEL, so it must be called in a normal blocking context).

    Because the early_brk() API requires us to pre-reserve the maximum amount
    of memory we could allocate, there is still a CONFIG_XEN_MAX_DOMAIN_MEMORY
    config option, but its only negative side-effect is to increase the
    kernel's apparent bss size. However, since all unused brk memory is
    returned to the heap, there's no real downside to making it large.

Ubuntu Foundations Team Bug Bot (crichton) on 2010-12-20

tags:

added: regression-release
removed: regression-update

Revision history for this message

Stefan Bader (smb) wrote on 2011-01-18:

#4

Some updates here: the good news is that I am able to reproduce this on a local CentOS based installation. Bad news so far is that the DomU crashes so quickly that I get no output at all, even when directly attaching to the console on "xm create".

But at least I found a lead. The crashes happen if the guest memory is less than 1G and not dividable by 4. So 615M crashes, but 616 will boot (or 612 and so on). There is also a visible change in the memory layout presented to Linux. While previously the max_pfn was directly used to create an e820 map, there is now some additional 8M added in the data returned by the memory hypercall. I cannot say right now whether that directly relates to the crash or not but one can see that starting a guest with mem=616, Linux will report 624M of memory. There is a lot of shifting around and recalculating going on which I have yet to understand.

Revision history for this message

Scott Moser (smoser) wrote on 2011-01-18:

#5

@Stefan,
just for reference, could you attach your xen config for this instance ? I'd like to recreate.

Revision history for this message

Stefan Bader (smb) wrote on 2011-01-18:

#6

name = "NattyServerMicro32"
kernel = "/root/boot/pv-grub-hd0-V1.01-i386.gz"
memory = 616
vcpus = 1
disk = [ 'file:/root/amis/natty-server-uec-i386.img,sda1,w' ]
vif = [ '' ]

Not sure the vif really would work like this. I seem to have problems getting the boot completed (currently got the cloud-init stuff disabled as I have no magic meta server).

Revision history for this message

Stefan Bader (smb) wrote on 2011-01-18:

#7

Download full text (3.8 KiB)

One further step finally. Using 'on_crash = "coredump-destroy"' and after creating /var/xen/dump, I was able to extract the following from the dump file:

<6>[ 0.000000] ACPI in unprivileged domain disabled
<3>[ 0.000000] max_pfn used = 26700(26700000)
<3>[ 0.000000] Xen: map base 0 + 26f00000
<3>[ 0.000000] Xen: map end = 26f00000
<3>[ 0.000000] map size reduzed to 26700000
<3>[ 0.000000] delta = 800000, extra_pages = 2048
<3>[ 0.000000] extra_mem_start = 26700000
<3>[ 0.000000] Xen: reserve c166f000c15d2000 - 800
<6>[ 0.000000] released 0 pages of unused memory
<3>[ 0.000000] Xen: extra_limit = 159488
<3>[ 0.000000] Xen: adding 2048 extra pages at 644874240
<6>[ 0.000000] BIOS-provided physical RAM map:
<6>[ 0.000000] Xen: 0000000000000000 - 00000000000a0000 (usable)
<6>[ 0.000000] Xen: 00000000000a0000 - 0000000000100000 (reserved)
<6>[ 0.000000] Xen: 0000000000100000 - 0000000026f00000 (usable)
<6>[ 0.000000] NX (Execute Disable) protection: active
<6>[ 0.000000] DMI not present or invalid.
<7>[ 0.000000] e820 update range: 0000000000000000 - 0000000000010000 (usable
) ==> (reserved)
<7>[ 0.000000] e820 remove range: 00000000000a0000 - 0000000000100000 (usable)
<6>[ 0.000000] last_pfn = 0x26f00 max_arch_pfn = 0x1000000
<6>[ 0.000000] Scanning 0 areas for low memory corruption
<7>[ 0.000000] initial memory mapped : 0 - 01fff000
<6>[ 0.000000] init_memory_mapping: 0000000000000000-0000000026f00000
<7>[ 0.000000] 0000000000 - 0026f00000 page 4k
<7>[ 0.000000] kernel direct mapping tables up to 26f00000 @ 1ec4000-1fff000
<1>[ 0.000000] BUG: unable to handle kernel NULL pointer dereference at (null)
<1>[ 0.000000] IP: [<c0107397>] xen_set_pte+0x27/0x60
<4>[ 0.000000] *pdpt = 0000000000000000 *pde = 0000000000000000
<0>[ 0.000000] Oops: 0003 [#1] SMP
<0>[ 0.000000] last sysfs file:
<4>[ 0.000000] Modules linked in:
<4>[ 0.000000]
<4>[ 0.000000] Pid: 0, comm: swapper Not tainted 2.6.37-12-virtual #26+lp686692v3 /
<4>[ 0.000000] EIP: e019:[<c0107397>] EFLAGS: 00010046 CPU: 0
<4>[ 0.000000] EIP is at xen_set_pte+0x27/0x60
<4>[ 0.000000] EAX: 00000000 EBX: c1fe7800 ECX: 00000000 EDX: c0848000
<4>[ 0.000000] ESI: 00000003 EDI: 00000000 EBP: c0849e14 ESP: c0849e04
<4>[ 0.000000] DS: e021 ES: e021 FS: 00d8 GS: 00e0 SS: e021
<0>[ 0.000000] Process swapper (pid: 0, ti=c0848000 task=c084f060 task.ti=c0848000)
<0>[ 0.000000] Stack:
<4>[ 0.000000] c1fe7800 c1fe7800 00000003 00000000 c0849e30 c08aa7ca 00000fff fffff003
<4>[ 0.000000] e6700000 00000000 00026700 c0849e38 c01362be c0849e8c c08b9961 c0849e64
<4>[ 0.000000] 46cf9ef8 00026701 c1fe7800 c0a3f998 00000133 00000100 00026f00 00000000
<0>[ 0.000000] Call Trace:
<4>[ 0.000000] [<c08aa7ca>] ? xen_set_pte_init+0x6b/0x72
<4>[ 0.000000] [<c01362be>] ? set_pte+0xe/0x10
<4>[ 0.000000] [<c08b9961>] ? kernel_physical_mapping_init+0x1c9/0x291
<4>[ 0.000000] [<c06122b6>] ? init_memory_mapping+0x1e6/0x340
<4>[ 0.000000] [<c08ac037>] ? setup_arch+0x6ce/0x935
<4>[ 0.000000] [<c010798e>] ? __raw_callee_save_xen_restore_fl+0x6/0x8
<4>[ 0.000000] [...

One further step finally. Using 'on_crash = "coredump-destroy"' and after creating /var/xen/dump, I was able to extract the following from the dump file:

<6>[    0.000000] ACPI in unprivileged domain disabled
<3>[    0.000000] max_pfn used = 26700(26700000)
<3>[    0.000000] Xen: map base 0 + 26f00000
<3>[    0.000000] Xen: map end = 26f00000
<3>[    0.000000] map size reduzed to 26700000
<3>[    0.000000] delta = 800000, extra_pages = 2048
<3>[    0.000000] extra_mem_start = 26700000
<3>[    0.000000] Xen: reserve c166f000c15d2000 - 800
<6>[    0.000000] released 0 pages of unused memory
<3>[    0.000000] Xen: extra_limit = 159488
<3>[    0.000000] Xen: adding 2048 extra pages at 644874240
<6>[    0.000000] BIOS-provided physical RAM map:
<6>[    0.000000]  Xen: 0000000000000000 - 00000000000a0000 (usable)
<6>[    0.000000]  Xen: 00000000000a0000 - 0000000000100000 (reserved)
<6>[    0.000000]  Xen: 0000000000100000 - 0000000026f00000 (usable)
<6>[    0.000000] NX (Execute Disable) protection: active
<6>[    0.000000] DMI not present or invalid.
<7>[    0.000000] e820 update range: 0000000000000000 - 0000000000010000 (usable
) ==> (reserved)
<7>[    0.000000] e820 remove range: 00000000000a0000 - 0000000000100000 (usable)
<6>[    0.000000] last_pfn = 0x26f00 max_arch_pfn = 0x1000000
<6>[    0.000000] Scanning 0 areas for low memory corruption
<7>[    0.000000] initial memory mapped : 0 - 01fff000
<6>[    0.000000] init_memory_mapping: 0000000000000000-0000000026f00000
<7>[    0.000000]  0000000000 - 0026f00000 page 4k
<7>[    0.000000] kernel direct mapping tables up to 26f00000 @ 1ec4000-1fff000
<1>[    0.000000] BUG: unable to handle kernel NULL pointer dereference at   (null)
<1>[    0.000000] IP: [<c0107397>] xen_set_pte+0x27/0x60
<4>[    0.000000] *pdpt = 0000000000000000 *pde = 0000000000000000 
<0>[    0.000000] Oops: 0003 [#1] SMP 
<0>[    0.000000] last sysfs file: 
<4>[    0.000000] Modules linked in:
<4>[    0.000000] 
<4>[    0.000000] Pid: 0, comm: swapper Not tainted 2.6.37-12-virtual #26+lp686692v3 /
<4>[    0.000000] EIP: e019:[<c0107397>] EFLAGS: 00010046 CPU: 0
<4>[    0.000000] EIP is at xen_set_pte+0x27/0x60
<4>[    0.000000] EAX: 00000000 EBX: c1fe7800 ECX: 00000000 EDX: c0848000
<4>[    0.000000] ESI: 00000003 EDI: 00000000 EBP: c0849e14 ESP: c0849e04
<4>[    0.000000]  DS: e021 ES: e021 FS: 00d8 GS: 00e0 SS: e021
<0>[    0.000000] Process swapper (pid: 0, ti=c0848000 task=c084f060 task.ti=c0848000)
<0>[    0.000000] Stack:
<4>[    0.000000]  c1fe7800 c1fe7800 00000003 00000000 c0849e30 c08aa7ca 00000fff fffff003
<4>[    0.000000]  e6700000 00000000 00026700 c0849e38 c01362be c0849e8c c08b9961 c0849e64
<4>[    0.000000]  46cf9ef8 00026701 c1fe7800 c0a3f998 00000133 00000100 00026f00 00000000
<0>[    0.000000] Call Trace:
<4>[    0.000000]  [<c08aa7ca>] ? xen_set_pte_init+0x6b/0x72
<4>[    0.000000]  [<c01362be>] ? set_pte+0xe/0x10
<4>[    0.000000]  [<c08b9961>] ? kernel_physical_mapping_init+0x1c9/0x291
<4>[    0.000000]  [<c06122b6>] ? init_memory_mapping+0x1e6/0x340
<4>[    0.000000]  [<c08ac037>] ? setup_arch+0x6ce/0x935
<4>[    0.000000]  [<c010798e>] ? __raw_callee_save_xen_restore_fl+0x6/0x8
<4>[    0.000000]  [<c08a6549>] ? start_kernel+0xba/0x356
<4>[    0.000000]  [<c08a60ed>] ? i386_start_kernel+0xdc/0xe4
<4>[    0.000000]  [<c08a9ab2>] ? xen_start_kernel+0x56b/0x573
<4>[    0.000000]  [<c0409095>] ? regulator_check_current_limit.clone.9+0x65/0xe0
<0>[    0.000000] Code: 00 00 00 00 55 89 e5 83 ec 10 89 5d f4 89 75 f8 89 7d fc e8 e4 3b 00 00 f6 c6 04 89 c3 89 d6 89 cf 75 19 e8 2c e6 02 00 89 7b 04 <89> 33 8b 7d fc 8b 5d f4 8b 75 f8 89 ec 5d c3 66 90 c7 04 24 f1 
<0>[    0.000000] EIP: [<c0107397>] xen_set_pte+0x27/0x60 SS:ESP e021:c0849e04
<0>[    0.000000] CR2: 0000000000000000
<4>[    0.000000] ---[ end trace a7919e7f17c0a725 ]---
<0>[    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!

Revision history for this message

Stefan Bader (smb) wrote on 2011-01-19:

#8

I think I see the issue now. When xen sets up the p2m tree, it does a loop from 0 to max_pfn-1, incrementing by the number of p2m mappings in the leaf. If max_pfn is a multiple of 4M this works out. But if not, we need an additional leaf being initialized (which is only partially used).

I need to think about how to make this work best. Maybe the end_pfn needs to be rounded up to the next multiple of P2M_PER_PAGE. And the next question would be how many places need to be touched as there is at least another place which sets up the corresponding pfn to mfn mapping...

Revision history for this message

Stefan Bader (smb) wrote on 2011-01-19:

#9

<3>[ 0.000000] smb: pfn=266ff calling set_pte(c1fe77f8, 6b3003)
<3>[ 0.000000] smb: pfn=26700 calling set_pte(c1fe7800, 3)
<1>[ 0.000000] BUG: unable to handle kernel NULL pointer dereference at (null)

This was seen with some annotation. Basically pfn_pte for the last pfn returns an invalid pte.

Revision history for this message

Stefan Bader (smb) wrote on 2011-01-20:

#10

Patch to correctly set up a partial leaf. Edit (3.1 KiB, text/plain)

Ok, so it was the right place but a completely wrong explanation. The problem is not that the last part of pointers is missed but that it is not. The problem is that the kernel is given a flat array of address pointers by the domain constructor along with the number of pointer in that array. With recent changes, the Xen kernel code tries to map this into a 3-level tree structure, where the leaves contain a part of that array. To conserve memory, the 2nd level points directly at parts of the flat array, which is ok as long as the whole 4k area is containing valid pointers. But for memory assignments which are not a multiple of 4MB (or 2MB for 64bit) the last leaf would contain some undefined pointers instead of invalid markers.

The attached patch assumes that it is not good to meddle with the memory at the end of the external array, so if there is a final leaf that would only be partially filled, it allocates a new page, initializes it and then copies the valid pointers from the original array.

tags:

added: patch

Revision history for this message

Stefan Bader (smb) wrote on 2011-01-20:

#11

With that patch applied I was able to successfully boot t1.micro instances with a 2.6.37 kernel:

ubuntu@ip-10-112-5-120:~$ uname -a
Linux ip-10-112-5-120 2.6.37-12-virtual #26+686692v2 SMP Thu Jan 20 11:30:38 UTC 2011 x86_64 GNU/Linux
ubuntu@ip-10-112-5-120:~$ echo $(wget -q -O- http://169.254.169.254/latest/meta-data/instance-type)
t1.micro
ubuntu@ip-10-112-5-120:~$ uname -m
x86_64

ubuntu@ip-10-117-61-4:~$ uname -a
Linux ip-10-117-61-4 2.6.37-12-virtual #26+686692v2 SMP Thu Jan 20 11:33:17 UTC 2011 i686 GNU/Linux
ubuntu@ip-10-117-61-4:~$ echo $(wget -q -O- http://169.254.169.254/latest/meta-data/instance-type)
t1.micro
ubuntu@ip-10-117-61-4:~$ uname -m
i686

Next step will be to send this upstream to see whether it is an acceptable approach or not.

Changed in linux (Ubuntu Natty):
status:	Confirmed → In Progress

Stefan Bader (smb) on 2011-01-21

Changed in linux (Ubuntu Natty):
status:	In Progress → Fix Committed

Revision history for this message

Scott Moser (smoser) wrote on 2011-01-31:

#12

I'm still unable to boot i386 instances. I tested
us-east-1 ami-5c3fcf35 canonical ebs/ubuntu-natty-daily-i386-server-20110131
It resulted in no console output and unreachable instance in t1.micro.

So, i386 is still broken on t1.micro (the same ami does boot on m1.small).

However, x86_64 is functional. I just verified
us-east-1 ami-2e3fcf47 canonical ebs/ubuntu-natty-daily-amd64-server-20110131

$ uname -r
2.6.38-1-virtual
$ uname -m
x86_64
$ ec2metadata --instance-type
t1.micro
$ dpkg -S /boot/vmlinuz-$(uname -r)
linux-image-2.6.38-1-virtual: /boot/vmlinuz-2.6.38-1-virtual

Martin Pitt (pitti) on 2011-02-04

Changed in linux (Ubuntu Natty):
milestone:	natty-alpha-2 → natty-alpha-3

Revision history for this message

Scott Moser (smoser) wrote on 2011-02-04:

#13

This was fix-released by Stefan in 2.6.38-1.28. Alpha2 boots in amd64 in t1.micro. We've opened bug 710754 to address the i386 issue.

Changed in linux (Ubuntu Natty):
status:	Fix Committed → Fix Released

Revision history for this message

Andy Whitcroft (apw) wrote on 2011-02-10:

#14

Download full text (3.2 KiB)

This bug was fixed in the package linux - 2.6.38-1.27

---------------
linux (2.6.38-1.27) natty; urgency=low

[ Andy Whitcroft ]

  * ubuntu: AUFS -- update aufs-update to track new locations of headers
  * ubuntu: AUFS -- update to c5021514085a5d96364e096dbd34cadb2251abfd
  * SAUCE: ensure root is ready before running usermodehelpers in it
  * correct the Vcs linkage to point to natty
  * rebase to linux tip e78bf5e6cbe837daa6ab628a5f679548742994d3
  * [Config] update configs following rebase
    e78bf5e6cbe837daa6ab628a5f679548742994d3
  * SAUCE: Yama: follow changes to generic_permission
  * ubuntu: compcache -- follow changes to bd_claim/bd_release
  * ubuntu: iscsitarget -- follow changes to open_bdev_exclusive
  * ubuntu: ndiswrapper -- fix interaction between __packed and packed
  * ubuntu: AUFS -- update to 806051bcbeec27748aae2b7957726a4e63ff308e
  * update package version to match payload version
  * rebase to e6f597a1425b5af64917be3448b29e2d5a585ac8
  * rebase to v2.6.38-rc1
  * [Config] updateconfigs following rebase to v2.6.38-rc1
  * SAUCE: x86 fix up jiffies/jiffies_64 handling
  * rebase to linus tip 2b1caf6ed7b888c95a1909d343799672731651a5
  * [Config] updateconfigs following rebase to
    2b1caf6ed7b888c95a1909d343799672731651a5
  * [Config] disable CONFIG_TRANSPARENT_HUGEPAGE to fix i386 boot crashes
  * ubuntu: AUFS -- suppress benign plink warning messages
    - LP: #621195
  * [Config] CONFIG_NR_CPUS=256 for amd64 -server flavour
  * rebase to v2.6.38-rc2
  * rebase to mainline d315777b32a4696feb86f2a0c9e9f39c94683649
  * rebase to c723fdab8aa728dc2bf0da6a0de8bb9c3f588d84
  * [Config] update configs following rebase to
    c723fdab8aa728dc2bf0da6a0de8bb9c3f588d84
  * [Config] disable CONFIG_AD7152 to fix FTBS on armel versatile
  * [Config] disable CONFIG_AD7150 to fix FTBS on armel versatile
  * [Config] disable CONFIG_RTL8192CE to fix FTBS on armel omap
  * [Config] disable CONFIG_MANTIS_CORE to fix FTBS on armel versatile

[ Kees Cook ]

* SAUCE: kernel: make /proc/kallsyms mode 400 to reduce ease of attacking

[ Stefan Bader ]

* Temporarily disable RODATA for virtual i386
- LP: #699828

[ Tim Gardner ]

  * [Config] CONFIG_NLS_DEFAULT=utf8
    - LP: #683690
  * [Config] CONFIG_HIBERNATION=n
  * update bnx2 firmware files in d-i/firmware/nic-modules

[ Upstream Kernel Changes ]

  * Revert "drm/radeon/bo: add some fallback placements for VRAM only
    objects."
  * packaging: make System.map mode 0600
  * thinkpad_acpi: Always report scancodes for hotkeys
    - LP: #702407
  * sched: tg->se->load should be initialised to tg->shares
  * Input: sysrq -- ensure sysrq_enabled and __sysrq_enabled are consistent
  * brcm80211: include linux/slab.h for kfree
  * pch_dma: add include/slab.h for kfree
  * i2c-eg20t: include linux/slab.h for kfree
  * gpio/ml_ioh_gpio: include linux/slab.h for kfree
  * tty: include linux/slab.h for kfree
  * winbond: include linux/delay.h for mdelay et al

[ Upstream Kernel Changes ]

  * mark the start of v2.6.38 versioning
  * rebase v2.6.37 to v2.6.38-rc2 + c723fdab8aa728dc2bf0da6a0de8bb9c3f588d84
    - LP: #689886
    - LP: #702125
    - LP: #608775
    - LP: #215802
...

Ubuntu
linux package

natty kernel does not boot on ec2 t1.micro

Bug Description

Other bug subscribers

Patches

Bug attachments

Remote bug watches

Affects		Status	Importance	Assigned to	Milestone
	linux (Ubuntu)	Fix Released	High	Stefan Bader	Ubuntu natty-alpha-3
	Natty	Fix Released	High	Stefan Bader	Ubuntu natty-alpha-3

Ubuntulinux package

natty kernel does not boot on ec2 t1.micro

Bug Description

Other bug subscribers

Patches

Bug attachments

Remote bug watches

Ubuntu
linux package