kernel only recognizes 32G of memory

Bug #667796 reported by Scott Moser
34
This bug affects 4 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Medium
Unassigned
Maverick
Fix Released
Medium
Stefan Bader
linux-ec2 (Ubuntu)
Invalid
Undecided
Unassigned
Maverick
Invalid
Undecided
Unassigned

Bug Description

SRU justification:

Impact: The config option XEN_MAX_DOMAIN_MEMORY controls how much memory a Xen instance is seeing. The default for 64bit is 32GB, which is the reason that m2.4xlarge instances only report this amount of memory.

Fix: Setting this limit to 70GB as there is a known restriction for t1.micro instances at about 80GB.

Testcase: Booted m2.4xlarge with this set to 32GB shows 32GB of memory, when set to 70GB, it correctly reports 68GB. Booted a t1.micro as well to verify this has not caused problems there.

---

The following is copied from https://bugs.launchpad.net/ubuntu/+source/linux/+bug/651370/comments/14 . I'm fairly sure that The problem is also with a stock image (without the additional command line options suggested in that bug).

Brandon's and Scott's workaround works for me partly, but the kernel on an instance started in such a way seems to detect only 32 GB of memory even for a m2.4xlarge instance which should have 68.4 GB available, according to the EC2 instances page. Is this a side-effect of the workaround, or a completely separate bug?

Maveric results:
ubuntu@ip-10-230-9-87:~$ uname -a
Linux ip-10-230-9-87 2.6.35-22-virtual #35-Ubuntu SMP Sat Oct 16 23:19:29 UTC 2010 x86_64 GNU/Linux
ubuntu@ip-10-230-9-87:~$ ec2metadata --instance-type
m2.4xlarge
ubuntu@ip-10-230-9-87:~$ free
             total used free shared buffers cached
Mem: 32810684 667628 32143056 0 6444 32152
-/+ buffers/cache: 629032 32181652
Swap: 0 0 0

Expected results (from a SUSE 11 guest):
ip-10-230-45-187:~ # uname -a
Linux ip-10-230-45-187 2.6.32.19-0.3-ec2 #1 SMP 2010-09-17 20:28:21 +0200 x86_64 x86_64 x86_64 GNU/Linux
ip-10-230-45-187:~ # curl http://169.254.169.254/latest/meta-data/instance-type
m2.4xlarge
ip-10-230-45-187:~ # free
             total used free shared buffers cached
Mem: 71705116 2361584 69343532 0 10972 126424
-/+ buffers/cache: 2224188 69480928
Swap: 0 0 0

ProblemType: Bug
DistroRelease: Ubuntu 10.10
Package: linux-image-2.6.35-22-virtual 2.6.35-22.33
Regression: Yes
Reproducible: Yes
ProcVersionSignature: User Name 2.6.35-22.33-virtual 2.6.35.4
Uname: Linux 2.6.35-22-virtual x86_64
AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access /dev/snd/: No such file or directory
AplayDevices: Error: [Errno 2] No such file or directory
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
CurrentDmesg:

Date: Thu Oct 28 13:35:42 2010
Ec2AMI: ami-548c783d
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: us-east-1d
Ec2InstanceType: t1.micro
Ec2Kernel: aki-427d952b
Ec2Ramdisk: unavailable
Lspci:

Lsusb: Error: command ['lsusb'] failed with exit code 1:
ProcCmdLine: root=LABEL=uec-rootfs ro console=hvc0
ProcEnviron:
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcModules: acpiphp 18752 0 - Live 0xffffffffa0000000
SourcePackage: linux

CVE References

Revision history for this message
Scott Moser (smoser) wrote :
Revision history for this message
Scott Moser (smoser) wrote :

I added linux-ec2. This issue is a problem also on lucid.

I tested on:
  us-east-1 ami-4a0df923 canonical ebs/ubuntu-lucid-10.04-amd64-server-20101020
$ wget http://169.254.169.254/latest/meta-data/instance-type -O - -q ; echo
m2.2xlarge
$ free
             total used free shared buffers cached
Mem: 35840228 1561456 34278772 0 22996 359664
-/+ buffers/cache: 1178796 34661432
Swap: 0 0 0
$ uname -a
Linux domU-12-31-39-18-31-09 2.6.32-309-ec2 #18-Ubuntu SMP Mon Oct 18 21:00:50 UTC 2010 x86_64 GNU/Linux

Revision history for this message
Mikael Gueck (gumi) wrote :

[ 0.000000] Linux version 2.6.35-22-virtual (buildd@yellow) (gcc version 4.4.5 (Ubuntu/Linaro 4.4.4-14ubuntu5) ) #35-Ubuntu SMP Sat Oct 16 23:19:29 UTC 2010 (Ubuntu 2.6.35-22.35-virtual 2.6.35.4)
[8343235.991081] Memory: 32797532k/33554432k available (5816k kernel code, 448k absent, 756452k reserved, 5366k data, 828k init)

Partial boot dmesg from an EC2 m2.4xlarge instance with 65GB of available memory.

Revision history for this message
Scott Moser (smoser) wrote :

My comment #2 above is invalid. I ran a m2.2xlarge, and showed it had 32G of memory. a m2.2xlarge *should* have 32G memory. I've now verified the same ami (ami-4a0df923) does show 64G of memory.
$ dpkg -S /boot/vmlinuz-$(uname -r)
linux-image-2.6.32-309-ec2: /boot/vmlinuz-2.6.32-309-ec2
$ uname -r
2.6.32-309-ec2
$ free
             total used free shared buffers cached
Mem: 71680228 2628480 69051748 0 22724 355024
-/+ buffers/cache: 2250732 69429496
Swap: 0 0 0
$ curl http://169.254.169.254/latest/mestance-type; echo
m2.4xlarge

Changed in linux-ec2 (Ubuntu):
status: New → Invalid
Revision history for this message
Mikael Gueck (gumi) wrote :

I was wondering why that kernel version would exhibit this bug, since that's what I've downgraded my Maverick instances to get them to show the correct amount of memory.

Once we get a Maverick weekly EC2 image which reliably boots up without hitting bug 651370, I will post the kernel logs from it here.

Revision history for this message
Brandon Black (blblack) wrote :

My experience so far has been that the Lucid kernels do not have this memory size bug, only the Maverick ones.

Scott Moser (smoser)
Changed in linux (Ubuntu):
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Scott Moser (smoser) wrote :

Brandon, correct. this is maverick kernel issue only. I incorrectly stated that it was. I've since removed the 'linux-ec2' task for this bug. (linux-ec2 is kernel package for lucid, linux is kernel package for maverick).

Revision history for this message
Ed Swierk (eswierk) wrote :

I booted a t1.micro instance with an image based on ami-ca1f4f8f, installed the updated maverick kernel from bug 651370 (https://launchpad.net/ubuntu/+archive/primary/+files/linux-image-2.6.35-24-virtual_2.6.35-24.42_amd64.deb), and converted the instance to m2.4xlarge. The m2.4xlarge instance still sees only 32 GB of memory when it boots.

Revision history for this message
Ed Swierk (eswierk) wrote :

The most likely culprit seems to be CONFIG_XEN_MAX_DOMAIN_MEMORY=32 in config-2.6.35-24-virtual.

Revision history for this message
Stefan Bader (smb) wrote :

Thanks Ed, about to test changing that config.

Changed in linux (Ubuntu):
assignee: nobody → Stefan Bader (stefan-bader-canonical)
status: Confirmed → In Progress
Revision history for this message
Stefan Bader (smb) wrote :

It really is as simple as that: uname -a
Linux domU-12-31-39-18-27-15 2.6.35-24-virtual #42+lp667796v1 SMP Thu Dec 9 18:08:46 UTC 2010 x86_64 GNU/Linux

dmesg |grep Mem
[14960639.962084] Memory: 70324392k/71680000k available (5817k kernel code, 448k absent, 1355160k reserved, 5515k data, 828k init

Test kernel can be found at: http://people.canonical.com/~smb/lp667796/. I will follow up on the steps to SRU this tomorrow.

Stefan Bader (smb)
Changed in linux-ec2 (Ubuntu Maverick):
status: New → Invalid
Changed in linux (Ubuntu Maverick):
assignee: nobody → Stefan Bader (stefan-bader-canonical)
importance: Undecided → Medium
status: New → In Progress
Changed in linux (Ubuntu):
assignee: Stefan Bader (stefan-bader-canonical) → nobody
status: In Progress → Invalid
Stefan Bader (smb)
description: updated
Revision history for this message
Martin Pitt (pitti) wrote : Please test proposed package

Accepted into maverick-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in linux (Ubuntu Maverick):
status: In Progress → Fix Committed
tags: added: verification-needed
Revision history for this message
Steve Conklin (sconklin) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed' to 'verification-done'.

If verification is not done by one week from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

Revision history for this message
Stefan Bader (smb) wrote :

ubuntu@domU-12-31-39-17-11-68:~$ free -m
             total used free shared buffers cached
Mem: 68690 1381 67308 0 6 31
-/+ buffers/cache: 1344 67345
Swap: 0 0 0

ubuntu@domU-12-31-39-17-11-68:~$ uname -a
Linux domU-12-31-39-17-11-68 2.6.35-25-virtual #43-Ubuntu SMP Thu Jan 6 22:59:06 UTC 2011 x86_64 GNU/Linux

tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (24.4 KiB)

This bug was fixed in the package linux - 2.6.35-25.44

---------------
linux (2.6.35-25.44) maverick-proposed; urgency=low

  [ Upstream Kernel Changes ]

  * Revert "drm/radeon/kms: properly compute group_size on 6xx/7xx"
    - LP: #703553

linux (2.6.35-25.43) maverick-proposed; urgency=low

  [ Brad Figg ]

  - LP: #697948

  [ Andy Whitcroft ]

  * [Config] add vmware-balloon driver to -virtual flavour
    - LP: #592039

  [ Manoj Iyer ]

  * SAUCE: Enable jack sense for Thinkpad Edge 13
    - LP: #685015

  [ Robert Hooker ]

  * Revert "(pre-stable): input: Support Clickpad devices in ClickZone
    mode"
    - LP: #669399

  [ Stefan Bader ]

  * Set virtual flavour maximum of domain visible memory to 70G
    - LP: #667796

  [ Takashi Iwai ]

  * SAUCE: input: Support Clickpad devices in ClickZone mode
    - LP: #516329

  [ Tim Gardner ]

  * [Config] Add nfsd modules to -virtual flavour
    - LP: #688070
  * [Config] Added autofs4.ko to -virtual flavour
    - LP: #692917

  [ Upstream Kernel Changes ]

  * intel_idle: delete substates DEBUG modparam
    - LP: #684888
  * intel_idle: delete power_policy modparam, and choose substate functions
    - LP: #684888
  * intel_idle: add support for Westmere-EX
    - LP: #684888
  * intel_idle: recognize Lincroft Atom Processor
    - LP: #684888
  * x86, mwait: Move mwait constants to a common header file
    - LP: #684888
  * intel_idle: Change mode 755 => 644
    - LP: #684888
  * intel_idle: add missing __percpu markup
    - LP: #684888
  * cpuidle: extend cpuidle and menu governor to handle dynamic states
    - LP: #684888
  * intel_idle: Voluntary leave_mm before entering deeper
    - LP: #684888
  * intel_idle: enable Atom C6
    - LP: #684888
  * intel_idle: simplify test for leave_mm()
    - LP: #684888
  * intel_idle: delete bogus data from cpuidle_state.power_usage
    - LP: #684888
  * intel_idle: add initial Sandy Bridge support
    - LP: #684888
  * intel_idle: do not use the LAPIC timer for ATOM C2
    - LP: #684888
  * staging: usbip: Notify usb core of port status changes
    - LP: #686158
  * staging: usbip: Process event flags without delay
    - LP: #686158
  * Staging: phison: fix problem caused by libata change
    - LP: #686158
  * perf_events: Fix bogus AMD64 generic TLB events
    - LP: #686158
  * perf_events: Fix bogus context time tracking
    - LP: #686158
  * powerpc/perf: Fix sampling enable for PPC970
    - LP: #686158
  * pcmcia: synclink_cs: fix information leak to userland
    - LP: #686158
  * sched: Drop all load weight manipulation for RT tasks
    - LP: #686158
  * sched: Fix string comparison in /proc/sched_features
    - LP: #686158
  * bluetooth: Fix missing NULL check
    - LP: #686158
  * futex: Fix errors in nested key ref-counting
    - LP: #686158
  * cifs: fix broken oplock handling
    - LP: #686158
  * libahci: fix result_tf handling after an ATA PIO data-in command
    - LP: #686158
  * mm, x86: Saving vmcore with non-lazy freeing of vmas
    - LP: #686158
  * x86, cpu: Fix renamed, not-yet-shipping AMD CPUID feature bit
    - LP: #686158
  * x86, kexec: Make sure to stop all CPUs before exiting the kernel
    - LP: #686158
  * x86, olpc: Don...

Changed in linux (Ubuntu Maverick):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.