lucid system randomly locks up, does not recover
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Invalid
|
Undecided
|
Unassigned |
Bug Description
This system had months upon months of uptime in LTS 8.04.01. After upgrading to LTS 10.04.01, the longest the system has gone before freezing is 10 hours (that was with the 2.6.34-
X is not running on this system. It runs the following software for my home network:
kvm w/6-10 VMs running
apache webserver w/php & mysql
nagios v3 (same version compiled under 8.04.01, running with those config files, as user nagios)
samba
bind9
iptables for use as firewall
dhcpd
NFS server
dovecot (serving up old mail archives)
postfix (in a forwarding-only mode)
OpenSSH server
Nothing is getting logged anywhere on these freezes -- I've enabled remote syslogging to another linux box on my network, and when the 8.04.01 system stops responding, it simply stops -- it doesn't log anything interesting at all -- it's like someone just switched the power off.
RAM checked out fine during an 8-hour memtest86+ run (3 passes were completed)
cpu temp was fine after an hour of using burnK7 from cpuburn
none of the 6 SATA drives reports any SMART errors
I found a couple of PSU calculators online, and all of them indicated the 500watt FSP powersupply in the system is more than sufficient for all attached hardware.
No USB devices are plugged in.
the screen goes blank immediately when the freezes occur. System activity seems to have little effect -- sometimes it freezes when the system is very idle, sometimes when the load average is around 2 (dual-core cpu in this machine, so that shouldn't be alarming). There are no signs of memory starvation at any point -- I've yet to see the swap active, even under very heavy load.
The system also experienced these hangs before KVM was installed. I was using VirtualBox, and I migrated to KVM because I thought VirtualBox was causing the crashes.
I have been experimenting with different timers.
my LONGEST uptime was with the mainline kernel, booted with the following options:
quiet ro splash ipv6.disable=1 clock_source=hpet
I'm am now experimenting with clock_source=
I have tried booting with the noacpi option, but that does not appear to help
ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: linux-image-
Regression: Yes
Reproducible: No
ProcVersionSign
Uname: Linux 2.6.32-26-server x86_64
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
AplayDevices: Error: [Errno 2] No such file or directory
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', '/dev/snd/
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info: Error: [Errno 2] No such file or directory
Card0.Amixer.
Date: Thu Dec 9 08:19:29 2010
Frequency: Once a day.
HibernationDevice: RESUME=
IwConfig: Error: [Errno 2] No such file or directory
MachineType: BIOSTAR Group A740G M2+
ProcCmdLine: root=UUID=
ProcEnviron:
PATH=(custom, no user)
LANG=en_US.UTF-8
SHELL=/bin/bash
RelatedPackageV
RfKill: Error: [Errno 2] No such file or directory
SourcePackage: linux
dmi.bios.date: 05/02/2008
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 080014
dmi.board.
dmi.board.name: A740G M2+
dmi.board.vendor: BIOSTAR Group
dmi.board.version: 6.0
dmi.chassis.
dmi.chassis.type: 3
dmi.chassis.vendor: BIOSTAR Group
dmi.chassis.
dmi.modalias: dmi:bvnAmerican
dmi.product.name: A740G M2+
dmi.product.
dmi.sys.vendor: BIOSTAR Group
As suggested by tgardner in #ubuntu-kernel, I am now booting the machine with the following line: 2.6.32- 26-server root=UUID= 5626f5d7- 0210-432c- 9200-ec6a1d599d f3 ro pci=nomsi
/boot/vmlinuz-