Comment 15 for bug 688068

Revision history for this message
Jeremy Anderson (jeremy-angelar) wrote :

New server crash this morning. Two KVM guests running, but I was just running a small perl script which reads in an 8109 byte xml file, changes a couple of values via XML::Twig, and prints it to stdout. After rebooting, the same script ran just fine.

I DID get a vmstat picture of things, and the last output of top:

crash at:

lost connectivity at Fri Dec 17 08:40:11 CST 2010

final top screen:

top - 08:58:35 up 1 day, 47 min, 5 users, load average: 0.51, 0.48, 0.42
Tasks: 191 total, 1 running, 190 sleeping, 0 stopped, 0 zombie
Cpu(s): 3.5%us, 1.8%sy, 0.0%ni, 94.4%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st
Mem: 8160340k total, 3287448k used, 4872892k free, 177300k buffers
Swap: 1630460k total, 0k used, 1630460k free, 2237028k cached

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
17939 root 20 0 528m 267m 3380 S 25 3.4 131:50.81 kvm
   54 root 25 5 0 0 0 S 19 0.0 80:09.12 ksmd
 1709 root 20 0 199m 5508 3420 S 2 0.1 9:37.83 libvirtd
17357 jeremy 20 0 19272 1432 1028 R 1 0.0 2:39.89 top
17904 jeremy 20 0 100m 1828 876 S 1 0.0 3:28.46 sshd
 1775 bind 20 0 204m 29m 2224 S 0 0.4 0:36.81 named
 2548 nagios 20 0 28472 1468 880 S 0 0.0 2:57.31 nagios
17186 jeremy 20 0 100m 1828 872 S 0 0.0 0:06.69 sshd
17708 jeremy 20 0 100m 1828 872 S 0 0.0 0:00.80 sshd
17910 jeremy 20 0 9612 428 336 S 0 0.0 1:31.52 nc
    1 root 20 0 23896 2036 1244 S 0 0.0 0:09.89 init
    2 root 20 0 0 0 0 S 0 0.0 0:00.01 kthreadd
    3 root 20 0 0 0 0 S 0 0.0 0:03.06 ksoftirqd/0
    4 root 20 0 0 0 0 S 0 0.0 0:00.00 kworker/0:0
    6 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/0
    7 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/1
    8 root 20 0 0 0 0 S 0 0.0 0:00.01 kworker/1:0

last vmstat output (sampling at 1 second):

 0 0 0 4873808 177300 2237028 0 0 0 0 4283 8235 4 2 95 0

final command was running a perl script which parsed XML, reading in an 8109 byte file

jeremy@valhalla:~/sandbox/esx$ ./jda2.pl
Write failed: Broken pipe

Note that this machine has 8GB of ram in a 4x 2gb configuration. I will remove 2 sticks of RAM later today and see if that eliminates the crash. A friend has warned that he has seen random freezes in machines with fully populated RAM banks. I will be contacting the vendor, Biostar, to see if there are known issues with this RAM and this motherboard, or with fully populated RAM banks.