Peter, I got it to crash again, this time with a nice kernel dump. The dump can be fetched here: http://www.rmoesbergen.nl/linux-image-3.2.0-34-generic.0.crash.gz The crash itself looked like this: Dec 21 11:07:32 ealxs00161 kernel: [63272.392812] sd 4:0:1:1: emc: ALUA failover mode detected Dec 21 11:07:32 ealxs00161 kernel: [63272.392820] sd 4:0:1:1: emc: at SP B Port 1 (owned, default SP B) Dec 21 11:07:32 ealxs00161 kernel: [63272.393180] sd 3:0:0:1: emc: ALUA failover mode detected Dec 21 11:07:32 ealxs00161 kernel: [63272.393187] sd 3:0:0:1: emc: at SP B Port 0 (owned, default SP B) Dec 21 11:10:36 ealxs00161 kernel: [63455.641431] qla2xxx [0000:07:00.0]-500b:3: LOOP DOWN detected (2 3 0 0). Dec 21 11:10:52 ealxs00161 multipathd: sdf: remove path (uevent) Dec 21 11:10:52 ealxs00161 kernel: [63471.548255] rport-3:0-1: blocked FC remote port time out: removing target and saving binding Dec 21 11:10:52 ealxs00161 kernel: [63471.676065] rport-3:0-0: blocked FC remote port time out: removing target and saving binding Dec 21 11:11:08 ealxs00161 cimserver[2079]: Authentication failed for user=root. Dec 21 11:11:10 ealxs00161 cimserver[2079]: Authentication failed for user=root. Dec 21 11:13:28 ealxs00161 kernel: [63627.745648] INFO: task jbd2/dm-1-8:1530 blocked for more than 120 seconds. Dec 21 11:13:28 ealxs00161 kernel: [63627.746025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Dec 21 11:13:28 ealxs00161 kernel: [63627.756371] jbd2/dm-1-8 D ffff8803aa11a620 0 1530 2 0x00000000 Dec 21 11:13:28 ealxs00161 kernel: [63627.756380] ffff880416141ac0 0000000000000046 ffff880416141a60 ffff88042ee137c0 Dec 21 11:13:28 ealxs00161 kernel: [63627.756388] ffff880416141fd8 ffff880416141fd8 ffff880416141fd8 00000000000137c0 Dec 21 11:13:28 ealxs00161 kernel: [63627.756395] ffffffff81c0d020 ffff880415ef9700 ffff880416141a90 ffff88042ee14080 Dec 21 11:13:28 ealxs00161 kernel: [63627.756403] Call Trace: Dec 21 11:13:28 ealxs00161 kernel: [63627.756416] [] ? __lock_page+0x70/0x70 Dec 21 11:13:28 ealxs00161 kernel: [63627.756431] [] schedule+0x3f/0x60 Dec 21 11:13:28 ealxs00161 kernel: [63627.756441] [] io_schedule+0x8f/0xd0 Dec 21 11:13:28 ealxs00161 kernel: [63627.756451] [] sleep_on_page+0xe/0x20 Dec 21 11:13:28 ealxs00161 kernel: [63627.756460] [] __wait_on_bit+0x5f/0x90 Dec 21 11:13:28 ealxs00161 kernel: [63627.756470] [] wait_on_page_bit+0x78/0x80 Dec 21 11:13:28 ealxs00161 kernel: [63627.756481] [] ? autoremove_wake_function+0x40/0x40 Dec 21 11:13:28 ealxs00161 kernel: [63627.756492] [] filemap_fdatawait_range+0x10c/0x1a0 Dec 21 11:13:28 ealxs00161 kernel: [63627.756503] [] filemap_fdatawait+0x2b/0x30 Dec 21 11:13:28 ealxs00161 kernel: [63627.756516] [] journal_finish_inode_data_buffers+0x70/0x170 Dec 21 11:13:28 ealxs00161 kernel: [63627.756528] [] jbd2_journal_commit_transaction+0x665/0x1240 Dec 21 11:13:28 ealxs00161 kernel: [63627.756538] [] ? add_wait_queue+0x60/0x60 Dec 21 11:13:28 ealxs00161 kernel: [63627.756548] [] kjournald2+0xbb/0x220 Dec 21 11:13:28 ealxs00161 kernel: [63627.756557] [] ? add_wait_queue+0x60/0x60 Dec 21 11:13:28 ealxs00161 kernel: [63627.756566] [] ? commit_timeout+0x10/0x10 Dec 21 11:13:28 ealxs00161 kernel: [63627.756575] [] kthread+0x8c/0xa0 Dec 21 11:13:28 ealxs00161 kernel: [63627.756587] [] kernel_thread_helper+0x4/0x10 Dec 21 11:13:28 ealxs00161 kernel: [63627.756596] [] ? flush_kthread_worker+0xa0/0xa0 Dec 21 11:13:28 ealxs00161 kernel: [63627.756606] [] ? gs_change+0x13/0x13 Dec 21 11:13:28 ealxs00161 kernel: [63627.756612] Kernel panic - not syncing: hung_task: blocked tasks Dec 21 11:13:28 ealxs00161 kernel: [63627.768425] Pid: 66, comm: khungtaskd Tainted: G W 3.2.0-34-generic #53-Ubuntu Dec 21 11:13:28 ealxs00161 kernel: [63627.779691] Call Trace: Dec 21 11:13:28 ealxs00161 kernel: [63627.790147] [] panic+0x91/0x1a4 Dec 21 11:13:28 ealxs00161 kernel: [63627.800888] [] check_hung_task+0xb2/0xc0 Dec 21 11:13:28 ealxs00161 kernel: [63627.811370] [] check_hung_uninterruptible_tasks+0x11b/0x140 Dec 21 11:13:28 ealxs00161 kernel: [63627.821998] [] ? check_hung_uninterruptible_tasks+0x140/0x140 Dec 21 11:13:28 ealxs00161 kernel: [63627.833715] [] watchdog+0x4f/0x60 Dec 21 11:13:28 ealxs00161 kernel: [63627.844538] [] kthread+0x8c/0xa0 Dec 21 11:13:28 ealxs00161 kernel: [63627.855370] [] kernel_thread_helper+0x4/0x10 Dec 21 11:13:28 ealxs00161 kernel: [63627.866367] [] ? flush_kthread_worker+0xa0/0xa0 Dec 21 11:13:28 ealxs00161 kernel: [63627.877343] [] ? gs_change+0x13/0x13 output of ps xa, just before the crash: PID TTY STAT TIME COMMAND 1 ? Ss 0:02 /sbin/init 2 ? S 0:00 [kthreadd] 3 ? S 0:01 [ksoftirqd/0] 6 ? S 0:01 [migration/0] 7 ? S 0:00 [watchdog/0] 8 ? S 0:00 [migration/1] 10 ? S 0:00 [ksoftirqd/1] 12 ? S 0:00 [watchdog/1] 13 ? S 0:01 [migration/2] 15 ? S 0:00 [ksoftirqd/2] 16 ? S 0:00 [watchdog/2] 17 ? S 0:00 [migration/3] 19 ? S 0:00 [ksoftirqd/3] 20 ? S 0:00 [watchdog/3] 21 ? S 0:00 [migration/4] 23 ? S 0:00 [ksoftirqd/4] 24 ? S 0:00 [watchdog/4] 25 ? S 0:00 [migration/5] 27 ? S 0:00 [ksoftirqd/5] 28 ? S 0:00 [watchdog/5] 29 ? S 0:00 [migration/6] 30 ? S 0:00 [kworker/6:0] 31 ? S 0:00 [ksoftirqd/6] 32 ? S 0:00 [watchdog/6] 33 ? S 0:00 [migration/7] 35 ? S 0:00 [ksoftirqd/7] 36 ? S 0:00 [watchdog/7] 37 ? S 0:00 [migration/8] 38 ? S 0:00 [kworker/8:0] 39 ? S 0:00 [ksoftirqd/8] 40 ? S 0:00 [watchdog/8] 41 ? S 0:00 [migration/9] 42 ? S 0:00 [kworker/9:0] 43 ? S 0:00 [ksoftirqd/9] 44 ? S 0:00 [watchdog/9] 45 ? S 0:00 [migration/10] 47 ? S 0:00 [ksoftirqd/10] 48 ? S 0:00 [watchdog/10] 49 ? S 0:00 [migration/11] 51 ? S 0:00 [ksoftirqd/11] 52 ? S 0:00 [watchdog/11] 53 ? S< 0:00 [cpuset] 54 ? S< 0:00 [khelper] 55 ? S 0:00 [kdevtmpfs] 56 ? S< 0:00 [netns] 58 ? S 0:00 [sync_supers] 59 ? S 0:00 [bdi-default] 60 ? S< 0:00 [kintegrityd] 61 ? S< 0:00 [kblockd] 62 ? S< 0:00 [ata_sff] 63 ? S 0:00 [khubd] 64 ? S< 0:00 [md] 66 ? S 0:00 [khungtaskd] 67 ? S 0:14 [kswapd0] 68 ? SN 0:00 [ksmd] 69 ? SN 0:00 [khugepaged] 70 ? S 0:00 [fsnotify_mark] 71 ? S 0:00 [ecryptfs-kthrea] 72 ? S< 0:00 [crypto] 80 ? S< 0:00 [kthrotld] 81 ? S 0:00 [scsi_eh_0] 82 ? S 0:00 [scsi_eh_1] 104 ? S< 0:00 [devfreq_wq] 265 ? S 0:00 [scsi_eh_2] 267 ? S 0:00 [hpsa] 349 ? S 0:00 [kworker/6:1] 352 ? S 0:00 [kworker/9:1] 353 ? S 0:00 [kworker/10:1] 354 ? S 0:00 [kworker/4:1] 357 ? S< 0:00 [kdmflush] 365 ? S 0:00 [jbd2/sda1-8] 366 ? S< 0:00 [ext4-dio-unwrit] 458 ? S 0:00 upstart-udev-bridge --daemon 461 ? Ss 0:00 /sbin/udevd --daemon 547 ? S< 0:00 [kmpathd] 548 ? S< 0:00 [kmpath_handlerd] 626 ? S< 0:00 [edac-poller] 660 ? S 0:00 [scsi_eh_3] 702 ? S< 0:00 [kpsmoused] 861 ? S< 0:00 [qla2xxx_3_dpc] 862 ? Ss 0:00 rpcbind -w 864 ? S< 0:00 [scsi_wq_3] 877 ? S 0:00 [scsi_eh_4] 879 ? Ss 0:00 rpc.statd -L 888 ? S< 0:00 [rpciod] 891 ? S< 0:00 [nfsiod] 893 ? S 0:00 upstart-socket-bridge --daemon 895 ? S< 0:00 [qla2xxx_4_dpc] 896 ? S< 0:00 [scsi_wq_4] 902 ? S< 0:00 [bond0] 1054 ? S< 0:00 [kdmflush] 1109 ? S< 0:00 [kdmflush] 1490 ? S 0:06 [jbd2/dm-2-8] 1491 ? S< 0:00 [ext4-dio-unwrit] 1530 ? D 0:42 [jbd2/dm-1-8] 1531 ? S< 0:00 [ext4-dio-unwrit] 1573 ? Ss 0:00 /usr/sbin/sshd -D 1576 ? Ss 0:00 rpc.idmapd 1580 ? Ss 0:00 dbus-daemon --system --fork --activation=upstart 1603 ? Sl 0:02 rsyslogd -c5 1677 tty4 Ss+ 0:00 /sbin/getty -8 38400 tty4 1684 tty5 Ss+ 0:00 /sbin/getty -8 38400 tty5 1693 tty2 Ss+ 0:00 /sbin/getty -8 38400 tty2 1697 tty3 Ss+ 0:00 /sbin/getty -8 38400 tty3 1703 tty6 Ss+ 0:00 /sbin/getty -8 38400 tty6 1710 ? Ss 0:00 acpid -c /etc/acpi/events -s /var/run/acpid.socket 1712 ? Ss 0:00 cron 1715 ? Ss 0:00 atd 1727 ? S 0:00 /usr/sbin/zabbix_agentd 1729 ? Ss 0:20 /usr/sbin/irqbalance 1733 ? Ssl 0:00 whoopsie 1738 ? Ssl 2:51 /usr/sbin/mysqld 1745 ? S 0:35 /usr/sbin/zabbix_agentd 1746 ? S 0:10 /usr/sbin/zabbix_agentd 1747 ? S 0:10 /usr/sbin/zabbix_agentd 1748 ? S 0:11 /usr/sbin/zabbix_agentd 1749 ? S 0:12 /usr/sbin/zabbix_agentd 1750 ? S 0:11 /usr/sbin/zabbix_agentd 1751 ? S 0:01 /usr/sbin/zabbix_agentd 1919 ? S 0:00 [kworker/5:2] 2004 ? S 0:00 [kworker/8:2] 2010 ? S 0:00 [kworker/11:2] 2011 ? S 0:00 [kworker/11:3] 2024 ? Sl 0:01 /opt/Unisphere/bin/hostagent -f /etc/Unisphere/agent.config 2046 ? SLl 0:08 /sbin/multipathd 2079 ? SLsl 0:19 /opt/microsoft/scx/bin/scxcimserver 2177 tty1 Ss+ 0:00 /sbin/getty -8 38400 tty1 2179 ? S 0:00 [flush-8:0] 2180 ? D 1:05 [flush-252:1] 2181 ? S 0:00 [flush-252:2] 2257 ? Ssl 0:28 /opt/microsoft/scx/bin/scxcimprovagt 0 9 12 root SCXCoreProviderModule 2422 ? Ss 0:02 /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 109:115 2625 ? Ss 0:00 sshd: ronaldm [priv] 2824 ? S 0:00 sshd: ronaldm@pts/0 2825 pts/0 Ss 0:00 -bash 2924 pts/0 S 0:00 sudo -i 2929 pts/0 S 0:00 -bash 3194 ? Ssl 0:01 /opt/microsoft/scx/bin/scxcimprovagt 0 8 14 scoma SCXUserCoreProviderModule 3207 ? S 0:17 [kworker/1:3] 3614 ? Ss 0:00 sshd: ronaldm [priv] 3753 ? S 0:00 sshd: ronaldm@pts/2 3754 pts/2 Ss 0:00 -bash 3868 pts/2 S 0:00 sudo -i 3874 pts/2 S 0:00 -bash 4925 ? S 0:00 [kworker/7:3] 5248 ? S 0:00 [kworker/u:2] 5251 ? S 0:00 [kworker/u:3] 5348 ? S 0:00 [kworker/10:2] 5353 ? S 0:00 [kworker/1:1] 5361 ? S 0:00 [kworker/0:1] 5382 ? S 0:00 [kworker/3:0] 5383 ? S 0:00 [kworker/3:3] 5384 ? S 0:00 [kworker/5:3] 5387 ? S 0:00 [kworker/0:5] 5391 ? S 0:00 [kworker/1:2] 5691 ? S 0:00 [kworker/7:4] 6088 ? S 0:00 [kworker/1:4] 6221 ? S 0:00 [kworker/4:2] 6260 ? S 0:00 [kworker/2:0] 6261 ? S 0:00 [kworker/2:4] 6521 pts/0 D+ 0:17 bonnie++ -d . -u root 6655 ? S 0:00 /sbin/udevd --daemon 6656 ? S 0:00 /sbin/udevd --daemon 6910 ? S 0:00 [kworker/1:0] 6915 pts/2 R+ 0:00 ps xa Acceptatie - DB01 (root@ealxs00161):~# ps xa | grep multi 2046 ? SLl 0:08 /sbin/multipathd 6917 pts/2 S+ 0:00 grep --color=auto multi Also, just before the crash: Acceptatie - DB01 (root@ealxs00161):~# multipath -ll LUN-DATABASE (36006016061e02e003cf1aca4ae07e211) dm-2 DGC,VRAID size=200G features='1 queue_if_no_path' hwhandler='1 emc' wp=rw |-+- policy='round-robin 0' prio=130 status=active | |- 4:0:1:1 sdi 8:128 active ready running | `- #:#:#:# - #:# active faulty running `-+- policy='round-robin 0' prio=10 status=enabled |- 4:0:0:1 sde 8:64 active ready running `- #:#:#:# - #:# active faulty running LUN-LOGGING (36006016061e02e000286c1adae07e211) dm-1 DGC,VRAID size=20G features='0' hwhandler='1 emc' wp=rw |-+- policy='round-robin 0' prio=130 status=active | |- 4:0:0:0 sdd 8:48 active ready running | `- #:#:#:# - #:# active faulty running `-+- policy='round-robin 0' prio=10 status=enabled |- 4:0:1:0 sdh 8:112 active ready running `- #:#:#:# - #:# active faulty running Output of dmsetup table -v before starting the tests: Name: vg-swap State: ACTIVE Read Ahead: 256 Tables present: LIVE Open count: 2 Event number: 0 Major, minor: 252, 0 Number of targets: 1 UUID: LVM-BySGZfHLAZg250K7UjTxYBStGjTdkb2CE8b7q7HMxBUtJso72BPYfnAcLpxixYP4 0 3997696 linear 8:2 512 Name: LUN-DATABASE State: ACTIVE Read Ahead: 256 Tables present: LIVE Open count: 1 Event number: 4 Major, minor: 252, 2 Number of targets: 1 UUID: mpath-36006016061e02e003cf1aca4ae07e211 0 419430400 multipath 1 queue_if_no_path 1 emc 2 1 round-robin 0 2 1 8:128 1000 8:32 1000 round-robin 0 2 1 8:64 1000 8:96 1000 Name: LUN-LOGGING State: ACTIVE Read Ahead: 256 Tables present: LIVE Open count: 1 Event number: 4 Major, minor: 252, 1 Number of targets: 1 UUID: mpath-36006016061e02e000286c1adae07e211 0 41943040 multipath 1 queue_if_no_path 1 emc 2 1 round-robin 0 2 1 8:48 1000 8:80 1000 round-robin 0 2 1 8:112 1000 8:16 1000 Output of lsscsi -lv before starting the tests: [2:0:0:0] storage HP P420i 3.04 - state=running queue_depth=1020 scsi_level=6 type=12 device_blocked=0 timeout=0 dir: /sys/bus/scsi/devices/2:0:0:0 [/sys/devices/pci0000:00/0000:00:02.2/0000:02:00.0/host2/target2:0:0/2:0:0:0] [2:0:0:1] disk HP LOGICAL VOLUME 3.04 /dev/sda state=running queue_depth=1020 scsi_level=6 type=0 device_blocked=0 timeout=30 dir: /sys/bus/scsi/devices/2:0:0:1 [/sys/devices/pci0000:00/0000:00:02.2/0000:02:00.0/host2/target2:0:0/2:0:0:1] [3:0:0:0] disk DGC VRAID 0531 /dev/sdb state=running queue_depth=32 scsi_level=5 type=0 device_blocked=0 timeout=30 dir: /sys/bus/scsi/devices/3:0:0:0 [/sys/devices/pci0000:00/0000:00:01.0/0000:07:00.0/host3/rport-3:0-0/target3:0:0/3:0:0:0] [3:0:0:1] disk DGC VRAID 0531 /dev/sdc state=running queue_depth=32 scsi_level=5 type=0 device_blocked=0 timeout=30 dir: /sys/bus/scsi/devices/3:0:0:1 [/sys/devices/pci0000:00/0000:00:01.0/0000:07:00.0/host3/rport-3:0-0/target3:0:0/3:0:0:1] [3:0:1:0] disk DGC VRAID 0531 /dev/sdf state=running queue_depth=32 scsi_level=5 type=0 device_blocked=0 timeout=30 dir: /sys/bus/scsi/devices/3:0:1:0 [/sys/devices/pci0000:00/0000:00:01.0/0000:07:00.0/host3/rport-3:0-1/target3:0:1/3:0:1:0] [3:0:1:1] disk DGC VRAID 0531 /dev/sdg state=running queue_depth=32 scsi_level=5 type=0 device_blocked=0 timeout=30 dir: /sys/bus/scsi/devices/3:0:1:1 [/sys/devices/pci0000:00/0000:00:01.0/0000:07:00.0/host3/rport-3:0-1/target3:0:1/3:0:1:1] [4:0:0:0] disk DGC VRAID 0531 /dev/sdd state=running queue_depth=32 scsi_level=5 type=0 device_blocked=0 timeout=30 dir: /sys/bus/scsi/devices/4:0:0:0 [/sys/devices/pci0000:00/0000:00:1c.0/0000:0a:00.0/host4/rport-4:0-0/target4:0:0/4:0:0:0] [4:0:0:1] disk DGC VRAID 0531 /dev/sde state=running queue_depth=32 scsi_level=5 type=0 device_blocked=0 timeout=30 dir: /sys/bus/scsi/devices/4:0:0:1 [/sys/devices/pci0000:00/0000:00:1c.0/0000:0a:00.0/host4/rport-4:0-0/target4:0:0/4:0:0:1] [4:0:1:0] disk DGC VRAID 0531 /dev/sdh state=running queue_depth=32 scsi_level=5 type=0 device_blocked=0 timeout=30 dir: /sys/bus/scsi/devices/4:0:1:0 [/sys/devices/pci0000:00/0000:00:1c.0/0000:0a:00.0/host4/rport-4:0-1/target4:0:1/4:0:1:0] [4:0:1:1] disk DGC VRAID 0531 /dev/sdi state=running queue_depth=32 scsi_level=5 type=0 device_blocked=0 timeout=30 dir: /sys/bus/scsi/devices/4:0:1:1 [/sys/devices/pci0000:00/0000:00:1c.0/0000:0a:00.0/host4/rport-4:0-1/target4:0:1/4:0:1:1] I hope this helps...