system stops booting after message JDB: barrier-based sync failed on md1-8 - disabling barriers ... wrong # of devices in RAID

Bug #598086 reported by Kluth
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
dmraid (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

Binary package hint: dmraid

I am using Ubuntu 10.04 Kernel 2.6.32-22-generic
Asus K8V SE Deluxe BIOS: AMIBIOS Version 08.00.09 ID: A0058002
Promise-Controller deaktivated
4 IDE-harddisks (Samsung SP1634N) configured as RAID-0 connected via the VIA VT8237 controller
All harddisks are show in BIOS identicaly

I created the RAID with the pratitioning tool included on the Ubuntu 10.04 64Bit minimal Instalation-CD
The system worked fine for two weeks or so

after changing the /etc/fstab by adding

tmpfs /tempFileSystem tmpfs noexec,default,noatime,nodiratime 0 0
and removing the line that seeks for a floppy
/dev/fd0 /media/floppy auto rw,noauto,user,sync 0 0 # This line is a example because I can't read my harddisk files...

the system hangs after the message
JDB: barrier-based sync failed on md1-8 - disabling barriers

(The message before the grub-screen that the system can't find a floppy still appears (floppy controler in BIOS is deactivated))

I can switch to tty7 and back to tty1 but not to the once between (I have not diabled them, tty7 show just a blinking cursor)
If I add an usb-cd-rom it is found and a message is printed out -> the system does not hang totaly
I can not connect via ssh (don't know if i configured the ssh-deamon yet)
If I hit ctrl-alt-del the system shuts down

I can look to some earlyer messages by using shift-pageup. There is a mesage: "md0: unknown partition table", but i think to remember that this massage has been there all the time since installation
And two lines later comes:

EXT4-fs (md1): mounted filesystem with ordered data mode
Begin: Running /scripts/local-bottom ...
Done.
Done.
Begin: Running /scripts/init-bottom ...
Done.

The grub entry is:
normal:

recordfail
insmod raid
insmod mdraid
insmod ext2
set rot='(md1)'
search --no-floppy --fs-uuid --set 88d5917f-fdb3-4673-a3bc-82e29469467a
linux /boot/vmlinuz-2.6.32-22-generic root=UUID=88d5917f-fdb3-4673-a3bc-82e29469467a ro splash
initrd /boot/initrd.img-2.6.32-22-generic

covery:

recordfail
insmod raid
insmod mdraid
insmod ext2
set rot='(md1)'
search --no-floppy --fs-uuid --set 88d5917f-fdb3-4673-a3bc-82e29469467a
echo 'Linux 2.6.32-22-generic wird geladen ...'
linux /boot/vmlinuz-2.6.32-22-generic root=UUID=88d5917f-fdb3-4673-a3bc-82e29469467a ro single
echo ' Initiale Ramdisk wird geladen ...'
initrd /boot/initrd.img-2.6.32-22-generic

I tried following:

-Booting via "Recovery mode" -> nothing changed
-Booting with added bootoption nodmraid gives me the message "dmraid-activate: WARNING: dmraid disable by boot option" -> hangs after the same message
-Booting Ubuntu 10.04 Kernel 2.6.32-21-generic Live-CD

fdisk -l

Disk /dev/sda: 160.0 GB, 160041885696 bytes
255 heads, 63 sectors/track, 19457 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x3870bf41

   Device Boot Start End Blocks Id System
/dev/sda1 * 1 244 1951744 fd Linux raid autodetect
Partition 1 does not end on cylinder boundary.
/dev/sda2 244 19458 154336257 5 Extended
/dev/sda5 244 19458 154336256 fd Linux raid autodetect

Disk /dev/sdb: 160.0 GB, 160041885696 bytes
255 heads, 63 sectors/track, 19457 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xeee62945

   Device Boot Start End Blocks Id System
/dev/sdb1 1 244 1951744 fd Linux raid autodetect
Partition 1 does not end on cylinder boundary.
/dev/sdb2 244 19458 154336257 5 Extended
/dev/sdb5 244 19458 154336256 fd Linux raid autodetect

Disk /dev/sdc: 160.0 GB, 160041885696 bytes
255 heads, 63 sectors/track, 19457 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x3898d9fa

   Device Boot Start End Blocks Id System
/dev/sdc1 1 244 1951744 fd Linux raid autodetect
Partition 1 does not end on cylinder boundary.
/dev/sdc2 244 19458 154336257 5 Extended
/dev/sdc5 244 19458 154336256 fd Linux raid autodetect

Disk /dev/sdd: 160.0 GB, 160041885696 bytes
255 heads, 63 sectors/track, 19457 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x386237a8

   Device Boot Start End Blocks Id System
/dev/sdd1 * 1 244 1951744 fd Linux raid autodetect
Partition 1 does not end on cylinder boundary.
/dev/sdd2 244 19458 154336257 5 Extended
/dev/sdd5 244 19458 154336256 fd Linux raid autodetect

dmraid -r
/dev/sdb: pdc, "pdc_icfhhggf-0", stripe, ok, 312499968 sectors, data@ 0

dmraid -V
dmraid version: 1.0.0.rc16 (2009.09.16) shared
dmraid library version: 1.0.0.rc16 (2009.09.16)
device-mapper version: 4.15.0

dmraid -b
/dev/sdd: 312581808 total, "S0BKJ10YC31771"
/dev/sdc: 312581808 total, "S0BKJ1FYC02391"
/dev/sdb: 312581808 total, "S016J10X454363"
/dev/sda: 312581808 total, "S0BKJ10YC31770"

dmraid -ay
ERROR: pdc: wrong # of devices in RAID set "pdc_icfhhggf-0" [1/2] on /dev/sdb
ERROR: keeping degraded mirror set "pdc_icfhhggf"

RAID set "pdc_icfhhggf-0" was not activateddmraid -tay
ERROR: pdc: wrong # of devices in RAID set "pdc_icfhhggf-0" [1/2] on /dev/sdb
ERROR: keeping degraded mirror set "pdc_icfhhggf"
pdc_icfhhggf-0: 0 312499968 linear /dev/zero 0
ERROR: no mapping possible for RAID set pdc_icfhhggf

I am wondering why there ist [1/2] in the line
ERROR: pdc: wrong # of devices in RAID set "pdc_icfhhggf-0" [1/2] on /dev/sdb
what does it mean? should it not be [1/4]?

dmraid -ay -v -d
DEBUG: not isw at 160041884672
DEBUG: isw trying hard coded -2115 offset.
DEBUG: not isw at 160040802816
DEBUG: not isw at 160041884672
DEBUG: isw trying hard coded -2115 offset.
DEBUG: not isw at 160040802816
DEBUG: not isw at 160041884672
DEBUG: isw trying hard coded -2115 offset.
DEBUG: not isw at 160040802816
DEBUG: not isw at 160041884672
DEBUG: isw trying hard coded -2115 offset.
DEBUG: not isw at 160040802816
DEBUG: _find_set: searching pdc_icfhhggf-0
DEBUG: _find_set: not found pdc_icfhhggf-0
DEBUG: _find_set: searching pdc_icfhhggf
DEBUG: _find_set: not found pdc_icfhhggf
DEBUG: _find_set: searching pdc_icfhhggf-0
DEBUG: _find_set: not found pdc_icfhhggf-0
DEBUG: checking pdc device "/dev/sdb"
ERROR: pdc: wrong # of devices in RAID set "pdc_icfhhggf-0" [1/2] on /dev/sdb
DEBUG: set status of set "pdc_icfhhggf-0" to 2
ERROR: keeping degraded mirror set "pdc_icfhhggf"
RAID set "pdc_icfhhggf-0" was not activated
DEBUG: freeing devices of RAID set "pdc_icfhhggf-0"
DEBUG: freeing device "pdc_icfhhggf-0", path "/dev/sdb"
DEBUG: freeing devices of RAID set "pdc_icfhhggf"

Do I have to repair something? How?
How can I mount the filesystem, using a live cd, so that I can change back the fstab?
Are the changes in fstab the reason for that problem? Or maybe a automaticaly run of the fschk?

Tags: dmraid
Revision history for this message
Phillip Susi (psusi) wrote :

It sounds like you are trying to use both mdadm software raid, and dmraid fake hardware raid, at the same time. You can not do that. You should go into your bios and delete the raid array there.

Changed in dmraid (Ubuntu):
status: New → Incomplete
Revision history for this message
Kluth (kluth-weas) wrote :

The onboard Promise controller is disabled
The VIA VT8237 (VIA BIOS 2.01) is used as regular IDE controller, no RAID-Array is defined
With this configuration the system worked fine for 2 weeks.
If the problem results from a hardware or BIOS defect:
How can I mount the Array manually? The system starts booting from the array, so it must be readable, or do I missunderstand something?. When I use the live cd, I can see the Array, but can not open it.

Revision history for this message
Phillip Susi (psusi) wrote :

Disabling the bios does not remove the raid configuration from the disks, which is why dmraid -r is showing them. You can also erase the fake raid signatures with dmraid -E. I'm not sure how you managed to get the system working like this since usually dmraid will take over the disks and prevent mdadm from activating them. It is likely that you have yet another problem with your filesystem, but dmraid would not be involved with that and if you aren't trying to use it, then you should erase the dmraid signatures from the disks, and that will end the involvement with dmraid.

Revision history for this message
Kluth (kluth-weas) wrote :
Download full text (5.0 KiB)

I used dmraid -rE to delete the signature and rebooted. Boot behavior is still the same.
If I examine the array using the "disk tool" from the live cd I cann still see the array and the partitions seams to be fine. Selftest of the disks shows no errors. But if I try to start the array I get the error: Not running, not enough components to start

root@ubuntu:/home/ubuntu# mdadm -E /dev/sda1
/dev/sda1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 4a91e63e:efd84daf:4b142e9b:4593d65a
  Creation Time : Fri Jun 11 10:21:53 2010
     Raid Level : raid0
  Used Dev Size : 0
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

    Update Time : Fri Jun 11 10:21:53 2010
          State : active
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0
       Checksum : c62a585 - correct
         Events : 1

     Chunk Size : 64K

      Number Major Minor RaidDevice State
this 0 8 1 0 active sync /dev/sda1

   0 0 8 1 0 active sync /dev/sda1
   1 1 8 17 1 active sync /dev/sdb1
   2 2 8 33 2 active sync /dev/sdc1
   3 3 8 49 3 active sync /dev/sdd1
root@ubuntu:/home/ubuntu# mdadm -E /dev/sdb1
/dev/sdb1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 4a91e63e:efd84daf:4b142e9b:4593d65a
  Creation Time : Fri Jun 11 10:21:53 2010
     Raid Level : raid0
  Used Dev Size : 0
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

    Update Time : Fri Jun 11 10:21:53 2010
          State : active
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0
       Checksum : c62a597 - correct
         Events : 1

     Chunk Size : 64K

      Number Major Minor RaidDevice State
this 1 8 17 1 active sync /dev/sdb1

   0 0 8 1 0 active sync /dev/sda1
   1 1 8 17 1 active sync /dev/sdb1
   2 2 8 33 2 active sync /dev/sdc1
   3 3 8 49 3 active sync /dev/sdd1
root@ubuntu:/home/ubuntu# mdadm -E /dev/sdc1
/dev/sdc1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 4a91e63e:efd84daf:4b142e9b:4593d65a
  Creation Time : Fri Jun 11 10:21:53 2010
     Raid Level : raid0
  Used Dev Size : 0
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

    Update Time : Fri Jun 11 10:21:53 2010
          State : active
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0
       Checksum : c62a5a9 - correct
         Events : 1

     Chunk Size : 64K

      Number Major Minor RaidDevice State
this 2 8 33 2 active sync /dev/sdc1

   0 0 8 1 0 active sync /dev/sda1
   1 1 8 17 1 active sync /dev/sdb1
   2 2 8 33 2 active sync /dev/sdc1
   3 3 8 49 3 active sync /dev/sdd1
root@ubuntu:/home/ubuntu# mdadm -E /dev/sdd1
/dev/sdd1:
          Magic : a92b4efc
        Ver...

Read more...

Revision history for this message
Kluth (kluth-weas) wrote :

addition to comment #4:
I tried with and without bootoption "nodmraid"

Revision history for this message
Kluth (kluth-weas) wrote :

Maybe this additional info helps:
The four disks are partionated as following:

Primary Master
-Primary Partition: 2GB Linux RAID autodetect
-Extended 158GB
--158GB Linux RAID autodetect

Primary Slave
-Primary Partition: 2GB Linux RAID autodetect Flag bootable
-Extended 158GB
--158GB Linux RAID autodetect

Secondary Master
-Primary Partition: 2GB Linux RAID autodetect
-Extended 158GB
--158GB Linux RAID autodetect

Secondary Slave
-Primary Partition: 2GB Linux RAID autodetect Flag bootable
-Extended 158GB
--158GB Linux RAID autodetect

The combinded 2GB Partitions should be the swap partition
The combinded 158GB Partitions should be the ext4 partition

As in comment #4 shown only information for the sd[abcd]1 devices are recognized. Where are the ones for sd[abcd]5? Or how do I recreate them?

Revision history for this message
Danny Wood (danwood76) wrote :

Hi,

You can only see information in sda1 etc as you are issuing the command 'mdadm -E /dev/sda1'.
A better way to check the status of mdadm raid is by issuing a 'cat /proc/mdstat' that will show you the status of each raid set.

Its possible that you have corrupted one of the disks and it needs to sync.
But the only way to check that is by issuing 'cat /proc/mdstat'

If it does need to sync you will have to let it, depending on the size of the disk it could take a while.

Changed in dmraid (Ubuntu):
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.