Comment 11 for bug 893450

Revision history for this message
Peter Petrakis (peter-petrakis) wrote :

In re-examining the evidence and chatting with Serge a few things come to mind.

1) There's a disparity between the conclusion formed in comment #3 and the latest
udev logs. It could easily be true that the udev logging simply was held long enough
for that last LV to be activated. It appears to be true that if we wait long enough
(approx 10 mins) that all LVs will come online.

2) The crux is udev being starved of events or is it being stalled by some other action?

3) Comment #9 compelled me to dig around concerning the backing store's to the
RAID 1 software set. The WD20EARS appear to use 4K sectors, which if true
would lend credence to *slowness* being the factor (udev is starved) which
determines why it takes almost 10 mins to discover all the LVs.

Follow up:
1) verify sector size of backing stores by installing sg3-utils and posting the result of
 # sg_readcap /dev/sda
and
 # cat /sys/block/sda/queue/physical_block_size

2) the basic dd test is fine, please re-run with a sweep of blocksizes, 512, 4k, 8k, 1024k
with "count=100" so it doesn't run forever

3) an additional io benchmark would be informative like iozone or iometer, whatever
is your preference.

Conclusion:
If this is an alignment issue then we've got a lot of work to do. You might be able
to resize/grow the MD array to change the stripe size to be a multiple of the sector
size. Though you would need to carry the exercise all the way through, including LVM
metadata offsets. It would be optimal if you had a duplicate set of drives to reproduce
this issue on a separate system. Thanks.