pvmove coredumps

Bug #38007 reported by hunger
10
Affects Status Importance Assigned to Milestone
lvm2 (Ubuntu)
Fix Released
High
Michael Vogt

Bug Description

Moving a PV causes the pvmove app to coredump after a while.

Revision history for this message
hunger (hunger) wrote : strace output of pvmove.

The strace output of pvmove /dev/sda5 /dev/sdb4 (both drives do exist of course;-).

Revision history for this message
Nathan Howell (neh) wrote :

I've just run into this problem as well. I'm moving (or trying to move) data off of a failing 300GB drive onto an identical new drive. (pvmove -v /dev/hdc /dev/hde) Both drives are entirely PVs (no partitions). My strace looks the same as the one already attached.

Here are a few chat excerpts from #lvm (I'm phlaegel):

11:42 agk| what version is it?
11:43 agk| run 'lvm version'
11:43 agk| (or check package info)
11:44 agk| and is the kernel the one that came with the distro, or a newer one you installed later?
11:45 agk| [in some cases new kernel requires new tools - changed around 2.6.12]
11:45 phlaegel| LVM version: 2.01.15 (2005-10-16)
11:45 phlaegel| Library version: 1.01.05 (2005-09-26)
11:45 phlaegel| Driver version: 4.4.0
11:45 agk| OK, that version's fine with earlier kernels
11:46 agk| but not after 2.6.12-ish
11:46 phlaegel| that'll be it then
11:46 phlaegel| 2.6.15
11:47 agk| you need 2.02.* and 1.02.*

12:08 agk| if it's still there, what does 'dmsetup status' say - the line that has 'mirror' in it
12:12 phlaegel| vg_tv-pvmove0: 0 586063872 mirror 2 22:0 33:0 572328/572328
12:14 agk| lovely.
12:14 agk| Yes, there's a bug in that line of code that got fixed.
12:14 agk| I don't remember it though!
12:15 agk| if (sscanf(pos, "%u %n", mirror_count, used) != 1) {
12:15 agk| became
12:15 agk| if (sscanf(pos, "%u %n", &mirror_count, &used) != 1) {
12:15 agk| 2.01.15 has that bug.

Hope this helps...

Revision history for this message
hunger (hunger) wrote :

Two people report this bug now... I guess that is "Confirmed".

Changing severity to "major" as a bug like this renders lvm useless in corporate environments (which dapper seems to be targeting).

Changed in lvm2:
status: Unconfirmed → Confirmed
Revision history for this message
Matt Zimmerman (mdz) wrote :

Does it work if you retry the operation? Does it cause any problems with the volume, or only fail to execute the move?

Revision history for this message
Matt Zimmerman (mdz) wrote :

According to comments, we could fix this by merging lvm2 2.02.02 and devmapper 2:1.02.03-1 from unstable.

I've reviewed the upstream changelogs and they look OK for Dapper. I am a little nervous about this innocent-looking changelog entry:

  A pvresize implementation.

and some code has been shuffled from lvm2 into devmapper, but this has been in Debian for almost 6 months now and seems stable.

Michael, please merge these.

Changed in lvm2:
assignee: nobody → mvo
Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

pvresize is safe, but in order to sync lvm you will also need device-mapper and a huge amount of test.

Fabio

Revision history for this message
hunger (hunger) wrote :

I used a FC5 rescue CD to move the LVs I had and so far had no more need to retest.

AFAIR I did test after the last LVM updates and the problem was still there. If it helps I can retest when I get home to a harddrive that I can spare:-) pvmove should not ruin data when crashing at that time (it has just set up a dm-mirror and that starts to copy data over), but I am not eager to wagger my data on that.

I do agree with Fabio though: Upgrading LVM drags in device-mapper and other stuff. I tried to rebuild LVM from debians sources, but gave up when I saw what that entails. Doing something like that this close to a release might be risky, but then shipping a broken pvmove in a "enterprise grade" distribution will give really bad press I am afraid.

OTOH, I could do some tests this weekend when I am home again and could grab a spare HDD for my laptop to do further testing next week when I am on the road again.

Revision history for this message
Michael Vogt (mvo) wrote :

Thanks for your kind offer to test it.

Here is a (source-only) merge of the latest debian lvm2 with our packages:
http://people.ubuntu.com/~mvo/test/lvm2/

I would appreciate testing and will upload if it is all good.

Thanks,
 Michael

Michael Vogt (mvo)
Changed in lvm2:
status: Confirmed → In Progress
Revision history for this message
hunger (hunger) wrote :

I ran some simplistic tests (copying a 1GiB pv back and forth) and that works for me. This used to trigger the bug, so I consider it to be fixed with your patch.

I further did some reboots and normal kind of operation stuff: No new bugs to report with your new code. I'd recommend uploading it.

Thanks for fixing this bug (which was really troubleing since it bit only when something was so screwed up that you needed to shuffle your PVs around)!

Revision history for this message
Michael Vogt (mvo) wrote :

This should be fixed with the upload of devmapper_1.02.05-1ubuntu1 and lvm2_2.02.02-1ubuntu1

Cheers,
 Michael

Changed in lvm2:
status: In Progress → Fix Released
Revision history for this message
Nathan Howell (neh) wrote :

I had a chance to test this a couple of days ago (with a couple of fairly large moves on two different machines) and pvmove seems to be working correctly now. Thanks for the fix.

Revision history for this message
Christoph Rauch (christoph-rauch) wrote : pvmove STILL coredumps.

root@xen04:~# dpkg -l|grep devmapper1.02
ii libdevmapper1.02 1.02.05-1ubuntu1 [...]

root@xen04:~# pvmove
Segmentation fault

root@xen04:~# strace pvmove
[...]
stat64("/dev/fibre/dspam01-root", {st_mode=S_IFBLK|0600, st_rdev=makedev(253, 28), ...}) = 0
stat64("/dev/fibre/dspam01-root", {st_mode=S_IFBLK|0600, st_rdev=makedev(253, 28), ...}) = 0
open("/dev/fibre/dspam01-root", O_RDONLY|O_DIRECT|O_LARGEFILE|O_NOATIME) = 3
fstat64(3, {st_mode=S_IFBLK|0600, st_rdev=makedev(253, 28), ...}) = 0
ioctl(3, BLKBSZGET, 0x80eafb8) = 0
_llseek(3, 0, [0], SEEK_SET) = 0
read(3, <unfinished ...>
+++ killed by SIGSEGV +++

Changed in lvm2:
status: Fix Released → Confirmed
Revision history for this message
Christoph Rauch (christoph-rauch) wrote :

never mind... might just be that hard drive dying below my feet without me noticing. :-(

Changed in lvm2:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.