volume connect failed due to lun-1 missing

Bug #1096773 reported by li,chen
28
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Cinder
Invalid
Undecided
Unassigned

Bug Description

We're working on Ubuntu 12.10 with Folsom release cinder-volume 2012.2.1-0ubuntu1.

We met an error when we try to attach a volume to an instance.
We found error log at compute.log like this:

2013-01-07 00:28:33 ERROR nova.compute.manager [req-a2aaabd3-a863-4efe-822e-ff864ad7639c 18c9f5d9db9b4b a983c7b1538098d3f1 07f6249cdb1843ec9ebb10768656acd3] [instance: 553f1a69-6bcc-425b-990c-197da83b3ab0] F ailed to attach volume 17b0a32c-44bc-4b64-8dc0-cddf904af9ce at /dev/vdc
69279 2013-01-07 00:28:33 TRACE nova.compute.manager [instance: 553f1a69-6bcc-425b-990c-197da83b3ab0] Traceba ck (most recent call last):
69280 2013-01-07 00:28:33 TRACE nova.compute.manager [instance: 553f1a69-6bcc-425b-990c-197da83b3ab0] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1971, in _attach_volume
69281 2013-01-07 00:28:33 TRACE nova.compute.manager [instance: 553f1a69-6bcc-425b-990c-197da83b3ab0] mou ntpoint)
69282 2013-01-07 00:28:33 TRACE nova.compute.manager [instance: 553f1a69-6bcc-425b-990c-197da83b3ab0] File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 117, in wrapped
69283 2013-01-07 00:28:33 TRACE nova.compute.manager [instance: 553f1a69-6bcc-425b-990c-197da83b3ab0] tem p_level, payload)
69284 2013-01-07 00:28:33 TRACE nova.compute.manager [instance: 553f1a69-6bcc-425b-990c-197da83b3ab0] File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
69285 2013-01-07 00:28:33 TRACE nova.compute.manager [instance: 553f1a69-6bcc-425b-990c-197da83b3ab0] sel f.gen.next()
69286 2013-01-07 00:28:33 TRACE nova.compute.manager [instance: 553f1a69-6bcc-425b-990c-197da83b3ab0] File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 92, in wrapped
69287 2013-01-07 00:28:33 TRACE nova.compute.manager [instance: 553f1a69-6bcc-425b-990c-197da83b3ab0] ret urn f(*args, **kw)
69288 2013-01-07 00:28:33 TRACE nova.compute.manager [instance: 553f1a69-6bcc-425b-990c-197da83b3ab0] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 635, in attach_volume
69289 2013-01-07 00:28:33 TRACE nova.compute.manager [instance: 553f1a69-6bcc-425b-990c-197da83b3ab0] mou nt_device)
69290 2013-01-07 00:28:33 TRACE nova.compute.manager [instance: 553f1a69-6bcc-425b-990c-197da83b3ab0] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 627, in volume_driver_method
69291 2013-01-07 00:28:33 TRACE nova.compute.manager [instance: 553f1a69-6bcc-425b-990c-197da83b3ab0] ret urn method(connection_info, *args, **kwargs)
69292 2013-01-07 00:28:33 TRACE nova.compute.manager [instance: 553f1a69-6bcc-425b-990c-197da83b3ab0] File "/usr/lib/python2.7/dist-packages/nova/utils.py", line 752, in inner
69293 2013-01-07 00:28:33 TRACE nova.compute.manager [instance: 553f1a69-6bcc-425b-990c-197da83b3ab0] ret val = f(*args, **kwargs)
69294 2013-01-07 00:28:33 TRACE nova.compute.manager [instance: 553f1a69-6bcc-425b-990c-197da83b3ab0] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/volume.py", line 174, in connect_volume
69295 2013-01-07 00:28:33 TRACE nova.compute.manager [instance: 553f1a69-6bcc-425b-990c-197da83b3ab0] % ( host_device))
69296 2013-01-07 00:28:33 TRACE nova.compute.manager [instance: 553f1a69-6bcc-425b-990c-197da83b3ab0] NovaExc eption: iSCSI device not found at /dev/disk/by-path/ip-192.168.4.96:3260-iscsi-iqn.2010-10.org.openstac k:volume-17b0a32c-44bc-4b64-8dc0-cddf904af9ce-lun-1
69297 2013-01-07 00:28:33 TRACE nova.compute.manager [instance: 553f1a69-6bcc-425b-990c-197da83b3ab0]

So we back to iscsi server side to check our iscsi target, then we found an strange phenomenon: LUN:1 is missing !!!!!

Target 79: iqn.2010-10.org.openstack:volume-17b0a32c-44bc-4b64-8dc0-cddf904af9ce
    System information:
        Driver: iscsi
        State: ready
    I_T nexus information:
        I_T nexus: 788
            Initiator: iqn.1993-08.org.debian:01:d0ae3c2cb9b3
            Connection: 0
                IP Address: 192.168.4.86
    LUN information:
        LUN: 0
            Type: controller
            SCSI ID: IET 004f0000
            SCSI SN: beaf790
            Size: 0 MB, Block size: 1
            Online: Yes
            Removable media: No
            Readonly: No
            Backing store type: null
            Backing store path: None
            Backing store flags:
    Account information:
    ACL information:
        ALL

What this issue confuse us most is that’s it do not happen to every LVM, just some of them.
For example, when we create 50 volumes at one machine, and try to attach them to instances, always have 1 failed but other 49 are just working fine.

Revision history for this message
John Griffith (john-griffith) wrote :

Is there any consistency on which target this occurs on? I'm looking at putting a check in the create for this, however it doesn't address the root cause of the problem.

Revision history for this message
li,chen (chen-li) wrote :

I guess no, we didn't find any consistency .

Revision history for this message
John Griffith (john-griffith) wrote :

Wondering if you can find anything helpful in the cinder-volume logs? Specifically something related to the create call on the volume that's missing it's Lun entry?

Changed in cinder:
status: New → Incomplete
Revision history for this message
li,chen (chen-li) wrote :

The volume.log does not have any error message.
when I create 100 volumes, I found one of these targets does not have lun 1, the attach file is volume.log for create these 100 volumes.

After create 100 volumes, I run the below command:
>> tgt-admin --show | grep "LUN: 1" -B 18 | grep Target

I found target 81 for volume-28546ff0-af27-4c03-87a3-c47c08e614c5 missing lun 1, the below is output:

Target 1: iqn.2010-10.org.openstack:volume-3bca3b27-d965-4c98-902a-438df543a268
Target 2: iqn.2010-10.org.openstack:volume-ccb332d0-39e2-4fe8-97f3-67103548e576
......
Target 79: iqn.2010-10.org.openstack:volume-37c88812-a95d-4bea-86a7-b2e6f9a4070c
Target 80: iqn.2010-10.org.openstack:volume-56b19556-c1a3-4d83-a569-7797efc27814
Target 82: iqn.2010-10.org.openstack:volume-2eaa3f75-a5b6-43f9-a1f1-cf0138e9bc88

Revision history for this message
Benedikt von St. Vieth (b-von-st-vieth) wrote :

hi, slightly different situation but also losing lun 1 here with
ubuntu 12.04.2, cinder-volume 2012.2.1-0ubuntu1~cloud0:
If i create a new volume this works (logical volume is created, tgt-admin shows it) and i am able to attach it to a instance.
After a reboot of the host providing the iscsi-target, tgt-admin shows me that none of these old volumes has a lun-1 and therefore i am not able to attach the volume to a instance anymore. For new volumes its working, until a reboot.
i am not able to see something inside of the logs, any idea?

Revision history for this message
Benedikt von St. Vieth (b-von-st-vieth) wrote :

okay, after a system reboot i had to restart tgt ones again to make it aware of the lun 1. Don't know why only lun 0 appear after a reboot but "service tgt restart" solved the problem. I don't think this is a cinder problem, sorry for the inconvenience.

Revision history for this message
John Griffith (john-griffith) wrote :

this is a valid issue, and can be caused by device busy/in-use on reboot etc. We should come up with a way of checking this and trying to eliminate the issues.

Also, I suspect this is related to root cause of: LP #1226337

Changed in cinder:
status: Incomplete → Confirmed
Revision history for this message
Sean McGinnis (sean-mcginnis) wrote :

I believe this has been addressed indirectly by other changes since this was filed. Please reopen if this is still an issue.

Changed in cinder:
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.