Cinder

Netapp: dfm lun list refresh not up-to-date

Bug #1095633 reported by Brano Zarnovican on 2013-01-03

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Cinder	Fix Released	Undecided	Ben Swartzlander	Cinder 2013.1 "grizzly"

Bug Description

Short summary:

When creating volume from snapshot, the operation may fail with an error "No entry in LUN table for volume". LUN representing the volume was created correctly on the filer, but DFM did not refresh its LUN list in time, causing the above error.

Detailed description:

Openstack Netapp driver is calling DFM apis to manage LUNs. If the SOAP call goes directly to DFM (like create_volume), then DFM will correctly update its internal LUN list. However, in some cases (like create_volume_from_snapshot), driver will use DFM only as a proxy to target filer. Driver will create LUN on the filer and then ask DFM to refresh the LUN list. New LUN is expected to appear on the updated DFM list. New LUN is then added to the driver's internal self.discovered_luns list. This may fail, as the list of LUNs returned from DFM does not always contain the newly created LUN.

Consequences of which is that 'create_volume_from_snapshot' will return with success. However, the method 'ensure_export' invoked immediately afterwards will fail because of missing entry in 'self.discovered_luns'.

Versions:
* Openstack Netapp driver from latest Folsom branch
* DFM version 5.1
* OnTap: 7.3.6P5

Notes:
* after several attempts to refresh the DFM LUN list, the new LUN will eventually appear, but it is quite unpredictable when
* this issue applies to 7-mode. The code for Cluster mode appears to be different..
* if DFM supposed to return up-to-date list after refresh completed, then this is DFM problem (because it does not). Otherwise, it should be handled somehow in the driver

Symptoms:
'lun show' command on the filer shows most up-to-date list
'dfm lun list' shows DFM's internal LUN list which may be "behind", missing LUNs created recently

Steps to reproduce:
I have written a simple script (attached) which imitates driver's code to refresh lun list.
1) Inside the script, update DFM's api url and credentails (same as in nova.conf)

2) run interactively on the host with access to DFM
# python -i /tmp/lun_refresh.py
2013-01-02 17:04:18,524 INFO Soap client init..
2013-01-02 17:04:22,444 INFO Soap client init done.
2013-01-02 17:04:23,523 DEBUG Discovered 3 datasets and 5 LUNs
>>>
>>> ds = d._get_dataset('OpenStack_103a49bb861e485ea05aa78f9b0216bd')
>>> host_id = '<filers-hostname>'

3) on Netapp filer, create 'test2' LUN manually
> lun create -s 1g -t linux /vol/OpenStack_103a49bb861e485ea05aa78f9b0216bd/test2

4) switch to python cli and run refresh
>>> d._refresh_and_discover(host_id, ds, 'test2')
2013-01-02 17:04:59,697 INFO Starting refresh..
2013-01-02 17:05:14,782 INFO calling TimestampList
2013-01-02 17:05:29,884 INFO calling TimestampList
2013-01-02 17:05:44,987 INFO calling TimestampList
2013-01-02 17:06:00,087 INFO calling TimestampList
2013-01-02 17:06:15,187 INFO calling TimestampList
2013-01-02 17:06:15,272 INFO Finished refresh..
2013-01-02 17:06:15,529 INFO DFM lun refresh FOUND volume "test2"
(GOOD case, new lun is there after first refresh..)

5) repeat steps 3) and 4) for other test LUNs
Netapp:
> lun create -s 1g -t linux /vol/OpenStack_103a49bb861e485ea05aa78f9b0216bd/test3
python cli:
>>> d._refresh_and_discover(host_id, ds, 'test3')
2013-01-02 17:06:28,025 INFO Starting refresh..
2013-01-02 17:06:43,094 INFO calling TimestampList
2013-01-02 17:06:43,179 INFO Finished refresh..
2013-01-02 17:06:43,438 DEBUG DFM lun refresh did not return volume "test3" (1/3)
2013-01-02 17:06:43,438 INFO Starting refresh..
2013-01-02 17:06:58,512 INFO calling TimestampList
2013-01-02 17:06:58,597 INFO Finished refresh..
2013-01-02 17:06:58,865 DEBUG DFM lun refresh did not return volume "test3" (2/3)
2013-01-02 17:06:58,865 INFO Starting refresh..
2013-01-02 17:07:13,946 INFO calling TimestampList
2013-01-02 17:07:14,032 INFO Finished refresh..
2013-01-02 17:07:14,290 INFO DFM lun refresh FOUND volume "test3"
(BAD case, first two DFM lun refreshes did NOT return newly created LUN, third one worked)

Dirty workaround:
Just repeat DFM LUN refresh several times. Stop, when the new LUN appears on the list. This is what '_refresh_and_discover' method does in the attached 'lun_refresh.py' script.

I will probably create a proper patch against Folsom later on. The logic will be the same as in the attached script (minus DEBUG messages). However, I believe this workaround is quite lame. Repeating the same operation N-times still does not guarantee that it will work. I have seen cases where DFM lun list was out-of-date for minutes (!) even if repeatedly calling refresh via SOAP and/or 'dfm lun discover ...'. It would be nice if somebody could look into the DFM itself, why the refresh does not work.

Regards,

Brano Zarnovican

Tags:

Revision history for this message

Brano Zarnovican (zarnovican) wrote on 2013-01-03:

helper script to reproduce the problem Edit (7.7 KiB, text/x-python)

Chuck Short (zulcss) on 2013-01-03

affects:

nova → cinder

Mike Perez (thingee) on 2013-01-03

tags:

added: driver

Revision history for this message

Brano Zarnovican (zarnovican) wrote on 2013-01-07:

More info about the DFM refresh problem..

1) request to refresh lun list is ignored if it was executed "close" to the previous refresh has finished. Looks like the graceful period is around a minute or so. In this period DFM will return timestamp of the previous monitor execution. Currently, driver will run DfmObjectRefresh, but it does not check if the timestamp is in the past, only that it is non-zero.

https://github.com/openstack/nova/blob/stable/folsom/nova/volume/netapp.py#L870

Fix/workaround: Repeat DfmObjectRefresh requests until you get back timestamp higher than first execution of Refresh.

2) (Theory) Running 'lun' monitor will not refresh lun list if the new lun is 'inside' a qtree which was not discovered yet. Even if you run DfmObjectRefresh(.., ChildType='lun_path') multiple times and let it correctly finish, new lun still won't appear. It looks that qtree (and his lun) will appear only after 'file_system' monitor has been executed. This monitor is NOT triggered with ChildType='lun_path' parameter.

Fix/workaround: Explicitly trigger both 'file_system' and 'lun' monitors.

Revision history for this message

Brano Zarnovican (zarnovican) wrote on 2013-01-07:

0001-BUGFIX-1095633-Netapp-driver-repeat-DFM-lun-refresh.patch Edit (2.4 KiB, text/plain)

I have written a patch against Folsom, which implement those two fixes.

1) we repeat Refresh request if the monitor finish time returned by DFM was older than our first request. Unfortunately, this introduces a new requirement. Driver host and DFM host should be time synchronized.

2) Instead of running
server.DfmObjectRefresh(.., ChildType='lun_path')
we run
server.DfmObjectRefresh(.., MonitorNames=['file_system', 'lun'])

The former will trigger DFM lun monitor. The later will trigger both file_system and lun monitor. File_system monitor is needed to discover newly created qtree.

> My question is whether it would slow down operations at all in the 85-90% of the time when DFM does the right thing.

I would say it might be slower even if everything works perfectly on DFM side. We trigger two monitors instead of one. Weather they run concurrently, I don't know. Lun monitor seems to be always slower than file_system monitor.

Rushi Agrawal (rushiagr) on 2013-01-22

Changed in cinder:
assignee:	nobody → Rushi Agrawal (rushiagr)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2013-01-22: Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/20229

Changed in cinder:
status:	New → In Progress

Revision history for this message

Rushi Agrawal (rushiagr) wrote on 2013-01-22:

Thanks Brano. The time-synchronization requirement is a bit of an overhead in my opinion.

I was attempting to do something like this:
Somehow, if we are able to get the difference between the two clocks, that is, the driver host and DFM (within a margin of say around 30 seconds) by contacting DFM before even cloning the LUN, that would do the job.

Would look into it in more detail in the coming days.

Thanks again.

OpenStack Infra (hudson-openstack) on 2013-02-21

Changed in cinder:
assignee:	Rushi Agrawal (rushiagr) → John Griffith (john-griffith)

Ben Swartzlander (bswartz) on 2013-02-21

Changed in cinder:
assignee:	John Griffith (john-griffith) → Ben Swartzlander (bswartz)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2013-02-21: Fix merged to cinder (master)

Reviewed: https://review.openstack.org/20229
Committed: http://github.com/openstack/cinder/commit/f7bcf951aa038f23ea32f9215458f3fd59f33590
Submitter: Jenkins
Branch: master

commit f7bcf951aa038f23ea32f9215458f3fd59f33590
Author: Rushi Agrawal <email address hidden>
Date: Mon Jan 21 21:31:48 2013 +0530

Fix stale volume list for NetApp 7-mode ISCSI driver

    While contacting filer through DFM in order to create volume from
    snapshot, the operation may fail with an error "No entry in LUN
    table for volume". Although the LUN representing the volume was
    created on the filer, the LUN list was not refreshed in time, which
     caused an error. This fix handles this situation for creating
    volume from snapshots.

Note that this fix adds the requirement that the driver host and
DFM machine should be reasonably time-synchronized.

Fixes bug 1095633

Change-Id: I77fff4c36a3af72d28f2d01988a6067919093718