test_tagged_attachment randomly fails with: sequence or collection is not empty: [{u'type': u'disk', u'bus': u'pci', u'address': u'0000:00:07.0', u'serial': u'56e9a4a0-51fc-4319-845b-02c6a21d5b16', u'tags': [u'volume-tag']}]

Bug #1775491 reported by Matt Riedemann
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tempest
Fix Released
High
Ghanshyam Mann

Bug Description

Seen here:

http://logs.openstack.org/95/572195/6/check/tempest-full/6e332b1/job-output.txt.gz#_2018-06-06_22_51_16_149472

2018-06-06 22:51:16.149472 | controller | {2} tempest.api.compute.servers.test_device_tagging.TaggedAttachmentsTest.test_tagged_attachment [87.604595s] ... FAILED
2018-06-06 22:51:16.149532 | controller |
2018-06-06 22:51:16.149586 | controller | Captured traceback:
2018-06-06 22:51:16.149637 | controller | ~~~~~~~~~~~~~~~~~~~
2018-06-06 22:51:16.149726 | controller | Traceback (most recent call last):
2018-06-06 22:51:16.149855 | controller | File "tempest/common/utils/__init__.py", line 88, in wrapper
2018-06-06 22:51:16.149938 | controller | return f(*func_args, **func_kwargs)
2018-06-06 22:51:16.150112 | controller | File "tempest/api/compute/servers/test_device_tagging.py", line 376, in test_tagged_attachment
2018-06-06 22:51:16.150189 | controller | self.verify_empty_devices)
2018-06-06 22:51:16.150356 | controller | File "tempest/api/compute/servers/test_device_tagging.py", line 88, in verify_metadata_from_api
2018-06-06 22:51:16.150443 | controller | CONF.compute.build_interval):
2018-06-06 22:51:16.150678 | controller | File "tempest/lib/common/utils/test_utils.py", line 107, in call_until_true
2018-06-06 22:51:16.150764 | controller | if func(*args, **kwargs):
2018-06-06 22:51:16.150924 | controller | File "tempest/api/compute/servers/test_device_tagging.py", line 83, in get_and_verify_metadata
2018-06-06 22:51:16.151000 | controller | verify_method(md_json)
2018-06-06 22:51:16.151170 | controller | File "tempest/api/compute/servers/test_device_tagging.py", line 310, in verify_empty_devices
2018-06-06 22:51:16.151276 | controller | self.assertEmpty(md_dict['devices'])
2018-06-06 22:51:16.151380 | controller | File "tempest/test.py", line 863, in assertEmpty
2018-06-06 22:51:16.151454 | controller | self.assertFalse(items, msg)
2018-06-06 22:51:16.151649 | controller | File "/opt/stack/tempest/.tox/tempest/local/lib/python2.7/site-packages/unittest2/case.py", line 696, in assertFalse
2018-06-06 22:51:16.151732 | controller | raise self.failureException(msg)
2018-06-06 22:51:16.152319 | controller | AssertionError: [{u'type': u'disk', u'bus': u'pci', u'address': u'0000:00:07.0', u'serial': u'56e9a4a0-51fc-4319-845b-02c6a21d5b16', u'tags': [u'volume-tag']}] is not false : sequence or collection is not empty: [{u'type': u'disk', u'bus': u'pci', u'address': u'0000:00:07.0', u'serial': u'56e9a4a0-51fc-4319-845b-02c6a21d5b16', u'tags': [u'volume-tag']}]

http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22sequence%20or%20collection%20is%20not%20empty%3A%5C%22%20AND%20message%3A%5C%22u'tags'%3A%20%5Bu'volume-tag'%5D%7D%5D%5C%22%20AND%20tags%3A%5C%22console%5C%22&from=7d

87 hits in 7 days, check and gate, all failures.

The test that is failing was merged on May 31 so this is pretty recent and a reasonably high failure rate:

https://review.openstack.org/#/c/391947/

This is in the tempest logs before it fails:

http://logs.openstack.org/95/572195/6/check/tempest-full/6e332b1/job-output.txt.gz#_2018-06-06_23_23_46_573362

There is a sleep in this test so it's obviously racy:

https://review.openstack.org/#/c/391947/62/tempest/api/compute/servers/test_device_tagging.py@374

That should probably sleep until the verify method passes or we timeout.

Matt Riedemann (mriedem)
Changed in tempest:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Ghanshyam Mann (ghanshyammann) wrote :

yes, it is easily reproducible on local env by increasing the nova's config option metadata_cache_expiration value to something higher (i did 50 sec).

It fail in both place in verify_empty_devices()[1] as well as in verify_device_metadata()[2]

[1] https://github.com/openstack/tempest/blob/5aeb551a58078abd34884583ac70ad425f4590ba/tempest/api/compute/servers/test_device_tagging.py#L310

[2] https://github.com/openstack/tempest/blob/5aeb551a58078abd34884583ac70ad425f4590ba/tempest/api/compute/servers/test_device_tagging.py#L306

as matt mentioned in gerrit, we should go with wait loop here than hard coded sleep.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tempest (master)

Fix proposed to branch: master
Review: https://review.openstack.org/573107

Changed in tempest:
assignee: nobody → Ghanshyam Mann (ghanshyammann)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tempest (master)

Reviewed: https://review.openstack.org/573107
Committed: https://git.openstack.org/cgit/openstack/tempest/commit/?id=cc59f9d9887a6424f0e963567802ecd3ed696b7c
Submitter: Zuul
Branch: master

commit cc59f9d9887a6424f0e963567802ecd3ed696b7c
Author: Ghanshyam <email address hidden>
Date: Thu Jun 7 06:57:57 2018 +0000

    Fix race condition in device tag testing

    Device tag tests (test_device_tagging.py) has hard coded
    sleep with default cache time of metadata service. That cache
    time is configurable on nova side (metadata_cache_expiration).

    So instead of sleep with hard code value, we should go with the wait
    loop to fetch the metadata info for verification. For safer side,
    build_timeout is being used as max time to fetch the expected metadata
    info otherwise test will fail with timeout.

    Change-Id: Ic59a12f3fbcb7ef1972c03cb0c92d08a0a35498b
    Closes-Bug: #1775491

Changed in tempest:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tempest 19.0.0

This issue was fixed in the openstack/tempest 19.0.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.