Volume service hostgroup@tripleo_iscsi failed to start.: CappedVersionUnknown: Unrecoverable Error

Bug #1730111 reported by wes hayutin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Triaged
Critical
Unassigned

Bug Description

http://logs.openstack.org/45/514945/2/gate/legacy-tripleo-ci-centos-7-containers-multinode/aaed466/logs/subnode-2/var/log/cinder/volume.log.txt.gz#_2017-11-04_20_29_30_332

2017-11-04 20:29:30.332 114538 ERROR cinder.cmd.volume [req-68a155d2-f1a9-47aa-b56d-f7a94cf58490 - - - - -] Volume service hostgroup@tripleo_iscsi failed to start.: CappedVersionUnknown: Unrecoverable Error: Versioned Objects in DB are capped to unknown version 1.30. Most likely your environment contains only new services and you're trying to start an older one. Use `cinder-manage service list` to check that and upgrade this service.
2017-11-04 20:29:30.332 114538 ERROR cinder.cmd.volume Traceback (most recent call last):
2017-11-04 20:29:30.332 114538 ERROR cinder.cmd.volume File "/usr/lib/python2.7/site-packages/cinder/cmd/volume.py", line 97, in main
2017-11-04 20:29:30.332 114538 ERROR cinder.cmd.volume cluster=cluster)
2017-11-04 20:29:30.332 114538 ERROR cinder.cmd.volume File "/usr/lib/python2.7/site-packages/cinder/service.py", line 395, in create
2017-11-04 20:29:30.332 114538 ERROR cinder.cmd.volume cluster=cluster)
2017-11-04 20:29:30.332 114538 ERROR cinder.cmd.volume File "/usr/lib/python2.7/site-packages/cinder/service.py", line 148, in __init__
2017-11-04 20:29:30.332 114538 ERROR cinder.cmd.volume *args, **kwargs)
2017-11-04 20:29:30.332 114538 ERROR cinder.cmd.volume File "/usr/lib/python2.7/site-packages/cinder/volume/manager.py", line 203, in __init__
2017-11-04 20:29:30.332 114538 ERROR cinder.cmd.volume *args, **kwargs)
2017-11-04 20:29:30.332 114538 ERROR cinder.cmd.volume File "/usr/lib/python2.7/site-packages/cinder/manager.py", line 177, in __init__
2017-11-04 20:29:30.332 114538 ERROR cinder.cmd.volume self.scheduler_rpcapi = scheduler_rpcapi.SchedulerAPI()
2017-11-04 20:29:30.332 114538 ERROR cinder.cmd.volume File "/usr/lib/python2.7/site-packages/cinder/rpc.py", line 208, in __init__
2017-11-04 20:29:30.332 114538 ERROR cinder.cmd.volume serializer = base.CinderObjectSerializer(obj_version_cap)
2017-11-04 20:29:30.332 114538 ERROR cinder.cmd.volume File "/usr/lib/python2.7/site-packages/cinder/objects/base.py", line 509, in __init__
2017-11-04 20:29:30.332 114538 ERROR cinder.cmd.volume raise exception.CappedVersionUnknown(version=version_cap)
2017-11-04 20:29:30.332 114538 ERROR cinder.cmd.volume CappedVersionUnknown: Unrecoverable Error: Versioned Objects in DB are capped to unknown version 1.30. Most likely your environment contains only new services and you're trying to start an older one. Use `cinder-manage service list` to check that and upgrade this service.

This causes tempest to fail:
http://logs.openstack.org/45/514945/2/gate/legacy-tripleo-ci-centos-7-containers-multinode/aaed466/logs/undercloud/home/zuul/tempest_output.log.txt.gz#_2017-11-04_20_24_47

2017-11-04 20:25:28 | {0} tempest.scenario.test_volume_boot_pattern.TestVolumeBootPattern.test_volume_boot_pattern [8.664533s] ... FAILED
2017-11-04 20:25:28 |
2017-11-04 20:25:28 | Captured traceback:
2017-11-04 20:25:28 | ~~~~~~~~~~~~~~~~~~~
2017-11-04 20:25:28 | Traceback (most recent call last):
2017-11-04 20:25:28 | File "/usr/lib/python2.7/site-packages/tempest/common/utils/__init__.py", line 89, in wrapper
2017-11-04 20:25:28 | return f(self, *func_args, **func_kwargs)
2017-11-04 20:25:28 | File "/usr/lib/python2.7/site-packages/tempest/scenario/test_volume_boot_pattern.py", line 100, in test_volume_boot_pattern
2017-11-04 20:25:28 | volume_origin = self._create_volume_from_image()
2017-11-04 20:25:28 | File "/usr/lib/python2.7/site-packages/tempest/scenario/test_volume_boot_pattern.py", line 43, in _create_volume_from_image
2017-11-04 20:25:28 | return self.create_volume(name=vol_name, imageRef=img_uuid)
2017-11-04 20:25:28 | File "/usr/lib/python2.7/site-packages/tempest/scenario/manager.py", line 234, in create_volume
2017-11-04 20:25:28 | volume['id'], 'available')
2017-11-04 20:25:28 | File "/usr/lib/python2.7/site-packages/tempest/common/waiters.py", line 204, in wait_for_volume_resource_status
2017-11-04 20:25:28 | resource_name=resource_name, resource_id=resource_id)
2017-11-04 20:25:28 | tempest.exceptions.VolumeResourceBuildErrorException: volume d10f881d-088d-4d0c-95ee-2789a761fe1b failed to build and is in ERROR status
2017-11-04 20:25:28 |
2017-11-04 20:25:28 |

Revision history for this message
wes hayutin (weshayutin) wrote :

FYI
query: >-
  message:"tempest.exceptions.VolumeResourceBuildErrorException" AND
  (build_name:legacy-tripleo-ci-centos-7-containers-multinode OR
  build_name:legacy-tripleo-ci-centos-7-multinode-oooq OR
  build_name:legacy-tripleo-ci-centos-7-scenario001-multinode-oooq-container
  ) AND
  tags:console
whayutin•git/OPENSTACK/elastic-recheck ᐅ elastic-recheck-query queries/foo.yaml [virtenv] earlier ⚡ thinkdoe ⌚ 17:00:40
total hits: 142
build_branch
  100% master
build_change
  4% 457822
  4% 510464
  4% 510537
  4% 514759
  2% 471245
build_name
  81% legacy-tripleo-ci-centos-7-containers-multinode
  18% legacy-tripleo-ci-centos-7-scenario001-multinode-oooq-container
build_node
  100% centos-7
build_queue
  76% check
  23% gate
build_status
  100% FAILURE
build_zuul_url
  100% N/A
filename
  100% logs/undercloud/var/log/extra/logstash.txt
log_url
  1% http://logs.openstack.org/01/512501/6/gate/legacy-tripleo-ci-centos-7-containers-multinode/e02da75/logs/undercloud/var/log/extra/logstash.txt
  1% http://logs.openstack.org/02/515402/5/check/legacy-tripleo-ci-centos-7-containers-multinode/e83aa70/logs/undercloud/var/log/extra/logstash.txt
  1% http://logs.openstack.org/02/515502/1/check/legacy-tripleo-ci-centos-7-containers-multinode/8d8772d/logs/undercloud/var/log/extra/logstash.txt
  1% http://logs.openstack.org/06/516206/6/gate/legacy-tripleo-ci-centos-7-containers-multinode/7d1059e/logs/undercloud/var/log/extra/logstash.txt
  1% http://logs.openstack.org/07/472607/113/check/legacy-tripleo-ci-centos-7-containers-multinode/b8e2b20/logs/undercloud/var/log/extra/logstash.txt
node_provider
  28% ovh-bhs1
  21% rax-iad
  14% rax-dfw
  9% rax-ord
  8% ovh-gra1
port
  4% 52177
  4% 33524
  4% 47365
  2% 32860
  2% 32880
project
  54% openstack/tripleo-heat-templates
  18% openstack/tripleo-common
  12% openstack/tripleo-quickstart-extras
  7% openstack/puppet-tripleo
  2% openstack/instack-undercloud
tags
  100% logstash.txt console postci
voting
  100% 1

Revision history for this message
Gorka Eguileor (gorka) wrote :

The code used for the Cinder-Volume (that is running on bare metal) is out of sync with the code being run for Cinder-API and Cinder-Scheduler (that are running on containers).

The code from the Cinder-Volume is running a package from the 31st of October, so it's missing the patch [1] that moves the Cinder Versioned Objects version from 1.29 to 1.30, while the other 2 services are running with that patch.

This means that when the scheduler starts it sets in the DB a mark that it's running v1.30, and when the Volume service initializes the RPC used to connect to the scheduler tries to set v1.30 for the Versioned Objects payload, but fails because the latest known version for the service is v1.29.

The solution is getting all services in sync, be it by updating the Cinder-Volume service package or using and older version for the containers.

[1] https://review.openstack.org/#/c/512276/

Revision history for this message
Alan Bishop (alan-bishop) wrote :

I think this may be another manifestation of the problem that caused [1], and this issue will also be resolved when [2] merges (which is stuck behind [3]).

[1] https://bugs.launchpad.net/tripleo/+bug/1729253
[2] https://review.openstack.org/517038
[3] https://review.openstack.org/517222

Revision history for this message
wes hayutin (weshayutin) wrote :

adding promotion blocker because it's failing in upstream gate jobs, not just check

tags: added: promotion-blocker
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.