centos-8-scenario002-standalone failing tempest - SSH timed out

Bug #1904223 reported by Marios Andreou
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Unassigned

Bug Description

The tripleo-ci-centos-8-scenario002-standalone is failing during tempest run at [1][2][3] with error like:

        * tempest.lib.exceptions.SSHTimeout: Connection to the 192.168.24.136 via SSH timed out.

Digging a little I see the same error in the nova-compute log, at [4] for example:

        * 2020-11-13 07:35:37.866 7 WARNING os_brick.encryptors.luks [req-d17a67ac-f40c-4326-9402-b058873f547e d18971a27a894516a2856684ddb50a25 b79a73b43d4941799d137e1adce638c2 - default default] isLuks exited abnormally (status 1): : oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.

Not sure if that is the root cause yet though - this is blocking various patches for example https://review.opendev.org/#/c/762497 (needed for unrelated bug #1903508 which is a victoria promotion blocker)

[1] https://80137ce53930819135d8-42d904af0faa486c8226703976d821a0.ssl.cf2.rackcdn.com/762497/4/check/tripleo-ci-centos-8-scenario002-standalone/175ce9a/logs/undercloud/var/log/tempest/stestr_results.html
[2] https://9cce6a92f5e67989bf3f-9f131354b122204fb24a7b43973ed8e6.ssl.cf1.rackcdn.com/762584/1/check/tripleo-ci-centos-8-scenario002-standalone/6a50122/logs/undercloud/var/log/tempest/stestr_results.html
[3] http://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_770/762584/1/check/tripleo-ci-centos-8-scenario002-standalone/7709997/logs/undercloud/var/log/tempest/stestr_results.html
[4] https://9cce6a92f5e67989bf3f-9f131354b122204fb24a7b43973ed8e6.ssl.cf1.rackcdn.com/762584/1/check/tripleo-ci-centos-8-scenario002-standalone/6a50122/logs/undercloud/var/log/containers/nova/nova-compute.log

tags: added: promotion-blocker
Revision history for this message
Ronelle Landy (rlandy) wrote :

https://review.opendev.org/762865 added to the skiplist while debug is in progress

Revision history for this message
Marios Andreou (marios-b) wrote :

I'm digging here but can't find something useful yet.

rlandy merged an addition to the skip at https://review.opendev.org/#/c/763078/ Add periodic sc002 tests to skip for bug 1904223

Main issue I've seen thus far is the one from the description, i.e. compute-log has

        * https://80137ce53930819135d8-42d904af0faa486c8226703976d821a0.ssl.cf2.rackcdn.com/762497/4/check/tripleo-ci-centos-8-scenario002-standalone/175ce9a/logs/undercloud/var/log/containers/nova/nova-compute.log
        * 2020-11-12 20:53:08.963 7 WARNING os_brick.encryptors.luks [req-02224c40-831a-4539-bc17-229de4975b5e 724335db83504df59d7bc8cbd4461b80 17940204a4f64c4dab4937478c3585eb - default default] isLuks exited abnormally (status 1): : oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.

and it seems to line up with the tempest fail time

ft1.2: barbican_tempest_plugin.tests.scenario.test_volume_encryption.VolumeEncryptionTest.test_encrypted_cinder_volumes_luks[compute,id-89165fb4-5534-4b9d-8429-97ccffb8f86f,image,volume]testtools.testresult.real._StringException: pythonlogging:'': {{{
2020-11-12 20:52:44,912 253297 WARNING [barbican_tempest_plugin.tests.scenario.barbican_manager] Starting Tempest 25.0.0 release, CONF.scenario.img_file need a full path for the image. CONF.scenario.img_dir was deprecated and will be removed in the next release. Till Tempest 25.0.0, old behavior is maintained and keep working but starting Tempest 26.0.0, you need to specify the full path in CONF.scenario.img_file config option.

Revision history for this message
Marios Andreou (marios-b) wrote :

since we are now skipping this in scen2 on check/gate and periodics it will be difficult to debug

So to make debug easier I posted a revert at https://review.opendev.org/763359 (includes https://review.opendev.org/762865 and https://review.opendev.org/#/c/763078/) and testing with https://review.rdoproject.org/r/31154
We can use this to run on demand with the test enabled or to test a fix with added depends-on in r/31154

Revision history for this message
Marios Andreou (marios-b) wrote :

Confirmed we are still seeing that in the test https://review.rdoproject.org/r/31154

        * https://logserver.rdoproject.org/54/31154/1/check/tripleo-ci-centos-8-scenario002-standalone/665b8bb/logs/undercloud/var/log/tempest/stestr_results.html.gz
        * 2020-11-19 13:13:14,598 321649 ERROR [barbican_tempest_plugin.tests.scenario.manager] (VolumeEncryptionTest:test_encrypted_cinder_volumes_luks) Initializing SSH connection to 192.168.24.112 failed. Error: Connection to the 192.168.24.112 via SSH timed out.
        * 2020-11-19 13:13:14.598 321649 ERROR barbican_tempest_plugin.tests.scenario.manager paramiko.ssh_exception.AuthenticationException: Authentication failed.

so you can use https://review.rdoproject.org/r/31154 (+depends-on https://review.opendev.org/#/c/763359/) for debug and re-run when we think we have a fix

Changed in tripleo:
milestone: wallaby-1 → wallaby-2
Revision history for this message
Slawek Kaplonski (slaweq) wrote :

Looking at logs in failed job it seems for me that this is the same issue like in https://bugs.launchpad.net/tripleo/+bug/1906769
Can we mark it as duplicate of 1906769?

Revision history for this message
Marios Andreou (marios-b) wrote :

@Slawek let's do that once we confirm? it might well be duplicate but we won't know for sure until we see if whatever fixes the other bug also closes this one?

Changed in tripleo:
milestone: wallaby-2 → wallaby-3
Changed in tripleo:
milestone: wallaby-3 → wallaby-rc1
Changed in tripleo:
milestone: wallaby-rc1 → xena-1
Changed in tripleo:
milestone: xena-1 → xena-2
Changed in tripleo:
milestone: xena-2 → xena-3
Revision history for this message
Ronelle Landy (rlandy) wrote :

no scenario002 in skiplist - closing this out

Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.