neutron-ovn-tempest-full-multinode-ovs-master job is failing 100% times

Bug #1886807 reported by Slawek Kaplonski
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Confirmed
High
Maciej Jozefczyk
Changed in neutron:
assignee: nobody → Maciej Jozefczyk (maciej.jozefczyk)
Revision history for this message
Maciej Jozefczyk (maciejjozefczyk) wrote :

The test

neutron_tempest_plugin.scenario.test_connectivity.NetworkConnectivityTest.test_connectivity_through_2_routers

looks like failing all the time.

Whats new, the same test with combination:

OVN_BRANCH: branch-20.03
OVS_BRANCH: branch-2.13

works.

It seems to be regression caused by commit recently merged in OVN master branch.

Revision history for this message
Maciej Jozefczyk (maciejjozefczyk) wrote :

My previous comment is wrong. It looks like race-condition. Full log:
http://paste.openstack.org/show/796117/

There are many errors linked to failed tests. All of those fails on similar issue:

Jul 18 00:33:03.396324 ubuntu-bionic-rax-iad-0018417357 neutron-server[7625]: ERROR neutron.services.ovn_l3.plugin [None req-74120644-67fe-4b11-964e-86299ee5407f tempest-RoutersIpV6Test-2067377291 tempest-RoutersIpV6Test-2067377291] Unable to update lrouter for deb1a165-eaaa-4406-aee5-bacc5d106260: sqlalchemy.orm.exc.StaleDataError: UPDATE statement on table 'standardattributes' expected to update 1 row(s); 0 were matched.
Jul 18 00:33:03.396324 ubuntu-bionic-rax-iad-0018417357 neutron-server[7625]: ERROR neutron.services.ovn_l3.plugin Traceback (most recent call last):
Jul 18 00:33:03.396324 ubuntu-bionic-rax-iad-0018417357 neutron-server[7625]: ERROR neutron.services.ovn_l3.plugin File "/opt/stack/neutron/neutron/services/ovn_l3/plugin.py", line 157, in update_router
Jul 18 00:33:03.396324 ubuntu-bionic-rax-iad-0018417357 neutron-server[7625]: ERROR neutron.services.ovn_l3.plugin original_router)
...
Jul 18 00:33:03.396324 ubuntu-bionic-rax-iad-0018417357 neutron-server[7625]: ERROR neutron.services.ovn_l3.plugin File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/persistence.py", line 1028, in _emit_update_statements
Jul 18 00:33:03.396324 ubuntu-bionic-rax-iad-0018417357 neutron-server[7625]: ERROR neutron.services.ovn_l3.plugin % (table.description, len(records), rows)
Jul 18 00:33:03.396324 ubuntu-bionic-rax-iad-0018417357 neutron-server[7625]: ERROR neutron.services.ovn_l3.plugin sqlalchemy.orm.exc.StaleDataError: UPDATE statement on table 'standardattributes' expected to update 1 row(s); 0 were matched.

Looks like main cause is somewhere around this function:

Jul 18 01:09:50.692083 ubuntu-bionic-rax-iad-0018417357 neutron-server[7625]: ERROR oslo_db.api File "/opt/stack/neutron/neutron/services/ovn_l3/plugin.py", line 163, in update_router

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello:

I saw that Nova is not spawning correctly the VM [1]. This is recurrent in all failing test cases [2].

The problem is almost always the same: the test case cannot connect to the VM. E.g.: [3]

This error [2] could be related to [4]. When Nova compute tries to execute, inside a privsep context, a command execution, the privsep daemon calling [5] does not implement the workaround provided in the LP bug.

I've pushed [6] in an attempt to fix this issue.

Regards.

[1]https://a5e1b7092bd931de9d7c-99461a827f8c9e81159099d5f417814c.ssl.cf1.rackcdn.com/738163/21/check/neutron-ovn-tempest-full-multinode-ovs-master/462d24b/compute1/logs/screen-n-cpu.txt
[2]http://paste.openstack.org/show/798558/
[3]http://paste.openstack.org/show/798570/
[4]https://bugs.launchpad.net/nova/+bug/1863021
[5]https://github.com/openstack/oslo.concurrency/blob/ff1e681656ff67fe5e4a276d9dabd77a29db08cb/oslo_concurrency/processutils.py#L193
[6]https://review.opendev.org/#/c/755256/

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.