native/idl ovsdb driver loses some ovsdb transactions

Bug #1630920 reported by Bence Romsics
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
networking-bgpvpn
Fix Released
Undecided
Unassigned
neutron
Fix Released
Undecided
Terry Wilson

Bug Description

It seems the 'native' and the 'vsctl' ovsdb drivers behave differently. The native/idl driver seems to lose some ovsdb transactions, at least the transactions setting the 'other_config' ovs port attribute.

I have written about this in a comment of an earlier bug report (https://bugs.launchpad.net/neutron/+bug/1626010). But I opened this new bug report because the two problems seem to be independent and that other comment may have gone unnoticed.

It is not completely clear to me what difference this causes in user-observable behavior. I think it at least leads to losing information about which conntrack zone to use in the openvswitch firewall driver. See here:

https://github.com/openstack/neutron/blob/3ade301/neutron/agent/linux/openvswitch_firewall/firewall.py#L257

The details:

If I use the vsctl ovsdb driver:

ml2_conf.ini:
[ovs]
ovsdb_interface = vsctl

then I see this:

$ > /opt/stack/logs/q-agt.log
$ sudo ovs-vsctl list Port | grep other_config | grep -c net_uuid
1
$ openstack server create --flavor cirros256 --image cirros-0.3.4-x86_64-uec --nic net-id=net0 --wait vm0
$ sudo ovs-vsctl list Port | grep other_config | grep -c net_uuid
2
$ openstack server delete vm0
$ sleep 3
$ sudo ovs-vsctl list Port | grep other_config | grep -c net_uuid
1
$ egrep -c 'Transaction caused no change' /opt/stack/logs/q-agt.log
0

But if I use the (default) native driver:

ml2_conf.ini:
[ovs]
ovsdb_interface = native

Then this happens:

$ > /opt/stack/logs/q-agt.log
$ sudo ovs-vsctl list Port | grep other_config | grep -c net_uuid
1
$ openstack server create --flavor cirros256 --image cirros-0.3.4-x86_64-uec --nic net-id=net0 --wait vm0
$ sudo ovs-vsctl list Port | grep other_config | grep -c net_uuid
1
$ openstack server delete vm0
$ sleep 3
$ sudo ovs-vsctl list Port | grep other_config | grep -c net_uuid
1
$ egrep -c 'Transaction caused no change' /opt/stack/logs/q-agt.log
22

A sample log message from q-agt.log:

2016-10-06 09:23:05.447 DEBUG neutron.agent.ovsdb.impl_idl [-] Running txn command(idx=0): DbSetCommand(table=Port, col_values=(('other_config', {'tag': 1}),), record=tap8e2a390d-63) from (pid=6068) do_commit /opt/stack/neutron/neutron/agent/ovsdb/impl_idl.py:99
2016-10-06 09:23:05.448 DEBUG neutron.agent.ovsdb.impl_idl [-] Transaction caused no change from (pid=6068) do_commit /opt/stack/neutron/neutron/agent/ovsdb/impl_idl.py:126

devstack version: 563d377
neutron version: 3ade301

tags: added: ovs-lib
Revision history for this message
Terry Wilson (otherwiseguy) wrote :

Transaction caused no change log message will never show up on vsctl because it is a native-only log message. So don't use that as a gauge of success when comparing. The caused no change message should only come back from the database when you are committing something that doesn't actually change the database.

Revision history for this message
Bence Romsics (bence-romsics) wrote :

My original reason for opening this ticket was probably better described in the other ticket, let me try to explain better.

One place where we actually use the 'other_config' field is this openvswitch firewall rule:

table=0, priority=90,dl_dst=fa:16:3e:8a:48:37 actions=load:0x4b->NXM_NX_REG5[],load:0xfff->NXM_NX_REG6[],resubmit(,81)

Here we load the value 0xfff into register 6. But this only happens with the native driver. Using the vsctl driver we would store the port's actual vlan tag into reg6. The below code place shows the reason for this difference is the lack of 'other_config':

https://github.com/openstack/neutron/blob/3ade301/neutron/agent/linux/openvswitch_firewall/firewall.py#L257

Here's the same rule (with a different reg6 value) from the ovs firewall devref.
http://docs.openstack.org/developer/neutron/devref/openvswitch_firewall.html:

table=0, priority=90,dl_dst=fa:16:3e:a4:22:10 actions=load:0x1->NXM_NX_REG5[],load:0x284->NXM_NX_REG6[],resubmit(,81)

The registers will be used to set up conntrack zones. So I think if we don't have 'other_config' we may end up using the same conntrack zone for traffic that should be conntracked in different conntrack zones. That's quite an arcane error, but looks real.

Also please notice that some of the 'Transaction caused no change' messages come back in reaction to this:

DbSetCommand(table=Port, col_values=(('other_config', {'tag': 1}),), record=tap8e2a390d-63)

If that DbSetCommand caused no change because other_config was already set before, we still should see other_config filled in the output of:

sudo ovs-vsctl list Port | grep other_config

But in my devstack I never see non-empty other_config with the native driver.

Changed in neutron:
assignee: nobody → Terry Wilson (otherwiseguy)
status: New → In Progress
Revision history for this message
Terry Wilson (otherwiseguy) wrote :

The problem is how set_db_attribute is called. The column type of Port.other_config is {'max': 'unlimited', 'value': 'string', 'key': 'string', 'min': 0}, i.e. a dict with string keys and string values. We are passing in an int segmentation_id and a None physical_network. It just happens to work with ovs-vsctl because there is no type since it is all just CLI strings. The fix is when we store things in other_config/external_ids we have to coerce the values to strings to match the column type.

We could try to add some kind of schema inspection to the ovsdb native implementation, but that's a lot of overhead. Really, the OVS Python library itself could try to coerce things to the correct type itself. But its still going to be fastest/best to just pass the right types from the outset.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/383540

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (master)

Reviewed: https://review.openstack.org/383054
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=cb48d15466643b72931ef43085421a9fb1eaf870
Submitter: Jenkins
Branch: master

commit cb48d15466643b72931ef43085421a9fb1eaf870
Author: Terry Wilson <email address hidden>
Date: Mon Oct 3 22:32:42 2016 -0500

    set_db_attribute differs between vsctl and native

    On the following:

    b.set_db_attribute('Port', pname, 'other_config', {'a': 'b'})
    b.set_db_attribute('Port', pname, 'other_config', {'c': 'd'})

    will produce different results between the vsctl and native OVSDB
    implementations. vsctl will merge the values into a single dict
    and native will overwrite the dict.

    This patch makes the native implementation mirror vsctl.

    Related-Bug: #1630920
    Change-Id: Ie7680a301b8b3ee8e5654666e2aea78ecbd07a12

Revision history for this message
Thomas Morin (tmmorin-orange) wrote :

We have hit this bug when setting up a tempest scenario test in networking-bgpvpn [1].
The setup we have is simple: ML2 ovs, a service plugin and an agent extension.

Isn't it surprising that our simple tempest gate setup triggers the bug and none of the neutron tempest test ?

[1] https://bugs.launchpad.net/neutron/+bug/1642388

Revision history for this message
Thomas Morin (tmmorin-orange) wrote :

If I'm correct, the string/int issue is preventing from properly tagging VM ports.

Wouldn't it make sense to tag this bug with Importance Critical ?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/383540
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=3d500d36608e83d202c1a6c8438ea5961a7debe1
Submitter: Jenkins
Branch: master

commit 3d500d36608e83d202c1a6c8438ea5961a7debe1
Author: Terry Wilson <email address hidden>
Date: Thu Oct 6 21:52:56 2016 -0500

    Only send string values to OVSDB other_config column

    The other_config columns in OVSDB are defined as maps with string
    keys and string values. The OVS agent was passing an integer
    segmentation id and could pass None as the physical_network.
    Unfortunately, the upstream Python OVS library does not pass the
    exceptions through to us.

    Change-Id: Iafa6be3749b1ee863f5fa71150c708fc46951510
    Closes-Bug: #1630920

Changed in neutron:
status: In Progress → Fix Released
Changed in bgpvpn:
status: New → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/408825

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/newton)

Reviewed: https://review.openstack.org/408825
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=c15aa7ee9797603d812ad160673ef09a3203384f
Submitter: Jenkins
Branch: stable/newton

commit c15aa7ee9797603d812ad160673ef09a3203384f
Author: Terry Wilson <email address hidden>
Date: Thu Oct 6 21:52:56 2016 -0500

    Only send string values to OVSDB other_config column

    The other_config columns in OVSDB are defined as maps with string
    keys and string values. The OVS agent was passing an integer
    segmentation id and could pass None as the physical_network.
    Unfortunately, the upstream Python OVS library does not pass the
    exceptions through to us.

    Change-Id: Iafa6be3749b1ee863f5fa71150c708fc46951510
    Closes-Bug: #1630920
    (cherry picked from commit 3d500d36608e83d202c1a6c8438ea5961a7debe1)

tags: added: in-stable-newton
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 10.0.0.0b2

This issue was fixed in the openstack/neutron 10.0.0.0b2 development milestone.

tags: added: neutron-proactive-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/422026

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/mitaka)

Reviewed: https://review.openstack.org/422026
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=121283d42a43c9e98c232f634aeed55b7eedb957
Submitter: Jenkins
Branch: stable/mitaka

commit 121283d42a43c9e98c232f634aeed55b7eedb957
Author: Terry Wilson <email address hidden>
Date: Thu Oct 6 21:52:56 2016 -0500

    Only send string values to OVSDB other_config column

    The other_config columns in OVSDB are defined as maps with string
    keys and string values. The OVS agent was passing an integer
    segmentation id and could pass None as the physical_network.
    Unfortunately, the upstream Python OVS library does not pass the
    exceptions through to us.

    Change-Id: Iafa6be3749b1ee863f5fa71150c708fc46951510
    Closes-Bug: #1630920
    (cherry picked from commit 3d500d36608e83d202c1a6c8438ea5961a7debe1)

tags: added: in-stable-mitaka
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 9.2.0

This issue was fixed in the openstack/neutron 9.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 8.4.0

This issue was fixed in the openstack/neutron 8.4.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.