DBDeadlock trying to update instance_actions.updated_at

Bug #1739658 reported by Matt Riedemann
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Matt Riedemann

Bug Description

Seeing this here:

http://logs.openstack.org/58/511358/47/check/legacy-tempest-dsvm-neutron-full/539f670/logs/screen-n-cpu.txt.gz?level=TRACE#_Dec_21_14_57_35_530720

Dec 21 14:57:35.530720 ubuntu-xenial-ovh-bhs1-0001558681 nova-compute[15980]: ERROR oslo_messaging.rpc.server [None req-bf61a8c5-88f4-419c-a3c3-4f076b60ec96 tempest-ServerActionsTestJSON-1969878288 tempest-ServerActionsTestJSON-1969878288] Exception during message handling: RemoteError: Remote error: DBDeadlock (pymysql.err.InternalError) (1213, u'Deadlock found when trying to get lock; try restarting transaction') [SQL: u'UPDATE instance_actions SET updated_at=%(updated_at)s WHERE instance_actions.id = %(instance_actions_id)s'] [parameters: {'instance_actions_id': 313, 'updated_at': datetime.datetime(2017, 12, 21, 14, 57, 35, 420626)}]

http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22RemoteError%3A%20Remote%20error%3A%20DBDeadlock%20(pymysql.err.InternalError)%20(1213%2C%20u'Deadlock%20found%20when%20trying%20to%20get%20lock%3B%20try%20restarting%20transaction')%20%5BSQL%3A%20u'UPDATE%20instance_actions%20SET%20updated_at%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22&from=10d

This would be a result of the @wrap_instance_event decorator in the compute manager creating an instance action event:

Dec 21 14:57:35.534644 ubuntu-xenial-ovh-bhs1-0001558681 nova-compute[15980]: ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/compute/utils.py", line 910, in decorated_function
Dec 21 14:57:35.534785 ubuntu-xenial-ovh-bhs1-0001558681 nova-compute[15980]: ERROR oslo_messaging.rpc.server with EventReporter(context, event_name, instance_uuid):
Dec 21 14:57:35.534905 ubuntu-xenial-ovh-bhs1-0001558681 nova-compute[15980]: ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/compute/utils.py", line 881, in __enter__

Logstash shows that this started within the last 10 days, which would have been when this change merged:

https://github.com/openstack/nova/commit/011b36aa93fed0fd2f3b06e9dd5bd27ea9ccbf6e

We have the retry decorator on action_event_finish but not action_event_start because we didn't think we'd need it, but I guess we do.

Tags: db
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/529672

Changed in nova:
assignee: nobody → Matt Riedemann (mriedem)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/529672
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3e3bc32db064de5783d2aab3ebd3977d43d8eacb
Submitter: Zuul
Branch: master

commit 3e3bc32db064de5783d2aab3ebd3977d43d8eacb
Author: Matt Riedemann <email address hidden>
Date: Thu Dec 21 13:41:41 2017 -0500

    Add retry_on_deadlock decorator to action_event_start

    This decorator already exists on action_event_finish
    but we didn't think we'd need it on action_event_start,
    but looking at failures in the logs over the last 10 days
    it's clear we do.

    Change-Id: If11ebc64a3275a76d9ee6b39f1cd00f2ef0413de
    Closes-Bug: #1739658

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 17.0.0.0b3

This issue was fixed in the openstack/nova 17.0.0.0b3 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.