kombu reconnecting needs better exception catching

Bug #888621 reported by Andrea Rosa
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Andrea Rosa

Bug Description

NOTE(comstud): Amended this bug report to include another case that needs fixing. Will fix together.

(Reported by Andrea:)
If there are some socket issues, e.g. after a restart of the rabbitmq server, the declare consume class can raise an exception when try to declare its queue.
2011-10-31 09:17:19,073 ERROR nova.rpc [-] Exception during message handling
(nova.rpc): TRACE: Traceback (most recent call last):
(nova.rpc): TRACE: File "/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 620, in _process_data
(nova.rpc): TRACE: rval = node_func(context=ctxt, **node_args)
(nova.rpc): TRACE: File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 129, in wrapped
(nova.rpc): TRACE: raise Error(str(e))
(nova.rpc): TRACE: Error: (320, u"CONNECTION_FORCED - broker forced connection closure with reason 'shutdown'", (0, 0), '')
(nova.rpc): TRACE:

(Reported by comstud:)
http://paste.openstack.org/show/4231/

Revision history for this message
Thierry Carrez (ttx) wrote :

What would be the expected behavior ?

Changed in nova:
status: New → Incomplete
Revision history for this message
Andrea Rosa (andrea-rosa-m) wrote :

The expected behavior is that we need to catching the exception and try to reconnect to the server.
The reconnection process will through out the broken connection from the connection pool and it will create a new one
I think you can assign this bug to me, I am working on it: https://review.openstack.org/#change,1503

Thierry Carrez (ttx)
Changed in nova:
assignee: nobody → Andrea Rosa (andrea-rosa-m)
importance: Undecided → Medium
status: Incomplete → In Progress
summary: - exception for decalre consumer in the case of socket error
+ exception for declare consumer in the case of socket error
Chris Behrens (cbehrens)
summary: - exception for declare consumer in the case of socket error
+ kombu reconnecting needs better exception catching
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/2973

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/2973
Committed: http://github.com/openstack/nova/commit/59e8ae1362f33ab30b2dc900dcbde30efc5a57c8
Submitter: Jenkins
Branch: master

commit 59e8ae1362f33ab30b2dc900dcbde30efc5a57c8
Author: Chris Behrens <email address hidden>
Date: Wed Jan 11 12:35:42 2012 -0800

    Implement more complete kombu reconnecting

    Fixes bug 888621

    We were missing some wrapping around when consumers are declared and
    a case where we had an exception we weren't trapping. In the latter
    case, it's not easy to trap it because you'd have to bypass the kombu
    interface and import amqplib and try to trap one of its exceptions.
    What I've implemented here looks for 'timeout' in any exception, even
    though I really don't like it. :)

    Fixes HACKING violations while I'm at it.

    Change-Id: I0132fbc4377e221b0a366d0340652147ddb33c87

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → essex-3
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: essex-3 → 2012.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.