nova-scheduler doesnt reconnect to databases when started and database is down

Bug #1444532 reported by Alejandro Comisario
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Invalid
Undecided
Unassigned
nova (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

In Juno release (ubuntu packages), when you start nova-scheduler but database is down, the service never reconnects, the stacktrace is as follow :

AUDIT nova.service [-] Starting scheduler node (version 2014.2.2)
ERROR nova.openstack.common.threadgroup [-] (OperationalError) (2003, "Can't connect to MySQL server on '10.128.30.11' (111)") None None
TRACE nova.openstack.common.threadgroup Traceback (most recent call last):
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/openstack/common/threadgroup.py", line 125, in wait
TRACE nova.openstack.common.threadgroup x.wait()
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/openstack/common/threadgroup.py", line 47, in wait
TRACE nova.openstack.common.threadgroup return self.thread.wait()
TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 173, in wait
TRACE nova.openstack.common.threadgroup return self._exit_event.wait()
TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 121, in wait
TRACE nova.openstack.common.threadgroup return hubs.get_hub().switch()
TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 293, in switch
TRACE nova.openstack.common.threadgroup return self.greenlet.switch()
TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 212, in main
TRACE nova.openstack.common.threadgroup result = function(*args, **kwargs)
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/openstack/common/service.py", line 490, in run_service
TRACE nova.openstack.common.threadgroup service.start()
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/service.py", line 169, in start
TRACE nova.openstack.common.threadgroup self.host, self.binary)
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/conductor/api.py", line 161, in service_get_by_args
TRACE nova.openstack.common.threadgroup binary=binary, topic=None)
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/utils.py", line 949, in wrapper
TRACE nova.openstack.common.threadgroup return func(*args, **kwargs)
TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/server.py", line 139, in inner
TRACE nova.openstack.common.threadgroup return func(*args, **kwargs)
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 279, in service_get_all_by
TRACE nova.openstack.common.threadgroup result = self.db.service_get_by_args(context, host, binary)
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 136, in service_get_by_args
TRACE nova.openstack.common.threadgroup return IMPL.service_get_by_args(context, host, binary)
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 125, in wrapper
TRACE nova.openstack.common.threadgroup return f(*args, **kwargs)
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 490, in service_get_by_args
TRACE nova.openstack.common.threadgroup result = model_query(context, models.Service).\
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 213, in model_query
TRACE nova.openstack.common.threadgroup session = kwargs.get('session') or get_session(use_slave=use_slave)
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 101, in get_session
TRACE nova.openstack.common.threadgroup facade = _create_facade_lazily()
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 91, in _create_facade_lazily
TRACE nova.openstack.common.threadgroup _ENGINE_FACADE = db_session.EngineFacade.from_config(CONF)
TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/oslo/db/sqlalchemy/session.py", line 795, in from_config
TRACE nova.openstack.common.threadgroup retry_interval=conf.database.retry_interval)
TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/oslo/db/sqlalchemy/session.py", line 711, in __init__
TRACE nova.openstack.common.threadgroup **engine_kwargs)
TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/oslo/db/sqlalchemy/session.py", line 386, in create_engine
TRACE nova.openstack.common.threadgroup connection_trace=connection_trace
TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/oslo/db/sqlalchemy/utils.py", line 890, in __call__
TRACE nova.openstack.common.threadgroup self._url_from_target(target), target, arg, kw)
TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/oslo/db/sqlalchemy/utils.py", line 927, in _dispatch_on
TRACE nova.openstack.common.threadgroup return self._dispatch_on_db_driver(dbname, driver, arg, kw)
TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/oslo/db/sqlalchemy/utils.py", line 981, in _dispatch_on_db_driver
TRACE nova.openstack.common.threadgroup if self._invoke_fn(fn, arg, kw) is not None:
TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/oslo/db/sqlalchemy/utils.py", line 930, in _invoke_fn
TRACE nova.openstack.common.threadgroup return fn(*arg, **kw)
TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/oslo/db/sqlalchemy/session.py", line 456, in _init_events
TRACE nova.openstack.common.threadgroup realmode = engine.execute("SHOW VARIABLES LIKE 'sql_mode'").fetchone()
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1751, in execute
TRACE nova.openstack.common.threadgroup connection = self.contextual_connect(close_with_result=True)
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1799, in contextual_connect
TRACE nova.openstack.common.threadgroup self.pool.connect(),
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 338, in connect
TRACE nova.openstack.common.threadgroup return _ConnectionFairy._checkout(self)
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 641, in _checkout
TRACE nova.openstack.common.threadgroup fairy = _ConnectionRecord.checkout(pool)
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 440, in checkout
TRACE nova.openstack.common.threadgroup rec = pool._do_get()
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 961, in _do_get
TRACE nova.openstack.common.threadgroup return self._create_connection()
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 285, in _create_connection
TRACE nova.openstack.common.threadgroup return _ConnectionRecord(self)
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 411, in __init__
TRACE nova.openstack.common.threadgroup self.connection = self.__connect()
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 537, in __connect
TRACE nova.openstack.common.threadgroup connection = self.__pool._creator()
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/strategies.py", line 96, in connect
TRACE nova.openstack.common.threadgroup connection_invalidated=invalidated
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/sqlalchemy/util/compat.py", line 199, in raise_from_cause
TRACE nova.openstack.common.threadgroup reraise(type(exception), exception, tb=exc_tb)
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/strategies.py", line 90, in connect
TRACE nova.openstack.common.threadgroup return dialect.connect(*cargs, **cparams)
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/default.py", line 377, in connect
TRACE nova.openstack.common.threadgroup return self.dbapi.connect(*cargs, **cparams)
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/MySQLdb/__init__.py", line 81, in Connect
TRACE nova.openstack.common.threadgroup return Connection(*args, **kwargs)
TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/MySQLdb/connections.py", line 187, in __init__
TRACE nova.openstack.common.threadgroup super(Connection, self).__init__(*args, **kwargs2)
TRACE nova.openstack.common.threadgroup OperationalError: (OperationalError) (2003, "Can't connect to MySQL server on '10.128.30.11' (111)") None None
TRACE nova.openstack.common.threadgroup

That doesnt happen once nova-scheduler connects, and database is gone, after the database is up again IT DOES reconnects, but the calls are from different part of the code ( nova.servicegroup.drivers.db ) and when it doesnt work its called from ( nova.openstack.common.threadgroup ), the stack when it works is as follow:

ERROR nova.servicegroup.drivers.db [-] model server went away
TRACE nova.servicegroup.drivers.db Traceback (most recent call last):
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/nova/servicegroup/drivers/db.py", line 99, in _report_state
TRACE nova.servicegroup.drivers.db service.service_ref, state_catalog)
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/nova/conductor/api.py", line 180, in service_update
TRACE nova.servicegroup.drivers.db return self._manager.service_update(context, service, values)
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/nova/utils.py", line 949, in wrapper
TRACE nova.servicegroup.drivers.db return func(*args, **kwargs)
TRACE nova.servicegroup.drivers.db File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/server.py", line 139, in inner
TRACE nova.servicegroup.drivers.db return func(*args, **kwargs)
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 320, in service_update
TRACE nova.servicegroup.drivers.db svc = self.db.service_update(context, service['id'], values)
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 150, in service_update
TRACE nova.servicegroup.drivers.db return IMPL.service_update(context, service_id, values)
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 125, in wrapper
TRACE nova.servicegroup.drivers.db return f(*args, **kwargs)
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 181, in wrapped
TRACE nova.servicegroup.drivers.db return f(*args, **kwargs)
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 524, in service_update
TRACE nova.servicegroup.drivers.db with_compute_node=False, session=session)
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 424, in _service_get
TRACE nova.servicegroup.drivers.db result = query.first()
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 2341, in first
TRACE nova.servicegroup.drivers.db ret = list(self[0:1])
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 2208, in __getitem__
TRACE nova.servicegroup.drivers.db return list(res)
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 2412, in __iter__
TRACE nova.servicegroup.drivers.db return self._execute_and_instances(context)
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 2425, in _execute_and_instances
TRACE nova.servicegroup.drivers.db close_with_result=True)
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 2416, in _connection_from_session
TRACE nova.servicegroup.drivers.db **kw)
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 854, in connection
TRACE nova.servicegroup.drivers.db close_with_result=close_with_result)
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 858, in _connection_for_bind
TRACE nova.servicegroup.drivers.db return self.transaction._connection_for_bind(engine)
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 329, in _connection_for_bind
TRACE nova.servicegroup.drivers.db transaction = conn.begin()
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 420, in begin
TRACE nova.servicegroup.drivers.db self.__transaction = RootTransaction(self)
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1357, in __init__
TRACE nova.servicegroup.drivers.db self.connection._begin_impl(self)
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 490, in _begin_impl
TRACE nova.servicegroup.drivers.db self.dispatch.begin(self)
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/event/attr.py", line 260, in __call__
TRACE nova.servicegroup.drivers.db fn(*args, **kw)
TRACE nova.servicegroup.drivers.db File "/usr/local/lib/python2.7/dist-packages/oslo/db/sqlalchemy/session.py", line 331, in _begin_ping_listener
TRACE nova.servicegroup.drivers.db connection.scalar(select([1]))
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 659, in scalar
TRACE nova.servicegroup.drivers.db return self.execute(object, *multiparams, **params).scalar()
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 729, in execute
TRACE nova.servicegroup.drivers.db return meth(self, multiparams, params)
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/sql/elements.py", line 321, in _execute_on_connection
TRACE nova.servicegroup.drivers.db return connection._execute_clauseelement(self, multiparams, params)
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 826, in _execute_clauseelement
TRACE nova.servicegroup.drivers.db compiled_sql, distilled_params
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 893, in _execute_context
TRACE nova.servicegroup.drivers.db None, None)
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1169, in _handle_dbapi_exception
TRACE nova.servicegroup.drivers.db dbapi_conn_wrapper = self.connection
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 236, in connection
TRACE nova.servicegroup.drivers.db return self._revalidate_connection()
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 244, in _revalidate_connection
TRACE nova.servicegroup.drivers.db self.__connection = self.engine.raw_connection()
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1848, in raw_connection
TRACE nova.servicegroup.drivers.db return self.pool.unique_connection()
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 280, in unique_connection
TRACE nova.servicegroup.drivers.db return _ConnectionFairy._checkout(self)
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 641, in _checkout
TRACE nova.servicegroup.drivers.db fairy = _ConnectionRecord.checkout(pool)
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 442, in checkout
TRACE nova.servicegroup.drivers.db dbapi_connection = rec.get_connection()
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 505, in get_connection
TRACE nova.servicegroup.drivers.db self.connection = self.__connect()
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 537, in __connect
TRACE nova.servicegroup.drivers.db connection = self.__pool._creator()
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/strategies.py", line 96, in connect
TRACE nova.servicegroup.drivers.db connection_invalidated=invalidated
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/util/compat.py", line 199, in raise_from_cause
TRACE nova.servicegroup.drivers.db reraise(type(exception), exception, tb=exc_tb)
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/strategies.py", line 90, in connect
TRACE nova.servicegroup.drivers.db return dialect.connect(*cargs, **cparams)
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/default.py", line 377, in connect
TRACE nova.servicegroup.drivers.db return self.dbapi.connect(*cargs, **cparams)
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/MySQLdb/__init__.py", line 81, in Connect
TRACE nova.servicegroup.drivers.db return Connection(*args, **kwargs)
TRACE nova.servicegroup.drivers.db File "/usr/lib/python2.7/dist-packages/MySQLdb/connections.py", line 187, in __init__
TRACE nova.servicegroup.drivers.db super(Connection, self).__init__(*args, **kwargs2)
TRACE nova.servicegroup.drivers.db OperationalError: (OperationalError) (2003, "Can't connect to MySQL server on '10.128.30.11' (111)") None None
TRACE nova.servicegroup.drivers.db

affects: ubuntu → nova (Ubuntu)
Revision history for this message
zhaobo (zhaobo6) wrote :

I can not repro the err you mentioned.Could you give more env about it?
I tried to repro it. I didn't see the server never reconnect mysql.

Revision history for this message
Alejandro Comisario (alejandro-f) wrote :

thanks for the reply!
basically what i do, is to shutdown the databa se, and restart nova-scheduler.
you get the scheduler logs not able to connect to database, then start the database and try to launch a vm, youll get the scheduler saying it cant connect to an already up database.

tell me if you can reproduce it.

Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

Not sure it's Nova related, since the connection is managed by MySQLdb here.
That's all due to the configuration done on SQLA like said in http://docs.sqlalchemy.org/en/rel_1_0/core/engines.html

Please check your "connection" option for the nova.conf file as this is how it's called.

Changed in nova:
status: New → Invalid
Chuck Short (zulcss)
Changed in nova (Ubuntu):
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.