nova-network sometimes crashes with bad state

Bug #719004 reported by Narayan Desai
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Invalid
Medium
Unassigned

Bug Description

We've run into a problem with nova-network (both with bzr655 and bzr669) where nova-network crashes with the following traceback. Upstart unhelpfully restarts it, which results in it dying again. I've put in a workaround that traps this error and skips the entry, which seems to right the system, after it works through the rabbitmq backlog that has built up. (in our case, it was 90K events, only half of which have been processed over the last 90 minutes.)

(nova.root): TRACE: Traceback (most recent call last):
(nova.root): TRACE: File "/usr/bin/nova-network", line 44, in <module>
(nova.root): TRACE: service.serve()
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/service.py", line 231, in serve
(nova.root): TRACE: x.start()
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/service.py", line 81, in start
(nova.root): TRACE: self.manager.init_host()
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/network/manager.py", line 470, in init_host
(nova.root): TRACE: super(VlanManager, self).init_host()
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/network/manager.py", line 129, in init_host
(nova.root): TRACE: self._on_set_network_host(ctxt, network['id'])
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/network/manager.py", line 579, in _on_set_network_host
(nova.root): TRACE: self.driver.update_dhcp(context, network_id)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/network/linux_net.py", line 296, in update_dhcp
(nova.root): TRACE: f.write(get_dhcp_hosts(context, network_id))
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/network/linux_net.py", line 279, in get_dhcp_hosts
(nova.root): TRACE: hosts.append(_host_dhcp(fixed_ip_ref))
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/network/linux_net.py", line 369, in _host_dhcp
(nova.root): TRACE: return "%s,%s.%s,%s" % (instance_ref['mac_address'],
(nova.root): TRACE: TypeError: 'NoneType' object is unsubscriptable
(nova.root): TRACE:

Revision history for this message
Kost (kost-isi) wrote :

Hi,

I am also seeing this error running an ubuntu image or the ttylinux image using FlatDHCP...

2011-02-15 16:49:52,956 ERROR nova.root [-] Exception during message handling
550 (nova.root): TRACE: Traceback (most recent call last):
551 (nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/rpc.py", line 192, in receive
552 (nova.root): TRACE: rval = node_func(context=ctxt, **node_args)
553 (nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/network/manager.py", line 418, in allocate_fixed_ip
554 (nova.root): TRACE: self.driver.update_dhcp(context, network_ref['id'])
555 (nova.root): TRACE: TypeError: 'NoneType' object is unsubscriptable
556 (nova.root): TRACE:
557 2011-02-15 16:49:52,957 ERROR nova.rpc [-] Returning exception 'NoneType' object is unsubscriptable to caller
558 2011-02-15 16:49:52,957 ERROR nova.rpc [-] ['Traceback (most recent call last):\n', ' File "/usr/lib/pymodules/python2.6/nova/rpc.py", li ne 192, in receive\n rval = node_func(context=ctxt, **node_args)\n', ' File "/usr/lib/pymodules/python2.6/nova/network/manager.py", li ne 418, in allocate_fixed_ip\n self.driver.update_dhcp(context, network_ref[\'id\'])\n', "TypeError: 'NoneType' object is unsubscriptab le\n"]

Revision history for this message
Thierry Carrez (ttx) wrote :

Apparently in some cases db.network_get_associated_fixed_ips returns a FixedIp with fixed_ip_ref['instance']=None... but I don't understand this code enough to tell if that's a normal use case that should be supported in get_dhcp_hosts, or if we should find the root cause.

Changed in nova:
importance: Undecided → High
status: New → Confirmed
Revision history for this message
Edina Varga (edina-varga) wrote :

i have similiar

(nova.root): TRACE: Traceback (most recent call last):
(nova.root): TRACE: File "/usr/bin/nova-network", line 44, in <module>
(nova.root): TRACE: service.serve()
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/service.py", line 231, in serve
(nova.root): TRACE: x.start()
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/service.py", line 81, in start
(nova.root): TRACE: self.manager.init_host()
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/network/manager.py", line 467, in init_host
(nova.root): TRACE: super(VlanManager, self).init_host()
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/network/manager.py", line 125, in init_host
(nova.root): TRACE: self._on_set_network_host(ctxt, network['id'])
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/network/manager.py", line 568, in _on_set_network_host
(nova.root): TRACE: network_ref)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/network/linux_net.py", line 169, in ensure_vlan_bridge
(nova.root): TRACE: interface = ensure_vlan(vlan_num)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/network/linux_net.py", line 178, in ensure_vlan
(nova.root): TRACE: _execute("sudo vconfig set_name_type VLAN_PLUS_VID_NO_PAD")
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/network/linux_net.py", line 327, in _execute
(nova.root): TRACE: return utils.execute(cmd, *args, **kwargs)
(nova.root): TRACE: File "/usr/lib/pymodules/python2.6/nova/utils.py", line 147, in execute
(nova.root): TRACE: cmd=cmd)
(nova.root): TRACE: ProcessExecutionError: Unexpected error while running command.
(nova.root): TRACE: Command: sudo vconfig set_name_type VLAN_PLUS_VID_NO_PAD
(nova.root): TRACE: Exit code: 1
(nova.root): TRACE: Stdout: ''
(nova.root): TRACE: Stderr: 'sudo: no tty present and no askpass program specified\n'

Revision history for this message
Masanori Itoh (itohm) wrote :

Hi Edina, folks,

I think the issue Edina reported is different from the original issue by Narayan and Kost.
If 'requiretty' is set in /etc/sudores and nova-network was executed from init script on boot time,
we'll see the message which Edina got:

  "(nova.root): TRACE: Stderr: 'sudo: no tty present and no askpass program"

So, could you check your /etc/sudoers, Edina?

from man sudoers
       requiretty If set, sudo will only run when the user is logged in
                       to a real tty. When this flag is set, sudo can only be
                       run from a login session and not via other means such
                       as cron(8) or cgi-bin scripts. This flag is off by
                       default.

Revision history for this message
Thierry Carrez (ttx) wrote :

Yes, the issue from Edina should be split to another bug, since it's not the same issue.

Revision history for this message
Masanori Itoh (itohm) wrote :

Hi,

BTW, are these issues(Narayan's one and Kost one) still reproducible in trunk?
Then, I would like to know more information.

Especially, I'm wondering if 'nova-manage create network' was executed
successfully before starting nova-network and if so, fixed_range, num_networks and network_size.
Also results of 'nova-manage network list' and 'nova-manage fixed list' assuming the issue is reproducible
using trunk ppa...

Revision history for this message
Narayan Desai (narayan-desai) wrote : Re: [Bug 719004] Re: nova-network crashes with bad data

I never had the ability to reproduce this bug on demand. It appeared
that there was some bad content in either the network database or
network-bound rabbitmq messages that would cause the code to
traceback, that would eventually flush itself out.

I'm in the process of upgrading our version of nova now; that should
be done in the next few days hopefully. We will see if it still occurs
then. Though, even in that case, I wouldn't necessarily be convinced
that the issue was gone.
 -nld

On Mon, Apr 4, 2011 at 12:52 PM, Masanori Itoh
<email address hidden> wrote:
> Hi,
>
> BTW, are these issues(Narayan's one and Kost one) still reproducible in trunk?
> Then, I would like to know more information.
>
> Especially, I'm wondering if 'nova-manage create network' was executed
> successfully before starting nova-network and if so, fixed_range, num_networks and network_size.
> Also results of 'nova-manage network list' and 'nova-manage fixed list' assuming the issue is reproducible
> using trunk ppa...
>
> --
> You received this bug notification because you are a direct subscriber
> of the bug.
> https://bugs.launchpad.net/bugs/719004
>
> Title:
>  nova-network crashes with bad data
>
> To unsubscribe from this bug, go to:
> https://bugs.launchpad.net/nova/+bug/719004/+subscribe
>

Thierry Carrez (ttx)
Changed in nova:
importance: High → Undecided
status: Confirmed → Incomplete
Revision history for this message
Mark Nelson (mark-msi) wrote : Re: nova-network crashes with bad data

I just encountered this earlier tonight. It definitely is related to data in the networks table and fixed_ips not matching correctly. I've been screwing around with different network managers and deleting and recreating networks and this was the end result.

To fix it, I deleted everything from the networks and fixed_ips tables and then recreated the network with nova-manage and everything was fine.

Mark

Revision history for this message
Narayan Desai (narayan-desai) wrote : Re: [Bug 719004] Re: nova-network crashes with bad data

That is interesting; when I had the problem, I added exception
handling for the error (and ignored it) and eventually the system
righted itself. I got the impression that there was some bad temporary
state.
 -nld

On Wed, Apr 20, 2011 at 3:05 AM, Mark Nelson <email address hidden> wrote:
> I just encountered this earlier tonight.  It definitely is related to
> data in the networks table and fixed_ips not matching correctly.  I've
> been screwing around with different network managers and deleting and
> recreating networks and this was the end result.
>
> To fix it, I deleted everything from the networks and fixed_ips tables
> and then recreated the network with nova-manage and everything was fine.
>
> Mark
>
> --
> You received this bug notification because you are a direct subscriber
> of the bug.
> https://bugs.launchpad.net/bugs/719004
>
> Title:
>  nova-network crashes with bad data
>
> To unsubscribe from this bug, go to:
> https://bugs.launchpad.net/nova/+bug/719004/+subscribe
>

Thierry Carrez (ttx)
Changed in nova:
importance: Undecided → Medium
status: Incomplete → Confirmed
summary: - nova-network crashes with bad data
+ nova-network sometimes crashes with bad state
Revision history for this message
Brian Lamar (blamar) wrote :

I feel like this bug report is too generic. Are there still situation where nova-network crashes? If so, we need to create a bug for the individual cases and not for the general case. Going through bugs looking for things to fix right now, and I'm not certain this is relevant anymore. Can anyone provide recent tracebacks with specific situations in which nova-network crashes?

Revision history for this message
yong sheng gong (gongysh) wrote :

regarding 'sudo: no tty present and no askpass program specified\n':

it seems you are not running with root account. Devstack's solution is to run:

( umask 226 && echo "stack ALL=(ALL) NOPASSWD:ALL" \
        > /etc/sudoers.d/50_stack_sh )

Revision history for this message
Tom Fifield (fifieldt) wrote :

As this bug is generic-ish, and is relating to quite old code, I propose marking this one as 'invalid' and encouraging the poster to resubmit a new bug with specific errors happening with newer code.

Changed in nova:
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.