DNS stops during install

Bug #993666 reported by Duncan McGreggor
30
This bug affects 6 people
Affects Status Importance Assigned to Milestone
devstack
Fix Released
Low
Davanum Srinivas (DIMS)

Bug Description

Today, I was trying to get the latest devstack running on a VM (Ubuntu 12.04, updated today, running on Mac OS X 10.6.8 using VirtualBox 4.1.14 r77440).

When devstack got to the point indicated in the full-res image linked below (somewhere between 'add_nova_opt vncserver_listen=127.0.0.1' and 'nova-manage db sync'), I lost DNS.

As a result, later parts of the install fail (e.g., downloading the image file).

This happens regardless of my VM network running in either bridged or NATed mode,

In order to narrow things down, I was running a ping to Google, and as soon as I saw that the hostname stopped resolving, I hit ^c. I wasn't able to tell exactly which point DNS failed, due to the low time resolution and the undetermined reflex response time of my fingers :-)

Full resolution screenshot:
  http://www.flickr.com/photos/oubiwann/7137334925/sizes/o/in/photostream/

Revision history for this message
Duncan McGreggor (oubiwann) wrote :

To get a better sense of what was going on, I added a "read" at several places to pause the ./stack.sh script.

With the first one, I hit the jackpot:

+ add_nova_opt rabbit_password=0a88a309651cf87fedee
+ echo rabbit_password=0a88a309651cf87fedee
+ add_nova_opt glance_api_servers=10.0.2.15:9292
+ echo glance_api_servers=10.0.2.15:9292
+ add_nova_opt force_dhcp_release=True
+ echo force_dhcp_release=True
+ '[' -n '' ']'
+ '[' False '!=' False ']'
+ '[' False '!=' False ']'
+ '[' True '!=' True ']'
+ [[ -z '' ]]
+ [[ -n '' ]]

The read stopped the script right after 'force_dhcp_release' and before the XenServer checks.

From that point (with the script "paused"), I was able to restart the network and then resume the script, resulting in a complete, error-free install.

Revision history for this message
Dean Troyer (dtroyer) wrote :

Is this an ongoing problem? I don't see it in a similar configuration (VB 4.1.14, OS/X 10.7, Ubuntu 12.04).

Changed in devstack:
status: New → Incomplete
Revision history for this message
Salvatore Orlando (salvatore-orlando) wrote :

the command "killall dnsmasq || true" kills not just the dnsmasq instances created by nova-network but also the one the NetworkManager uses a proxy to the dns server.
This is why dns stops working IMHO.
I have proposed a fix that ensure no dnsmasq process being killed has NetworkManager as PPID.

However, I reckon it should be up to unstack.sh to destroy dandling dnsmasq instances. If you agree on the latter statement we can either reject the patch and propose another one that works exlusively on unstack.sh or create a different bug report.

Changeset available for review at: https://review.openstack.org/#/c/8730

Changed in devstack:
status: Incomplete → Confirmed
assignee: nobody → Salvatore Orlando (salvatore-orlando)
Changed in devstack:
status: Confirmed → In Progress
Revision history for this message
Dean Troyer (dtroyer) wrote :

stack.sh generally cleans up certain resources as it runs, this is appropriate to have there. unstack.sh should also clean up after killing the OpenStack processes. Typically for non-trivial bits of code I'd create a function and put it in the 'functions' file, for this I could go either way.

Changed in devstack:
importance: Undecided → Low
Revision history for this message
Duncan McGreggor (oubiwann) wrote : Re: [Bug 993666] Re: DNS stops during install

On Thu, May 31, 2012 at 2:12 PM, Dean Troyer <email address hidden> wrote:
> Is this an ongoing problem?  I don't see it in a similar configuration

Sorry, Dean -- missed this message. Yes, it's an on-going issue.

Revision history for this message
Salvatore Orlando (salvatore-orlando) wrote :

Dean, I reckon a function will be the best approach always, no matter whether it's trivial or not.
I will update and resubmit the patch.

Revision history for this message
Paul Maunders (sq2d3bipy0t2o-paul) wrote :

I have also encountered this problem today with a fresh install of Ubuntu Precise, and following the devstack quick start instructions. The script failed while it was trying to download Cirrus UEC images.

--2012-07-20 18:16:34-- http://launchpad.net/cirros/trunk/0.3.0/+download/cirros-0.3.0-x86_64-uec.tar.gz
Resolving launchpad.net (launchpad.net)... failed: Name or service not known.
wget: unable to resolve host address `launchpad.net'

I could see that all my nameservers had been removed from resolv.conf

$ cat /etc/resolv.conf
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
# DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 127.0.0.1

Did the patch ever get merged into the master branch?

Can anyone suggest a work around?

Revision history for this message
Salvatore Orlando (salvatore-orlando) wrote :

My bad.
I totally forgot about this patch.
I won't have time for fixing and resubmitting it today, so if you want to
go ahead feel free to do that!

Dean had some fairly easy comments. Also the patch definitely needs a
rebase.

Salvatore

On 20 July 2012 10:35, Paul Maunders <email address hidden> wrote:

> I have also encountered this problem today with a fresh install of
> Ubuntu Precise, and following the devstack quick start instructions. The
> script failed while it was trying to download Cirrus UEC images.
>
> --2012-07-20 18:16:34--
> http://launchpad.net/cirros/trunk/0.3.0/+download/cirros-0.3.0-x86_64-uec.tar.gz
> Resolving launchpad.net (launchpad.net)... failed: Name or service not
> known.
> wget: unable to resolve host address `launchpad.net'
>
> I could see that all my nameservers had been removed from resolv.conf
>
> $ cat /etc/resolv.conf
> # Dynamic resolv.conf(5) file for glibc resolver(3) generated by
> resolvconf(8)
> # DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
> nameserver 127.0.0.1
>
> Did the patch ever get merged into the master branch?
>
> Can anyone suggest a work around?
>
> --
> You received this bug notification because you are a bug assignee.
> https://bugs.launchpad.net/bugs/993666
>
> Title:
> DNS stops during install
>
> Status in devstack - openstack dev environments:
> In Progress
>
> Bug description:
> Today, I was trying to get the latest devstack running on a VM (Ubuntu
> 12.04, updated today, running on Mac OS X 10.6.8 using VirtualBox
> 4.1.14 r77440).
>
> When devstack got to the point indicated in the full-res image linked
> below (somewhere between 'add_nova_opt vncserver_listen=127.0.0.1' and
> 'nova-manage db sync'), I lost DNS.
>
> As a result, later parts of the install fail (e.g., downloading the
> image file).
>
> This happens regardless of my VM network running in either bridged or
> NATed mode,
>
> In order to narrow things down, I was running a ping to Google, and as
> soon as I saw that the hostname stopped resolving, I hit ^c. I wasn't
> able to tell exactly which point DNS failed, due to the low time
> resolution and the undetermined reflex response time of my fingers :-)
>
> Full resolution screenshot:
>
> http://www.flickr.com/photos/oubiwann/7137334925/sizes/o/in/photostream/
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/devstack/+bug/993666/+subscriptions
>

Revision history for this message
Paul Maunders (sq2d3bipy0t2o-paul) wrote :

I was using the Desktop version of Precise, so I've now re-installed via a minimal install and it worked fine.

Revision history for this message
Salvatore Orlando (salvatore-orlando) wrote :

I suspect that's because ubuntu minimal does probably not use the network
manager.

Salvatore

On 20 July 2012 14:01, Paul Maunders <email address hidden> wrote:

> I was using the Desktop version of Precise, so I've now re-installed via
> a minimal install and it worked fine.
>
> --
> You received this bug notification because you are a bug assignee.
> https://bugs.launchpad.net/bugs/993666
>
> Title:
> DNS stops during install
>
> Status in devstack - openstack dev environments:
> In Progress
>
> Bug description:
> Today, I was trying to get the latest devstack running on a VM (Ubuntu
> 12.04, updated today, running on Mac OS X 10.6.8 using VirtualBox
> 4.1.14 r77440).
>
> When devstack got to the point indicated in the full-res image linked
> below (somewhere between 'add_nova_opt vncserver_listen=127.0.0.1' and
> 'nova-manage db sync'), I lost DNS.
>
> As a result, later parts of the install fail (e.g., downloading the
> image file).
>
> This happens regardless of my VM network running in either bridged or
> NATed mode,
>
> In order to narrow things down, I was running a ping to Google, and as
> soon as I saw that the hostname stopped resolving, I hit ^c. I wasn't
> able to tell exactly which point DNS failed, due to the low time
> resolution and the undetermined reflex response time of my fingers :-)
>
> Full resolution screenshot:
>
> http://www.flickr.com/photos/oubiwann/7137334925/sizes/o/in/photostream/
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/devstack/+bug/993666/+subscriptions
>

Revision history for this message
Vikas Deolaliker (vikasd) wrote :

Workaround is to cat nameserver <yournameservipadd> to the /etc/resolv.conf at the point where it begins to download the image.

Revision history for this message
Richard Lincoln (r-w-lincoln) wrote :

FWIW, I worked around this with:

$ screen -x stack
$ screen -ls
$ screen -X -S 3917.stack quit
$ wget -c http://launchpad.net/cirros/trunk/0.3.0/+download/cirros-0.3.0-x86_64-uec.tar.gz -O /home/rwl/tmp/devstack/files/cirros-0.3.0-x86_64-uec.tar.gz
$ ./stack.sh

I then had to disconnect and reconnect my network with NM.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to devstack (master)

Fix proposed to branch: master
Review: https://review.openstack.org/20659

Changed in devstack:
assignee: Salvatore Orlando (salvatore-orlando) → Davanum Srinivas (DIMS) (dims-v)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to devstack (master)

Reviewed: https://review.openstack.org/20659
Committed: http://github.com/openstack-dev/devstack/commit/d71d6e71b37d97e3fd4922608ae41f9ff53bc4d0
Submitter: Jenkins
Branch: master

commit d71d6e71b37d97e3fd4922608ae41f9ff53bc4d0
Author: Davanum Srinivas <email address hidden>
Date: Mon Jan 28 19:15:57 2013 -0500

    Dns stops working on precise when network manager is enabled

    In Precise and Quantal, we nuke the dnsmasq launched by NetworkManager

    Fixes LP# 993666

    Change-Id: I4b39010765e2cbbea1ca3fc3120bf329015b7a56

Changed in devstack:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.