nis doesn't work anymore after upgrade to 12.04

Bug #1007293 reported by alex jordan
44
This bug affects 8 people
Affects Status Importance Assigned to Milestone
rpcbind (Ubuntu)
Fix Released
High
Steve Langasek
Precise
Fix Released
High
Steve Langasek
Quantal
Fix Released
High
Steve Langasek

Bug Description

[Impact]
On a system which uses both NIS and NFS, a boot-time NFS mount can trigger portmap to be started before the local filesystem is mounted, resulting in ypbind also starting prematurely. The portmap job needs to set an environment variable so ypbind knows not to start yet.

[Test Case]
1. Configure a system with NFS mounts that are mounted at boot time and which is configured to use a NIS domain.
2. Boot the system with the rpcbind package in precise-updates. Verify that ypbind does not start up correctly.
3. Upgrade to rpcbind from precise-proposed.
4. Reboot the system and verify that ypbind consistently starts up correctly.

[Regression Potential]
None that I can see. The only change is to export a variable to the upstart event which was always supposed to be there. In theory this could cause some jobs to now start *later* than expected, but I'm only aware of NFS and NIS jobs that are affected by portmap.

I am managing about 20 Ubuntu Computers since release 5.04 and have the strong feeling, that problems are increasing rapidly from one distribution upgrade to anoter. Non of my LTS version upgrades works out of the box today. The main problems I have, is to upgrade or install the nis package. I cannot login anymore after the upgrade to 12.04 (neither from last LTS-Version, nor from Version 11.10). After a complete uninstallation of the nis and the network-manager package nis works more or less, but I have to restart it manually after every reboot.
I tried also a complete new installation, but nis also doesn't work there. After I installed NIS pagage on a fresh Ubuntu system, I cannot login anymore. Computers are booting, no graphical desktop manager comes up, only console with login promt. And after login, no shell comes up.

And this happens one month after a new LTS release. I am not amused and testing and thinking now about a change to Fedora Linux. I have the feeling that Ubuntu developers focuses only to single computer installations, disregards networking abilities and has a lot of problems with upstart, that I still not really understand.

:<
Alex

Revision history for this message
Naveen Kumard (naveensheoran-munna) wrote :

this will help you, do the following.

go to console (alt+ctrl+f1/f2.....f6) and apply the command

rm -rf .gnome .gnome2 .gconf .gconfd .metacity

after this restart your system.

try this command

Revision history for this message
alex jordan (urgretl) wrote :

OK,
I'm speaking now only of a new installed UBUNTU 12.04LTS system:

NIS is not working after reboot, on my other machines (Ubuntu 10.04-11.10 and Debian 6) it works very well.
Your hint does not have any positiv effects and I don't understand the logic behind. Why should I remove configuration files in my home directory if I'm not able to login. I have to restart nis manually, than it works.

Maybe this bug has more side effects, because:

automounts are not working too!

After some hours tear at one's hair:
....
removing network-manager and editing /etc/network/interfaces: no success, network response is very slow (i.e. ping).
reinstalling netork-manager and reconfigure new, commenting out everithing in /etc/network/interfaces: network response is OK again, nis is working after reboot!!! What's going on?

But

autofs is still not running: I give up for today and create static mounts.

No hints, no messages, just not working.

Revision history for this message
Clint Byrum (clint-fewbar) wrote :

alex, this is somewhat surprising to hear, as up until 12.04, NIS was actually quite broken on bootup for most users due to many race conditions in the way it started up. In 12.04, the startup was moved to being managed by upstart jobs, and this fixed the problem for many users (in fact we have updates pending for 10.04 to fix this for those users).

Could you perhaps create a tarball of /etc/init from a machine that is affected, and attach it to this bug report? Try:

tar -cz -f init.tar.gz -C /etc init

Thanks!

Also if you can run this command on an affected machine, it will help us verify the versions of software you have:

apport-collect 1007293

(Marking as High Importance, Incomplete).

Changed in nis (Ubuntu):
status: New → Incomplete
importance: Undecided → High
Revision history for this message
Steve Langasek (vorlon) wrote :

hi alex,

It would also be helpful to see the output of this command as root:

 initctl list | grep yp

and to have the contents of /etc/default/nis attached.

Revision history for this message
alex jordan (urgretl) wrote :

Hi Steve, hi Clint,

thanks for answer and sorry for delay:

# initctl list | grep yp
ypbind start/running, process 1256
ypserv stop/waiting
ypxfrd stop/waiting
start-ypbind stop/waiting
yppasswdd stop/waiting

Best regards
Alex

Revision history for this message
alex jordan (urgretl) wrote :

Oh sorry, I forgot the content of /etc/default/nis

Alex

Revision history for this message
vagk (vagk-p) wrote :

I am having similar issues concerning nis after upgrading to ubuntu 12.04

ypbind does not start on boot.

The status command does not show a running process

  service ypbind status
  ypbind start/running

nor

  ps -A | grep ypbind

I inserted
   exec > /var/log/nis-debug.log 2>&1
in /etc/init/ypbind.conf after script line

and the output in nis-debug.log was
  No NIS server and no -broadcast option specified.
  Add a NIS server to the /etc/yp.conf configuration file,
  or start ypbind with the -broadcast option.

my /etc/yp.conf is
  ypserver nisserver
(nisserver is resolved by dns ok, tried to use the ip but the problems seems elsewhere)

I dont use network-manager (purged) and my /etc/network/interfaces file is the following
  allow-hotplug lo eth0
  auto lo eth0
  iface lo inet loopback
  iface eth0 inet dhcp

The problem seems to be that when ypbind is initially started networking is not up yet

For now I execute manually after each boot the following command (I will probably add it at /etc/rc.local although this is ugly)
  service ypbind restart

Some lines from /var/log/syslog showing that ypbind is started before networking and although it is respawning it does not succeed

Jun 4 16:47:48 ubuntutestclient ypbind: Host name lookup failure
..
Jun 4 16:47:48 ubuntutestclient dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 3
Jun 4 16:47:48 ubuntutestclient dhclient: DHCPREQUEST of 192.168.10.53 on eth0 to 255.255.255.255 port 67
Jun 4 16:47:48 ubuntutestclient dhclient: DHCPOFFER of 192.168.10.53 from 192.168.10.165
Jun 4 16:47:48 ubuntutestclient dhclient: DHCPACK of 192.168.10.53 from 192.168.10.165
Jun 4 16:47:48 ubuntutestclient rpcbind: Cannot open '/run/rpcbind/rpcbind.xdr' file for reading, errno 2 (No such file or directory)
Jun 4 16:47:48 ubuntutestclient rpcbind: Cannot open '/run/rpcbind/portmap.xdr' file for reading, errno 2 (No such file or directory)
Jun 4 16:47:48 ubuntutestclient kernel: [ 8.984711] init: portmap-wait (statd) main process (531) killed by TERM signal
..
Jun 4 16:47:48 ubuntutestclient kernel: [ 9.448250] init: ypbind main process (583) terminated with status 1
Jun 4 16:47:48 ubuntutestclient kernel: [ 9.448372] init: ypbind main process ended, respawning
..
Jun 4 16:47:54 ubuntutestclient kernel: [ 15.403354] init: wait-for-state (atdypbind) main process (1071) killed by TERM signal
Jun 4 16:47:54 ubuntutestclient kernel: [ 15.423670] init: wait-for-state (gdmypbind) main process (1076) killed by TERM signal
Jun 4 16:47:54 ubuntutestclient kernel: [ 15.429334] init: wait-for-state (lightdmypbind) main process (1084) killed by TERM signal

I have 30 client machines with about 300 users in a school and this is a serious issue that prevents me from upgrading yet.

Please look into it and I will try to help as much as I can

Revision history for this message
Steve Langasek (vorlon) wrote : Re: [Bug 1007293] Re: nis doesn't work anymore after upgrade to 12.04
Download full text (3.3 KiB)

On Mon, Jun 04, 2012 at 02:43:32PM -0000, vagk wrote:
> I inserted
> exec > /var/log/nis-debug.log 2>&1
> in /etc/init/ypbind.conf after script line

That really shouldn't be necessary. You should be able to read the log
files from /var/log/upstart, without modifying the upstart job.

alex, do you have any interesting log files in /var/log/upstart on your
system related to ypbind?

> and the output in nis-debug.log was
> No NIS server and no -broadcast option specified.
> Add a NIS server to the /etc/yp.conf configuration file,
> or start ypbind with the -broadcast option.

> my /etc/yp.conf is
> ypserver nisserver

> (nisserver is resolved by dns ok, tried to use the ip but the problems
> seems elsewhere)

> I dont use network-manager (purged) and my /etc/network/interfaces file is the following
> allow-hotplug lo eth0
> auto lo eth0
> iface lo inet loopback
> iface eth0 inet dhcp

> The problem seems to be that when ypbind is initially started networking
> is not up yet

There's no reason for this that I can see, if your network is configured in
/etc/network/interfaces. The ypbind job itself is configured to start only
after the network is up. ypbind may also be started by the start-ypbind
job, but this job also only starts in response to a series of other jobs
that all wait for the network.

alex, vagk: could each of you please set YPBINDARGS="-no-dbus -debug" in
/etc/default/nis, reboot, and attach /var/log/upstart/ypbind.log so we can
see exactly what's failing?

> Some lines from /var/log/syslog showing that ypbind is started before
> networking and although it is respawning it does not succeed

> Jun 4 16:47:48 ubuntutestclient ypbind: Host name lookup failure
> ..
> Jun 4 16:47:48 ubuntutestclient dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 3
> Jun 4 16:47:48 ubuntutestclient dhclient: DHCPREQUEST of 192.168.10.53 on eth0 to 255.255.255.255 port 67
> Jun 4 16:47:48 ubuntutestclient dhclient: DHCPOFFER of 192.168.10.53 from 192.168.10.165
> Jun 4 16:47:48 ubuntutestclient dhclient: DHCPACK of 192.168.10.53 from 192.168.10.165
> Jun 4 16:47:48 ubuntutestclient rpcbind: Cannot open '/run/rpcbind/rpcbind.xdr' file for reading, errno 2 (No such file or directory)
> Jun 4 16:47:48 ubuntutestclient rpcbind: Cannot open '/run/rpcbind/portmap.xdr' file for reading, errno 2 (No such file or directory)
> Jun 4 16:47:48 ubuntutestclient kernel: [ 8.984711] init: portmap-wait (statd) main process (531) killed by TERM signal
> ..
> Jun 4 16:47:48 ubuntutestclient kernel: [ 9.448250] init: ypbind main process (583) terminated with status 1
> Jun 4 16:47:48 ubuntutestclient kernel: [ 9.448372] init: ypbind main process ended, respawning
> ..
> Jun 4 16:47:54 ubuntutestclient kernel: [ 15.423670] init: wait-for-state (gdmypbind) main process (1076) killed by TERM signal
> Jun 4 16:47:54 ubuntutestclient kernel: [ 15.429334] init: wait-for-state (lightdmypbind) main process (1084) killed by TERM signal

This is quite unusual since as I said, each of these other services (atd and
lightdm) are supposed to wait for events ('runlevel 2') that are themselves
dependent on the network b...

Read more...

Revision history for this message
alex jordan (urgretl) wrote :

I compared ypbind.log with a proper working machine, and I could not find any differences.
/etc/yp.conf exists and nis servers responding ping packages as expected.

Revision history for this message
alex jordan (urgretl) wrote :

Just FYI,

I'm playing now with a fresh new vitual (VBOX) Ubuntu12.04 64bit (don't want to damage a more or less running system) and got the same behaviour with a non working nis environment. The only packages I added via apt-get are nis, autofs, tcsh, csh, nfs-kernel-server:

After a local login I did:
# ypwhich
ypwhich: Can't communicate with ypbind

I also cannot find any process (i.e. ps -ef | grep yp) which seems to be related with ypbind!

If I try to start ypbind:
# start ypbind
start: Job is already running: ypbind

But I could not see it (again looking at the snapshot of the current processes with root privilegies)!!

If i restart ypbind (# restart ypbind) process is obviously restarting and nis is running.

From my humble point of view this bug must pain a lot of Ubuntu user all over the world.
Alex

Revision history for this message
vagk (vagk-p) wrote :

>alex, vagk: could each of you please set YPBINDARGS="-no-dbus -debug" in
>/etc/default/nis, reboot, and attach /var/log/upstart/ypbind.log so we can
>see exactly what's failing?

Here is an excerpt of the /var/log/upstart/ypbind.log when it fails to start

Setting NIS domainname to: sch.local
571: parsing config file
571: Trying entry: ypserver nisserver
571: parsed ypserver nisserver
571: add_server() domain: sch.local, host: nisserver, slot: 0
571: Host name lookup failure
571: No entry found.
No NIS server and no -broadcast option specified.
Add a NIS server to the /etc/yp.conf configuration file,
or start ypbind with the -broadcast option.
Binding to YP server .....backgrounded

Although my understanding of upstart procedures is minimal this line in /etc/init/ypbind.conf got me curious

              and ((filesystem and static-network-up) or failsafe-boot)))

My setup in /etc/network/interfaces (as stated in my previous post) is to get ip from dhcp ( which by the way I assign based on mac to be always the same)

I changed the interfaces file to get static ip and retried.
It failed but for a different reason.
My /etc/yp.conf has no direct ip but a cname
  ypserver nisserver
When networking got up /etc/resolv.conf was empty and that is because resolvconf package deleted the information that the last dhclient command wrote to /etc/resolv.conf

So I removed package resolvconf (which I don't like in lab setups, same as network-manager) and voila!

I rebooted and now I there was a 10 second delay before lightdm started and ypbind got running.

So to conclude :
 Static IP on /etc/network/interfaces and nis server by IP in /etc/yp.conf works (or by cname and without resolvconf package)
 IP from dhcp on /etc/network/interfaces does not work. It seems that ypbind starts before networking

Changing to static ips on 30+ clients is not an option for a lot of reasons.

By the way my tests are also in a vbox that was running 10.04 with no problems (with just nis and nfs client packages added) and was upgraded to 12.04.

Revision history for this message
Steve Langasek (vorlon) wrote :

On Tue, Jun 05, 2012 at 02:25:02PM -0000, vagk wrote:
> I changed the interfaces file to get static ip and retried.

This should not matter because dhclient doesn't return control to ifupdown
until it successfully acquires an IP.

> It failed but for a different reason.

> My /etc/yp.conf has no direct ip but a cname
> ypserver nisserver

> When networking got up /etc/resolv.conf was empty and that is because
> resolvconf package deleted the information that the last dhclient command
> wrote to /etc/resolv.conf

That's because you had invalid network configuration information: you were
no longer using DHCP, but had not specified another source for DNS server
information.

> So I removed package resolvconf (which I don't like in lab setups, same
> as network-manager) and voila!

resolvconf is the supported mechanism for managing /etc/resolv.conf in
12.04. Outside of your experiments here for debugging, I strongly advise
that you keep it installed.

> I rebooted and now I there was a 10 second delay before lightdm started
> and ypbind got running.

What is the system doing during this 10 second delay?

> Static IP on /etc/network/interfaces and nis server by IP in /etc/yp.conf
> works (or by cname and without resolvconf package)

> IP from dhcp on /etc/network/interfaces does not work. It seems that
> ypbind starts before networking

I'm still lacking the debugging information to see why it's starting before
networking for you. For that I need to see the syslog output of a boot with
--verbose.

Revision history for this message
Steve Langasek (vorlon) wrote :

On Tue, Jun 05, 2012 at 12:33:14PM -0000, alex wrote:
> If I try to start ypbind:
> # start ypbind
> start: Job is already running: ypbind

When this happens, it may be useful to grab the output of 'status ypbind' as
well.

But regardless, your log shows the same hostname lookup failure, so we'll
need to see the syslog output when booting with --verbose to understand why.

Revision history for this message
vagk (vagk-p) wrote :

>I'm still lacking the debugging information to see why it's starting before
>networking for you. For that I need to see the syslog output of a boot with
>--verbose.

changed grub parameters to nosplash debug --verbose and I am attaching the syslog file

Do you want also the syslog if the static ip situation that worked?

Revision history for this message
Steve Langasek (vorlon) wrote :
Download full text (6.6 KiB)

Ok, here's the problem:

Jun 5 22:24:04 intwinxpubuntu6 kernel: [ 8.910762] init: portmap main process (524) became new process (527)
Jun 5 22:24:04 intwinxpubuntu6 kernel: [ 8.921625] init: portmap state changed from spawned to post-start
Jun 5 22:24:04 intwinxpubuntu6 kernel: [ 8.921947] init: portmap state changed from post-start to running
Jun 5 22:24:04 intwinxpubuntu6 kernel: [ 8.922172] init: Handling started event
Jun 5 22:24:04 intwinxpubuntu6 kernel: [ 8.922498] init: ypbind goal changed

This shows that ypbind is being started as soon as portmap starts. But the start condition for ypbind is:

start on (started portmap ON_BOOT=
          or (started portmap ON_BOOT=y
              and ((filesystem and static-network-up) or failsafe-boot)))

So it appears that portmap is being started with ON_BOOT= instead of with ON_BOOT=y, confusing ypbind into believing that as soon as portmap starts, ypbind can also be started.

I believe this is due to a bug in either the portmap-wait or the statd job, one of which needs to set ON_BOOT=y.

Going back a bit in the boot:

Jun 5 22:24:03 intwinxpubuntu6 kernel: [ 5.844424] init: Handling remote-filesystems event
Jun 5 22:24:03 intwinxpubuntu6 kernel: [ 6.950731] init: Handling local-filesystems event
Jun 5 22:24:03 intwinxpubuntu6 kernel: [ 7.038074] init: Handling filesystem event
Jun 5 22:24:04 intwinxpubuntu6 kernel: [ 7.415998] init: statd-mounting goal changed from stop to start
Jun 5 22:24:04 intwinxpubuntu6 kernel: [ 7.422316] init: statd-mounting state changed from waiting to starting
Jun 5 22:24:04 intwinxpubuntu6 kernel: [ 8.364354] init: statd goal changed from stop to start
Jun 5 22:24:04 intwinxpubuntu6 kernel: [ 8.371049] init: statd state changed from waiting to starting
Jun 5 22:24:04 intwinxpubuntu6 kernel: [ 8.377212] init: statd state changed from starting to pre-start
Jun 5 22:24:04 intwinxpubuntu6 kernel: [ 8.405912] init: statd pre-start process (489)
Jun 5 22:24:04 intwinxpubuntu6 kernel: [ 8.565510] init: portmap-wait (statd) goal changed from stop to start
Jun 5 22:24:04 intwinxpubuntu6 kernel: [ 8.565696] init: portmap-wait (statd) state changed from waiting to starting
Jun 5 22:24:04 intwinxpubuntu6 kernel: [ 8.566173] init: portmap-wait (statd) state changed from starting to pre-start
Jun 5 22:24:04 intwinxpubuntu6 kernel: [ 8.566342] init: portmap-wait (statd) state changed from pre-start to spawned
Jun 5 22:24:04 intwinxpubuntu6 kernel: [ 8.600918] init: portmap-wait (statd) main process (505)
Jun 5 22:24:04 intwinxpubuntu6 kernel: [ 8.601291] init: portmap-wait (statd) state changed from spawned to post-start
Jun 5 22:24:04 intwinxpubuntu6 kernel: [ 8.611092] init: portmap-wait (statd) state changed from post-start to running
Jun 5 22:24:04 intwinxpubuntu6 kernel: [ 8.816319] init: portmap goal changed from stop to start
Jun 5 22:24:04 intwinxpubuntu6 kernel: [ 8.816509] init: portmap state changed from waiting to starting
Jun 5 22:24:04 intwinxpubuntu6 kernel: [ 8.816991] init: portmap state changed from starting to pre-start
Jun 5 22:24:04 intwinxpubuntu6 kernel: [ 8...

Read more...

Revision history for this message
vagk (vagk-p) wrote :

>This shows that ypbind is being started as soon as portmap starts. But the start condition for ypbind is:
>
>start on (started portmap ON_BOOT=
> or (started portmap ON_BOOT=y
> and ((filesystem and static-network-up) or failsafe-boot)))

>So it appears that portmap is being started with ON_BOOT= instead of with ON_BOOT=y, confusing ypbind into believing that as >soon as portmap starts, ypbind can also be started.

>I believe this is due to a bug in either the portmap-wait or the statd job, one of which needs to set ON_BOOT=y.

I tried changing /etc/init/ypbind.conf reversing the conditions just to test
   start on (started portmap ON_BOOT=y
        or (started portmap ON_BOOT=
            and ((filesystem and static-network-up) or failsafe-boot)))

The boot process delayed in 2 points (between 9 to 19 sec and after 30 sec)

ypbind started successfully!

Does this verify your assumptions of where the problem is?

Revision history for this message
Steve Langasek (vorlon) wrote :

I believe the right place to export ON_BOOT=y is part of the /etc/init/portmap-wait.conf job. You should be able to confirm this locally by changing this line:

  start portmap || true

to:

  start portmap ON_BOOT=y || true

I'll prepare an upload to fix this.

affects: nis (Ubuntu) → rpcbind (Ubuntu)
Changed in rpcbind (Ubuntu):
status: Incomplete → Triaged
Steve Langasek (vorlon)
Changed in rpcbind (Ubuntu Precise):
status: New → In Progress
importance: Undecided → High
assignee: nobody → Steve Langasek (vorlon)
Changed in rpcbind (Ubuntu Quantal):
assignee: nobody → Steve Langasek (vorlon)
status: Triaged → In Progress
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package rpcbind - 0.2.0-7ubuntu2

---------------
rpcbind (0.2.0-7ubuntu2) quantal; urgency=low

  * debian/rpcbind.portmap-wait.upstart: export ON_BOOT=y when starting
    portmap from here; since portmap and statd will only race to start on
    boot, and not when started from a maintainer script, we can assume that
    the only time this 'start' command has any effect is at boot time, so
    it's always correct to say ON_BOOT=y here. This fixes a problem with
    racy startup of portmap causing other services that depend on portmap to
    start too early. LP: #1007293.
 -- Steve Langasek <email address hidden> Tue, 05 Jun 2012 15:21:15 -0700

Changed in rpcbind (Ubuntu Quantal):
status: In Progress → Fix Released
Revision history for this message
alex jordan (urgretl) wrote :

Good morning,

thanks, but the actual bug still persists.

# status ypbind
ypbind start/running

No pid found. But after "re-"starting ypbind, I can see a pid that belongs to ypbind:
...
root 2464 1 0 09:07 ? 00:00:00 ypbind --no-dbus
...
The verbose syslog is attached.

Revision history for this message
vagk (vagk-p) wrote :

>I believe the right place to export ON_BOOT=y is part of the /etc/init/portmap-wait.conf job. You should be able to confirm this >locally by changing this line:
>
> start portmap || true
>
>to:
>
> start portmap ON_BOOT=y || true

Changing this line worked for me.

ypbind started successfully!

Thanks for fixing this!

Revision history for this message
Steve Langasek (vorlon) wrote :

alex, your syslog shows the events firing in the right order: static-network-up, then filesystem, and only then does ypbind start.

So you and vagk are seeing two different bugs, unfortunately.

Your bug is that, even though static-network-up has been emitted, name resolution is failing on the system.

It appears that this is because you are using network-manager for your network configuration, not ifupdown. Boot-time scripts have no way to automatically determine if they should wait for NM-controlled interfaces; to fix the reliability of your boot, you will need to do one of two things:
 - configure your eth0 interface via /etc/network/interfaces instead of through NetworkManager, or
 - edit /etc/init/ypbind.conf locally, to replace 'static-network-up' with 'net-device-up IFACE=eth0'.

Unfortunately, as I said there's no way for upstart jobs to know automatically which NM interfaces they should wait for, so there doesn't seem to be anything we can do to fix this in the Ubuntu package.

Revision history for this message
alex jordan (urgretl) wrote :

thanks Steve, but:

> - configure your eth0 interface via /etc/network/interfaces instead of through NetworkManager

This method makes my nis working. But only if I setup a dynamic host configuration (iface eth0 inet dhcp), the machine works well. But I want to have a static IP configuration. And this is the reason why I changes to use the network-manager since Ubuntu 11.10, because static IP configuration in /etc/network/interfaces causes different other problems:

1. There is a response delay of all network packages, i.e. after I did a ping command, I have to wait about 5 minutes get get the first response. Ping response timespan seems to be OK (~0.1ms) . Internet browsing gets also very slow because of a response delay of about 5 seconds. Maybe this is a dns problem, because this only happens, if I request an internet address which I am not visited immediatly before??

2. boot time takes minimum twice as long.
3. If I create a client.conf file under /etc/cups/, boot time takes much much longer.

> - edit /etc/init/ypbind.conf locally, to replace 'static-network-up' with 'net-device-up IFACE=eth0'.

sorry no success. I tried it on a virgin Ubuntu installation too (with static IP). The result is the same (see #19).

Sorry

Revision history for this message
alex jordan (urgretl) wrote :

sorry, I mean five seconds, not five minutes!

Revision history for this message
alex jordan (urgretl) wrote :

Finally I fixed it:

I replaced the nis-server name entries in yp.conf with its ip-addresses. Now everything works fine, even autofs.
The only thing one has to do additionally is, inserting the following line into /etc/lightdm/lightdm.conf

greeter-show-manual-login=true

Otherwise one cannot login as a nis user from login manager lightdm. From my point of view, this has to be done automatically during installation of nis or ldap client packages.

I don't know why I have to give IP adresses instaed of qualified names. Can it be that
either dns is starting after nis?
or boot time is too quick to wait for dns response?

Nice WE
Alex

Revision history for this message
Steve Langasek (vorlon) wrote :

On Fri, Jun 08, 2012 at 09:04:35AM -0000, alex jordan wrote:

> This method makes my nis working. But only if I setup a dynamic host
> configuration (iface eth0 inet dhcp), the machine works well. But I want
> to have a static IP configuration. And this is the reason why I changes
> to use the network-manager since Ubuntu 11.10, because static IP
> configuration in /etc/network/interfaces causes different other
> problems:

> 1. There is a response delay of all network packages, i.e. after I did a
> ping command, I have to wait about 5 minutes get get the first response.
> Ping response timespan seems to be OK (~0.1ms) . Internet browsing gets
> also very slow because of a response delay of about 5 seconds. Maybe
> this is a dns problem, because this only happens, if I request an
> internet address which I am not visited immediatly before??

How do you have your DNS configured in this case? This sounds like you're
ending up with a different DNS configuration via DHCP than you have when
manually configuring. Note that 12.04 uses resolvconf, which means that DNS
servers should be specified as part of the network interface configuration
in /etc/network/interfaces (see "dns-nameservers" in the resolvconf(8)
manpage for details).

Revision history for this message
Steve Langasek (vorlon) wrote :

On Fri, Jun 08, 2012 at 12:21:33PM -0000, alex jordan wrote:
> Finally I fixed it:

> I replaced the nis-server name entries in yp.conf with its ip-addresses.
> Now everything works fine, even autofs. The only thing one has to do
> additionally is, inserting the following line into
> /etc/lightdm/lightdm.conf

> greeter-show-manual-login=true

> Otherwise one cannot login as a nis user from login manager lightdm.
> From my point of view, this has to be done automatically during
> installation of nis or ldap client packages.

Can you file a separate bug report against lightdm for this? This sounds
like a buggy assumption about user enumeration on the part of lightdm.

Revision history for this message
alex jordan (urgretl) wrote :

never ending story:

To #25:

I configerd dns server in /etc/network/interfaces by adding the following lines manually:

dns-nameservers server1 server2 server3
dns-search domain.name

nothing else! I left resolv.conf untouched.
This Bug is the reason, why I postponed the upgrade of my machines until Ubuntu upgrade will work in a more or less proper way. I testet it on three of my machines, and all shows the same behavior. If I reinstall network-manager (instead of useing ifupdown) to avoid network response delays, the following happens to all machines:

1. After reboot: bootmessage says:" * starting cups printing/server"

2. Perhaps 20 seconds nothing happens

3. After that bootmessage says: "Waiting for network configuration ...."

4. Perhaps 30 seconds nothing happens

5. After that bootmessage says: "Waiting up to 60 more seconds for network configuration ..."

6. Perhaps 60 seconds nothing happens

7. After that bootmessage says: "Booting system without full network configuration"

8. System boots up to login manager

But network is still not configured, one has to restart network-manager manually to get the system working.

In summary: If I install a Ubuntu 12.04 system from scratch, I can have a working system, but I have to use network-manager and IP numbers in yp.conf instead of machine names. If I want to upgrade a Ubuntu 10.04LTS system, I get a lot of trouble.

Alex

Revision history for this message
Steve Langasek (vorlon) wrote :

On Tue, Jun 12, 2012 at 02:16:48PM -0000, alex jordan wrote:
> 2. Perhaps 20 seconds nothing happens

> 3. After that bootmessage says: "Waiting for network configuration ...."

> 4. Perhaps 30 seconds nothing happens

> 5. After that bootmessage says: "Waiting up to 60 more seconds for
> network configuration ..."

> 6. Perhaps 60 seconds nothing happens

> 7. After that bootmessage says: "Booting system without full network
> configuration"

> 8. System boots up to login manager

Ok, that's highly unusual but should have nothing to do with nis. What are
the contents of /etc/network/interfaces when this happens?

Revision history for this message
alex jordan (urgretl) wrote :

It's also highly unusable for me,

I attached an example of my /etc/network/interfaces file.

Alex

Revision history for this message
alex jordan (urgretl) wrote :

Sorry Steve, didn't read your last post attentive enough. When this happens, I either removed /etc/network/interfaces or commented out the primary network interface part, or I commented out everything except dns part, and so on. So I think that has nothing to do witch interfaces file.

Alex

Revision history for this message
Steve Langasek (vorlon) wrote :

Thanks, Alex. That interfaces file looks completely sane to me, and *should* result in the network being brought up quickly at boot time and in order, no question. So I wonder if something is failing in the ifupdown hooks.

Can you show me the contents of these three directories:
/etc/network/if-pre-up.d/
/etc/network/if-up.d/
/etc/resolvconf/update.d/

Can you also check whether you get a /var/log/upstart/network-interface-eth0.log file at boot, and whether it has any possibly-relevant contents?

Revision history for this message
alex jordan (urgretl) wrote :

Hi Steve,

in /etc/network/if-pre-up.d/
- wireless-tools
- wpasupplicant -> ../../wpa_supplicant/ifupdown.sh

in /etc/network/if-up.d/
- 000resolvconf
- avahi-autoipd
- avahi-daemon
- ntpdate
- openssh-server
- upstart
- wpasupplicant -> ../../wpa_supplicant/ifupdown.sh

/etc/resolvconf/update.d/
- dnscache
- libc

In /var/lib/networking.log.1.gz and network-interface-eth0.log.1.gz I found the following entries: "/etc/network/interfaces:17: misplaced option".
This points to the dns-nameservers entry: "dns-nameservers 139.xxx.xxx.xxx 139.xxx.xxx.xxx 139.xxx.xxx.xxx"
Where the x are numbers of course.

I have discovered another strange behavior: the command w or who doesn't show nis user anymore on all upgraded, not propper working machines.

Alex

Revision history for this message
vagk (vagk-p) wrote :

>I have discovered another strange behavior: the command w or who doesn't show nis user anymore on all upgraded, not propper >working machines.

This seems to concern lightdm

See https://bugs.launchpad.net/ubuntu/precise/+source/lightdm/+bug/870297

Revision history for this message
alex jordan (urgretl) wrote :

Still no success to solve my network/nis problems. I now believe that this is a bug in upstart. But I don't want to waste my time to understand upstart and now switching my machines successively to Debian wheezy. It seems to me, that even the Debian testing release runs more smoothly compared to Ubuntu.

Sorry, but trouble increases from one Ubuntu release to the next and the current LTS verion is obviously still a beta version from my humble point of view :o(

Alex

Revision history for this message
Manuel Melo (manuel-nuno-melo) wrote :

I don't know whether a 'me too' comment is still relevant but I must add that in my network we experienced the almost exact same problem as Alex, with overall laggy, and only randomly successful reboots.
This randomness seemed to depend a bit on hardware, as certain groups of boxes of boxes would be more susceptible to it, but in the end it just reflects the race condition Steve described.

The problem was effectively solved by upgrading to 12.04 and patching /etc/init/portmap-wait.conf as per Steve's suggestion (#17). The problem could not be fixed on 11.10 as per the same patch, which I guess makes sense because NIS was not an upstart job back then.

Since all our NIS- and NFS-related lookups are done via hosts file I cannot comment on the subsequent problems Alex encountered.

Steve Langasek (vorlon)
description: updated
Revision history for this message
Chris Halse Rogers (raof) wrote : Please test proposed package

Hello alex, or anyone else affected,

Accepted rpcbind into precise-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/rpcbind/0.2.0-7ubuntu1.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please change the bug tag from verification-needed to verification-done. If it does not, change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in rpcbind (Ubuntu Precise):
status: In Progress → Fix Committed
tags: added: verification-needed
Revision history for this message
no!chance (ralf-fehlau) wrote :

I have the same problems since 2012-10-03 on my server. My machine was upgraded from 10.04 and worked fine, until yesterday. It began with problems with autofs. And then is saw these problems with nis. My version of ypbind is already 0.2.0-7ubuntu1.1. The effects are the same as in Alex's description.

In my install history, the only upgraded package was xdiagnose.

I tried to patch my /etc/init/portmap-wait.conf as suggested in comment #17, but with no effect.
Is there any possiblity to view the log for this upstart-job? I my logs, I can only see "Binding to YP server .....backgrounded". Nothing more.

Revision history for this message
no!chance (ralf-fehlau) wrote :

Sorry, I saw that my version is not the same as in the proposed-repo (1.1 vs. 1.2). But if the patch in comment #17 is the only difference, it has no effect for me.

Revision history for this message
Steve Langasek (vorlon) wrote :

no!chance,

Please test with the actual package from precise-proposed and report whether the bug is still present.

Alex, have you tested the rpcbind package in precise-proposed? There has been no feedback confirming whether it fixes the issue you're seeing. We won't publish this to the LTS unless someone confirms that it fixes the issue.

Revision history for this message
no!chance (ralf-fehlau) wrote :

I installed the deb-Package manually, because it's not a good idea to activate the proposed-packages on a production system. There is no difference ... rpcbind does not start.

$ sudo dpkg -i rpcbind_0.2.0-7ubuntu1.2_amd64.deb
(Lese Datenbank ... 375204 Dateien und Verzeichnisse sind derzeit installiert.)
Vorbereitung zum Ersetzen von rpcbind 0.2.0-7ubuntu1.1 (durch rpcbind_0.2.0-7ubuntu1.2_amd64.deb) ...
portmap stop/waiting
Ersatz für rpcbind wird entpackt ...
rpcbind (0.2.0-7ubuntu1.2) wird eingerichtet ...
portmap start/running, process 10375
Trigger für ureadahead werden verarbeitet ...
Trigger für man-db werden verarbeitet ...

$ sudo service rpcbind-boot status
rpcbind-boot stop/waiting

$ sudo service rpcbind-boot restart
stop: Unknown instance:
rpcbind-boot stop/waiting

$ sudo service rpcbind-boot status
rpcbind-boot stop/waiting

$ sudo service ypserv status
ypserv stop/waiting

$ sudo service ypserv restart
stop: Unknown instance:
ypserv start/running, process 18342

$ sudo service ypserv status
ypserv stop/waiting

$ initctl list | grep yp
ypbind start/running
ypserv stop/waiting
ecryptfs-utils-restore stop/waiting
ypxfrd stop/waiting
start-ypbind stop/waiting
yppasswdd stop/waiting
cryptdisks-udev stop/waiting
cryptdisks-enable stop/waiting
ecryptfs-utils-save stop/waiting

yptest still failes with "Keine Kommunikation mit »ypbind« möglich".... "cannot connect to ypbind" (I don't know the correct english translation for this message)

Revision history for this message
Steve Langasek (vorlon) wrote : Re: [Bug 1007293] Re: nis doesn't work anymore after upgrade to 12.04

On Thu, Oct 04, 2012 at 11:43:25AM -0000, no!chance wrote:
> I installed the deb-Package manually, because it's not a good idea to
> activate the proposed-packages on a production system. There is no
> difference ... rpcbind does not start.

> $ sudo dpkg -i rpcbind_0.2.0-7ubuntu1.2_amd64.deb
> (Lese Datenbank ... 375204 Dateien und Verzeichnisse sind derzeit installiert.)
> Vorbereitung zum Ersetzen von rpcbind 0.2.0-7ubuntu1.1 (durch rpcbind_0.2.0-7ubuntu1.2_amd64.deb) ...
> portmap stop/waiting
> Ersatz für rpcbind wird entpackt ...
> rpcbind (0.2.0-7ubuntu1.2) wird eingerichtet ...
> portmap start/running, process 10375
> Trigger für ureadahead werden verarbeitet ...
> Trigger für man-db werden verarbeitet ...

> $ sudo service rpcbind-boot status
> rpcbind-boot stop/waiting

> $ sudo service rpcbind-boot restart
> stop: Unknown instance:
> rpcbind-boot stop/waiting

> $ sudo service rpcbind-boot status
> rpcbind-boot stop/waiting

> $ sudo service ypserv status
> ypserv stop/waiting

This is not a valid test of the changed patch - this does not accurately
model the boot-time startup sequence. The test in the bug description calls
for a reboot to verify the fix - are you at liberty to do a reboot test?

Revision history for this message
no!chance (ralf-fehlau) wrote :

Sorry! I forgot to mention, that I did a reboot in the hope, the situation would change. The problem remains unchanged before and after a reboot.

Regards,
Ralf

Revision history for this message
no!chance (ralf-fehlau) wrote :

Hi there,
is there any possibility to get a detailed log of the rpcbind process? I don't think that we can solve this problem without it.
Regards,
Ralf

Revision history for this message
no!chance (ralf-fehlau) wrote :

I think (I hope), I have found the reason. It was apparmor. Maybe, there were profile updates in the past?!

I did the following:

$ sudo aa-complain rpcbind
$ sudo service rpcbind restart
$ sudo service ypbind restart
$ yptest
was ok and gave me the list of all users, groups and others

$ ls -l /net/remote-machine/remote-share/....
was ok and gave me the list of all files and directories, which fails in the last week!

$ sudo mount remote-host:/remote-share /mnt
does not give me any error message

All in all: Looks good.

By the way: In 10.04 the logs were much better. Since the introduction of unity, most logfiles (e.g. /var/log/messages) are empty and the remaining logs - I think - are much less informative.

Regards,
Ralf

Revision history for this message
Steve Langasek (vorlon) wrote :

On Mon, Oct 08, 2012 at 01:11:04PM -0000, no!chance wrote:
> I think (I hope), I have found the reason. It was apparmor. Maybe, there
> were profile updates in the past?!

> I did the following:

> $ sudo aa-complain rpcbind
> $ sudo service rpcbind restart
> $ sudo service ypbind restart
> $ yptest
> was ok and gave me the list of all users, groups and others

So after making these changes, does everything come up correctly now on
reboot?

> By the way: In 10.04 the logs were much better. Since the introduction
> of unity, most logfiles (e.g. /var/log/messages) are empty and the
> remaining logs - I think - are much less informative.

Logging has nothing to do with unity. The change in log handling has only
consolidated the set of log files being written to, there are no changes to
the verbosity. /var/log/syslog contains all the information that
/var/log/messages used to. Furthermore, 12.04 logs the output of all
upstart jobs under /var/log/upstart.

Revision history for this message
no!chance (ralf-fehlau) wrote :

Hi Steve,

yes, after making this changes, all services will come up after a reboot ... with no problems.

And Yes, I know that unity and the logs have nothing to do with each other. Ok, maybe it was only my impression, that I havent found some messages in the logfiles in the past. Maybe, I have to review my logfiles more intensive.

Regards,
Ralf

Steve Langasek (vorlon)
tags: added: verification-donee
removed: verification-needed
tags: added: verification-done
removed: verification-donee
Revision history for this message
Colin Watson (cjwatson) wrote : Update Released

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package rpcbind - 0.2.0-7ubuntu1.2

---------------
rpcbind (0.2.0-7ubuntu1.2) precise-proposed; urgency=low

  * debian/rpcbind.portmap-wait.upstart: export ON_BOOT=y when starting
    portmap from here; since portmap and statd will only race to start on
    boot, and not when started from a maintainer script, we can assume that
    the only time this 'start' command has any effect is at boot time, so
    it's always correct to say ON_BOOT=y here. This fixes a problem with
    racy startup of portmap causing other services that depend on portmap to
    start too early. LP: #1007293.
 -- Steve Langasek <email address hidden> Tue, 05 Jun 2012 15:21:15 -0700

Changed in rpcbind (Ubuntu Precise):
status: Fix Committed → Fix Released
Revision history for this message
Sergey Pashinin (pashinin) wrote :

I had similar problem with NFS and not started idmapd after reboot on my Ubuntu 12.10 x64.

I used to have in my fstab:
10.254.239.1:/usr/data/disk_1 /usr/data/disk_1 nfs4 _netdev,rsize=8192,wsize=8192,timeo=1,soft,retry=0,auto,bg 0 0

The only thing I changed was ip address:
domain.com:/usr/data/disk_1 /usr/data/disk_1 nfs4 _netdev,rsize=8192,wsize=8192,timeo=1,soft,retry=0,auto,bg 0 0

Actually domain.com resolves to 10.254.239.1 (My DNS server is on 10.254.239.1 also)
But there are even errors "Can't be resolved" in log. So I don't know why it works. (Maybe starts some needed inet services)

But still it mounts(!) now and everything is working.
I hope that it will help somebody.

Revision history for this message
no!chance (ralf-fehlau) wrote :

Same procedure as every year ... nis isn't working on 14.04 LTS, too! I agree, I looks like the main target is standalone computers or tablets, but not for computers in a productive environment.

I will start starting CentOS.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.