libhal_ctx_init() returns FALSE, causing Xserver to fail to set up keyboard/mouse

Bug #276857 reported by Ted Gould
8
Affects Status Importance Assigned to Milestone
hal (Ubuntu)
Invalid
High
Bryce Harrington
Intrepid
Invalid
High
Bryce Harrington

Bug Description

Binary package hint: xorg

Upon restarting my computer over half the time I end up at a GDM login screen with no mouse or keyboard working. I've removed the configuration for input devices from my xorg.conf, and I'm using HAL for the configuration. It seems that X is failing to connect with HAL on boot up.

The work around is to go to a terminal, kill X, and it will restart correctly.

The problem was worse for me earlier in the Intrepid cycle. I believe that this was because I was running the nvidia 177 drivers instead of the 173 I'm running now. Though, other things could have changed.

Tags: iso-testing
Revision history for this message
Ted Gould (ted) wrote :
Revision history for this message
Ted Gould (ted) wrote :
Revision history for this message
Matt Zimmerman (mdz) wrote :

By default, hal starts at S24 while gdm starts at S30. Have you changed the startup order at all?

Can you confirm whether hal is running, e.g. by adding "lshal >> somelogfile" to /etc/init.d/gdm:start?

Revision history for this message
Ted Gould (ted) wrote : Re: [Bug 276857] Re: On reboot X requires restart because mouse and keyboard are unresponsive

On Sat, 2008-10-11 at 16:17 +0000, Matt Zimmerman wrote:
> By default, hal starts at S24 while gdm starts at S30. Have you changed
> the startup order at all?
>
> Can you confirm whether hal is running, e.g. by adding "lshal >>
> somelogfile" to /etc/init.d/gdm:start?

I've attached the two different lshals. One from gdm:start and one when
the system is completely running. The difference is 5 devices.

udi = '/org/freedesktop/Hal/devices/usb_device_5ac_8205_noserial'
udi = '/org/freedesktop/Hal/devices/usb_device_5ac_8205_noserial_if0'
udi =
'/org/freedesktop/Hal/devices/usb_device_5ac_8205_noserial_if0_bluetooth_hci_1b636334f3'
udi = '/org/freedesktop/Hal/devices/usb_device_5ac_8205_noserial_if1'
udi = '/org/freedesktop/Hal/devices/usb_device_5ac_8205_noserial_if2'

These seem to be all enhanced ACPI keys and similar things. I'm not
sure why they would have an effect. They seem to be in the same USB
chain, I wonder if there is an init issue with the USB tree. Though it
does seem like the state of HAL is different when X starts the first
time vs. the second.

Revision history for this message
Sebastien Bacher (seb128) wrote : Re: On reboot X requires restart because mouse and keyboard are unresponsive

the issue seems to be similar to bug #271138

Revision history for this message
Matt Zimmerman (mdz) wrote : Re: [Bug 276857] Re: On reboot X requires restart because mouse and keyboard are unresponsive

On Sun, Oct 12, 2008 at 01:59:57PM -0000, Ted Gould wrote:
> On Sat, 2008-10-11 at 16:17 +0000, Matt Zimmerman wrote:
> > By default, hal starts at S24 while gdm starts at S30. Have you changed
> > the startup order at all?
> >
> > Can you confirm whether hal is running, e.g. by adding "lshal >>
> > somelogfile" to /etc/init.d/gdm:start?
>
> I've attached the two different lshals. One from gdm:start and one when
> the system is completely running. The difference is 5 devices.
>
> udi = '/org/freedesktop/Hal/devices/usb_device_5ac_8205_noserial'
> udi = '/org/freedesktop/Hal/devices/usb_device_5ac_8205_noserial_if0'
> udi =
> '/org/freedesktop/Hal/devices/usb_device_5ac_8205_noserial_if0_bluetooth_hci_1b636334f3'
> udi = '/org/freedesktop/Hal/devices/usb_device_5ac_8205_noserial_if1'
> udi = '/org/freedesktop/Hal/devices/usb_device_5ac_8205_noserial_if2'
>
> These seem to be all enhanced ACPI keys and similar things. I'm not
> sure why they would have an effect. They seem to be in the same USB
> chain, I wonder if there is an init issue with the USB tree. Though it
> does seem like the state of HAL is different when X starts the first
> time vs. the second.

Bizarre...can you post a copy of the instrumented gdm init script? Are you
running lshal before gdm even starts, so that we're certain it is up and
running already?

--
 - mdz

Revision history for this message
Pablo Angulo (pablo-angulo) wrote : Re: On reboot X requires restart because mouse and keyboard are unresponsive

>...I believe that this was because I was running the nvidia 177 drivers instead of the 173 I'm running now. Though, other things could have changed...

It happened in a computer of mine with an ati radeon card, using the free ati driver (last week, using Intrepid, and never before, neither with Hardy, neither in a month with Intrepid).

Revision history for this message
Ted Gould (ted) wrote : Re: [Bug 276857] Re: On reboot X requires restart because mouse and keyboard are unresponsive
  • gdm Edit (3.3 KiB, application/x-shellscript; name="gdm")

On Mon, 2008-10-13 at 09:11 +0000, Matt Zimmerman wrote:
> Bizarre...can you post a copy of the instrumented gdm init script? Are you
> running lshal before gdm even starts, so that we're certain it is up and
> running already?

Yes, I am doing that. The script is attached for reference.

Revision history for this message
Ted Gould (ted) wrote : Re: [Bug 276857] Re: On reboot X requires restart because mouse and keyboard are unresponsive

On Mon, 2008-10-13 at 09:11 +0000, Sebastien Bacher wrote:
> the issue seems to be similar to bug #271138

I think that they're probably duplicates. I'm not sure which one should
be marked a dup of the other.

Revision history for this message
Ted Gould (ted) wrote : Re: On reboot X requires restart because mouse and keyboard are unresponsive

I've tried to find various fixes/workarounds for this and I haven't found any. For the record incase other people want to try and work on it, I've done:

* Added a sleep to the start of gdm
* Added a sleepd to the end of HAL
* Added another start task that is sleep for 60 seconds
* Moved the gdm start to S99zzgdm

None of those have fixed this issue for me. I'm still looking for answers.

Changed in xorg:
assignee: nobody → bryce
importance: Undecided → High
status: New → Confirmed
assignee: bryce → bryceharrington
milestone: none → ubuntu-8.10
Revision history for this message
Martin Pitt (pitti) wrote :

Ted, in your Xorg.0.log communication with hal seems to fail:

  EE) config/hal: couldn't initialise context: (null) ((null))

and there is no attempt at all to load evdev. However, your lshal.gdm-start.txt clearly shows that hal *is* running.

Also, I find it strange that *all* of Ted's devices in lshal have the "input.keymap" capability and

  input.keymap.data = {'e13c:brightnessdown', 'e13d:brightnessup'}

that doesn't sound right, and it might be what's confusing X.org. That makes me wonder whether it is a bug in hal-info on some specific laptop models.

Revision history for this message
Martin Pitt (pitti) wrote :

Ted, do you still get this if you temporarily purge hal-info? That should get rid of the ubiquituous input.keymap properties in hal.

As for the Xorg.0.log hal issue, X.org config/hal.c prints that if libhal_ctx_init() fails. This can happen due to a variety of reasons, and finding out why would need a gdb attached to the X server with a breakpoint at this function. I see that this is hard to do, though, so maybe we need a hal with some fprintf() logging augmentation to find out where it fails.

Ted's experiments with delaying the gdm startup seem to indicate that it's not a race condition between hal being busy to attach all input.keymap properties to devices and gdm starting up, so the first X.org startup apparently does some input device detection which change system state.

So my current theory is that Ted has like a gazillion hal-setup-keymap processes running (spawned from /usr/share/hal/fdi/policy/10osvendor/10-keymap.fdi) which might just get into each other's way. Ted, can you please check that in your augmented gdm script? (ps aux|grep hal should do). Thanks!

Revision history for this message
Ted Gould (ted) wrote : Re: [Bug 276857] Re: On reboot X requires restart because mouse and keyboard are unresponsive

On Tue, 2008-10-21 at 11:24 +0000, Martin Pitt wrote:
> So my current theory is that Ted has like a gazillion hal-setup-keymap
> processes running (spawned from
> /usr/share/hal/fdi/policy/10osvendor/10-keymap.fdi) which might just get
> into each other's way. Ted, can you please check that in your augmented
> gdm script? (ps aux|grep hal should do). Thanks!

I've attached ps's before and after, without the grep just to ensure
that nothing is lost. I don't see any instances of hal-setup-keymap. I
do find it odd that ps is reporting hald as using 14% of the CPU in the
first case.

gpm.ps.first.txt is the time that X doesn't work. gpm.ps.second.txt is
the time that X does.

The fact that every device had some keymap information was my fault. I
had a file in /usr/share/hal/fdi/... (attached) where I am trying to fix
the hotkeys on my laptop (bug 257377). The FDI file is wrong, and was
putting that information on every device. I rebooted without it and I'm
still getting the same issue with no mouse/keyboard on first GDM prompt.

Revision history for this message
Bryce Harrington (bryce) wrote : Re: On reboot X requires restart because mouse and keyboard are unresponsive

xserver: config/hal.c:

    if (!libhal_ctx_set_dbus_connection(info->hal_ctx, info->system_bus)) {
        LogMessage(X_ERROR, "config/hal: couldn't associate HAL context with bus\n");
        goto out_ctx;
    }
    if (!libhal_ctx_init(info->hal_ctx, &error)) {
        LogMessage(X_ERROR, "config/hal: couldn't initialise context: %s (%s)\n",
               error.name, error.message);
        goto out_ctx;
    }

libhal_ctx_init is failing; there are 5 conditions under which it returns false. Probably you want to install a debug version of hal, break on libhal_ctx_init(), and see which of the conditions are getting hit.

My guess is that either dbus_bus_name_has_owner() or dbus_connection_add_filter() are failing.

Changed in xorg:
status: Confirmed → Triaged
Revision history for this message
Ted Gould (ted) wrote : Re: [Bug 276857] Re: On reboot X requires restart because mouse and keyboard are unresponsive

On Tue, 2008-10-21 at 23:58 +0000, Bryce Harrington wrote:
> My guess is that either dbus_bus_name_has_owner() or
> dbus_connection_add_filter() are failing.

Further debugging -- no answers

I added a line to my gdm init script to list the names of all the
services on the GDM system bus. In both situations the lists is
basically identical. I think this rules out dbus_bus_name_has_owner()
failing. The command line is:

dbus-send --print-reply --system
--dest=org.freedesktop.DBus /org/freedesktop/DBus
org.freedesktop.DBus.ListNames

And the results are attached.

Revision history for this message
Ted Gould (ted) wrote :

On Tue, 2008-10-21 at 23:58 +0000, Bryce Harrington wrote:
> My guess is that either dbus_bus_name_has_owner() or
> dbus_connection_add_filter() are failing.

Further debugging -- no answers

I added a line to my gdm init script to list the names of all the
services on the GDM system bus. In both situations the lists is
basically identical. I think this rules out dbus_bus_name_has_owner()
failing. The command line is:

dbus-send --print-reply --system
--dest=org.freedesktop.DBus /org/freedesktop/DBus
org.freedesktop.DBus.ListNames

And the results are attached.

Revision history for this message
Martin Pitt (pitti) wrote :

After you removed the faulty .fdi file, does the Xorg.0.log still actually show a failure to connect to hal?

Revision history for this message
Martin Pitt (pitti) wrote :

I tried this from the other end, i. e. to minimize the time between hal startup and X.

First, I switched to a VT and

  sudo killall hald gdm

Experiment 1: Just "startx"
  -> I got the "config/hal: couldn't initialize context: (null) ((null))" in Xorg.0.log, and I did not have keyboard and mouse (I do not have an xorg.conf).

So that demonstrates the "no hal" -> "no love" effect.

Experiment 2: sudo hald; startx

hald usually stays in the foreground until all the "coldplugging" is done, and then forks off into the background. This is more or less what the init script does as well. startx avoids all the gdm delays etc. This worked fine, I had keyboard and mouse, and Xorg.0.log didn't complain.

So let's see what else is different on your system:

 * Does above experiment work for you? "sudo hald; startx" -> does that give you mouse/keyboard or not?

 * If you boot with adding "text" to the kernel command line, gdm is not started. If you start (a) gdm or (b) startx manually after a text-only boot, does it work (a1/b1) the first time, (a2/b2) the second time?

 * If the previous experiment still fails for you, please boot with "text" again, go to a VT, do

     sudo killall hald
     sudo hald --verbose=yes --daemon=no 2>&1 | tee /tmp/hal.log

   then, on a second VT, do "startx" or start gdm. Once X is started up, kill it again, Control-C the foreground hald on console 1, and attach /tmp/hal.log and /var/log/Xorg.0.log.

Thanks!

Revision history for this message
Ted Gould (ted) wrote :

On Tue, 2008-10-21 at 23:58 +0000, Bryce Harrington wrote:
> xserver: config/hal.c:
>
> if (!libhal_ctx_set_dbus_connection(info->hal_ctx, info->system_bus)) {
> LogMessage(X_ERROR, "config/hal: couldn't associate HAL context with bus\n");
> goto out_ctx;
> }
> if (!libhal_ctx_init(info->hal_ctx, &error)) {
> LogMessage(X_ERROR, "config/hal: couldn't initialise context: %s (%s)\n",
> error.name, error.message);
> goto out_ctx;
> }
>
> libhal_ctx_init is failing; there are 5 conditions under which it
> returns false. Probably you want to install a debug version of hal,
> break on libhal_ctx_init(), and see which of the conditions are getting
> hit.

Okay, life gets weirder.

So I patch libhal to fill the error item like it is attached. (this
should probably go upstream sometime) The error that is occurring is
that X can't find the name of org.freedesktop.Hal. So I constructed the
same call that DBus is making and put it into my /etc/init.d/gdm which
is as follows:

dbus-send --system --print-reply
--dest=org.freedesktop.DBus /org/freedesktop/DBus
org.freedesktop.DBus.NameHasOwner string:org.freedesktop.Hal

When that executes it returns True.

So, in summary, dbus-send can get the name before X starts, but then X
can't get it when it is started.

Revision history for this message
Ted Gould (ted) wrote :

On Tue, 2008-10-21 at 23:58 +0000, Bryce Harrington wrote:
> xserver: config/hal.c:
>
> if (!libhal_ctx_set_dbus_connection(info->hal_ctx, info->system_bus)) {
> LogMessage(X_ERROR, "config/hal: couldn't associate HAL context with bus\n");
> goto out_ctx;
> }
> if (!libhal_ctx_init(info->hal_ctx, &error)) {
> LogMessage(X_ERROR, "config/hal: couldn't initialise context: %s (%s)\n",
> error.name, error.message);
> goto out_ctx;
> }
>
> libhal_ctx_init is failing; there are 5 conditions under which it
> returns false. Probably you want to install a debug version of hal,
> break on libhal_ctx_init(), and see which of the conditions are getting
> hit.

Okay, life gets weirder.

So I patch libhal to fill the error item like it is attached. (this
should probably go upstream sometime) The error that is occurring is
that X can't find the name of org.freedesktop.Hal. So I constructed the
same call that DBus is making and put it into my /etc/init.d/gdm which
is as follows:

dbus-send --system --print-reply
--dest=org.freedesktop.DBus /org/freedesktop/DBus
org.freedesktop.DBus.NameHasOwner string:org.freedesktop.Hal

When that executes it returns True.

So, in summary, dbus-send can get the name before X starts, but then X
can't get it when it is started.

Revision history for this message
Ted Gould (ted) wrote : Re: [Bug 276857] Re: libhal_ctx_init() returns FALSE, causing Xserver to fail to set up keyboard/mouse

First thing to realize, when trying to follow these instructions print
them BEFORE rebooting!

On Wed, 2008-10-22 at 15:02 +0000, Martin Pitt wrote:
> Experiment 1: Just "startx"
> -> I got the "config/hal: couldn't initialize context: (null) ((null))" in Xorg.0.log, and I did not have keyboard and mouse (I do not have an xorg.conf).
>
> So that demonstrates the "no hal" -> "no love" effect.

Ditto.

> Experiment 2: sudo hald; startx
>
> hald usually stays in the foreground until all the "coldplugging" is
> done, and then forks off into the background. This is more or less what
> the init script does as well. startx avoids all the gdm delays etc. This
> worked fine, I had keyboard and mouse, and Xorg.0.log didn't complain.

Works for me also.

So, the question is why is this different than boot. I tried increasing
the system load and removing the disk cache. On separate VTs did:

    make -i -j 12
    find . --name "*" --exec cat {} \; &> /dev/null
    find . --name "*" --exec cat {} \; &> /dev/null

Things still work.

But, this got me looking into why boot would be different. One thought
on that was that the read ahead was effecting things. So I
changed /etc/readahead/boot to only include: /bin/bash. No effect. I
then changed it to include all of the fdi files in /usr/share/hal. No
effect.

> * If you boot with adding "text" to the kernel command line, gdm is not
> started. If you start (a) gdm or (b) startx manually after a text-only
> boot, does it work (a1/b1) the first time, (a2/b2) the second time?

It works the first time if booting without gdm and then starting it.

> * If the previous experiment still fails for you, please boot with
> "text" again, go to a VT, do
>
> sudo killall hald
> sudo hald --verbose=yes --daemon=no 2>&1 | tee /tmp/hal.log

hal.log is uninteresting. But attached.

I started looking at the dbus logs to see if there was anything
interesting. This involved doing dbus-monitor in the system bus and
watching things come on and off. X is connecting and getting a name.
There's nothing weird going on there. There was no noticeable
differences between the two executions.

All in all, it seems to be something with the boot sequence
specifically. I'm not sure exactly how. DBus traffic? Network Manager
tasking HAL so it can't respond? I'm unsure why X believes that there
is no HAL.

Revision history for this message
Ted Gould (ted) wrote : I'm going insane

Okay, I can't recreate this anymore. After removing GDM from startup,
and now adding it back in. I really can't do anything to make it happen
again. I've literally been trying for the last two hours. Uhm, I don't
know what to say. Thank you everyone for your help and attention, I
don't know what other debugging can be done without someone being able
to recreate it.

Changed in hal:
status: Triaged → Invalid
tags: added: iso-testing
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.