sometime something happens on dbus which makes apps using it crashing

Bug #74946 reported by Sebastien Bacher
2
Affects Status Importance Assigned to Milestone
D-Bus
Fix Released
Medium
dbus-glib (Fedora)
Fix Released
Medium
dbus-glib (Ubuntu)
Fix Released
High
Unassigned

Bug Description

I was working on a gdm bug from a VT, after reboot I got apport crashes for epiphany, gaim, gnome-session, update-manager, xchat-gnome, evolution-alarm-notify. The crash happens to dbus-glib according to the backtraces, one backtrace example:

Core was generated by `epiphany https://launchpad.net/bugs/74895'.
Program terminated with signal 11, Segmentation fault.
#0 0xffffe410 in __kernel_vsyscall ()
(gdb) bt
#0 0xffffe410 in __kernel_vsyscall ()
#1 0xb7decdf0 in raise () from /lib/tls/i686/cmov/libc.so.6
#2 0xb7f542b6 in nsProfileLock::FatalSignalHandler (signo=11) at nsProfileLock.cpp:206
#3 <signal handler called>
#4 0xb7db390c in find_name_in_info (a=0x0, b=0x8847944) at dbus-gproxy.c:496
#5 0xb70a242e in IA__g_slist_find_custom (list=0x8ea4260, data=0x8847944, func=0xb7db38f0 <find_name_in_info>)
    at gslist.c:389
#6 0xb7db7d3c in dbus_g_proxy_manager_filter (connection=0x81ad438, message=0x8847200, user_data=0x88484a8)
    at dbus-gproxy.c:720
#7 0xb7d81a92 in dbus_connection_dispatch (connection=0x81ad438) at dbus-connection.c:4267
#8 0xb7daf57d in message_queue_dispatch (source=0x81aee38, callback=0, user_data=0x0) at dbus-gmain.c:101
#9 0xb708a6f2 in IA__g_main_context_dispatch (context=0x81879b8) at gmain.c:2045
#10 0xb708d6cf in g_main_context_iterate (context=0x81879b8, block=1, dispatch=1, self=0x815e4d8) at gmain.c:2677

bug #74682 is an another example of such crash happening

Revision history for this message
In , Kimmo Hämäläinen (kimmo-hamalainen) wrote :

Created an attachment (id=6920)
proposed patch

Revision history for this message
In , Kimmo Hämäläinen (kimmo-hamalainen) wrote :

(From update of attachment 6920)
This turned out to be a buggy patch. I'll attach a new one when it's been
tested.

Revision history for this message
In , Kimmo Hämäläinen (kimmo-hamalainen) wrote :

Created an attachment (id=7401)
proposed patch

Here is the tested, fixed patch.

Revision history for this message
In , Rob Taylor (robtaylor) wrote :

I'd like to look into the 'neater' option mentioned in the patch.

Revision history for this message
In , Kimmo Hämäläinen (kimmo-hamalainen) wrote :

It seems that unassociate_proxies() should also remove the proxy from the proxy
list before adding it to the data->destroyed list, because the caller destroys
the proxy.

Revision history for this message
In , Kimmo Hämäläinen (kimmo-hamalainen) wrote :

Created an attachment (id=7809)
proposed patch

Here is a modified patch that applies against 0.72.

Revision history for this message
In , Ray (ray-redhat-bugs) wrote :
Download full text (4.2 KiB)

so I'm not exactly sure what I was doing, but I noticed that my autohide panel
wouldn't come down, so I switched to an open terminal and typed ps -ef to see this:

rstrode 5665 1 0 14:03 ? 00:00:00 /usr/libexec/gnome_segv
gnome-at-properties 11 2.16.0
rstrode 5666 5078 0 14:03 ? 00:00:00 /usr/libexec/gnome_segv
evolution-alarm-notify 11 2.8.0
rstrode 5667 4902 0 14:03 ? 00:00:00 /usr/libexec/gnome_segv
gswitchit 11 0
rstrode 5668 4888 0 14:03 ? 00:00:00 /usr/libexec/gnome_segv
gnome-power-manager 11 2.16.0
rstrode 5664 1 0 14:03 ? 00:00:00 /usr/libexec/gnome_segv
gnome-panel 11 2.16.0
rstrode 5669 4733 0 14:03 ? 00:00:00 /usr/libexec/gnome_segv
gnome-session 11 2.16.0
rstrode 5670 5665 0 14:03 ? 00:00:00 /usr/bin/bug-buddy
--appname=gnome-at-properties --pid=5470 --package-ver=(null)
rstrode 5671 5668 0 14:03 ? 00:00:00 /usr/bin/bug-buddy
--appname=gnome-power-manager --pid=4888 --package-ver=(null)
rstrode 5672 5666 0 14:03 ? 00:00:00 /usr/bin/bug-buddy
--appname=evolution-alarm-notify --pid=5078 --package-ver=(null)
rstrode 5673 5664 0 14:03 ? 00:00:00 /usr/bin/bug-buddy
--appname=gnome-panel --pid=4834 --package-ver=(null)
rstrode 5674 5667 0 14:03 ? 00:00:00 /usr/bin/bug-buddy
--appname=gswitchit --pid=4902 --package-ver=(null)
rstrode 5675 5669 0 14:03 ? 00:00:00 /usr/bin/bug-buddy
--appname=gnome-session --pid=4733 --package-ver=(null)

I tried to gdb attach to gnome-session and I got this:

#0 0x0084e402 in __kernel_vsyscall ()
No symbol table info available.
#1 0x4956ac93 in __waitpid_nocancel () from /lib/libpthread.so.0
No symbol table info available.
#2 0x45189cf6 in gnome_gtk_module_info_get () from /usr/lib/libgnomeui-2.so.0
No symbol table info available.
#3 <signal handler called>
No symbol table info available.
#4 0x450adf7c in find_name_in_info (a=0x0, b=0x8ebb84c) at dbus-gproxy.c:496
No locals.
#5 0x4982416e in g_slist_find_custom () from /lib/libglib-2.0.so.0
No symbol table info available.
#6 0x450b16cc in dbus_g_proxy_manager_filter (connection=0x8ebb178,
    message=0x8ebb6b8, user_data=0x8ebd5e0) at dbus-gproxy.c:716
        name = 0x8ebb84c "org.gnome.YelpService"
        prev_owner = 0x8ebb868 ":1.22"
        new_owner = 0x8ebb874 ""
        derr = {name = 0x0, message = 0x0, dummy1 = 1, dummy2 = 0, dummy3 = 1,
  dummy4 = 1, dummy5 = 1, padding1 = 0xbfcacc18}
        manager = <value optimized out>
        __PRETTY_FUNCTION__ = "dbus_g_proxy_manager_filter"
#7 0x45050f05 in dbus_connection_dispatch () from /lib/libdbus-1.so.3
No symbol table info available.
#8 0x450a9ddd in message_queue_dispatch (source=0x8ebcb48, callback=0,
    user_data=0x0) at dbus-gmain.c:113
        connection = (DBusConnection *) 0x8ebb178
#9 0x4980c342 in g_main_context_dispatch () from /lib/libglib-2.0.so.0
No symbol table info available.
#10 0x4980f31f in g_main_context_check () from /lib/libglib-2.0.so.0
No symbol table info available.
#11 0x4980f6c9 in g_main_loop_run () from /lib/libglib-2.0.so.0
No symbol table info available.
#12 0x49c22be4 in gtk_main () from /usr/lib/libgtk-x11-2.0.s...

Read more...

Revision history for this message
In , Ray (ray-redhat-bugs) wrote :

In fact, the backtrace is missing the dbus_g_proxy_manager_replace_name_owner
frame it looks like.

The crash is triggered by this block of code:

   │707 else │
   │708 { │
   │709 DBusGProxyNameOwnerInfo *info; │
   │710 GSList *link; │
   │711 │
   │712 /* Name owner changed or deleted */ │
   │713 │
   │714 names = g_hash_table_lookup (manager->owner_names, prev_owner│
   │715 │
  >│716 link = g_slist_find_custom (names, name, find_name_in_info); │
   │717 │
where name is "org.gnome.YelpService" and names looks bogus

(gdb) p *(struct _GSList *) 0x8ed8f48
$18 = {data = 0x0, next = 0x0}

Revision history for this message
In , Ray (ray-redhat-bugs) wrote :

Also, note this function (dbus_g_proxy_manager_replace_name_owner) has another
code path that could conceivably add an element to the names list with null data:

   │716 link = g_slist_find_custom (names, name, find_name_in_info); │
   │717 │
   │718 info = NULL; │
   │719 if (link != NULL) │
   │720 { │
                  (fill in info here ...)
   │727 } |
   │728 │
   │729 if (new_owner[0] == '\0') │
   │730 { │
                   (...)
   │748 } │
   │749 else │
   │750 { │
   │751 insert_nameinfo (manager, new_owner, info); │
   │752 }

and insert_nameinfo does
   │563 names = g_slist_append (names, info); │

I have no idea if it hit this code path earlier or not, I just noticed it when
snooping around.

Revision history for this message
In , Ray (ray-redhat-bugs) wrote :

*** Bug 215952 has been marked as a duplicate of this bug. ***

Revision history for this message
In , RHEL (rhel-redhat-bugs) wrote :

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release. Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release. This request is not yet committed for
inclusion.

Revision history for this message
In , Matthias (matthias-redhat-bugs) wrote :

It also looks to me like info might be leaked in that function you cite:

Here we remove info from the owner_names table:

      if (link != NULL)
        {
          info = link->data;

          names = g_slist_delete_link (names, link);

          if (names == NULL)
            g_hash_table_remove (manager->owner_names, prev_owner);
        }

And here we do nothing with it, assuming new_owner is not empty:

      if (new_owner[0] == '\0')
        {
          DBusGProxyUnassociateData data;
          GSList *tmp;

          data.name = name;
          data.destroyed = NULL;

          /* A service went away, we need to unassociate proxies */
          g_hash_table_foreach (manager->proxy_lists,
                                unassociate_proxies, &data);

          UNLOCK_MANAGER (manager);

          for (tmp = data.destroyed; tmp; tmp = tmp->next)
            dbus_g_proxy_destroy (tmp->data);
          g_slist_free (data.destroyed);

          LOCK_MANAGER (manager);
        }

I think info may need freeing in that case ?

Revision history for this message
In , Matthias (matthias-redhat-bugs) wrote :

Created attachment 142259
possible patch

Here is an untested patch that adresses both issues.
Does that look reasonable, John ?

Revision history for this message
In , Matthias Clasen (mclasen) wrote :

Don't you want to create a new info in the else part ?

Revision history for this message
In , Rstrode (rstrode) wrote :
Revision history for this message
In , Matthias Clasen (mclasen) wrote :

The RH bug has my alternative patch

Revision history for this message
In , Matthias (matthias-redhat-bugs) wrote :

John says the patch looks good, and there very similar patches in upstream bugzilla.

Revision history for this message
In , David (david-redhat-bugs) wrote :

Thanks Matthias, I've included this patch

 * Tue Nov 28 2006 David Zeuthen <email address hidden> - 0.70-5
 - Add dbus-glib-0.70-fix-info-leak.patch
 - Resolves: #216034

Package is partially built in Brew (waiting on s390x).

Ray, can you verify that this package works? Thanks.

Revision history for this message
In , Ray (ray-redhat-bugs) wrote :

I don't have a reliable way to reproduce the problem and it's only happened to
me a few times, unfortunately.

Revision history for this message
In , Ray (ray-redhat-bugs) wrote :

I wonder if we could write a test case that just takes control of a bus name and
releases it over and over again.

Revision history for this message
In , Kimmo Hämäläinen (kimmo-hamalainen) wrote :

(In reply to comment #7)
> Don't you want to create a new info in the else part ?

I'm not sure, because of the comment "Name owner changed or deleted" in the
code: if the info was not found, doesn't that mean that there was not changing
or deletion? I.e. If the info is not found, the name owner is not related to
GProxies. Or am I confused?

Revision history for this message
In , Otaylor-redhat (otaylor-redhat) wrote :

Just wanted to note here that this is regularly crashing gnome-session,
since it blanket listens to all NameOwnerChanged signals; I don't see
a GNOME bugzilla issue for this, but there are probably some dups there
if you looked hard enough.

I think that Kimmo is right and you don't want to insert a new info
object, the message was simply not related to any GProxy objects that
the library was tracking.

Revision history for this message
In , Owen (owen-redhat-bugs) wrote :

You don't want to take control and release it, instead you want to
go from one owner to another owner. Calling :

 mugshot --replace

a few times (it won't quit the old one with mugshot < 1.1.27, but it
still replaces the D-Bus name) should take down your session pretty reliably.

According to a conversation that Havoc and I had, the patch in the
upstream bug report is most likely more correct than the one here ...
the proxy code in D-Bus should simply not care changes in the
name owner for names it doesn't have a proxy for, so inserting a
dummy info entry doesn't make sense. (We didn't review it line by
line.)

Revision history for this message
In , Rob Taylor (robtaylor) wrote :

Yep, that patch looks right to me, I'll prep a release with it in asap.

Revision history for this message
Sebastien Bacher (seb128) wrote :

I was working on a gdm bug from a VT, after reboot I got apport crashes for epiphany, gaim, gnome-session, update-manager, xchat-gnome, evolution-alarm-notify. The crash happens to dbus-glib according to the backtraces, one backtrace example:

Core was generated by `epiphany https://launchpad.net/bugs/74895'.
Program terminated with signal 11, Segmentation fault.
#0 0xffffe410 in __kernel_vsyscall ()
(gdb) bt
#0 0xffffe410 in __kernel_vsyscall ()
#1 0xb7decdf0 in raise () from /lib/tls/i686/cmov/libc.so.6
#2 0xb7f542b6 in nsProfileLock::FatalSignalHandler (signo=11) at nsProfileLock.cpp:206
#3 <signal handler called>
#4 0xb7db390c in find_name_in_info (a=0x0, b=0x8847944) at dbus-gproxy.c:496
#5 0xb70a242e in IA__g_slist_find_custom (list=0x8ea4260, data=0x8847944, func=0xb7db38f0 <find_name_in_info>)
    at gslist.c:389
#6 0xb7db7d3c in dbus_g_proxy_manager_filter (connection=0x81ad438, message=0x8847200, user_data=0x88484a8)
    at dbus-gproxy.c:720
#7 0xb7d81a92 in dbus_connection_dispatch (connection=0x81ad438) at dbus-connection.c:4267
#8 0xb7daf57d in message_queue_dispatch (source=0x81aee38, callback=0, user_data=0x0) at dbus-gmain.c:101
#9 0xb708a6f2 in IA__g_main_context_dispatch (context=0x81879b8) at gmain.c:2045
#10 0xb708d6cf in g_main_context_iterate (context=0x81879b8, block=1, dispatch=1, self=0x815e4d8) at gmain.c:2677

bug #74682 is an another example of such crash happening

Changed in dbus-glib:
importance: Undecided → High
Revision history for this message
Sebastien Bacher (seb128) wrote :

bt full from apport-retrace:

 #1 0xb7decdf0 in raise () from /lib/tls/i686/cmov/libc.so.6
 No symbol table info available.
 #2 0xb7f542b6 in nsProfileLock::FatalSignalHandler (signo=11) at nsProfileLock.cpp:206
        unblock_sigs = {__val = {1024, 0 <repeats 31 times>}}
        oldact = <value optimized out>
 #3 <signal handler called>
 No symbol table info available.
 #4 0xb7db390c in find_name_in_info (a=0x0, b=0x8847944) at dbus-gproxy.c:496
 No locals.
 #5 0xb70a242e in IA__g_slist_find_custom (list=0x8ea4260, data=0x8847944, func=0xb7db38f0 <find_name_in_info>)
     at gslist.c:389
        __PRETTY_FUNCTION__ = "IA__g_slist_find_custom"
 #6 0xb7db7d3c in dbus_g_proxy_manager_filter (connection=0x81ad438, message=0x8847200, user_data=0x88484a8)
     at dbus-gproxy.c:720
        name = 0x8847944 "org.gnome.YelpService"
        prev_owner = 0x8847960 ":1.58"
        new_owner = 0x884796c ""
        derr = {name = 0x0, message = 0x0, dummy1 = 1, dummy2 = 1, dummy3 = 0, dummy4 = 0, dummy5 = 1,
   padding1 = 0x812bd7e}
        manager = <value optimized out>
        __PRETTY_FUNCTION__ = "dbus_g_proxy_manager_filter"
 #7 0xb7d81a92 in dbus_connection_dispatch (connection=0x81ad438) at dbus-connection.c:4267
        filter = (DBusMessageFilter *) 0x8db8f06
        next = (DBusList *) 0x8847574
        message = (DBusMessage *) 0x8847200
        link = <value optimized out>
        filter_list_copy = (DBusList *) 0x8847550
        message_link = (DBusList *) 0x8847508
        result = DBUS_HANDLER_RESULT_NOT_YET_HANDLED
        status = <value optimized out>
        __FUNCTION__ = "dbus_connection_dispatch"
 #8 0xb7daf57d in message_queue_dispatch (source=0x81aee38, callback=0, user_data=0x0) at dbus-gmain.c:101
        connection = (DBusConnection *) 0x81ad438
 #9 0xb708a6f2 in IA__g_main_context_dispatch (context=0x81879b8) at gmain.c:2045
 No locals.

Revision history for this message
Sebastien Bacher (seb128) wrote :

Upstream pointed to a similar bug which has a patch

Revision history for this message
Sebastian Dröge (slomo) wrote :

This should be fixed by dbus-glib 0.72-0ubuntu2 which was uploaded some seconds ago...

Changed in dbus-glib:
status: Unconfirmed → Fix Released
Changed in dbus:
status: Unknown → In Progress
Changed in dbus-glib:
status: Unknown → Fix Committed
Revision history for this message
In , RHEL (rhel-redhat-bugs) wrote :

A package has been built which should help the problem described in
this bug report. This report is therefore being closed with a resolution
of CURRENTRELEASE. You may reopen this bug report if the solution does
not work for you.

Revision history for this message
In , Rob Taylor (robtaylor) wrote :

Fixed in git head, will be in dbus-glib 0.73

Changed in dbus:
status: In Progress → Fix Released
Changed in dbus-glib:
status: Fix Committed → Fix Released
Changed in dbus:
importance: Unknown → Medium
Changed in dbus:
importance: Medium → Unknown
Changed in dbus:
importance: Unknown → Medium
Changed in dbus-glib (Fedora):
importance: Unknown → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.