Random ofono crash in network-registration

Bug #1234491 reported by Tony Espy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ofono (Ubuntu)
Fix Released
Undecided
Tony Espy

Bug Description

Dave Morley reported a couple of bugs today with mobile data connections dropping after a few hours. His testing was done with system image #70 on a maguro phone. He also was running a test version of NetworkManager ( version: to-be-filled-in ) which had fixes to the mobile-data reconnect logic.

While gathering data for these bugs, he noticed an ofono crash file in /var/crash.

Here's the output of decoded backtrace:

root@ubuntu-phablet:/var/crash# ./dbg-runner.sh _usr_sbin_ofonod.0.crash
GNU gdb (GDB) 7.6-ubuntu
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/ofonod...Reading symbols from /usr/lib/debug/.build-id/fd/50bdbfc1dbd42c130f5b3c962a0e3ae44d2b89.debug...done.
done.
[New LWP 1029]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
Core was generated by `ofonod -p ril,rilmodem,provision,mbpi,nettime'.
Program terminated with signal 11, Segmentation fault.
#0 ril_register_cb (message=0xa44d48, user_data=0xa43290)
    at drivers/rilmodem/network-registration.c:371
371 drivers/rilmodem/network-registration.c: No such file or directory.
#0 ril_register_cb (message=0xa44d48, user_data=0xa43290)
    at drivers/rilmodem/network-registration.c:371
        cbd = 0xa43290
        cb = 0x74215 <register_callback>
        nd = 0x0
        error = {type = OFONO_ERROR_TYPE_NO_ERROR, error = 0}
#1 0x0001aa04 in handle_response (message=0xa44d48, p=0xa3d200)
    at gril/gril.c:366
        count = 12
        i = <optimized out>
        req = 0xa46000
        found = 1
#2 dispatch (message=0xa44d48, p=<optimized out>) at gril/gril.c:509
        data_len = <optimized out>
        unsolicited_field = <optimized out>
        id_num_field = <optimized out>
        bufp = <optimized out>
        datap = <optimized out>
#3 new_bytes (rbuf=0xa35790, user_data=0xa3d200) at gril/gril.c:607
        rbytes = <optimized out>
        p = 0xa3d200
        len = 16
        wrap = 16
        buf = 0xa3d2d0 "c"
        __FUNCTION__ = "new_bytes"
#4 0x0001b4ae in received_data (channel=0xa3d090, cond=G_IO_IN,
    data=0xa3d270) at gril/grilio.c:124
        buf = <optimized out>
        io = <optimized out>
        status = G_IO_STATUS_AGAIN
        rbytes = 0
        toread = <optimized out>
        total_read = 16
        read_count = 2
#5 0x40184b6a in g_main_context_dispatch ()
   from /lib/arm-linux-gnueabihf/libglib-2.0.so.0
No symbol table info available.
#6 0x40184dca in ?? () from /lib/arm-linux-gnueabihf/libglib-2.0.so.0
No symbol table info available.
#7 0x40184dca in ?? () from /lib/arm-linux-gnueabihf/libglib-2.0.so.0
No symbol table info available.
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 1 (Thread 0x401112b0 (LWP 1029)):
#0 ril_register_cb (message=0xa44d48, user_data=0xa43290)
    at drivers/rilmodem/network-registration.c:371
        cbd = 0xa43290
        cb = 0x74215 <register_callback>
        nd = 0x0
        error = {type = OFONO_ERROR_TYPE_NO_ERROR, error = 0}
#1 0x0001aa04 in handle_response (message=0xa44d48, p=0xa3d200)
    at gril/gril.c:366
        count = 12
        i = <optimized out>
        req = 0xa46000
        found = 1
#2 dispatch (message=0xa44d48, p=<optimized out>) at gril/gril.c:509
        data_len = <optimized out>
        unsolicited_field = <optimized out>
        id_num_field = <optimized out>
        bufp = <optimized out>
        datap = <optimized out>
#3 new_bytes (rbuf=0xa35790, user_data=0xa3d200) at gril/gril.c:607
        rbytes = <optimized out>
        p = 0xa3d200
        len = 16
        wrap = 16
        buf = 0xa3d2d0 "c"
        __FUNCTION__ = "new_bytes"
#4 0x0001b4ae in received_data (channel=0xa3d090, cond=G_IO_IN,
    data=0xa3d270) at gril/grilio.c:124
        buf = <optimized out>
        io = <optimized out>
        status = G_IO_STATUS_AGAIN
        rbytes = 0
        toread = <optimized out>
        total_read = 16
        read_count = 2
#5 0x40184b6a in g_main_context_dispatch ()
   from /lib/arm-linux-gnueabihf/libglib-2.0.so.0
No symbol table info available.
#6 0x40184dca in ?? () from /lib/arm-linux-gnueabihf/libglib-2.0.so.0
No symbol table info available.
#7 0x40184dca in ?? () from /lib/arm-linux-gnueabihf/libglib-2.0.so.0
No symbol table info available.
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Related branches

Tony Espy (awe)
Changed in ofono (Ubuntu):
assignee: nobody → Tony Espy (awe)
Revision history for this message
Tony Espy (awe) wrote :

I still haven't been able to reproduce this yet, but will try tomorrow.

Also, based on inspecting the code, this might be related to the delay registration technique used by most of the rilmodem modules.

Tony Espy (awe)
Changed in ofono (Ubuntu):
status: New → In Progress
Revision history for this message
Tony Espy (awe) wrote :

OK, so after staring at the code for a few hours, I found the bug.

The crash is line 371 of /drivers/rilmodem/network-registration.c, which is the function 'ril_register_cb()':

g_ril_print_response_no_args(nd->ril, message);

The stack trace shows 'nd' being 0x0, which indeed will cause a segfault.

nd is set from the value cdb->user;

This callback is triggered by two functions:

 - 'ril_register_auto()'

 - 'ril_register_manual()'

The first function properly sets the user data of the callback struct ( 'cdb' ), the second function does not. So if ril_register_manual() is ever called ( most likely from the ubuntu-system-settings cellular panel ), this will probably cause the crash.

Revision history for this message
Tony Espy (awe) wrote :

@Dave

Any chance you were messing around with the cellular system setting and trying to configure manual operator selection?

Revision history for this message
Tony Espy (awe) wrote :

Also, it appears the ability to set manual operator configuration is dependent on a particular file contained on the SIM:

* According to CPHS 4.2, EFcsp is an array of two-byte service
* entries, each consisting of a one byte service group
* identifier followed by 8 bits; each bit is indicating
* availability of a specific service or feature.
*
* The PLMN mode bit, if present, indicates whether manual
* operator selection should be disabled or enabled. When
* unset, the device is forced to automatic mode; when set,
* manual selection is to be enabled. The latter is also the
* default.

It appears this bit is unset on my phone, so I can't set the mode to manual by trying to force operator registration. I will modify code to ignore this bit later, but first will concentrate on pushing a patch for the bug mentioned in comment #2 first.

Revision history for this message
Tony Espy (awe) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ofono - 1.12+bzr6836-0ubuntu1

---------------
ofono (1.12+bzr6836-0ubuntu1) saucy; urgency=low

  [ Ricardo Salveti de Araujo ]
  * Changing packaging tree to make it part of the daily CI jobs
    - Changing source format to 1.0 (required by CI)
    - Creating bzr bd compatible branch, with the same content as available in
      the previous package

  [ Tony Espy ]
  - Ensure that *netreg_data is always set in callback data (LP: #1234491)
  - Re-factor rilmodem initialization code to enable set online/offline
    (LP: #1210502)
  - Fixing parcel parsing when probing for mute.

  [ Mathieu Trudel-Lapierre ]
  * Make the package use bzr-builddeb split-mode to properly build the source
    tarball.
 -- Mathieu Trudel-Lapierre <email address hidden> Tue, 08 Oct 2013 10:47:02 -0400

Changed in ofono (Ubuntu):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.