Kernel crash and reboot when accessing video device

Bug #1187189 reported by Stéphane Graber
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux-mako (Ubuntu)
Confirmed
High
Unassigned
lxc-android-config (Ubuntu)
Fix Released
Undecided
Unassigned
systemd (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

While working on the container flip, we noticed that starting udev and calling udevadm trigger would cause the device to reboot instantly.

After some effort, we tracked this down to video4linux and specifically the 60-persistent-v4l rules file. Looking into it, we see it's calling v4l_id and that's what's causing the reboot.

Here's some basic reproducing step for saucy post-container flip:
root@android:/ # mv /lib/udev/rules.d/60-persistent-v4l.rules /root
root@android:/ # start udev
udev start/running, process 1016
root@android:/ # start udevtrigger
udevtrigger stop/waiting
root@android:/ # cat /dev/video38
--- reboot ---

It looks like any attempt to open the video device causes the panic and reboot.

Unfortunately I don't have any way to get a console or any kernel debug information so all I can tell is that there's something very wrong going with that video device.

Steve Langasek (vorlon)
Changed in linux-mako (Ubuntu):
importance: Undecided → High
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-mako (Ubuntu):
status: New → Confirmed
Revision history for this message
Oliver Grawert (ogra) wrote :

to debug this, boot into recovery right after the crash, log in via adb and issue:
cat /proc/last_kmsg

that should give you teh oops (if there is one)

Revision history for this message
Steve Langasek (vorlon) wrote :
Download full text (10.2 KiB)

Here's the stacktrace from /proc/last_kmsg. This is clearly another knock-on problem from udev (not ueventd) handling the firmware requests.

We may want to disable /lib/udev/rules.d/50-firmware.rules on these images as a workaround.

[13793.644710]
[13793.644710] err: request_firmware for vidc_1080p.fw error -2
[13793.644863] <3>wfd: Failed to load video encoder firmware: 1
[13793.644985] mdp4_writeback_terminate called without stopping
[13793.645076] Unable to handle kernel NULL pointer dereference at virtual address 0000000d
[13793.645229] pgd = ebdf8000
[13793.645290] [0000000d] *pgd=aa238831, *pte=00000000, *ppte=00000000
[13793.645626] Internal error: Oops: 17 [#1] PREEMPT SMP ARM
[13793.645717] Modules linked in:
[13793.645901] CPU: 1 Not tainted (3.4.0-1-mako #7-Ubuntu)
[13793.645992] PC is at do_sys_open+0x108/0x178
[13793.646084] LR is at path_openat+0x36c/0x388
[13793.646236] pc : <c0145954> lr : <c0154fa4> psr: 00000013
[13793.646236] sp : ecadbf58 ip : ecadbe78 fp : ecadbf94
[13793.646419] r10: 00000000 r9 : ecada000 r8 : c000e2a8
[13793.646511] r7 : ffffff9c r6 : ebd2e000 r5 : 00000001 r4 : 00000003
[13793.646633] r3 : 00000000 r2 : 00000000 r1 : 00010000 r0 : 00000001
[13793.646725] Flags: nzcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
[13793.646847] Control: 10c5787d Table: ac7f806a DAC: 00000015
[13793.646969]
[13793.646969] PC: 0xc01458d4:
[13793.647121] 58d4 e3832c02 13833c06 e50b2028 150b3028 e31c0901 e1a00001 03a05000 13a05002
[13793.647854] 58f4 e31c0902 03855001 eb002f75 e3700a01 e1a06000 81a04000 8a000028 e1a01004
[13793.648403] init: Handling graphics-device-added event
[13793.648647]
[13793.648708] 5914 e3a00000 eb006ce7 e2504000 ba000021 e1a03005 e1a00007 e1a01006 e24b2034
[13793.649441] 5934 eb003ddb e3700a01 e1a05000 9a000003 e1a00004 e1a04005 ebfff9e3 ea000015
[13793.650112] 5954 e590100c e2807008 e1a00007 e591a020 e1da80b0 e2088a0f e3580901 03a08181
[13793.650814] 5974 13a08020 e1a02008 eb00eb1c e3a03000 e1a01008 e58d3000 e58d3004 e1a0000a
[13793.651547] 5994 e1a02007 e3a03001 eb00ea32 e1a00004 e1a01005 ebfffc03 e1a00006 eb002f51
[13793.652279] 59b4 e1a00004 e24bd024 e89dadf0 c0aeaadb e1a0c00d e92dd800 e24cb004 e52de004
[13793.652951]
[13793.652951] LR: 0xc0154f24:
[13793.653164] 4f24 e1a03008 ebfffce3 e1a05000 e51b0030 e3760a01 e5903020 8a000006 e5933018
[13793.653866] 4f44 e5933014 e3530000 0a000002 e1a01004 e1a02006 e12fff33 e24b0034 ebffee8b
[13793.654568] 4f64 e3550000 0affff66 e5943014 e3530000 0a000004 e5943020 e3130a02 1a000001
[13793.655301] 4f84 e2840014 ebffee81 e51b0040 e3500000 0a000000 ebffcd15 e1a00004 ebffff13
[13793.656033] 4fa4 ea000002 e1a05000 eaffffee e3e05016 e1a00005
[13793.656430] init: Handling firmware-device-added event
[13793.656644] e24bd028 e89daff0 e1a0c00d
[13793.656888] 4fc4 e92dd800 e24cb004 e52de004 e8bd4000 e1a02000 e1a03001 e3e00063 e1a01002
[13793.657590] 4fe4 e3a02010 ebfff931 e89da800 e1a0c00d e92dd9f0 e24cb004 e24dd064 e52de004
[13793.658231] 5004 e8bd4000 e1a06002 e1a08000 e1a00001 e1a01002 e59b2004 e1a07003 ebfff155
[13793.658963]
[13793.658994] SP: 0xecadbed8:
[13793.659116] bed8 eefdb490 ef368f80 5c1a0fc1 c014...

Revision history for this message
Steve Langasek (vorlon) wrote :

Disabling /lib/udev/rules.d/50-firmware.rules would probably also solve the udev/ueventd boot ordering problem

Revision history for this message
Steve Langasek (vorlon) wrote :
Download full text (14.9 KiB)

However, if I disable firmware.rules, *but* leave persistent_v4l.rules enabled, I still get a crash on boot - probably because udev is trying to probe the hardware before ueventd has started up to handle the firmware request.

[ 4.898886] msm_mctl_dev_open mctl NULL!
[ 4.899221] Unable to handle kernel NULL pointer dereference at virtual address 00000238
[ 4.899313] pgd = edf68000
[ 4.899404] [00000238] *pgd=af5cc831, *pte=00000000, *ppte=00000000
[ 4.899710] Internal error: Oops: 17 [#1] PREEMPT SMP ARM
[ 4.899801] Modules linked in:
[ 4.899984] CPU: 2 Not tainted (3.4.0-1-mako #7-Ubuntu)
[ 4.900076] PC is at msm_mctl_dev_close+0x1c/0x164
[ 4.900137] LR is at v4l2_release+0x54/0x74
[ 4.900259] pc : <c04d66d8> lr : <c04b28d4> psr: 60000013
[ 4.900289] sp : edf1bee8 ip : edf1bf18 fp : edf1bf14
[ 4.900442] r10: ee3edd80 r9 : 00000000 r8 : eefec490
[ 4.900534] r7 : ef3c8d00 r6 : 00000010 r5 : ee3edd80 r4 : 00000000
[ 4.900656] r3 : c04d66bc r2 : 00000002 r1 : ee3edd80 r0 : ee3edd80
[ 4.900717] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
[ 4.900839] Control: 10c5787d Table: ae96806a DAC: 00000015
[ 4.900961]
[ 4.900961] PC: 0xc04d6658:
[ 4.901083] 6658 1afffff6 e594024c ebf1a9b7 e3a03000 e584324c e1a00008 e3e0700b eb0f8442
[ 4.901815] 6678 ea000008 e2855001 eaffffff e5963000 e1550003 3affffd8 e5963000 e5843248
[ 4.902456] 6698 e1a00008 eb0f8438 e1a00007 e89daff8 c0b6408b c0b63c39 c092def1 c0b63c50
[ 4.903128] 66b8
[ 4.903189]
[ 4.903189] res_trk_download_firmware(): Request firmware download
[ 4.903433] c0b63c69 e1a0c00d e92ddff8 e24cb004 e52de004 e8bd4000 e590407c e1a0a000
[ 4.904104] 66d8 e5945238 e3550000 1a000004 e59f0128 e3e07015 e59f1124 eb0f4d12 ea000044
[ 4.904745] 66f8 e595041c e2858e16 eb0014ae e2849fbf e1a06000 e1a00008 eb0f849f e1a00009
[ 4.905417] 6718 eb0f849d e5953174 e3530001 13a07000 1a000009 e1a00005 eb002052 e2507000
[ 4.906088] 6738 a3a06000 aa000004 e59f10d0 e1a02007 e59f00cc eb0f4cfb ea00002b e3a03000
[ 4.906760]
[ 4.906760] LR: 0xc04b2854:
[ 4.906882] 2854 e1a00005 e89da878 e3000145 e89da878 e1a0c00d e92dd800 e24cb004 e52de004
[ 4.907584] 2874 e8bd4000 ebf23931 e89da800 e1a0c00d e92dd830 e24cb004 e52de004 e8bd4000
[ 4.908194] 2894 e1a00001 e1a05001 ebffff7e e5903050 e1a04000 e5933024 e3530000 0a00000b
[ 4.908805] 28b4 e5900250 e3500000 0a000000 eb101433 e5943050 e1a00005 e5933024 e12fff33
[ 4.909476] 28d4 e5940250 e3500000 0a000000 eb1013a7 e2840058 ebfbcc6b e3a00000 e89da830
[ 4.910148] 28f4 e1a0c00d e92dd878 e24cb004 e52de004 e8bd4000 e1a06000 e1a05001 ebffff61
[ 4.910819] 2914 e5903050 e1a04000 e593301c e3530000 0a000016 e5900250 e3500000 1a000004
[ 4.911490] 2934 e5943220 e3130001 03e05012 0a00000a ea000003 eb101337 e3500000 0afffff7
[ 4.912162]
[ 4.912162] SP: 0xedf1be68:
[ 4.912284] be68 c08bc3b4 c0096d28 ef08ec40 c04d66d8 60000013 ffffffff edf1bed4 eefec490
[ 4.912955] be88 00000000 ee3edd80 edf1bf14 edf1bea0 c08ba618 c0008364 ee3edd80 ee3edd80
[ 4.913596] bea8 00000002 c04d66bc 00000000 ee3edd80 00000010 ef3c8d00 eefec4...

Revision history for this message
Steve Langasek (vorlon) wrote :

Fundamentally this is a kernel bug, for panicing when the firmware load fails. But we should also be working through the issue of udev+ueventd double-handling of firmware requests. The change for bug #1187616 already lets udev read firmware from /system/firmware + /vendor/firmware, but we may not have the ordering right to make sure those mount points are available before udev starts. And Oliver has argued that there are some reasons to want ueventd to continue to handle certain drivers, due to options being set in the kernel - so we don't necessarily want to have udev take over entirely for ueventd.

In the short term, I think we'll want to work around this by having lxc-android-config divert the udev rules files to disable them. This will prevent us from being able to load firmware from /lib/firmware, but we actually don't do that anyway since everything is already in /system/firmware on these systems.

Revision history for this message
Steve Langasek (vorlon) wrote :

Apparently I missed something in the changelog; but this workaround to lxc-android-config has been uploaded. Changelog entry:

lxc-android-config (0.14) saucy; urgency=low

  * Add diversions for /lib/udev/rules.d/50-firmware.rules and
    /lib/udev/rules.d/60-persistent-v4l.rules: the first because all of
    our firmware currently lives where only ueventd sees it (meaning udev
    will return false-negatives for firmware load requests), the second
    because even with the firmware loading out of the way, v4l_id probes
    of video devices are causing kernel panics immediately after boot
    (on mako) due to udev starting earlier than ueventd. This works around
    LP: 1187189.

 -- Steve Langasek <email address hidden> Thu, 06 Jun 2013 10:39:18 -0700

Changed in lxc-android-config (Ubuntu):
status: New → Fix Released
Revision history for this message
Martin Pitt (pitti) wrote :

Does that mean we should revert the firmware lookup from /system and /vendor in udev?

Changed in systemd (Ubuntu):
status: New → Incomplete
Revision history for this message
Steve Langasek (vorlon) wrote : Re: [Bug 1187189] Re: Kernel crash and reboot when accessing video device

On Fri, Jun 07, 2013 at 04:59:51AM -0000, Martin Pitt wrote:
> Does that mean we should revert the firmware lookup from /system and
> /vendor in udev?

Possibly. You were rather quick on the draw in uploading that change, I was
expecting more discussion first. ;-)

For the moment, lxc-android-config is diverting
/lib/udev/rules.d/50-firmware.rules, which stops *all* firmware loading from
udev. This is certainly not the preferred behavior; if possible, I think we
should prefer udev to handle all the firmware loading and take ueventd out
of the picture entirely, so that we can have a single, consistent,
upstreamed handler for all of this. However, Oliver seems to have some
concerns about the robustness / maintainability of this, so I haven't pushed
things in this direction so far.

If we could trust udev to handle all the events once for both sides, then
the ueventd ordering problem would go away, the need for diversions would go
away, and we would want to stick with udev reading from /system/firmware and
/vendor/firmware directly. But it's not certain yet that we want to do
that, so I think some more investigation is still needed before we decide.

Revision history for this message
Martin Pitt (pitti) wrote :

Discussed with Oliver on IRC. I reverted the /vendor/firmware and /system/firmware lookup bits, but that was a no-op change after all as lxc-android-cfg diverts the rules.

There is no need for diversion there, BTW: lxc-android-cfg could just ship (or create in postinst) an empty /etc/udev/rules.d/50-firmware.rules file which will override the file in /lib/udev/rules.d (see /etc/udev/rules.d/README). So if you want to put that into the image build process instead, or simplify the diversion, you can do that, but otherwise the divert will work as well.

Changed in systemd (Ubuntu):
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.