BUG: scheduling while atomic: swapper/0/0x10000100

Bug #534549 reported by Simone Fabris
38
This bug affects 12 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Unassigned

Bug Description

This is related to bluetooth.
On a fresh, ubuntu-desktop Lucid Lynx install, trying to connect to my a2dp headphone generate this bug.
Can be reproduced: just boot, turn on bluetooth headphone and bang!

ProblemType: KernelOops
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
Annotation: Your system might become unstable now and might need to be restarted.
AplayDevices:
 **** List of PLAYBACK Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: AD198x Analog [AD198x Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
Architecture: i386
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: AD198x Analog [AD198x Analog]
   Subdevices: 2/2
   Subdevice #0: subdevice #0
   Subdevice #1: subdevice #1
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: sf 1362 F.... pulseaudio
CRDA: Error: [Errno 2] Nessun file o directory
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xe8280000 irq 21'
   Mixer name : 'Analog Devices AD1984A'
   Components : 'HDA:11d4194a,103c3056,00100400'
   Controls : 10
   Simple ctrls : 6
Date: Mon Mar 8 18:42:09 2010
DistroRelease: Ubuntu 10.04
Failure: oops
HibernationDevice: RESUME=UUID=519d43d7-e3c7-483a-b039-16a0c8135575
MachineType: Hewlett-Packard HP 2140
NonfreeKernelModules: wl
Package: linux-image-2.6.32-15-generic 2.6.32-15.22
ProcCmdLine: BOOT_IMAGE=/vmlinuz-2.6.32-16-generic root=UUID=588edabc-2604-4e52-88ce-5725b02b6573 ro quiet splash
ProcVersionSignature: Ubuntu 2.6.32-16.24-generic
Regression: Yes
RelatedPackageVersions: linux-firmware 1.32
Reproducible: Yes
RfKill:
 0: hci0: Bluetooth
  Soft blocked: no
  Hard blocked: no
SourcePackage: linux
TestedUpstream: Yes
Title: BUG: scheduling while atomic: swapper/0/0x10000100
Uname: Linux 2.6.32-16-generic i686
dmi.bios.date: 06/18/2009
dmi.bios.vendor: Hewlett-Packard
dmi.bios.version: 68DGU Ver. F.04
dmi.board.name: 3056
dmi.board.vendor: Hewlett-Packard
dmi.board.version: KBC Version 20.0D
dmi.chassis.asset.tag: CNU90807DB
dmi.chassis.type: 10
dmi.chassis.vendor: Hewlett-Packard
dmi.modalias: dmi:bvnHewlett-Packard:bvr68DGUVer.F.04:bd06/18/2009:svnHewlett-Packard:pnHP2140:pvrF.04:rvnHewlett-Packard:rn3056:rvrKBCVersion20.0D:cvnHewlett-Packard:ct10:cvr:
dmi.product.name: HP 2140
dmi.product.version: F.04
dmi.sys.vendor: Hewlett-Packard

Revision history for this message
Simone Fabris (simone-fabris) wrote :
Revision history for this message
Chase Douglas (chasedouglas) wrote :

The preempt_counter shows both PREEMPT_ACTIVE and in a softirq context. I *think* this likely means a process was in the middle of a softirq handler or a tasklet when a subroutine put it to sleep. I'm building a test kernel right now that will output extra debug information so we can see what the softirq stack was as it was put to scheduled out. I'll update this bug with a test kernel location when I get it uploaded.

Changed in linux (Ubuntu):
assignee: nobody → Chase Douglas (chasedouglas)
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Chase Douglas (chasedouglas) wrote :

I've uploaded a test kernel to http://people.canonical.com/~cndougla/534549/. Please install the kernel package and boot into it. When ready, do the following:

# echo function >/sys/kernel/debug/tracing/current_tracer
# ech0 1 >/sys/kernel/debug/tracing/options/latency-format
# echo 1 >/sys/kernel/debug/tracing/tracing_enabled

Then do whatever you need to trigger the issue (connect a2dp bt headset?). Once the bug has been triggered, do:

# cat /sys/kernel/debug/tracing/trace | bzip2 >/tmp/trace.bz2

Finally, attach the trace.bz2 file to this bug so we can take a look.

Thanks

Revision history for this message
Simone Fabris (simone-fabris) wrote :

Here it is.
Let me know if you need some other testing.
Thanks for you help and support, anyway.

Simone

Revision history for this message
Chase Douglas (chasedouglas) wrote :

@Simone:

Great! I've got everything I need to do more digging. The following (at the end of the trace) shows clearly exactly what is causing the bug:

# tracer: function
#
# function latency trace v1.1.5 on 2.6.32-16-generic
# --------------------------------------------------------------------
# latency: 0 us, #120241/31431691, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:2)
# -----------------
# | task: -0 (uid:0 nice:0 policy:0 rt_prio:0)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /_--=> lock-depth
# |||||/ delay
# cmd pid |||||| time | caller
# \ / |||||| \ | /
  <idle>-0 0.Ns.. 57071005us : rfcomm_session_timeout <-run_timer_softirq
  <idle>-0 0.Ns.. 57071007us : rfcomm_session_del <-rfcomm_session_timeout
  <idle>-0 0.Ns.. 57071009us : rfcomm_session_clear_timer <-rfcomm_session_del
  <idle>-0 0.Ns.. 57071012us : sock_release <-rfcomm_session_del
  <idle>-0 0.Ns.. 57071014us : l2cap_sock_release <-sock_release
  <idle>-0 0.Ns.. 57071016us : l2cap_sock_shutdown <-l2cap_sock_release
  <idle>-0 0.Ns.. 57071018us+: lock_sock_nested <-l2cap_sock_shutdown
  <idle>-0 0.Ns.. 57071019us : _cond_resched <-lock_sock_nested

Now we need to figure out what here needs to be moved out of irq context. I'll be looking into it further tomorrow.

Revision history for this message
Chase Douglas (chasedouglas) wrote :

Actually, this was a pretty quick find. The fix can be found here:

http://git.kernel.org/?p=linux/kernel/git/holtmann/bluetooth-2.6.git;a=commitdiff;h=485f1eff73a7b932fd3abb0dfcf804e1a1f59025

I'll take care of this tomorrow.

Revision history for this message
Chase Douglas (chasedouglas) wrote :

@Simone:

Please test the new kernel up at http://people.canonical.com/~cndougla/534549/. Install the one with the version ending with ~lp534549. If that kernel works, I'll send the patch off to the appropriate people to get it included into Lucid and upstream.

Thanks

Changed in linux (Ubuntu):
status: Triaged → In Progress
Revision history for this message
Simone Fabris (simone-fabris) wrote :

So far, so good: the new kernel has never trigged any kerneloops since now.

Revision history for this message
Chase Douglas (chasedouglas) wrote :

@Simone:

Thanks for testing. I will forward the patch on. If you do encounter any issues with the test kernel, please leave another comment.

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (16.5 KiB)

This bug was fixed in the package linux - 2.6.32-17.26

---------------
linux (2.6.32-17.26) lucid; urgency=low

  [ Amit Kucheria ]

  * [Config] SECURITY_FILE_CAPABILITIES dissapeared in 2.6.33

  [ Andy Whitcroft ]

  * rules -- allow architecture configurations to be missing
  * SAUCE: cdrom -- default to not locking the tray when in use
    - LP: #397734
  * expose the kernel EXTRAVERSION in dmesg and /proc/version_signature
  * record the drm version in EXTRAVERSION
  * linux-tools -- pull out the perf binary into a binary package
  * [Config] enable MMIOTRACE for graphics debugging
  * [Config] enable BLK_DEV_BSG
  * debian -- fix builds when tools are disabled
  * allow us to build default configs for automated builds
  * config -- allow locally specified configuration overrides
  * [Config] de-modularise PATA disk controllers
  * [Config] de-modularise SATA disk controllers

  [ Stefan Bader ]

  * Revert "SAUCE: (pre-stable) netfilter: xt_recent: fix buffer overflow"
    - LP: #540231
  * Revert "SAUCE: (pre-stable) netfilter: xt_recent: fix false match"
    - LP: #540231
  * [Config] Update configs for 2.6.32.10
    - LP: #540231

  [ Tim Gardner ]

  * [Config] Add vmw_pvscsi and vmxnet3 to -virtual flavour
    - LP: #531017
  * SAUCE: igb: Supress an upstream compiler complaint
  * [Config] Fix sub-flavours package conflicts
    - LP: #454827

  [ Upstream Kernel Changes ]

  * Revert "tpm_tis: TPM_STS_DATA_EXPECT workaround"
    - LP: #540231
  * Revert "(pre-stable) sched: Fix SMT scheduler regression in
    find_busiest_queue()"
    - LP: #540231
  * (pre-stable) Bluetooth: Fix sleeping function in RFCOMM within invalid
    context
    - LP: #534549
  * igb: remove unused temp variable from stats clearing path
  * igb: update comments for serdes config and update to handle duplex
  * igb: update the approach taken to acquiring and releasing the phy lock
  * igb: add locking to reads of the i2c interface
  * igb: add combined function for setting rar and pool bits
  * igb: make use of the uta to allow for promiscous mode filter
  * igb: add support for 82576NS SerDes adapter
  * igb: add function to handle mailbox lock
  * igb: fix a few items where weren't correctly setup for mbx timeout
  * igb: change how we handle alternate mac addresses
  * igb: remove microwire support from igb
  * igb: move the generic copper link setup code into e1000_phy.c
  * igb: add code to retry a phy read in the event of failure on link check
  * igb: add additional error handling to the phy code
  * igb: add flushes between RAR writes when setting mac address
  * igb: Use the instance of net_device_stats from net_device.
  * igb: Fix erroneous display of stats by ethtool -S
  * igb: add new data structure for handling interrupts and NAPI
  * igb: remove rx checksum good counter
  * igb: increase minimum rx buffer size to 1K
  * igb: move the tx and rx ring specific config into seperate functions
  * igb: remove rx_ps_hdr_len
  * igb: move SRRCTL register configuration into ring specific config
  * igb: change the head and tail offsets into pointers
  * igb: add pci device pointer to ring structure
  * igb: move rx_buffer_len into the ring structu...

Changed in linux (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Adam Hallgat (hallgat) wrote :

It isn't fixed in Natty Narvhal. Should I open a new bug thread?

Changed in linux (Ubuntu):
assignee: Chase Douglas (chasedouglas) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.