bcache_writebac blocks and goes silent on trusty kernel 3.13.0-24-generic #47-Ubuntu

Bug #1357295 reported by Nick Moffitt
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Unassigned
Trusty
Fix Released
Medium
Stefan Bader
Utopic
Fix Released
Medium
Unassigned

Bug Description

We regularly use bcache to use SSDs as an extra block cache layer for performance on large working sets (archive mirrors, large databases, etc). Until now we have had good success with bcache on precise LTS with HWE kernels up to Saucy's.

Unfortunately when we mount a bcache-backed volume on a Trusty server, it immediately loses contact with bcache_writebac:

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
  501 root 20 0 0 0 0 D 0.0 0.0 0:00.00 bcache_writebac

kern.log shows it losing its mind pretty much immediately on startup:

    Aug 12 02:22:15 somehostname kernel: [ 6483.477618] md: bind<sdc>
    Aug 12 02:22:15 somehostname kernel: [ 6483.478599] md: bind<sdd>
    Aug 12 02:22:15 somehostname kernel: [ 6483.482016] md/raid0:md0: md_size is 781443072 sectors.
    Aug 12 02:22:15 somehostname kernel: [ 6483.482024] md: RAID0 configuration for md0 - 1 zone
    Aug 12 02:22:15 somehostname kernel: [ 6483.482026] md: zone0=[sdc/sdd]
    Aug 12 02:22:15 somehostname kernel: [ 6483.482033] zone-offset= 0KB, device-offset= 0KB, size= 390721536KB
    Aug 12 02:22:15 somehostname kernel: [ 6483.482036]
    Aug 12 02:22:15 somehostname kernel: [ 6483.482055] md0: detected capacity change from 0 to 400098852864
    Aug 12 02:22:15 somehostname kernel: [ 6483.487644] md0: unknown partition table
    Aug 12 02:26:55 somehostname kernel: [ 6763.657249] bio: create slab <bio-2> at 2
    Aug 12 02:26:55 somehostname kernel: [ 6763.658018] bcache: run_cache_set() invalidating existing data
    Aug 12 02:26:55 somehostname kernel: [ 6763.695447] bcache: register_cache() registered cache device md0
    Aug 12 02:26:55 somehostname kernel: [ 6763.696262] bcache: register_bdev() registered backing device sdb
    Aug 12 02:26:55 somehostname kernel: [ 6763.697833] bcache: bch_cached_dev_attach() Caching sdb as bcache0 on set 13dcf71e-73ea-49f7-8cd6-fb8182c3cc17
    Aug 12 02:27:24 somehostname kernel: [ 6793.114287] EXT4-fs (bcache0): mounted filesystem with ordered data mode. Opts: (null)
    Aug 12 02:30:16 somehostname kernel: [ 6964.514425] INFO: task bcache_writebac:29721 blocked for more than 120 seconds.
    Aug 12 02:30:16 somehostname kernel: [ 6964.608861] Not tainted 3.13.0-24-generic #47-Ubuntu
    Aug 12 02:30:16 somehostname kernel: [ 6964.679403] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    Aug 12 02:30:16 somehostname kernel: [ 6964.779270] bcache_writebac D ffff8817dfb34440 0 29721 2 0x00000000
    Aug 12 02:30:16 somehostname kernel: [ 6964.779278] ffff882fafe57eb8 0000000000000002 ffff882fb100afe0 ffff882fafe57fd8
    Aug 12 02:30:16 somehostname kernel: [ 6964.779286] 0000000000014440 0000000000014440 ffff882fb100afe0 ffff882fb7bcac80
    Aug 12 02:30:16 somehostname kernel: [ 6964.779293] ffff882fb11c0000 ffffffffa03f5840 0000000000000000 0000000000000000
    Aug 12 02:30:16 somehostname kernel: [ 6964.779301] Call Trace:
    Aug 12 02:30:16 somehostname kernel: [ 6964.779335] [<ffffffffa03f5840>] ? read_dirty+0x400/0x400 [bcache]
    Aug 12 02:30:16 somehostname kernel: [ 6964.779346] [<ffffffff81719ee9>] schedule+0x29/0x70
    Aug 12 02:30:16 somehostname kernel: [ 6964.779356] [<ffffffff8108b2ed>] kthread+0xad/0xf0
    Aug 12 02:30:16 somehostname kernel: [ 6964.779368] [<ffffffff8108b240>] ? kthread_create_on_node+0x1d0/0x1d0
    Aug 12 02:30:16 somehostname kernel: [ 6964.779379] [<ffffffff817263fc>] ret_from_fork+0x7c/0xb0
    Aug 12 02:30:16 somehostname kernel: [ 6964.779385] [<ffffffff8108b240>] ? kthread_create_on_node+0x1d0/0x1d0
    Aug 12 02:32:16 somehostname kernel: [ 7084.828482] INFO: task bcache_writebac:29721 blocked for more than 120 seconds.
    Aug 12 02:32:16 somehostname kernel: [ 7084.922987] Not tainted 3.13.0-24-generic #47-Ubuntu
    Aug 12 02:32:16 somehostname kernel: [ 7084.993239] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    Aug 12 02:32:16 somehostname kernel: [ 7085.093858] bcache_writebac D ffff8817dfb34440 0 29721 2 0x00000000
    Aug 12 02:32:16 somehostname kernel: [ 7085.093866] ffff882fafe57eb8 0000000000000002 ffff882fb100afe0 ffff882fafe57fd8
    Aug 12 02:32:16 somehostname kernel: [ 7085.093874] 0000000000014440 0000000000014440 ffff882fb100afe0 ffff882fb7bcac80
    Aug 12 02:32:16 somehostname kernel: [ 7085.093881] ffff882fb11c0000 ffffffffa03f5840 0000000000000000 0000000000000000
    Aug 12 02:32:16 somehostname kernel: [ 7085.093897] Call Trace:
    Aug 12 02:32:16 somehostname kernel: [ 7085.093918] [<ffffffffa03f5840>] ? read_dirty+0x400/0x400 [bcache]
    Aug 12 02:32:16 somehostname kernel: [ 7085.093928] [<ffffffff81719ee9>] schedule+0x29/0x70
    Aug 12 02:32:16 somehostname kernel: [ 7085.093944] [<ffffffff8108b2ed>] kthread+0xad/0xf0
    Aug 12 02:32:16 somehostname kernel: [ 7085.093950] [<ffffffff8108b240>] ? kthread_create_on_node+0x1d0/0x1d0
    Aug 12 02:32:16 somehostname kernel: [ 7085.093959] [<ffffffff817263fc>] ret_from_fork+0x7c/0xb0
    Aug 12 02:32:16 somehostname kernel: [ 7085.093964] [<ffffffff8108b240>] ? kthread_create_on_node+0x1d0/0x1d0

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: linux-image-3.13.0-33-generic 3.13.0-33.58
ProcVersionSignature: Ubuntu 3.13.0-33.58-generic 3.13.11.4
Uname: Linux 3.13.0-33-generic x86_64
ApportVersion: 2.14.1-0ubuntu3.3
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: nick 2424 F.... pulseaudio
 /dev/snd/controlC0: nick 2424 F.... pulseaudio
CurrentDesktop: Unity
Date: Fri Aug 15 11:08:19 2014
HibernationDevice: RESUME=UUID=dae1fd50-2dc0-48b2-bd93-5823478772b6
InstallationDate: Installed on 2014-05-29 (77 days ago)
InstallationMedia: Ubuntu 14.04 LTS "Trusty Tahr" - Release amd64 (20140417)
MachineType: LENOVO 20ALCTO1WW
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.13.0-33-generic.efi.signed root=/dev/mapper/ubuntu--vg-root ro quiet splash vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-3.13.0-33-generic N/A
 linux-backports-modules-3.13.0-33-generic N/A
 linux-firmware 1.127.5
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 02/06/2014
dmi.bios.vendor: LENOVO
dmi.bios.version: GIET69WW (2.19 )
dmi.board.asset.tag: Not Available
dmi.board.name: 20ALCTO1WW
dmi.board.vendor: LENOVO
dmi.board.version: 0B98405 STD
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvrGIET69WW(2.19):bd02/06/2014:svnLENOVO:pn20ALCTO1WW:pvrThinkPadX240:rvnLENOVO:rn20ALCTO1WW:rvr0B98405STD:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 20ALCTO1WW
dmi.product.version: ThinkPad X240
dmi.sys.vendor: LENOVO

Revision history for this message
Nick Moffitt (nick-moffitt) wrote :
Stuart Bishop (stub)
tags: added: regression-release
Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Stefan Bader (smb) wrote :

Is that still the case with an updated kernel? The trace shows -23.47 which was the initially released kernel.

Revision history for this message
Haw Loeung (hloeung) wrote :

Confirmed it's still happening with an updated kernel:

| [ 241.558831] INFO: task bcache_writebac:501 blocked for more than 120 seconds.
| [ 241.648361] Not tainted 3.13.0-34-generic #60-Ubuntu
| [ 241.717161] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
| [ 241.815456] bcache_writebac D ffff88303f914440 0 501 2 0x00000000
| [ 241.815464] ffff882fb6bfdeb8 0000000000000002 ffff882fb64097f0 ffff882fb6bfdfd8
| [ 241.815472] 0000000000014440 0000000000014440 ffff882fb64097f0 ffff882fb75a8bc0
| [ 241.815478] ffff882fb6be0000 ffffffffa019d840 0000000000000000 0000000000000000
| [ 241.815486] Call Trace:
| [ 241.815532] [<ffffffffa019d840>] ? read_dirty+0x400/0x400 [bcache]
| [ 241.815543] [<ffffffff817201b9>] schedule+0x29/0x70
| [ 241.815556] [<ffffffff8108b3ad>] kthread+0xad/0xf0
| [ 241.815562] [<ffffffff8108b300>] ? kthread_create_on_node+0x1d0/0x1d0
| [ 241.815573] [<ffffffff8172c6bc>] ret_from_fork+0x7c/0xb0
| [ 241.815578] [<ffffffff8108b300>] ? kthread_create_on_node+0x1d0/0x1d0

| hloeung@meteor:/var/log$ uname -a
| Linux meteor 3.13.0-34-generic #60-Ubuntu SMP Wed Aug 13 15:45:27 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

| filename: /lib/modules/3.13.0-34-generic/kernel/drivers/md/bcache/bcache.ko
| author: Kent Overstreet <email address hidden>
| license: GPL
| license: GPL
| author: Kent Overstreet <email address hidden>
| srcversion: 92E6FEAECBE61D83D1F2C4F
| depends:
| intree: Y
| vermagic: 3.13.0-34-generic SMP mod_unload modversions
| signer: Magrathea: Glacier signing key
| sig_key: 50:0B:C5:C8:7D:4B:11:5C:F3:C1:50:4F:7A:92:E2:33:C6:14:3D:58
| sig_hashalgo: sha512

Revision history for this message
Haw Loeung (hloeung) wrote :

Looks like there's a pending patch[1].

I used the fix suggested by Sam Fulcomer[2] to set /sys/.../bcache/writeback_running to 0 because the cache mod we're using is writethrough.

| $ cat /sys/block/bcache0/bcache/writeback_running
| 1

| $cat /sys/block/bcache0/bcache/state
| clean

| $ cat /sys/block/bcache0/bcach/cache_mode
| [writethrough] writeback writearound none

Load seems to have dropped from a constant 2 to around 1. I've left it over night and am not seeing any more call traces being logged.

[1] http://permalink.gmane.org/gmane.linux.kernel.bcache.devel/2499
[2] http://permalink.gmane.org/gmane.linux.kernel.bcache.devel/2492

Revision history for this message
Stefan Bader (smb) wrote :

I think that the patch now (3.17-rc1) is upstream but in a slightly modified version:

commit 7f179da0d15ded8f9d56616099f940d50ec582b5
Author: Slava Pestov <email address hidden>
Date: Thu May 1 13:48:57 2014 -0700

    bcache: fix uninterruptible sleep in writeback thread

I cherry-picked that patch back into Trusty and currently check this in a test VM. The warnings are gone but the load seems to be to high, still.

Revision history for this message
Stefan Bader (smb) wrote :

So I am not sure there already is any solution to the load of 1, that I see. 3.16/3.17 kernels that I tried have even more serious issues (crashing). I pulled together three patches I got so far and build Trusty kernels which can be found at [1]. It would be good if someone could try them in a more real environment that mine.

[1] http://people.canonical.com/~smb/lp1357295/

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Triaged
Stefan Bader (smb)
Changed in linux (Ubuntu Trusty):
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Stefan Bader (smb) wrote :

SRU Justification:

Impact: Right now whenever the bcache module gets loaded, we can see softlockup warnings caused by the writeback thread.

Fix: A patch picked from 3.17-rc1 will avoid those warnings. I bundled this with one patch from upstream that avoids a crash on shutdown, which is also caused by the fact that the writeback thread is not always started now. A third patch which was sent upstream addresses the same crash when modifying the writeback_running sysfs variable. This also adds a check for the writeback thread to be running.

Testcase(s): Define caching and backing device with (bcache-tools):
  make-bcache -B <backing dev> -C <cache dev>
This will trigger the hung task after a while.
  echo 0 >/sys/block/bcache0/bcache/writeback_running
This will trigger a kernel BUG. Both issues are gone with the patch series applied.

Tim Gardner (timg-tpi)
Changed in linux (Ubuntu Utopic):
status: Triaged → Fix Committed
Changed in linux (Ubuntu Trusty):
assignee: nobody → Stefan Bader (smb)
status: Triaged → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 3.16.0-10.15

---------------
linux (3.16.0-10.15) utopic; urgency=low

  [ dann frazier ]

  * [debian] Fix regression with ABI subversions and backport

  [ Feng Kan ]

  * SAUCE: (no-up) irqchip:gic: change access of gicc_ctrl register to read
    modify write.
    - LP: #1357527
  * SAUCE: (no-up) arm64: optimized copy_to_user and copy_from_user
    assembly code
    - LP: #1358949

  [ Stefan Bader ]

  * SAUCE: bcache: prevent crash on changing writeback_running
    - LP: #1357295

  [ Tim Gardner ]

  * [Config] CONFIG_XFRM_STATISTICS=y
  * [Config] CONFIG_SECURITY_NETWORK_XFRM=y
  * [Config] CONFIG_SENSORS_IBMPOWERNV=m
    - LP: #1353005
  * Release Tracking Bug
    - LP: #1359783

  [ Upstream Kernel Changes ]

  * intel_idle: Broadwell support
    - LP: #1256170
  * powerpc/book3s: Add basic infrastructure to handle HMI in Linux.
    - LP: #1357108
  * powerpc/powernv: Invoke opal call to handle hmi.
    - LP: #1357108
  * powerpc/book3s: handle HMIs for cpus in nap mode.
    - LP: #1357108
  * powerpc/book3s: Fix endianess issue for HMI handling on napping cpus.
    - LP: #1357108
  * powerpc: Add smp_mb() to arch_spin_is_locked()
    - LP: #1358569
  * powerpc: Add smp_mb()s to arch_spin_unlock_wait()
    - LP: #1358569
  * hwmon: (powerpc/powernv) hwmon driver for power, fan rpm, voltage and
    temperature
    - LP: #1353005
  * tools/testing/selftests/ptrace/peeksiginfo.c: add PAGE_SIZE definition
    - LP: #1358855
  * printk: Add function to return log buffer address and size
    - LP: #1359423
  * powerpc/powernv: Interface to register/unregister opal dump region
    - LP: #1359423
  * bcache: fix crash on shutdown in passthrough mode
    - LP: #1357295
  * bcache: fix uninterruptible sleep in writeback thread
    - LP: #1357295

  [ Vinayak Kale ]

  * SAUCE: (no-up) dt-bindings: Add Potenza PMU binding
    - LP: #1357527
  * SAUCE: (no-up) arm64: dts: Add PMU node for APM X-Gene Storm SOC
    - LP: #1357527
 -- Tim Gardner <email address hidden> Fri, 15 Aug 2014 12:34:33 -0600

Changed in linux (Ubuntu Utopic):
status: Fix Committed → Fix Released
Revision history for this message
Nick Moffitt (nick-moffitt) wrote :

What's the best way for us to test this fix in Trusty right now?

Revision history for this message
Stefan Bader (smb) wrote :

Either pick the test-kernel from comment #7 or wait for a trusty kernel with that patch get into proposed (which should trigger a automated request for verification).

Revision history for this message
Nick Moffitt (nick-moffitt) wrote :

Thanks. It does indeed look like this got rid of the lockups in dmesg. Thanks!

Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-trusty' to 'verification-done-trusty'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-trusty
Revision history for this message
Stuart Bishop (stub) wrote :

We have switched to the kernel in -proposed and everything looks good. So verified.

tags: added: verification-done-trusty
removed: verification-needed-trusty
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (5.8 KiB)

This bug was fixed in the package linux - 3.13.0-36.63

---------------
linux (3.13.0-36.63) trusty; urgency=low

  [ Joseph Salisbury ]

  * Release Tracking Bug
    - LP: #1365052

  [ Feng Kan ]

  * SAUCE: (no-up) irqchip:gic: change access of gicc_ctrl register to read
    modify write.
    - LP: #1357527
  * SAUCE: (no-up) arm64: optimized copy_to_user and copy_from_user
    assembly code
    - LP: #1358949

  [ Ming Lei ]

  * SAUCE: (no-up) Drop APM X-Gene SoC Ethernet driver
    - LP: #1360140
  * [Config] Drop XGENE entries
    - LP: #1360140
  * [Config] CONFIG_NET_XGENE=m for arm64
    - LP: #1360140

  [ Stefan Bader ]

  * SAUCE: Add compat macro for skb_get_hash
    - LP: #1358162
  * SAUCE: bcache: prevent crash on changing writeback_running
    - LP: #1357295

  [ Suman Tripathi ]

  * SAUCE: (no-up) arm64: Fix the csr-mask for APM X-Gene SoC AHCI SATA PHY
    clock DTS node.
    - LP: #1359489
  * SAUCE: (no-up) ahci_xgene: Skip the PHY and clock initialization if
    already configured by the firmware.
    - LP: #1359501
  * SAUCE: (no-up) ahci_xgene: Fix the link down in first attempt for the
    APM X-Gene SoC AHCI SATA host controller driver.
    - LP: #1359507

  [ Tuan Phan ]

  * SAUCE: (no-up) pci-xgene-msi: fixed deadlock in irq_set_affinity
    - LP: #1359514

  [ Upstream Kernel Changes ]

  * iwlwifi: mvm: Add a missed beacons threshold
    - LP: #1349572
  * mac80211: reset probe_send_count also in HW_CONNECTION_MONITOR case
    - LP: #1349572
  * genirq: Add an accessor for IRQ_PER_CPU flag
    - LP: #1357527
  * arm64: perf: add support for percpu pmu interrupt
    - LP: #1357527
  * cifs: sanity check length of data to send before sending
    - LP: #1283101
  * KVM: nVMX: Pass vmexit parameters to nested_vmx_vmexit
    - LP: #1329434
  * KVM: nVMX: Rework interception of IRQs and NMIs
    - LP: #1329434
  * KVM: vmx: disable APIC virtualization in nested guests
    - LP: #1329434
  * HID: Add transport-driver functions to the USB HID interface.
    - LP: #1353021
  * ahci_xgene: Removing NCQ support from the APM X-Gene SoC AHCI SATA Host
    Controller driver.
    - LP: #1358498
  * fold d_kill() and d_free()
    - LP: #1354234
  * fold try_prune_one_dentry()
    - LP: #1354234
  * new helper: dentry_free()
    - LP: #1354234
  * expand the call of dentry_lru_del() in dentry_kill()
    - LP: #1354234
  * dentry_kill(): don't try to remove from shrink list
    - LP: #1354234
  * don't remove from shrink list in select_collect()
    - LP: #1354234
  * more graceful recovery in umount_collect()
    - LP: #1354234
  * dcache: don't need rcu in shrink_dentry_list()
    - LP: #1354234
  * lift the "already marked killed" case into shrink_dentry_list()
  * split dentry_kill()
    - LP: #1354234
  * expand dentry_kill(dentry, 0) in shrink_dentry_list()
    - LP: #1354234
  * shrink_dentry_list(): take parent's ->d_lock earlier
    - LP: #1354234
  * dealing with the rest of shrink_dentry_list() livelock
    - LP: #1354234
  * dentry_kill() doesn't need the second argument now
    - LP: #1354234
  * dcache: add missing lockdep annotation
    - LP: #1354234
  * fs: convert use of typedef ctl_table to struct ctl_table
 ...

Read more...

Changed in linux (Ubuntu Trusty):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.