Crash at creation of bcache if caching size > backing size

Bug #1377130 reported by Kick In
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
bcache-tools (Ubuntu)
Invalid
Undecided
Unassigned
linux (Ubuntu)
Expired
High
Unassigned

Bug Description

Hi,

I found a bug that crashes the machine if you create a bcache where the caching device size is bigger than backing device.

This case may happen if you create a bcache with a plan to add a bigger disk/partition/raid later to it.

To reproduce it follow this steps:

start a vm with utopic-amd64-desktop iso, it must have 2 disks, one of 16G and on of 8G

  apt-get update && apt-get install bcache-tools
  /dev/sda is 16G caching device
  /dev/sdb is 8G backing device

  make-bcache --writeback --discard -C /dev/sda -B /dev/sdb

now the machine is hanged!

if you reboot and reinstall bcache-tools
the bcache device has been created

  ************************************
  bcache-super-show -f /dev/sda:
  ==============================
  sb.magic ok
  sb.first_sector 8 [match]
  sb.csum 45C96B38A79275A7 [match]
  sb.version 0 [cache device]

  dev.label (empty)
  dev.uuid 155ab65e-e74a-4ca8-91da-467ef696db63
  dev.sectors_per_block 1
  dev.sectors_per_bucket 1024
  dev.cache.first_sector 1024
  dev.cache.cache_sectors 33553408
  dev.cache.total_sectors 33554432
  dev.cache.ordered no
  dev.cache.discard yes
  dev.cache.pos 0
  dev.cache.replacement 0 [lru]

  cset.uuid bcaa2595-bbc8-43a7-8c44-5d0aca0f3fc8

  *****************************************************
  bcache-super-show /dev/sdb:
  =============================================
  sb.magic ok
  sb.first_sector 8 [match]
  sb.csum F37A1BB08A7F5DE7 [match]
  sb.version 1 [backing device]

  dev.label (empty)
  dev.uuid 341f1c83-1797-4157-aa9e-524107db5606
  dev.sectors_per_block 1
  dev.sectors_per_bucket 1024
  dev.data.first_sector 16
  dev.data.cache_mode 1 [writeback]
  dev.data.cache_state 0 [detached]

  cset.uuid bcaa2595-bbc8-43a7-8c44-5d0aca0f3fc8
  *****************************************************

you can format it:
  mkfs.ext4 /dev/bcache0
  Rejet des blocs de périphérique : complété
  Creating filesystem with 2097150 4k blocks and 524288 inodes
  Filesystem UUID: b99a09de-3a92-4a44-82cd-29f75f8bf0c9
  Superblocs de secours stockés sur les blocs :
   32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632

  Allocation des tables de groupe : complété
  Écriture des tables d'i-noeuds : complété
  Création du journal (32768 blocs) : complété
  Écriture des superblocs et de l'information de comptabilité du système de
  fichiers :

restarting the format gave me an other bug (X.org crashing):
  free from cli:
        total used free shared buffers cached
  Mem: 1017168 814040 203128 195568 91264 513064
  -/+ buffers/cache: 209712 807456
  Swap: 0 0 0

  tail /var/log/syslog:
  [ 568.016481] [ 4983] 0 4983 3332 121 11 0 0 mdadm
  [ 568.016483] [ 5259] 0 5259 6703 300 19 0 0 mkfs.ext4
  [ 568.016484] Out of memory: Kill process 1947 (Xorg) score 38 or sacrifice child
  [ 568.016485] Killed process 1947 (Xorg) total-vm:313160kB, anon-rss:35976kB, file-rss:4232kB
  [ 568.707107] systemd-logind[1396]: Failed to start unit user@112.service: Unknown unit: user@112.service
  [ 568.707113] systemd-logind[1396]: Failed to start user service: Unknown unit: user@112.service
  [ 568.709876] systemd-logind[1396]: New session c8 of user lightdm.
  [ 568.709890] systemd-logind[1396]: Linked /tmp/.X11-unix/X0 to /run/user/112/X11-display.

You can unregister the bcache device,but to re-use all disks, you'll need a reboot (cf bug 1377142):
  [ 830.811150] bcache: cached_dev_detach_finish() Caching disabled for sdb
  [ 830.949898] bcache: cache_set_free() Cache set bcaa2595-bbc8-43a7-8c44-5d0aca0f3fc8 unregistered

Kick In (kick-d)
description: updated
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1377130

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Kick In (kick-d)
description: updated
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.17 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.17-rc7-utopic/

tags: added: kernel-da-key utopic
Changed in linux (Ubuntu):
importance: Undecided → High
Chris J Arges (arges)
description: updated
Revision history for this message
Chris J Arges (arges) wrote :

I could not easily reproduce this with a VM according to your instructions. Are there other factors that are necessary to hit this issue?
When the machine hangs do you have any message printed to the console showing the backtrace?

Revision history for this message
Stefan Bader (smb) wrote :

+1 on the fail to reproduce. Followed the steps and everything worked without issues (using kvm/qemu VM with emulated devices and even sda being backed by a real ssd, but I am not sure the difference gets propagated).

Revision history for this message
Kick In (kick-d) wrote : Re: [Bug 1377130] Re: Crash at creation of bcache if caching size > backing size

Strange, I can reproduce each time, I'm using the iso:
ubuntu-14.10-beta2-desktop-amd64.iso from http://releases.ubuntu.com/utopic/
I'm on utopic, with vrit-manager.

Regards.
Le 08/10/2014 13:17, Stefan Bader a écrit :
> +1 on the fail to reproduce. Followed the steps and everything worked
> without issues (using kvm/qemu VM with emulated devices and even sda
> being backed by a real ssd, but I am not sure the difference gets
> propagated).
>

Revision history for this message
Kick In (kick-d) wrote :

Yes you can get rid of bcache0, but you can't re-use the device for anything unless you reboot ( re bcache or just plain fdisk/parted ).

In fact whether you unmount before stopping the bcache or not doesn't change the behaviour. I did that this way in the description, to show the process.
You just stop the caching device with the echo 1 > ...
But, the backing is still in use and you have no control over it.

Revision history for this message
Ryan Harper (raharper) wrote :

I can't reproduce this on vivid w.r.t the hang.

For the "release all the devs" used in bcache setup; it's certainly awkward. Fundamentally, there's no path in sysfs that contains both the cache devs and the backing devs. And AFAIK, there is no way to know that the bcacheX devices is being cached by a specific cache set (/sys/fs/bcache/$UUID).

It's certainly possible to release all of the devices, but it does take some care:

http://paste.ubuntu.com/12141060/

Changed in bcache-tools (Ubuntu):
status: New → Invalid
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.