md/raid1: fix request counting bug in new 'barrier' code.
The new iobarrier implementation in raid1 (which keeps normal writes
and resync activity separate) counts every request what is not before
the current resync point in either next_window_requests or
current_window_requests.
It flags that the request is counted by setting ->start_next_window.
allow_barrier follows this model exactly and decrements one of the
*_window_requests if and only if ->start_next_window is set.
However wait_barrier(), which increments *_window_requests uses a
slightly different test for setting -.start_next_window (which is set
from the return value of this function).
So there is a possibility of the counts getting out of sync, and this
leads to the resync hanging.
So change wait_barrier() to return a non-zero value in exactly the
same cases that it increments *_window_requests.
This sounds very much like the bug described in the commit below, which is not yet in our kernel (it will be on the next rebase):
commit 41a336e011887f7 3e7c879b60e1e35 44045435cb
Author: NeilBrown <email address hidden>
Date: Tue Jan 14 11:56:14 2014 +1100
md/raid1: fix request counting bug in new 'barrier' code.
The new iobarrier implementation in raid1 (which keeps normal writes requests or window_ requests. next_window.
and resync activity separate) counts every request what is not before
the current resync point in either next_window_
current_
It flags that the request is counted by setting ->start_
allow_barrier follows this model exactly and decrements one of the requests if and only if ->start_next_window is set.
*_window_
However wait_barrier(), which increments *_window_requests uses a
slightly different test for setting -.start_next_window (which is set
from the return value of this function).
So there is a possibility of the counts getting out of sync, and this
leads to the resync hanging.
So change wait_barrier() to return a non-zero value in exactly the
same cases that it increments *_window_requests.
But was introduced in 3.13-rc1.
Reported-by: Bruno Wolff III <email address hidden> /bugzilla. kernel. org/show_ bug.cgi? id=68061 3cc231c9a90a278 333c21f761
URL: https:/
Fixes: 79ef3a8aa1cb152
Cc: majianpeng <email address hidden>
Signed-off-by: NeilBrown <email address hidden>