Comment 25 for bug 1925211

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi Kaihenfeng,
Thanks for your patch suggestion! I'm semantically not sure it is the right thing - to clarify your theory is that before it checked !resuming and before had the check for !cdev maybe just to avoid a deference error. And now you assume that instead of !cdev it should check if there is a cdev there.
I'm unsure - if !cdev was indeed just to protect the dereference then maybe no check at all might be better. Which would then read "if the event is IO_SCH_ORPH_UNREG or IO_SCH_UNREG then do css_sch_device_unregister.

But that I'm not immediately convinced doesn't mean much and it is easy to test and surely worth a try, so I ran v5.11 (bad) plus your patch and the result will be useful to know in any case. It is working fine, that much I can tell you.

But if my thought above was right (it was only there to avoid the potential deference error), then why check it at all. If the condition cdev==NULL is possible it would now skip to to fully remove it - we might not need that at all.
And Since I brought up the idea of dropping the cdev check entirely that was worth a try as well. So now the third check of this morning is for:
--- a/drivers/s390/cio/device.c
+++ b/drivers/s390/cio/device.c
@@ -1525,8 +1525,7 @@ static int io_subchannel_sch_event(struct subchannel *sch, int process)
        switch (action) {
        case IO_SCH_ORPH_UNREG:
        case IO_SCH_UNREG:
- if (!cdev)
- css_sch_device_unregister(sch);
+ css_sch_device_unregister(sch);
                break;
        case IO_SCH_ORPH_ATTACH:
        case IO_SCH_UNREG_ATTACH:

My patch with that change - in my test - is working as well.
Neither of the solutions has triggered other regressions in my setup - but then there are so many potential use-cases that I can't be sure without a further revew by subject matter experts.

So a summary of the recent tests:

5.11.0-16-generic #17+lp1925211v202104201520 (Seths full revert) - working
5.11.0lp1925211-patch-kaihengfeng-dirty - working
5.11.0nocdevcheck-paelzer-dirty - working

I think we'd want an answer from the IBM devs which solution (full revert, kaihenfeng patch, cpaelzer patch, another approach) they would prefer - then we can submit it upstream for them to include officially and we can carry it as delta until we rebase onto a version that has it applied anyway.

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8cc0dcfdc1c0e0be107d0288f9c0cf1f4201be62