scsi: hisi_sas: Increase debugfs_dump_index after dump is  completed

Bug #1982070 reported by Fred Kimmy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kunpeng920
Fix Released
Undecided
Ike Panhc
Ubuntu-20.04-hwe
Fix Released
Undecided
Ike Panhc
linux (Ubuntu)
Fix Released
Undecided
Unassigned
Jammy
Fix Released
Medium
Ike Panhc
Kinetic
Fix Released
Undecided
Unassigned

Bug Description

[Impact]
Trigger dump on hisi_sas in debugfs will cause kernel oops.

[Test Plan]
1) modprobe hisi_sas_main with "debugfs_enable=1"
2) echo 1 | sudo tee /sys/kernel/debug/hisi_sas/0000\:74\:02.0/trigger_dump
3) dmesg | grep Oops

[Regression Risk]
Only touch code in hisi_sas. Need to run full test on hisi_sas. Other drivers/platforms are not affected.

================
[Bug Description]
 When the hisi_sas_main driver is loaded, the DFX function is enabled. When the dump is triggered or the SAS controller is reset, call_trace is displayed. In addition, the hisi_sas_v3_hw driver is occupied and cannot be uninstalled.

[Steps to Reproduce]
1)dmesg -C
2)dmesg
3)lsblk
4)lsscsi -p
5)lsmod | grep hisi_sas_v3
6)rmmod hisi_sas_v3_hw
7)rmmod hisi_sas_main
8)modprobe hisi_sas_main debugfs_enable=1
9)modprobe hisi_sas_v3_hw
10)cd /sys/kernel/debug/hisi_sas/0000\:74\:02.0/
11)ll
12)echo 1 > trigger_dump
13)echo 1 > trigger_dump
14)dmesg

[Actual Results]
[ 1005.899976] sas: broadcast received: 0
[ 1005.899997] sas: REVALIDATING DOMAIN on port 0, pid:8
[ 1005.901775] sas: ex 570fd45f9d17b01f phys DID NOT change
[ 1005.901777] sas: done REVALIDATING DOMAIN on port 0, pid:8, res 0x0
[ 1005.901793] sas: broadcast received: 0
[ 1005.901820] sas: REVALIDATING DOMAIN on port 0, pid:8
[ 1005.903563] sas: ex 570fd45f9d17b01f phys DID NOT change
[ 1005.903570] sas: done REVALIDATING DOMAIN on port 0, pid:8, res 0x0
[ 1005.903586] sas: broadcast received: 0
[ 1005.903611] sas: REVALIDATING DOMAIN on port 0, pid:8
[ 1005.905387] sas: ex 570fd45f9d17b01f phys DID NOT change
[ 1005.905388] sas: done REVALIDATING DOMAIN on port 0, pid:8, res 0x0
[ 1005.905404] sas: broadcast received: 0
[ 1005.905429] sas: REVALIDATING DOMAIN on port 0, pid:8
[ 1005.907161] sas: ex 570fd45f9d17b01f phys DID NOT change
[ 1005.907168] sas: done REVALIDATING DOMAIN on port 0, pid:8, res 0x0
[ 1005.907182] sas: broadcast received: 0
[ 1005.907207] sas: REVALIDATING DOMAIN on port 0, pid:8
[ 1005.908944] sas: ex 570fd45f9d17b01f phys DID NOT change
[ 1005.908946] sas: done REVALIDATING DOMAIN on port 0, pid:8, res 0x0
[ 1005.909025] sas: broadcast received: 0
[ 1005.909062] sas: REVALIDATING DOMAIN on port 0, pid:8
[ 1005.910912] sas: ex 570fd45f9d17b01f phys DID NOT change
[ 1005.910919] sas: done REVALIDATING DOMAIN on port 0, pid:8, res 0x0
[ 1005.910947] sas: broadcast received: 0
[ 1005.910985] sas: REVALIDATING DOMAIN on port 0, pid:8
[ 1005.912843] sas: ex 570fd45f9d17b01f phys DID NOT change
[ 1005.912847] sas: done REVALIDATING DOMAIN on port 0, pid:8, res 0x0
[ 1005.912877] sas: broadcast received: 0
[ 1005.912911] sas: REVALIDATING DOMAIN on port 0, pid:8
[ 1005.915191] sas: ex 570fd45f9d17b01f phys DID NOT change
[ 1005.915198] sas: done REVALIDATING DOMAIN on port 0, pid:8, res 0x0
[ 1005.915221] sas: broadcast received: 0
[ 1005.915259] sas: REVALIDATING DOMAIN on port 0, pid:1170
[ 1005.917957] sas: ex 570fd45f9d17b01f phys DID NOT change
[ 1005.917965] sas: done REVALIDATING DOMAIN on port 0, pid:1170, res 0x0
[ 1005.920337] sd 4:0:11:0: [sdl] Attached SCSI disk
[ 1005.921692] sd 4:0:4:0: [sde] Attached SCSI disk
[ 1008.107610] hisi_sas_v3_hw 0000:b4:02.0: 16 hw queues
[ 1008.112712] scsi host6: hisi_sas_v3_hw
[ 1010.428061] hisi_sas_v3_hw 0000:b4:04.0: 16 hw queues
[ 1010.433120] scsi host7: hisi_sas_v3_hw
root@ubuntu:/sys/kernel/debug/hisi_sas/0000:74:02.0#

[Expected Results]
Recurrence Logs:
[ 360.441633] SET = 0, FnV = 0
[ 360.444689] EA = 0, S1PTW = 0
[ 360.447863] Data abort info:
[ 360.450783] ISV = 0, ISS = 0x00000044
[ 360.454663] CM = 0, WnR = 1
[ 360.457673] user pgtable: 4k pages, 48-bit VAs, pgdp=000000211b7ae000
[ 360.464140] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
[ 360.470969] Internal error: Oops: 96000044 [#2] SMP
[ 360.475844] Modules linked in: hisi_sas_v3_hw hisi_sas_main nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ipmi_ssif joydev input_leds efi_pstore arm_spe_pmu hisi_hpre ecdh_generic libcurve25519_generic hns_roce_hw_v2 ecc hisi_zip uio_pdrv_genirq uio hisi_sec2 hisi_qm uacce authenc acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler hisi_trng_v2 cppc_cpufreq sch_fq_codel ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon raid6_pq libcrc32c raid1 raid0 multipath linear mlx5_ib ib_uverbs ib_core realtek hibmc_drm drm_vram_helper drm_ttm_helper ttm i2c_algo_bit drm_kms_helper syscopyarea sysfillrect crct10dif_ce hid_generic ghash_ce sysimgblt sha2_ce fb_sys_fops sha256_arm64 mlx5_core cec usbhid ixgbe sha1_ce hns3 rc_core hclge psample xfrm_algo hid nvme drm mdio libsas mlxfw xhci_pci hnae3 nvme_core xhci_pci_renesas ahci scsi_transport_sas tls spi_dw_mmio gpio_dwapb spi_dw
[ 360.476356] aes_neon_bs aes_neon_blk aes_ce_blk crypto_simd cryptd aes_ce_cipher [last unloaded: hisi_sas_main]
[ 360.572957] CPU: 12 PID: 1389 Comm: kworker/u256:3 Tainted: G D 5.13.0-44-generic #49~20.04.1-Ubuntu
[ 360.583361] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDD, BIOS 2280-V2 CS V5.B221.01 12/09/2021
[ 360.592983] Workqueue: 0000:b4:04.0 hisi_sas_rst_work_handler [hisi_sas_main]
[ 360.600138] pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--)
[ 360.606135] pc : debugfs_snapshot_regs_v3_hw+0xe4/0x418 [hisi_sas_v3_hw]
[ 360.612841] lr : debugfs_snapshot_regs_v3_hw+0xe4/0x418 [hisi_sas_v3_hw]
[ 360.619536] sp : ffff80002fdebd10
[ 360.622846] x29: ffff80002fdebd10 x28: 0000000000000000 x27: 0000000000000000
[ 360.629974] x26: ffff202000cdaa6c x25: 0000000000000000 x24: ffff0020fe8fe8b0
[ 360.637101] x23: 0000000000000000 x22: ffff202017395ce8 x21: ffff2020173a4880
[ 360.644225] x20: 0000000000000000 x19: ffff202017380880 x18: 000000000000000e
[ 360.651351] x17: 0000000000000001 x16: ffffba5e9251f9a8 x15: 0000000000000000
[ 360.658474] x14: 0000000000000000 x13: 0000000000000020 x12: ffffba5e9388fa58
[ 360.665599] x11: 0000000000000040 x10: ffffba5e949a7cc0 x9 : ffffba5e4c21a8e4
[ 360.672724] x8 : ffff202001a56ff0 x7 : 0000000000000000 x6 : 0000000000000000
[ 360.679848] x5 : ffff202001a56fc8 x4 : ffff202001a57050 x3 : 0000000000003011
[ 360.686974] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000
[ 360.694100] Call trace:
[ 360.696547] debugfs_snapshot_regs_v3_hw+0xe4/0x418 [hisi_sas_v3_hw]
[ 360.702900] hisi_sas_controller_prereset+0x8c/0xa8 [hisi_sas_main]
[ 360.709168] hisi_sas_rst_work_handler+0x28/0x48 [hisi_sas_main]
[ 360.715175] process_one_work+0x1fc/0x4b8
[ 360.719194] worker_thread+0x148/0x510
[ 360.722946] kthread+0x114/0x120
[ 360.726177] ret_from_fork+0x10/0x18
[ 360.729762] Code: d503201f 2a1403e1 f9401660 97fff114 (b8346ae0)
[ 360.735843] ---[ end trace ff032567bd4ebc0a ]---
root@mycover:~#

Normal log:
[ 1005.899976] sas: broadcast received: 0
[ 1005.899997] sas: REVALIDATING DOMAIN on port 0, pid:8
[ 1005.901775] sas: ex 570fd45f9d17b01f phys DID NOT change
[ 1005.901777] sas: done REVALIDATING DOMAIN on port 0, pid:8, res 0x0
[ 1005.901793] sas: broadcast received: 0
[ 1005.901820] sas: REVALIDATING DOMAIN on port 0, pid:8
[ 1005.903563] sas: ex 570fd45f9d17b01f phys DID NOT change
[ 1005.903570] sas: done REVALIDATING DOMAIN on port 0, pid:8, res 0x0
[ 1005.903586] sas: broadcast received: 0
[ 1005.903611] sas: REVALIDATING DOMAIN on port 0, pid:8
[ 1005.905387] sas: ex 570fd45f9d17b01f phys DID NOT change
[ 1005.905388] sas: done REVALIDATING DOMAIN on port 0, pid:8, res 0x0
[ 1005.905404] sas: broadcast received: 0
[ 1005.905429] sas: REVALIDATING DOMAIN on port 0, pid:8
[ 1005.907161] sas: ex 570fd45f9d17b01f phys DID NOT change
[ 1005.907168] sas: done REVALIDATING DOMAIN on port 0, pid:8, res 0x0
[ 1005.907182] sas: broadcast received: 0
[ 1005.907207] sas: REVALIDATING DOMAIN on port 0, pid:8
[ 1005.908944] sas: ex 570fd45f9d17b01f phys DID NOT change
[ 1005.908946] sas: done REVALIDATING DOMAIN on port 0, pid:8, res 0x0
[ 1005.909025] sas: broadcast received: 0
[ 1005.909062] sas: REVALIDATING DOMAIN on port 0, pid:8
[ 1005.910912] sas: ex 570fd45f9d17b01f phys DID NOT change
[ 1005.910919] sas: done REVALIDATING DOMAIN on port 0, pid:8, res 0x0
[ 1005.910947] sas: broadcast received: 0
[ 1005.910985] sas: REVALIDATING DOMAIN on port 0, pid:8
[ 1005.912843] sas: ex 570fd45f9d17b01f phys DID NOT change
[ 1005.912847] sas: done REVALIDATING DOMAIN on port 0, pid:8, res 0x0
[ 1005.912877] sas: broadcast received: 0
[ 1005.912911] sas: REVALIDATING DOMAIN on port 0, pid:8
[ 1005.915191] sas: ex 570fd45f9d17b01f phys DID NOT change
[ 1005.915198] sas: done REVALIDATING DOMAIN on port 0, pid:8, res 0x0
[ 1005.915221] sas: broadcast received: 0
[ 1005.915259] sas: REVALIDATING DOMAIN on port 0, pid:1170
[ 1005.917957] sas: ex 570fd45f9d17b01f phys DID NOT change
[ 1005.917965] sas: done REVALIDATING DOMAIN on port 0, pid:1170, res 0x0
[ 1005.920337] sd 4:0:11:0: [sdl] Attached SCSI disk
[ 1005.921692] sd 4:0:4:0: [sde] Attached SCSI disk
[ 1008.107610] hisi_sas_v3_hw 0000:b4:02.0: 16 hw queues
[ 1008.112712] scsi host6: hisi_sas_v3_hw
[ 1010.428061] hisi_sas_v3_hw 0000:b4:04.0: 16 hw queues
[ 1010.433120] scsi host7: hisi_sas_v3_hw
root@ubuntu:/sys/kernel/debug/hisi_sas/0000:74:02.0#

[Reproducibility]
Conditionally recurs.

[Additional information2022053006819]
 Ubuntu-hwe-5.13-5.13.0-35.40_20.04.1

[Resolution]
 http://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.16-rc1&id=9aec5ffa6e39926cff1a6b576c815a9cee90e259

 scsi: hisi_sas: Increase debugfs_dump_index after dump is
 completed

CVE References

Revision history for this message
Ike Panhc (ikepanhc) wrote :

LGTM, I will try to backport and reproduce.

Changed in kunpeng920:
status: New → In Progress
assignee: nobody → Ike Panhc (ikepanhc)
Revision history for this message
Ike Panhc (ikepanhc) wrote :

I can reproduce but it seems kernel log with backtrace is the faulty one. Will build kernel and verify.

Revision history for this message
Ike Panhc (ikepanhc) wrote :

Build patched kernel and I can trigger dump without kernel oops. Thanks. I will send the patch.

Ike Panhc (ikepanhc)
Changed in linux (Ubuntu):
status: New → In Progress
assignee: nobody → Ike Panhc (ikepanhc)
Ike Panhc (ikepanhc)
description: updated
Revision history for this message
Ike Panhc (ikepanhc) wrote :

Patch hits mainline kernel since v5.16. When Kinetic kernel rolls to v5.19, this issue will be fix for 22.10.

Changed in linux (Ubuntu Jammy):
status: New → In Progress
assignee: nobody → Ike Panhc (ikepanhc)
Changed in linux (Ubuntu):
assignee: Ike Panhc (ikepanhc) → nobody
status: In Progress → Fix Committed
Revision history for this message
Ike Panhc (ikepanhc) wrote :
Revision history for this message
Ike Panhc (ikepanhc) wrote :

Dig deeper with 5.4 kernels and it also support debugfs with hisi_sas but can not clean cherry-pick the mainline patch. I will try to reproduce this issue with 5.4 kernel and find out if we need to backport.

Stefan Bader (smb)
Changed in linux (Ubuntu Jammy):
importance: Undecided → Medium
Stefan Bader (smb)
Changed in linux (Ubuntu Jammy):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Kinetic):
status: Fix Committed → Fix Released
Revision history for this message
Ike Panhc (ikepanhc) wrote :

I can not reproduce on 5.4.0-125.141-generic kernel. The only Ubuntu kernel affected is 5.15.

Changed in kunpeng920:
status: In Progress → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux/5.15.0-50.56 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification-done-jammy'. If the problem still exists, change the tag 'verification-needed-jammy' to 'verification-failed-jammy'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

Revision history for this message
Ike Panhc (ikepanhc) wrote :

Jammy kernel 5.15.0-50.56 works for me. Thanks.

tags: added: verification-done-jammy
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (42.9 KiB)

This bug was fixed in the package linux - 5.15.0-50.56

---------------
linux (5.15.0-50.56) jammy; urgency=medium

  * jammy/linux: 5.15.0-50.56 -proposed tracker (LP: #1990148)

  * CVE-2022-3176
    - io_uring: refactor poll update
    - io_uring: move common poll bits
    - io_uring: kill poll linking optimisation
    - io_uring: inline io_poll_complete
    - io_uring: correct fill events helpers types
    - io_uring: clean cqe filling functions
    - io_uring: poll rework
    - io_uring: remove poll entry from list when canceling all
    - io_uring: bump poll refs to full 31-bits
    - io_uring: fail links when poll fails
    - io_uring: fix wrong arm_poll error handling
    - io_uring: fix UAF due to missing POLLFREE handling

  * ip/nexthop: fix default address selection for connected nexthop
    (LP: #1988809)
    - selftests/net: test nexthop without gw

  * ip/nexthop: fix default address selection for connected nexthop
    (LP: #1988809) // icmp_redirect.sh in ubuntu_kernel_selftests failed on
    Jammy 5.15.0-49.55 (LP: #1990124)
    - ip: fix triggering of 'icmp redirect'

linux (5.15.0-49.55) jammy; urgency=medium

  * jammy/linux: 5.15.0-49.55 -proposed tracker (LP: #1989785)

  * amdgpu module crash after 5.15 kernel update (LP: #1981883)
    - drm/amdgpu: fix check in fbdev init

  * scsi: hisi_sas: Increase debugfs_dump_index after dump is  completed
    (LP: #1982070)
    - scsi: hisi_sas: Increase debugfs_dump_index after dump is completed

  * [UBUNTU 22.04] s390/qeth: cache link_info for ethtool (LP: #1984103)
    - s390/qeth: cache link_info for ethtool

  * WARN in trace_event_dyn_put_ref (LP: #1987232)
    - tracing/perf: Fix double put of trace event when init fails

  * Jammy update: v5.15.60 upstream stable release (LP: #1989221)
    - x86/speculation: Make all RETbleed mitigations 64-bit only
    - selftests/bpf: Extend verifier and bpf_sock tests for dst_port loads
    - selftests/bpf: Check dst_port only on the client socket
    - block: fix default IO priority handling again
    - tools/vm/slabinfo: Handle files in debugfs
    - ACPI: video: Force backlight native for some TongFang devices
    - ACPI: video: Shortening quirk list by identifying Clevo by board_name only
    - ACPI: APEI: Better fix to avoid spamming the console with old error logs
    - crypto: arm64/poly1305 - fix a read out-of-bound
    - KVM: x86: do not report a vCPU as preempted outside instruction boundaries
    - KVM: x86: do not set st->preempted when going back to user space
    - KVM: selftests: Make hyperv_clock selftest more stable
    - tools/kvm_stat: fix display of error when multiple processes are found
    - selftests: KVM: Handle compiler optimizations in ucall
    - KVM: x86/svm: add __GFP_ACCOUNT to __sev_dbg_{en,de}crypt_user()
    - arm64: set UXN on swapper page tables
    - btrfs: zoned: prevent allocation from previous data relocation BG
    - btrfs: zoned: fix critical section of relocation inode writeback
    - Bluetooth: hci_bcm: Add BCM4349B1 variant
    - Bluetooth: hci_bcm: Add DT compatible for CYW55572
    - dt-bindings: bluetooth: broadcom: Add BCM4349B1 DT binding
    - Bluetooth: btusb: Add support of IMC Netw...

Changed in linux (Ubuntu Jammy):
status: Fix Committed → Fix Released
Ike Panhc (ikepanhc)
Changed in kunpeng920:
status: Fix Committed → Fix Released
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-gkeop-5.15/5.15.0-1005.7~20.04.1 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-bluefield/5.15.0-1010.12 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification-done-jammy'. If the problem still exists, change the tag 'verification-needed-jammy' to 'verification-failed-jammy'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-bluefield verification-needed-jammy
removed: verification-done-jammy
Revision history for this message
Ike Panhc (ikepanhc) wrote :

This issue is already verified with -generic kernel.

tags: added: verification-done-jammy
removed: verification-needed-jammy
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-nvidia/5.15.0-1011.11 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification-done-jammy'. If the problem still exists, change the tag 'verification-needed-jammy' to 'verification-failed-jammy'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-nvidia verification-needed-jammy
removed: verification-done-jammy
Ike Panhc (ikepanhc)
tags: added: verification-donejammy
removed: verification-needed-jammy
tags: added: verification-done-jammy
removed: verification-donejammy
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-mtk/5.15.0-1030.34 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy-linux-mtk' to 'verification-done-jammy-linux-mtk'. If the problem still exists, change the tag 'verification-needed-jammy-linux-mtk' to 'verification-failed-jammy-linux-mtk'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-mtk-v2 verification-needed-jammy-linux-mtk
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.