show-regs can cause some samsung controllers to go offline
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
nvme-cli (Debian) |
Fix Released
|
Unknown
|
|||
nvme-cli (Ubuntu) |
Fix Released
|
Undecided
|
dann frazier | ||
Groovy |
Won't Fix
|
Undecided
|
dann frazier | ||
Hirsute |
Fix Released
|
Undecided
|
dann frazier | ||
Impish |
Fix Released
|
Undecided
|
dann frazier |
Bug Description
[Impact]
nvme show-regs has been found to cause certain Samsung controllers
(MZ1L21T9HCLS in particular) to go offline.
[Test Case]
Run `nvme show-regs` on an effected controller device. Messages similar to this will appear in dmesg:
[963314.311332] nvme nvme2: controller is down; will reset: CSTS=0x3, PCI_STATUS=0x10
[963334.951328] nvme nvme2: Device not ready; aborting reset
[963334.963114] nvme nvme2: Removing after probe failure status: -19
[963334.999600] blk_update_request: I/O error, dev nvme2n1, sector 1050640 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
[963335.023410] md: super_written gets error=10
[963335.033842] md/raid1:md0: Disk failure on nvme2n1p2, disabling device.
[ +0.009599] XFS (md127): log I/O error -5
[ +0.015136] XFS (md127): xfs_do_
[ +0.000001] XFS (md127): Log I/O Error Detected. Shutting down filesystem
[ +0.009290] XFS (md127): Please unmount the filesystem and rectify the problem(s)
[Fix]
This has been fixed upstream with the following commits:
https:/
https:/
[What Could Go Wrong]
Because the register prmsc is now split into prmscl/prmscu as the specification requires, the displayed registers will be different in showregs output. This might surprise any code that is trying to parse this output. Also upstream made a formatting change here that adds additional whitespace to a field when running w/ -H (human-readable mode):
This:
Controller Base Address (CBA) : 0
Became:
Controller Base Address (CBA): 0
It is human-readable mode which at least I interpret as "not for scripting", but it's possible that there is a user expecting that specific format. We could carry an additional patch to restore this whitespace if the SRU team is so inclined.
description: | updated |
Changed in nvme-cli (Debian): | |
status: | Unknown → Confirmed |
Changed in nvme-cli (Ubuntu Impish): | |
status: | New → In Progress |
assignee: | nobody → dann frazier (dannf) |
Changed in nvme-cli (Debian): | |
status: | Confirmed → Fix Released |
Changed in nvme-cli (Ubuntu Impish): | |
status: | In Progress → Triaged |
Changed in nvme-cli (Ubuntu Impish): | |
status: | Fix Committed → Fix Released |
Changed in nvme-cli (Ubuntu Hirsute): | |
status: | New → In Progress |
Changed in nvme-cli (Ubuntu Groovy): | |
status: | New → In Progress |
assignee: | nobody → dann frazier (dannf) |
Changed in nvme-cli (Ubuntu Hirsute): | |
assignee: | nobody → dann frazier (dannf) |
Changed in nvme-cli (Ubuntu Groovy): | |
status: | In Progress → Won't Fix |
This is fixed in the 1.14 upstream release which I've now sync'd from Debian experimental.