USB devices not detected during boot on USB 3.0 hubs

Bug #1968210 reported by Luke Nowakowski-Krijger
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Incomplete
Undecided
Luke Nowakowski-Krijger
Focal
Fix Released
Undecided
Unassigned
Impish
Fix Released
Undecided
Unassigned

Bug Description

[SRU Justification]

[Impact]
There are users with certain intel xHCI controllers that are
experiencing problems with USB devices not being detected at boot.

This is because when the primary roothub is registered, device
enumeration happens before xHC is running and leads to devices not being
detected. This results in the error that looks something like
'usb usb1-port3: couldn't allocate usb_device'.

[Fix]
Register both root hubs along with the secondary hcd for xhci.

This original fix was reverted upstream due to regressions that occured due to
racing that happened when both roothubs were registered simultaneously.
However with those fixes being addressed in commits
("usb: hub: Fix usb enumeration issue due to address0 race")
("usb: hub: Fix locking issues with address0_mutex")
the maintainers have stated that they will be reintroducing this commit.
So lets reintroduce it here to fix the issues that users are
experiencing.

[Test Case]
Confirmed by Chris Chiu that this issue exists on similiar hardware
reported by the users and that reverting these reverts fixes the issue
showing no signs of 'couldn't allocate usb_device' and with USB devices
available after boot.

[Regression Potential]
Should be low now that we carry the fixes that seemed to be caused by
this patch series.

------------------------------------------------------------------------
There have been reports by some users using certain intel xHCI controllers that their USB devices are not being detected after boot again after similar issues were previously found and fixed. This seems to be related to both [1][2] with the majority of the discussion on [1] about these problems reoccurring. This bug report is being made more for documentation of this new regression.

These seems to be due to the patchset for [2] being reverted upstream due to regressions.

[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1939638
[2] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1945211

summary:
Changed in linux (Ubuntu Focal):
status: New → Confirmed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1968210

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Changed in linux (Ubuntu Impish):
status: New → In Progress
Changed in linux (Ubuntu Focal):
status: Confirmed → In Progress
Changed in linux (Ubuntu Focal):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Impish):
status: In Progress → Fix Committed
description: updated
Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :

This bug is awaiting verification that the linux/5.4.0-109.123 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-focal
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-azure-5.4/5.4.0-1077.80~18.04.1 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Revision history for this message
Luke Nowakowski-Krijger (lukenow) wrote :

Verification of the fix was confirmed by Chris Chiu when testing the patch changes. For now we will flip verification done for bionic and focal 5.4 kernels.

We are also still waiting for confirmation from Dries Oeyen and Tilman Schmidt to confirm that they are also seeing their issues fixed.

tags: added: verification-done-bionic verification-done-focal
removed: verification-needed-bionic verification-needed-focal
Revision history for this message
Dries Oeyen (driesoeyen) wrote :

Thank you, all! I'll be able to run validation tests on my end on Tuesday next week (after Easter) and will report back here with the results.

Revision history for this message
Tilman Schmidt (tgs-bonn) wrote :

Please skip confirmation from me. I currently don't have a test environment available to test the -proposed kernel.

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (10.8 KiB)

This bug was fixed in the package linux - 5.4.0-109.123

---------------
linux (5.4.0-109.123) focal; urgency=medium

  * focal/linux: 5.4.0-109.123 -proposed tracker (LP: #1968290)

  * USB devices not detected during boot on USB 3.0 hubs (LP: #1968210)
    - SAUCE: Revert "Revert "xhci: Set HCD flag to defer primary roothub
      registration""
    - SAUCE: Revert "Revert "usb: core: hcd: Add support for deferring roothub
      registration""

linux (5.4.0-108.122) focal; urgency=medium

  * focal/linux: 5.4.0-108.122 -proposed tracker (LP: #1966740)

  * Packaging resync (LP: #1786013)
    - [Packaging] resync dkms-build{,--nvidia-N} from LRMv5
    - debian/dkms-versions -- update from kernel-versions (main/2022.03.21)

  * Low RX performance for 40G Solarflare NICs (LP: #1964512)
    - SAUCE: sfc: The size of the RX recycle ring should be more flexible

  * [UBUNTU 20.04] KVM: Enable storage key checking for intercepted instruction
    (LP: #1962831)
    - selftests: kvm: add _vm_ioctl
    - selftests: kvm: Introduce the TEST_FAIL macro
    - KVM: selftests: Add GUEST_ASSERT variants to pass values to host
    - KVM: s390: gaccess: Refactor gpa and length calculation
    - KVM: s390: gaccess: Refactor access address range check
    - KVM: s390: gaccess: Cleanup access to guest pages
    - s390/uaccess: introduce bit field for OAC specifier
    - s390/uaccess: fix compile error
    - s390/uaccess: Add copy_from/to_user_key functions
    - KVM: s390: Honor storage keys when accessing guest memory
    - KVM: s390: handle_tprot: Honor storage keys
    - KVM: s390: selftests: Test TEST PROTECTION emulation
    - KVM: s390: Add optional storage key checking to MEMOP IOCTL
    - KVM: s390: Add vm IOCTL for key checked guest absolute memory access
    - KVM: s390: Rename existing vcpu memop functions
    - KVM: s390: Add capability for storage key extension of MEM_OP IOCTL
    - KVM: s390: Update api documentation for memop ioctl
    - KVM: s390: Clarify key argument for MEM_OP in api docs
    - KVM: s390: Add missing vm MEM_OP size check

  * 【sec-0911】 fail to reset sec module (LP: #1943301)
    - crypto: hisilicon/sec2 - Add workqueue for SEC driver.
    - crypto: hisilicon/sec2 - update SEC initialization and reset

  * Lots of hisi_qm zombie task slow down system after stress test
    (LP: #1932117)
    - crypto: hisilicon - Use one workqueue per qm instead of per qp

  * Lots of hisi_qm zombie task slow down system after stress test
    (LP: #1932117) // 【sec-0911】 fail to reset sec module (LP: #1943301)
    - crypto: hisilicon - Unify hardware error init/uninit into QM

  * [UBUNTU 20.04] Fix SIGP processing on KVM/s390 (LP: #1962578)
    - KVM: s390: Simplify SIGP Set Arch handling
    - KVM: s390: Add a routine for setting userspace CPU state

  * Move virtual graphics drivers from linux-modules-extra to linux-modules
    (LP: #1960633)
    - [Packaging] Move VM DRM drivers into modules

  * Focal update: v5.4.178 upstream stable release (LP: #1964634)
    - audit: improve audit queue handling when "audit=1" on cmdline
    - ASoC: ops: Reject out of bounds values in snd_soc_put_volsw()
    - ASoC: ops: Reject out of bounds values in snd_...

Changed in linux (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux/5.13.0-41.46 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-impish' to 'verification-done-impish'. If the problem still exists, change the tag 'verification-needed-impish' to 'verification-failed-impish'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-impish
Revision history for this message
Dries Oeyen (driesoeyen) wrote :

I'm sorry, it looks like I won't be able to validate this fix on our Ubuntu Core 20 systems. While waiting for this fix to become available, we've used Ubuntu Core's refresh control (snap gating) mechanism to ensure a non-affected version of the pc-kernel snap is installed across our fleet of devices. Relevant docs: https://ubuntu.com/core/docs/refresh-control

Due to this snap gating "validation assertion" currently being in-place, I'm unable to update my test system to the fixed revision of the pc-kernel snap (5.4.0-109.123.1, revision 978) to confirm the fix. I have a pending support ticket with Canonical to find out how to lift the snap gating restriction but at this time I haven't received a response yet.

However, I see that the fix has now been marked as released for Focal. Indeed, the fixed revision of the pc-kernel snap (5.4.0-109.123.1, revision 978) has already hit the "20/stable" channel as of today. Can I conclude that the fix has been accepted based on Chris Chiu's confirmation and that my confirmation is no longer required? In other words: that the fix is now released and no longer at risk of being dropped, for 20.04?

Revision history for this message
Luke Nowakowski-Krijger (lukenow) wrote :

Hi Dries,

Please do not feel a rush to validate the fix, I was more asking just to confirm personally that your issue has been resolved :)

That message about dropping the fix is more of an empty threat by a bot than anything else..

So feel free at your earliest convenience to update and confirm that the new kernel fixes things for you.

Best,
- Luke

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (33.7 KiB)

This bug was fixed in the package linux - 5.13.0-41.46

---------------
linux (5.13.0-41.46) impish; urgency=medium

  * impish/linux: 5.13.0-41.46 -proposed tracker (LP: #1969014)

  * NVMe devices fail to probe due to ACPI power state change (LP: #1942624)
    - ACPI: power: Rework turning off unused power resources
    - ACPI: PM: Do not turn off power resources in unknown state

  * Recent 5.13 kernel has broken KVM support (LP: #1966499)
    - KVM: Add infrastructure and macro to mark VM as bugged
    - KVM: x86: Use KVM_BUG/KVM_BUG_ON to handle bugs that are fatal to the VM
    - KVM: VMX: prepare sync_pir_to_irr for running with APICv disabled

  * LRMv6: add multi-architecture support (LP: #1968774)
    - [Packaging] resync dkms-build{,--nvidia-N}

  * io_uring regression - lost write request (LP: #1952222)
    - io-wq: split bounded and unbounded work into separate lists

  * xfrm interface cannot be changed anymore (LP: #1968591)
    - xfrm: fix the if_id check in changelink

  * Use kernel-testing repo from launchpad for ADT tests (LP: #1968016)
    - [Debian] Use kernel-testing repo from launchpad

  * vmx_ldtr_test in ubuntu_kvm_unit_tests failed (FAIL: Expected 0 for L1 LDTR
    selector (got 50)) (LP: #1956315)
    - KVM: nVMX: Set LDTR to its architecturally defined value on nested VM-Exit

  * audio from external sound card is distorted (LP: #1966066)
    - ALSA: usb-audio: Fix packet size calculation regression

  * Impish update: upstream stable patchset 2022-04-12 (LP: #1968771)
    - cgroup/cpuset: Fix a race between cpuset_attach() and cpu hotplug
    - btrfs: tree-checker: check item_size for inode_item
    - btrfs: tree-checker: check item_size for dev_item
    - clk: jz4725b: fix mmc0 clock gating
    - vhost/vsock: don't check owner in vhost_vsock_stop() while releasing
    - parisc/unaligned: Fix fldd and fstd unaligned handlers on 32-bit kernel
    - parisc/unaligned: Fix ldw() and stw() unalignment handlers
    - KVM: x86/mmu: make apf token non-zero to fix bug
    - drm/amdgpu: disable MMHUB PG for Picasso
    - drm/i915: Correctly populate use_sagv_wm for all pipes
    - sr9700: sanity check for packet length
    - USB: zaurus: support another broken Zaurus
    - CDC-NCM: avoid overflow in sanity checking
    - x86/fpu: Correct pkru/xstate inconsistency
    - tee: export teedev_open() and teedev_close_context()
    - optee: use driver internal tee_context for some rpc
    - ping: remove pr_err from ping_lookup
    - perf data: Fix double free in perf_session__delete()
    - bnx2x: fix driver load from initrd
    - bnxt_en: Fix active FEC reporting to ethtool
    - hwmon: Handle failure to register sensor with thermal zone correctly
    - bpf: Do not try bpf_msg_push_data with len 0
    - selftests: bpf: Check bpf_msg_push_data return value
    - bpf: Add schedule points in batch ops
    - io_uring: add a schedule point in io_add_buffers()
    - net: __pskb_pull_tail() & pskb_carve_frag_list() drop_monitor friends
    - tipc: Fix end of loop tests for list_for_each_entry()
    - gso: do not skip outer ip header in case of ipip and net_failover
    - openvswitch: Fix setting ipv6 fields causing hw csum failure
   ...

Changed in linux (Ubuntu Impish):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.