dGPU cannot resume because system firmware stuck in IPCS method

Bug #2021572 reported by Kai-Heng Feng
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
HWE Next
New
Undecided
Unassigned
linux (Ubuntu)
Fix Released
Undecided
Unassigned
Jammy
Invalid
Undecided
Unassigned
Lunar
Fix Released
High
Unassigned
linux-oem-6.1 (Ubuntu)
Invalid
Undecided
Unassigned
Jammy
Fix Released
High
Unassigned
Lunar
Invalid
Undecided
Unassigned

Bug Description

[Impact]
When a dGPU on a laptop is in PCIe D3cold and TBT port is connected to a monitor via a cable, unplugging the cable will make the dGPU unable to resume to D0.

[Fix]
There's a part of system firmware, IOM, need the driver to disconnect all TBT/Type-C phy to let dGPU function normally.

[Test]
With the fix applied, the dGPU continues to work after replugging video cable many times.

[Where problems could occur]
The fix is composed of several overhaul, so it's likely to regression i915 GFX driver. To limit the impact, only target OEM kernel for now.

Changed in linux-oem-6.1 (Ubuntu Jammy):
status: New → Confirmed
importance: Undecided → High
Changed in linux-oem-6.1 (Ubuntu):
status: New → Invalid
tags: added: oem-priority originate-from-2012589 stella
tags: added: originate-from-1955560
tags: added: originate-from-1969426
tags: added: originate-from-1943101
Timo Aaltonen (tjaalton)
Changed in linux-oem-6.1 (Ubuntu Jammy):
status: Confirmed → Fix Committed
Timo Aaltonen (tjaalton)
tags: added: verification-needed-jammy
Changed in linux (Ubuntu Jammy):
status: New → Invalid
Changed in linux (Ubuntu):
status: New → Confirmed
tags: added: verification-done-jammy
removed: verification-needed-jammy
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-oem-6.1/6.1.0-1016.16 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification-done-jammy'. If the problem still exists, change the tag 'verification-needed-jammy' to 'verification-failed-jammy'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-oem-6.1 verification-needed-jammy
removed: verification-done-jammy
tags: added: verification-done-jammy
removed: verification-needed-jammy
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (55.2 KiB)

This bug was fixed in the package linux-oem-6.1 - 6.1.0-1016.16

---------------
linux-oem-6.1 (6.1.0-1016.16) jammy; urgency=medium

  * jammy/linux-oem-6.1: 6.1.0-1016.16 -proposed tracker (LP: #2024462)

  * Jammy update: v6.1.34 upstream stable release (LP: #2024166)
    - scsi: megaraid_sas: Add flexible array member for SGLs
    - net: sfp: fix state loss when updating state_hw_mask
    - spi: mt65xx: make sure operations completed before unloading
    - platform/surface: aggregator: Allow completion work-items to be executed in
      parallel
    - platform/surface: aggregator_tabletsw: Add support for book mode in KIP
      subsystem
    - spi: qup: Request DMA before enabling clocks
    - afs: Fix setting of mtime when creating a file/dir/symlink
    - wifi: mt76: mt7615: fix possible race in mt7615_mac_sta_poll
    - bpf, sockmap: Avoid potential NULL dereference in
      sk_psock_verdict_data_ready()
    - neighbour: fix unaligned access to pneigh_entry
    - net: dsa: lan9303: allow vid != 0 in port_fdb_{add|del} methods
    - net/ipv4: ping_group_range: allow GID from 2147483648 to 4294967294
    - bpf: Fix UAF in task local storage
    - bpf: Fix elem_size not being set for inner maps
    - net/ipv6: fix bool/int mismatch for skip_notify_on_dev_down
    - net/smc: Avoid to access invalid RMBs' MRs in SMCRv1 ADD LINK CONT
    - net: enetc: correct the statistics of rx bytes
    - net: enetc: correct rx_bytes statistics of XDP
    - net/sched: fq_pie: ensure reasonable TCA_FQ_PIE_QUANTUM values
    - Bluetooth: hci_sync: add lock to protect HCI_UNREGISTER
    - Bluetooth: Fix l2cap_disconnect_req deadlock
    - Bluetooth: ISO: don't try to remove CIG if there are bound CIS left
    - Bluetooth: L2CAP: Add missing checks for invalid DCID
    - wifi: mac80211: use correct iftype HE cap
    - wifi: cfg80211: reject bad AP MLD address
    - wifi: mac80211: mlme: fix non-inheritence element
    - wifi: mac80211: don't translate beacon/presp addrs
    - qed/qede: Fix scheduling while atomic
    - wifi: cfg80211: fix locking in sched scan stop work
    - selftests/bpf: Verify optval=NULL case
    - selftests/bpf: Fix sockopt_sk selftest
    - netfilter: nft_bitwise: fix register tracking
    - netfilter: conntrack: fix NULL pointer dereference in nf_confirm_cthelper
    - netfilter: ipset: Add schedule point in call_ad().
    - netfilter: nf_tables: out-of-bound check in chain blob
    - ipv6: rpl: Fix Route of Death.
    - tcp: gso: really support BIG TCP
    - rfs: annotate lockless accesses to sk->sk_rxhash
    - rfs: annotate lockless accesses to RFS sock flow table
    - net: sched: add rcu annotations around qdisc->qdisc_sleeping
    - drm/i915/selftests: Stop using kthread_stop()
    - drm/i915/selftests: Add some missing error propagation
    - net: sched: move rtm_tca_policy declaration to include file
    - net: sched: act_police: fix sparse errors in tcf_police_dump()
    - net: sched: fix possible refcount leak in tc_chain_tmplt_add()
    - bpf: Add extra path pointer check to d_path helper
    - drm/amdgpu: fix Null pointer dereference error in amdgpu_device_recover_vram
    - lib: cpu_rmap: Fix potential use-after-free in i...

Changed in linux-oem-6.1 (Ubuntu Jammy):
status: Fix Committed → Fix Released
Changed in linux-oem-6.1 (Ubuntu Lunar):
status: New → Invalid
Changed in linux (Ubuntu Lunar):
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

these are in 6.5

Changed in linux (Ubuntu):
status: Confirmed → Fix Released
Changed in linux (Ubuntu Lunar):
status: Confirmed → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux/6.2.0-34.34 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-lunar-linux' to 'verification-done-lunar-linux'. If the problem still exists, change the tag 'verification-needed-lunar-linux' to 'verification-failed-lunar-linux'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-lunar-linux-v2 verification-needed-lunar-linux
tags: added: verification-done-lunar-linux
removed: verification-needed-lunar-linux
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (32.0 KiB)

This bug was fixed in the package linux - 6.2.0-34.34

---------------
linux (6.2.0-34.34) lunar; urgency=medium

  * lunar/linux: 6.2.0-34.34 -proposed tracker (LP: #2033779)

  * CVE-2023-20569
    - x86/cpu, kvm: Add support for CPUID_80000021_EAX
    - tools headers x86 cpufeatures: Sync with the kernel sources
    - x86/alternative: Optimize returns patching
    - x86/retbleed: Add __x86_return_thunk alignment checks
    - x86/srso: Add a Speculative RAS Overflow mitigation
    - x86/srso: Add IBPB_BRTYPE support
    - x86/srso: Add SRSO_NO support
    - x86/srso: Add IBPB
    - x86/srso: Add IBPB on VMEXIT
    - x86/srso: Fix return thunks in generated code
    - x86/srso: Add a forgotten NOENDBR annotation
    - x86/srso: Tie SBPB bit setting to microcode patch detection
    - Documentation/hw-vuln: Unify filename specification in index
    - Documentation/srso: Document IBPB aspect and fix formatting
    - x86/srso: Fix build breakage with the LLVM linker
    - x86: Move gds_ucode_mitigated() declaration to header
    - x86/retpoline: Don't clobber RFLAGS during srso_safe_ret()
    - x86/srso: Disable the mitigation on unaffected configurations
    - x86/retpoline,kprobes: Fix position of thunk sections with CONFIG_LTO_CLANG
    - x86/retpoline,kprobes: Skip optprobe check for indirect jumps with
      retpolines and IBT
    - x86/cpu: Fix __x86_return_thunk symbol type
    - x86/cpu: Fix up srso_safe_ret() and __x86_return_thunk()
    - objtool/x86: Fix SRSO mess
    - x86/alternative: Make custom return thunk unconditional
    - x86/cpu: Clean up SRSO return thunk mess
    - x86/cpu: Rename original retbleed methods
    - x86/cpu: Rename srso_(.*)_alias to srso_alias_\1
    - x86/cpu: Cleanup the untrain mess
    - x86/srso: Explain the untraining sequences a bit more
    - objtool/x86: Fixup frame-pointer vs rethunk
    - x86/static_call: Fix __static_call_fixup()
    - x86/srso: Correct the mitigation status when SMT is disabled
    - Ubuntu: [Config]: enable Speculative Return Stack Overflow mitigation

  * Please enable Renesas RZ platform serial installer (LP: #2022361)
    - [Config] enable hihope RZ/G2M serial console
    - [Config] Mark sh-sci as built-in

  * dGPU cannot resume because system firmware stuck in IPCS method
    (LP: #2021572)
    - drm/i915/tc: Abort DP AUX transfer on a disconnected TC port
    - drm/i915/tc: switch to intel_de_* register accessors in display code
    - drm/i915: Enable a PIPEDMC whenever its corresponding pipe is enabled
    - drm/i915/tc: Fix TC port link ref init for DP MST during HW readout
    - drm/i915/tc: Fix system resume MST mode restore for DP-alt sinks
    - drm/i915/tc: Wait for IOM/FW PHY initialization of legacy TC ports
    - drm/i915/tc: Factor out helpers converting HPD mask to TC mode
    - drm/i915/tc: Fix target TC mode for a disconnected legacy port
    - drm/i915/tc: Fix TC mode for a legacy port if the PHY is not ready
    - drm/i915/tc: Fix initial TC mode on disabled legacy ports
    - drm/i915/tc: Make the TC mode readout consistent in all PHY states
    - drm/i915: Add encoder hook to get the PLL type used by TC ports
    - drm/i915/tc: Assume a TC port is legacy ...

Changed in linux (Ubuntu Lunar):
status: Fix Committed → Fix Released
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-azure/6.2.0-1015.15 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-lunar-linux-azure' to 'verification-done-lunar-linux-azure'. If the problem still exists, change the tag 'verification-needed-lunar-linux-azure' to 'verification-failed-lunar-linux-azure'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-lunar-linux-azure-v2 verification-needed-lunar-linux-azure
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-aws-6.2/6.2.0-1014.14~22.04.1 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy-linux-aws-6.2' to 'verification-done-jammy-linux-aws-6.2'. If the problem still exists, change the tag 'verification-needed-jammy-linux-aws-6.2' to 'verification-failed-jammy-linux-aws-6.2'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-aws-6.2-v2 verification-needed-jammy-linux-aws-6.2
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-nvidia-6.2/6.2.0-1011.11 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy-linux-nvidia-6.2' to 'verification-done-jammy-linux-nvidia-6.2'. If the problem still exists, change the tag 'verification-needed-jammy-linux-nvidia-6.2' to 'verification-failed-jammy-linux-nvidia-6.2'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-nvidia-6.2-v2 verification-needed-jammy-linux-nvidia-6.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.