[kernel] tty/hvc: Use opal irqchip interface if available

Bug #1728098 reported by bugproxy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Fix Released
Critical
Canonical Kernel Team
linux (Ubuntu)
Fix Released
Critical
Joseph Salisbury
Xenial
Fix Released
Critical
Joseph Salisbury

Bug Description

== SRU Justification ==
This bug is fixed by commit 00dab8187e18. The commit updates the hvc
driver to use the OPAL irqchip if made available by the running firmware.
If it is not present, the driver falls back to the existing OPAL event number.

Commit 00dab8187e18 was added to mainline is v4.8-rc1, so it is not needed in
releases newer than Xenial. The commit is a clean cherry pick in Xenial.

== Fix ==
commit 00dab8187e182da41122f66c207707b192509df4
Author: Sam Mendoza-Jonas <email address hidden>
Date: Mon Jul 11 13:38:58 2016 +1000

    tty/hvc: Use opal irqchip interface if available

== Regression Potential ==
This change is specific to the hvc driver and has been in mainline since v4.8-rc1
without any issues reported.

== Test Case ==
A test kernel was built with this patch and tested by the original bug reporter.
The bug reporter states the test kernel resolved the bug.

---Problem Description---
Please backport console irq patch .

commit 00dab8187e182da41122f66c207707b192509df4
Author: Sam Mendoza-Jonas <email address hidden>
Date: Mon Jul 11 13:38:58 2016 +1000

    tty/hvc: Use opal irqchip interface if available

    Update the hvc driver to use the OPAL irqchip if made available by the
    running firmware. If it is not present, the driver falls back to the
    existing OPAL event number.

    Signed-off-by: Samuel Mendoza-Jonas <email address hidden>
    Signed-off-by: Michael Ellerman <email address hidden>

---uname output---
Linux tul217p1 4.4.0-98-generic #121-Ubuntu SMP Tue Oct 10 14:23:01 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux

Machine Type = FSP based PowerNV system

Canonical,

We need this patch in 16.04 GA 4.4 kernel. This fix will avoid the following error:

Oct 19 22:15:07 tul217p1 kernel: sched: RT throttling activated
Oct 19 22:15:49 tul217p1 kernel: INFO: rcu_sched self-detected stall on CPU
Oct 19 22:15:49 tul217p1 kernel: 21-...: (85 GPs behind) idle=d29/140000000000002/0 softirq=26316/26316 fqs=4417
Oct 19 22:15:49 tul217p1 kernel: (t=5250 jiffies g=30019 c=30018 q=20591)
Oct 19 22:15:49 tul217p1 kernel: Call Trace:
Oct 19 22:15:49 tul217p1 kernel: [c0000007f4d0f110] [c0000000000fcbe0] sched_show_task+0xe0/0x180 (unreliable)
Oct 19 22:15:49 tul217p1 kernel: [c0000007f4d0f180] [c00000000013fcf4] rcu_dump_cpu_stacks+0xe4/0x150
Oct 19 22:15:49 tul217p1 kernel: [c0000007f4d0f1d0] [c000000000145424] rcu_check_callbacks+0x6b4/0x9b0
Oct 19 22:15:49 tul217p1 kernel: [c0000007f4d0f300] [c00000000014d288] update_process_times+0x58/0xa0
Oct 19 22:15:49 tul217p1 kernel: [c0000007f4d0f330] [c0000000001649b8] tick_sched_handle.isra.6+0x48/0xe0
Oct 19 22:15:49 tul217p1 kernel: [c0000007f4d0f370] [c000000000164ab4] tick_sched_timer+0x64/0xd0
Oct 19 22:15:49 tul217p1 kernel: [c0000007f4d0f3b0] [c00000000014dd54] __hrtimer_run_queues+0x124/0x450
Oct 19 22:15:49 tul217p1 kernel: [c0000007f4d0f440] [c00000000014ed7c] hrtimer_interrupt+0xec/0x2c0
Oct 19 22:15:49 tul217p1 kernel: [c0000007f4d0f500] [c00000000001f5fc] __timer_interrupt+0x8c/0x290
Oct 19 22:15:49 tul217p1 kernel: [c0000007f4d0f550] [c00000000001f9b0] timer_interrupt+0xa0/0xe0
Oct 19 22:15:49 tul217p1 kernel: [c0000007f4d0f580] [c0000000000099d8] restore_check_irq_replay+0x54/0x70
Oct 19 22:15:49 tul217p1 kernel: --- interrupt: 901 at arch_local_irq_restore+0x74/0x90
                                     LR = arch_local_irq_restore+0x74/0x90
Oct 19 22:15:49 tul217p1 kernel: [c0000007f4d0f870] [7fffffffffffffff] 0x7fffffffffffffff (unreliable)
Oct 19 22:15:49 tul217p1 kernel: [c0000007f4d0f890] [c0000000000bf808] __do_softirq+0xd8/0x3e0
Oct 19 22:15:49 tul217p1 kernel: [c0000007f4d0f980] [c0000000000bfd88] irq_exit+0xc8/0x100
Oct 19 22:15:49 tul217p1 kernel: [c0000007f4d0f9a0] [c00000000001f9b4] timer_interrupt+0xa4/0xe0
Oct 19 22:15:49 tul217p1 kernel: [c0000007f4d0f9d0] [c0000000000099d8] restore_check_irq_replay+0x54/0x70
Oct 19 22:15:49 tul217p1 kernel: --- interrupt: 901 at irq_work_queue+0x60/0xd0
                                     LR = irq_work_queue+0xa4/0xd0
Oct 19 22:15:49 tul217p1 kernel: [c0000007f4d0fcc0] [c0000007f4d0fd00] 0xc0000007f4d0fd00 (unreliable)
Oct 19 22:15:49 tul217p1 kernel: [c0000007f4d0fcf0] [c000000000076a98] opal_handle_events+0x108/0x130
Oct 19 22:15:49 tul217p1 kernel: [c0000007f4d0fd40] [c000000000070fc8] kopald+0x78/0x100
Oct 19 22:15:49 tul217p1 kernel: [c0000007f4d0fd80] [c0000000000e7374] kthread+0x124/0x150
Oct 19 22:15:49 tul217p1 kernel: [c0000007f4d0fe30] [c000000000009538] ret_from_kernel_thread+0x5c/0xa4
Oct 19 22:15:54 tul217p1 kernel: NMI watchdog: BUG: soft lockup - CPU#21 stuck for 23s! [kopald:494]
Oct 19 22:15:54 tul217p1 kernel: Modules linked in: ibmpowernv binfmt_misc ipmi_powernv ipmi_msghandler leds_powernv powernv_rng
uio_pdrv_genirq uio vmx_crypto nfsd auth_rpcgss nfs_acl lockd ib_iser grace rdma_cm iw_cm sunrpc ib_cm ib_sa ib_mad ib_core ib_
addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq a
sync_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear ipr cxl
Oct 19 22:15:54 tul217p1 kernel: CPU: 21 PID: 494 Comm: kopald Not tainted 4.4.0-98-generic #121-Ubuntu
Oct 19 22:15:54 tul217p1 kernel: task: c0000007f4cc3f30 ti: c0000007f4d0c000 task.ti: c0000007f4d0c000
Oct 19 22:15:54 tul217p1 kernel: NIP: c000000000010964 LR: c000000000010964 CTR: c00000000001f100
Oct 19 22:15:54 tul217p1 kernel: REGS: c0000007f4d0f5f0 TRAP: 0901 Not tainted (4.4.0-98-generic)
Oct 19 22:15:54 tul217p1 kernel: MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 28000824 XER: 20000000
Oct 19 22:15:54 tul217p1 kernel: CFAR: c000000000009958 SOFTE: 1
                                 GPR00: c0000000000bf808 c0000007f4d0f870 c000000001608300 0000000000000900
                                 GPR04: c0000007fbd40400 0000000000000001 0000000000000018 0000000001f404db
                                 GPR08: 0000000000000000 0000000000000000 c0000007f4d0c000 0000000000000005
                                 GPR12: c00000000006e3c8 c00000000fb4c780
Oct 19 22:15:54 tul217p1 kernel: NIP [c000000000010964] arch_local_irq_restore+0x74/0x90
Oct 19 22:15:54 tul217p1 kernel: LR [c000000000010964] arch_local_irq_restore+0x74/0x90
Oct 19 22:15:54 tul217p1 kernel: Call Trace:
Oct 19 22:15:54 tul217p1 kernel: [c0000007f4d0f870] [7fffffffffffffff] 0x7fffffffffffffff (unreliable)
Oct 19 22:15:54 tul217p1 kernel: [c0000007f4d0f890] [c0000000000bf808] __do_softirq+0xd8/0x3e0
Oct 19 22:15:54 tul217p1 kernel: [c0000007f4d0f980] [c0000000000bfd88] irq_exit+0xc8/0x100
Oct 19 22:15:54 tul217p1 kernel: [c0000007f4d0f9a0] [c00000000001f9b4] timer_interrupt+0xa4/0xe0
Oct 19 22:15:54 tul217p1 kernel: [c0000007f4d0f9d0] [c0000000000099d8] restore_check_irq_replay+0x54/0x70
Oct 19 22:15:54 tul217p1 kernel: --- interrupt: 901 at irq_work_queue+0x60/0xd0
                                     LR = irq_work_queue+0xa4/0xd0
Oct 19 22:15:54 tul217p1 kernel: [c0000007f4d0fcc0] [c0000007f4d0fd00] 0xc0000007f4d0fd00 (unreliable)
Oct 19 22:15:54 tul217p1 kernel: [c0000007f4d0fcf0] [c000000000076a98] opal_handle_events+0x108/0x130
Oct 19 22:15:54 tul217p1 kernel: [c0000007f4d0fd40] [c000000000070fc8] kopald+0x78/0x100
Oct 19 22:15:54 tul217p1 kernel: [c0000007f4d0fd80] [c0000000000e7374] kthread+0x124/0x150
Oct 19 22:15:54 tul217p1 kernel: [c0000007f4d0fe30] [c000000000009538] ret_from_kernel_thread+0x5c/0xa4
Oct 19 22:15:54 tul217p1 kernel: Instruction dump:
Oct 19 22:15:54 tul217p1 kernel: 994d02ca 2fa30000 409e0024 e92d0020 61298000 7d210164 38210020 e8010010
Oct 19 22:15:54 tul217p1 kernel: 7c0803a6 4e800020 60420000 4bff186d <60000000> 4bffffe4 60420000 e92d0020
Oct 19 22:16:39 tul217p1 kernel: INFO: rcu_sched self-detected stall on CPU
Oct 19 22:16:39 tul217p1 kernel: 16-...: (301 GPs behind) idle=5b5/140000000000002/0 softirq=2284/2284 fqs=4838
Oct 19 22:16:39 tul217p1 kernel: (t=5250 jiffies g=30159 c=30158 q=20605)
Oct 19 22:16:39 tul217p1 kernel: Call Trace:
Oct 19 22:16:39 tul217p1 kernel: [c0000007f4d0f150] [c0000000000fcbe0] sched_show_task+0xe0/0x180 (unreliable)
Oct 19 22:16:39 tul217p1 kernel: [c0000007f4d0f1c0] [c00000000013fcf4] rcu_dump_cpu_stacks+0xe4/0x150
Oct 19 22:16:39 tul217p1 kernel: [c0000007f4d0f210] [c000000000145424] rcu_check_callbacks+0x6b4/0x9b0
Oct 19 22:16:39 tul217p1 kernel: [c0000007f4d0f340] [c00000000014d288] update_process_times+0x58/0xa0
Oct 19 22:16:39 tul217p1 kernel: [c0000007f4d0f370] [c0000000001649b8] tick_sched_handle.isra.6+0x48/0xe0
Oct 19 22:16:39 tul217p1 kernel: [c0000007f4d0f3b0] [c000000000164ab4] tick_sched_timer+0x64/0xd0
Oct 19 22:16:39 tul217p1 kernel: [c0000007f4d0f3f0] [c00000000014dd54] __hrtimer_run_queues+0x124/0x450
Oct 19 22:16:39 tul217p1 kernel: [c0000007f4d0f480] [c00000000014ed7c] hrtimer_interrupt+0xec/0x2c0
Oct 19 22:16:39 tul217p1 kernel: [c0000007f4d0f540] [c00000000001f5fc] __timer_interrupt+0x8c/0x290
Oct 19 22:16:39 tul217p1 kernel: [c0000007f4d0f590] [c00000000001f9b0] timer_interrupt+0xa0/0xe0
Oct 19 22:16:39 tul217p1 kernel: [c0000007f4d0f5c0] [c0000000000099d8] restore_check_irq_replay+0x54/0x70
Oct 19 22:16:39 tul217p1 kernel: --- interrupt: 901 at arch_local_irq_restore+0x74/0x90

CVE References

bugproxy (bugproxy)
tags: added: architecture-ppc64le bugnameltc-160719 severity-critical targetmilestone-inin16044
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → linux (Ubuntu)
Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
importance: Undecided → Critical
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
Changed in linux (Ubuntu):
importance: Undecided → Critical
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → Joseph Salisbury (jsalisbury)
status: New → In Progress
Changed in linux (Ubuntu Xenial):
importance: Undecided → Critical
status: New → In Progress
assignee: nobody → Joseph Salisbury (jsalisbury)
Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
status: New → In Progress
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a Xenial test kernel with commit 00dab8187e18. The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1728098/

Be sure to install both the linux-image and linux-image-extra .deb packages.

Manoj Iyer (manjo)
tags: added: triage-g
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2017-11-07 04:27 EDT-------
(In reply to comment #7)
> I built a Xenial test kernel with commit 00dab8187e18. The test kernel can
> be downloaded from:
> http://kernel.ubuntu.com/~jsalisbury/lp1728098/
>
> Be sure to install both the linux-image and linux-image-extra .deb packages.

Hello,

We have verified above image and it looks good. Please integrate the fix to 16.04 LTS kernel.

-Vasant

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
description: updated
Stefan Bader (smb)
Changed in linux (Ubuntu Xenial):
status: In Progress → Fix Committed
Manoj Iyer (manjo)
Changed in ubuntu-power-systems:
status: In Progress → Fix Committed
Revision history for this message
Khaled El Mously (kmously) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'. If the problem still exists, change the tag 'verification-needed-xenial' to 'verification-failed-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :

Hello IBM,

Could you please verify if the Xenial kernel currently in -proposed fixes the issue?

Thank you.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-12-07 11:07 EDT-------
(In reply to comment #15)
> Hello IBM,
>
> Could you please verify if the Xenial kernel currently in -proposed fixes
> the issue?

We have verified and confirm that it fixes the issue.

Thanks
-Vasant

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (9.5 KiB)

This bug was fixed in the package linux - 4.4.0-103.126

---------------
linux (4.4.0-103.126) xenial; urgency=low

  * linux: 4.4.0-103.126 -proposed tracker (LP: #1736181)

  * CVE-2017-1000405
    - mm, thp: Do not make page table dirty unconditionally in touch_p[mu]d()

  * CVE-2017-16939
    - netlink: add a start callback for starting a netlink dump
    - ipsec: Fix aborted xfrm policy dump crash

linux (4.4.0-102.125) xenial; urgency=low

  * linux: 4.4.0-102.125 -proposed tracker (LP: #1733541)

  * tar -x sometimes fails on overlayfs (LP: #1728489)
    - ovl: check if all layers are on the same fs
    - ovl: persistent inode number for directories

  * NVMe timeout is too short (LP: #1729119)
    - nvme: update timeout module parameter type

  * Set PANIC_TIMEOUT=10 on Power Systems (LP: #1730660)
    - [Config]: Set PANIC_TIMEOUT=10 on ppc64el

  * Cannot pair BLE remote devices when using combo BT SoC (LP: #1731467)
    - Bluetooth: increase timeout for le auto connections

  * CIFS errors on 4.4.0-98, but not on 4.4.0-97 with same config (LP: #1729337)
    - SMB3: Validate negotiate request must always be signed

  * Plantronics P610 does not support sample rate reading (LP: #1719853)
    - ALSA: usb-audio: Add sample rate quirk for Plantronics P610

  * Invalid btree pointer causes the kernel NULL pointer dereference
    (LP: #1729256)
    - xfs: reinit btree pointer on attr tree inactivation walk

  * Samba mount/umount in docker container triggers kernel Oops (LP: #1729637)
    - ipv6: only call ip6_route_dev_notify() once for NETDEV_UNREGISTER
    - ipv6: fix NULL dereference in ip6_route_dev_notify()

  * [kernel] tty/hvc: Use opal irqchip interface if available (LP: #1728098)
    - tty/hvc: Use opal irqchip interface if available

  * Device hotplugging with MPT SAS cannot work for VMWare ESXi (LP: #1730852)
    - scsi: mptsas: Fixup device hotplug for VMWare ESXi

  * NMI watchdog: BUG: soft lockup on Guest upon boot (KVM) (LP: #1727331)
    - KVM: PPC: Book3S: Treat VTB as a per-subcore register, not per-thread

  * Attempt to map rbd image from ceph jewel/luminous hangs (LP: #1728739)
    - crush: ensure bucket id is valid before indexing buckets array
    - crush: ensure take bucket value is valid
    - crush: add chooseleaf_stable tunable
    - crush: decode and initialize chooseleaf_stable
    - libceph: advertise support for TUNABLES5
    - libceph: MOSDOpReply v7 encoding

  * Xenial update to 4.4.98 stable release (LP: #1732698)
    - adv7604: Initialize drive strength to default when using DT
    - video: fbdev: pmag-ba-fb: Remove bad `__init' annotation
    - PCI: mvebu: Handle changes to the bridge windows while enabled
    - xen/netback: set default upper limit of tx/rx queues to 8
    - drm: drm_minor_register(): Clean up debugfs on failure
    - KVM: PPC: Book 3S: XICS: correct the real mode ICP rejecting counter
    - iommu/arm-smmu-v3: Clear prior settings when updating STEs
    - powerpc/corenet: explicitly disable the SDHC controller on kmcoge4
    - ARM: omap2plus_defconfig: Fix probe errors on UARTs 5 and 6
    - crypto: vmx - disable preemption to enable vsx in aes_ctr.c
    - iio: trigger: free trigger...

Read more...

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
bugproxy (bugproxy)
tags: added: verification-done-xenial
removed: verification-needed-xenial
Frank Heimes (fheimes)
Changed in linux (Ubuntu):
status: In Progress → Fix Released
Changed in ubuntu-power-systems:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.