i40e xps management broken when > 64 queues/cpus

Bug #1820948 reported by Nivedita Singhvi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
High
Nivedita Singhvi
Bionic
Fix Released
High
Nivedita Singhvi

Bug Description

[Impact]
Transmit packet steering (xps) settings don't work when
the number of queues (cpus) is higher than 64. This is
currently still an issue on the 4.15 kernel (Xenial -hwe
and Bionic kernels).

It was fixed in Intel's i40e driver version 2.7.11 and
in 4.16-rc1 mainline Linux (i.e. Cosmic, Disco have fix).

Fix
-----
The following commit fixes this issue (as identified
by Lihong Yang in discussion with Intel i40e team):

"i40e: Fix the number of queues available to be mapped for use"
Commit: bc6d33c8d93f5999920e97a8c6330b8910053d4f

It requires the following commit as well:

i40e: Do not allow use more TC queue pairs than MSI-X vectors exist
Commit: 1563f2d2e01242f05dd523ffd56fe104bc1afd58

[Test Case]
1. Kernel version: Bionic/Xenial -hwe: any 4.15 kernel
   i40e driver version: 2.1.14-k
   Any system with > 64 CPUs

2. For any queue 0 - 63, you can read/set tx xps:

echo ffffffff > /sys/class/net/eth2/queues/tx-63/xps_cpus
echo $?
0
cat /sys/class/net/eth2/queues/tx-63/xps_cpus
00,00000000,ffffffff

  But for any queue number > 63, we see this error:

echo ffffffff > /sys/class/net/eth2/queues/tx-64/xps_cpus
echo: write error: Invalid argument

cat /sys/class/net/eth2/queues/tx-64/xps_cpus
cat: /sys/class/net/eth2/queues/tx-64/xps_cpus: Invalid argument

CVE References

Revision history for this message
Nivedita Singhvi (niveditasinghvi) wrote :

It's been reported by an external reporter and reproduced
internally.

Changed in linux (Ubuntu):
status: New → Confirmed
Changed in linux (Ubuntu Bionic):
status: New → Confirmed
Changed in linux (Ubuntu):
importance: Undecided → High
Changed in linux (Ubuntu Bionic):
importance: Undecided → High
assignee: nobody → Nivedita Singhvi (niveditasinghvi)
Revision history for this message
Nivedita Singhvi (niveditasinghvi) wrote :

Will be submitting SRU request early next week; trying to get
it into this next kernel release cycle.

Changed in linux (Ubuntu):
assignee: nobody → Nivedita Singhvi (niveditasinghvi)
Changed in linux (Ubuntu Bionic):
status: Confirmed → In Progress
Changed in linux (Ubuntu):
status: Confirmed → In Progress
Revision history for this message
Nivedita Singhvi (niveditasinghvi) wrote :

Submitted patches for SRU.

description: updated
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Revision history for this message
Nivedita Singhvi (niveditasinghvi) wrote :

I'm still trying to confirm this for Xenial.

Changed in linux (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Revision history for this message
Connor Kuehl (connork) wrote :

Hi Nivedita,

The Bionic kernel containing a fix for this issue is now in the "-proposed" repository. Could you (or the external reporter mentioned earlier in the ticket) try the proposed kernel to see if it fixes the issue?

Also, I saw you left a note earlier saying that you were still investigating this for Xenial; are there any updates from that investigation?

Thank you!

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (14.6 KiB)

This bug was fixed in the package linux - 4.15.0-48.51

---------------
linux (4.15.0-48.51) bionic; urgency=medium

  * linux: 4.15.0-48.51 -proposed tracker (LP: #1822820)

  * Packaging resync (LP: #1786013)
    - [Packaging] update helper scripts
    - [Packaging] resync retpoline extraction

  * 3b080b2564287be91605bfd1d5ee985696e61d3c in ubuntu_btrfs_kernel_fixes
    triggers system hang on i386 (LP: #1812845)
    - btrfs: raid56: properly unmap parity page in finish_parity_scrub()

  * [P9][LTCTest][Opal][FW910] cpupower monitor shows multiple stop Idle_Stats
    (LP: #1719545)
    - cpupower : Fix header name to read idle state name

  * [amdgpu] screen corruption when using touchpad (LP: #1818617)
    - drm/amdgpu/gmc: steal the appropriate amount of vram for fw hand-over (v3)
    - drm/amdgpu: Free VGA stolen memory as soon as possible.

  * [SRU][B/C/OEM]IOMMU: add kernel dma protection (LP: #1820153)
    - ACPICA: AML parser: attempt to continue loading table after error
    - ACPI / property: Allow multiple property compatible _DSD entries
    - PCI / ACPI: Identify untrusted PCI devices
    - iommu/vt-d: Force IOMMU on for platform opt in hint
    - iommu/vt-d: Do not enable ATS for untrusted devices
    - thunderbolt: Export IOMMU based DMA protection support to userspace
    - iommu/vt-d: Disable ATS support on untrusted devices

  * Add basic support to NVLink2 passthrough (LP: #1819989)
    - powerpc/powernv/npu: Do not try invalidating 32bit table when 64bit table is
      enabled
    - powerpc/powernv: call OPAL_QUIESCE before OPAL_SIGNAL_SYSTEM_RESET
    - powerpc/powernv: Export opal_check_token symbol
    - powerpc/powernv: Make possible for user to force a full ipl cec reboot
    - powerpc/powernv/idoa: Remove unnecessary pcidev from pci_dn
    - powerpc/powernv: Move npu struct from pnv_phb to pci_controller
    - powerpc/powernv/npu: Move OPAL calls away from context manipulation
    - powerpc/pseries/iommu: Use memory@ nodes in max RAM address calculation
    - powerpc/pseries/npu: Enable platform support
    - powerpc/pseries: Remove IOMMU API support for non-LPAR systems
    - powerpc/powernv/npu: Check mmio_atsd array bounds when populating
    - powerpc/powernv/npu: Fault user page into the hypervisor's pagetable

  * Huawei Hi1822 NIC has poor performance (LP: #1820187)
    - net-next: hinic: fix a problem in free_tx_poll()
    - hinic: remove ndo_poll_controller
    - net-next/hinic: add checksum offload and TSO support
    - hinic: Fix l4_type parameter in hinic_task_set_tunnel_l4
    - net-next/hinic:replace multiply and division operators
    - net-next/hinic:add rx checksum offload for HiNIC
    - net-next/hinic:fix a bug in set mac address
    - net-next/hinic: fix a bug in rx data flow
    - net: hinic: fix null pointer dereference on pointer hwdev
    - hinic: optmize rx refill buffer mechanism
    - net-next/hinic:add shutdown callback
    - net-next/hinic: replace disable_irq_nosync/enable_irq

  * [CONFIG] please enable highdpi font FONT_TER16x32 (LP: #1819881)
    - Fonts: New Terminus large console font
    - [Config]: enable highdpi Terminus 16x32 font support

  * [19.04 FEAT] qeth: Enhanced link...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Nivedita Singhvi (niveditasinghvi) wrote :

Late update, but the original reporter did test the proposed
kernel on systems able to reproduce the problem and were
tested successfully.

We do not yet have a way of reproducing this on Xenial (i.e,
any 4.4 kernel). I'm still leaving this an open issue, will be
trying to do this and once we can confirm/test, will update
and push an SRU for Xenial as well.

tags: added: sts
tags: added: verification-done-bionic verification-done-cosmic
removed: verification-needed-bionic
tags: removed: verification-done-cosmic
Brad Figg (brad-figg)
tags: added: cscc
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.