arm64 iperf performance suboptimal

Bug #1358949 reported by dann frazier
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
High
Unassigned
Trusty
Fix Released
Undecided
dann frazier
Utopic
Fix Released
High
Unassigned

Bug Description

[Impact]
The copy_{to,from}_user implementations in the Ubuntu kernel are suboptimal. Optimized implementations have been submitted upstream and have shown a significant improvement in network performance.

    Iperf performance increase:
                -l (size) 1 core result
    Optimized 64B 44-51Mb/s
                1500B 4.9Gb/s
                30000B 16.2Gb/s
    Original 64B 34-50.7Mb/s
                1500B 4.7Gb/s
                30000B 14.5Gb/s

[Test Case]
Generate traffic from one node to another using iperf (see above for config).

[Regression Risk]
These functions are obviously used heavily throughout the kernel, so a defect here could have significant impact. This risk is mitigated by using an implementation heavily based on the linaro string libraries (which are used in other places already, e.g. glibc), and through active testing of this patch on real hardware using a trusty-kernel base.

Revision history for this message
dann frazier (dannf) wrote :
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Can you also send this patch to the Ubuntu kernel-team mailing list for review?

tags: added: kernel-da-key trusty
Changed in linux (Ubuntu):
importance: Undecided → Medium
status: New → Triaged
tags: added: patch
Changed in linux (Ubuntu):
importance: Medium → High
Revision history for this message
dann frazier (dannf) wrote : Re: [Bug 1358949] Re: arm64 iperf performance suboptimal

On Tue, Aug 19, 2014 at 4:56 PM, Joseph Salisbury
<email address hidden> wrote:
> Can you also send this patch to the Ubuntu kernel-team mailing list for
> review?

https://lists.ubuntu.com/archives/kernel-team/2014-August/047895.html

Tim Gardner (timg-tpi)
Changed in linux (Ubuntu Trusty):
assignee: nobody → dann frazier (dannf)
status: New → Fix Committed
Changed in linux (Ubuntu Utopic):
status: Triaged → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 3.16.0-10.15

---------------
linux (3.16.0-10.15) utopic; urgency=low

  [ dann frazier ]

  * [debian] Fix regression with ABI subversions and backport

  [ Feng Kan ]

  * SAUCE: (no-up) irqchip:gic: change access of gicc_ctrl register to read
    modify write.
    - LP: #1357527
  * SAUCE: (no-up) arm64: optimized copy_to_user and copy_from_user
    assembly code
    - LP: #1358949

  [ Stefan Bader ]

  * SAUCE: bcache: prevent crash on changing writeback_running
    - LP: #1357295

  [ Tim Gardner ]

  * [Config] CONFIG_XFRM_STATISTICS=y
  * [Config] CONFIG_SECURITY_NETWORK_XFRM=y
  * [Config] CONFIG_SENSORS_IBMPOWERNV=m
    - LP: #1353005
  * Release Tracking Bug
    - LP: #1359783

  [ Upstream Kernel Changes ]

  * intel_idle: Broadwell support
    - LP: #1256170
  * powerpc/book3s: Add basic infrastructure to handle HMI in Linux.
    - LP: #1357108
  * powerpc/powernv: Invoke opal call to handle hmi.
    - LP: #1357108
  * powerpc/book3s: handle HMIs for cpus in nap mode.
    - LP: #1357108
  * powerpc/book3s: Fix endianess issue for HMI handling on napping cpus.
    - LP: #1357108
  * powerpc: Add smp_mb() to arch_spin_is_locked()
    - LP: #1358569
  * powerpc: Add smp_mb()s to arch_spin_unlock_wait()
    - LP: #1358569
  * hwmon: (powerpc/powernv) hwmon driver for power, fan rpm, voltage and
    temperature
    - LP: #1353005
  * tools/testing/selftests/ptrace/peeksiginfo.c: add PAGE_SIZE definition
    - LP: #1358855
  * printk: Add function to return log buffer address and size
    - LP: #1359423
  * powerpc/powernv: Interface to register/unregister opal dump region
    - LP: #1359423
  * bcache: fix crash on shutdown in passthrough mode
    - LP: #1357295
  * bcache: fix uninterruptible sleep in writeback thread
    - LP: #1357295

  [ Vinayak Kale ]

  * SAUCE: (no-up) dt-bindings: Add Potenza PMU binding
    - LP: #1357527
  * SAUCE: (no-up) arm64: dts: Add PMU node for APM X-Gene Storm SOC
    - LP: #1357527
 -- Tim Gardner <email address hidden> Fri, 15 Aug 2014 12:34:33 -0600

Changed in linux (Ubuntu Utopic):
status: Fix Committed → Fix Released
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-trusty' to 'verification-done-trusty'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-trusty
Revision history for this message
dann frazier (dannf) wrote :

I'm seeing reasonable performance with this.

tags: added: verification-done-trusty
removed: verification-needed-trusty
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (5.8 KiB)

This bug was fixed in the package linux - 3.13.0-36.63

---------------
linux (3.13.0-36.63) trusty; urgency=low

  [ Joseph Salisbury ]

  * Release Tracking Bug
    - LP: #1365052

  [ Feng Kan ]

  * SAUCE: (no-up) irqchip:gic: change access of gicc_ctrl register to read
    modify write.
    - LP: #1357527
  * SAUCE: (no-up) arm64: optimized copy_to_user and copy_from_user
    assembly code
    - LP: #1358949

  [ Ming Lei ]

  * SAUCE: (no-up) Drop APM X-Gene SoC Ethernet driver
    - LP: #1360140
  * [Config] Drop XGENE entries
    - LP: #1360140
  * [Config] CONFIG_NET_XGENE=m for arm64
    - LP: #1360140

  [ Stefan Bader ]

  * SAUCE: Add compat macro for skb_get_hash
    - LP: #1358162
  * SAUCE: bcache: prevent crash on changing writeback_running
    - LP: #1357295

  [ Suman Tripathi ]

  * SAUCE: (no-up) arm64: Fix the csr-mask for APM X-Gene SoC AHCI SATA PHY
    clock DTS node.
    - LP: #1359489
  * SAUCE: (no-up) ahci_xgene: Skip the PHY and clock initialization if
    already configured by the firmware.
    - LP: #1359501
  * SAUCE: (no-up) ahci_xgene: Fix the link down in first attempt for the
    APM X-Gene SoC AHCI SATA host controller driver.
    - LP: #1359507

  [ Tuan Phan ]

  * SAUCE: (no-up) pci-xgene-msi: fixed deadlock in irq_set_affinity
    - LP: #1359514

  [ Upstream Kernel Changes ]

  * iwlwifi: mvm: Add a missed beacons threshold
    - LP: #1349572
  * mac80211: reset probe_send_count also in HW_CONNECTION_MONITOR case
    - LP: #1349572
  * genirq: Add an accessor for IRQ_PER_CPU flag
    - LP: #1357527
  * arm64: perf: add support for percpu pmu interrupt
    - LP: #1357527
  * cifs: sanity check length of data to send before sending
    - LP: #1283101
  * KVM: nVMX: Pass vmexit parameters to nested_vmx_vmexit
    - LP: #1329434
  * KVM: nVMX: Rework interception of IRQs and NMIs
    - LP: #1329434
  * KVM: vmx: disable APIC virtualization in nested guests
    - LP: #1329434
  * HID: Add transport-driver functions to the USB HID interface.
    - LP: #1353021
  * ahci_xgene: Removing NCQ support from the APM X-Gene SoC AHCI SATA Host
    Controller driver.
    - LP: #1358498
  * fold d_kill() and d_free()
    - LP: #1354234
  * fold try_prune_one_dentry()
    - LP: #1354234
  * new helper: dentry_free()
    - LP: #1354234
  * expand the call of dentry_lru_del() in dentry_kill()
    - LP: #1354234
  * dentry_kill(): don't try to remove from shrink list
    - LP: #1354234
  * don't remove from shrink list in select_collect()
    - LP: #1354234
  * more graceful recovery in umount_collect()
    - LP: #1354234
  * dcache: don't need rcu in shrink_dentry_list()
    - LP: #1354234
  * lift the "already marked killed" case into shrink_dentry_list()
  * split dentry_kill()
    - LP: #1354234
  * expand dentry_kill(dentry, 0) in shrink_dentry_list()
    - LP: #1354234
  * shrink_dentry_list(): take parent's ->d_lock earlier
    - LP: #1354234
  * dealing with the rest of shrink_dentry_list() livelock
    - LP: #1354234
  * dentry_kill() doesn't need the second argument now
    - LP: #1354234
  * dcache: add missing lockdep annotation
    - LP: #1354234
  * fs: convert use of typedef ctl_table to struct ctl_table
 ...

Read more...

Changed in linux (Ubuntu Trusty):
status: Fix Committed → Fix Released
Revision history for this message
Andy Whitcroft (apw) wrote :

Note these patches have been found to cause crashes and have been reverted under Bug #1398596.

tags: added: arm-hs-vivid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.