Standard_A2_v2 Azure instance will have CPU in offline state

Bug #1831543 reported by Po-Hsu Lin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ubuntu-kernel-tests
Won't Fix
Undecided
Unassigned
linux-azure (Ubuntu)
Won't Fix
Undecided
Unassigned

Bug Description

This needs to be investigated to see if it's been disabled intentionally.
The cpu-hot-plug test shows there are 126 offlined CPUs.

 selftests: cpu-on-off-test.sh
 ========================================
 pid 9824's current affinity mask: 3
 pid 9824's new affinity mask: 1
 CPU online/offline summary:
 present_cpus = 0-1 present_max = 1
 Cpus in online state: 0-1
 Cpus in offline state: 2-127
 Limited scope test: one hotplug cpu
 (leaves cpu in the original state):
 online to offline to online: cpu 1
 not ok 1..1 selftests: cpu-on-off-test.sh [FAIL]

Po-Hsu Lin (cypressyew)
tags: added: azure
Po-Hsu Lin (cypressyew)
tags: added: ubuntu-kernel-selftests
Po-Hsu Lin (cypressyew)
tags: added: sru-20191111
Po-Hsu Lin (cypressyew)
tags: added: amd64
Revision history for this message
Marcelo Cerri (mhcerri) wrote :

Hi, Sam. Does that only happen in this specific instance type? Can you provide more details? Thanks

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Hi Marcelo,
for instance Standard_A2_v2 here, it looks like we're not using for SRU testing anymore, so I will take 5.0.0-1032.34 Bionic Azure kernel for example, the cpu hotplug test from kernel selftest will report that the following instances got some CPU offlined by default:
  * Standard_DS15_v2 (https://pastebin.ubuntu.com/p/Ppw9WQSNMH/)
  * Standard_DS4_v2 (https://pastebin.ubuntu.com/p/Bms5XF4Qxk/)
  * Standard_DS5_v2 (I will skip pasting outputs)
  * Basic_A2

Not sure if this a by-design result (maybe it's how Azure cloud manage its resource?)

But I think this not why the test has failed on them, as it can pass on some other instances even with some CPU offlined by default:
  * Standard_B1ms (only one cpu, test skipped https://pastebin.ubuntu.com/p/xzJb3jVY3g/)
  * Standard_D48_v3 (test passed, https://pastebin.ubuntu.com/p/T7bPKNZ7Pw/)
  * Standard_F32s_v2 (test passed, https://pastebin.ubuntu.com/p/J68CvkZb4Y/)

You can check the test report for azure here http://10.246.72.46/trackers.html (internal link)

Revision history for this message
Sean Feole (sfeole) wrote :

hyper-v does not work in the way you are familiar with (such as libvirt) where 2 processor cores are assigned and essentially dedicated to the virtual host.

If i'm not mistaken I believe this is a result in the way hyper-v handles the virtual machines. When a VM is created (BASIC_A2_V2) the hypervisor decides where virtual operating system processes/threads are to be processed via the bare metal resources. I believe this is done at random. This could be why you may be seeing an unusually high amount of cpus in the "offlined" state.

I'm not sure this test should even be run on azure, due to hyper-v. I will take an action item to look into this further. For now, I do not think this is an issue in the least. This test suite is designed to be run on bare metal machines, not hyper-v vms

Sean Feole (sfeole)
Changed in ubuntu-kernel-tests:
status: New → Won't Fix
Changed in linux-azure (Ubuntu):
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.