Comment 26 for bug 929941

Revision history for this message
Stefan Bader (smb) wrote :

Noah,

thanks for testing and reporting the results. The first thing to do now, is to decide whether v1 or v3 should be the goal. v1 could be considered well tested by now. The downside I see with that, is that to avoid some problems on certain older hypervisor code, this uses real spinning spinlocks. Which means while waiting for a lock, the virtual cpu will busily wait (which could have some impact on the cloud hosts cpu usage. Also this gives no queuing, which means that getting the lock can be unfair in contented situations.
The v3 kernel would in principle use the same implementation, which could theoretically be the wrong thing on older hypervisor versions (though the chance to have an instance launched on such an older host version is likely to get smaller every day). At least it is the same risk as we have now and the lockups happened on newer hypervisor versions. So I would tend towards the v3 solution but for that it would be good to have more hours testing with v3 to see it is not showing other problems that might be related to this change.

Normally the process to get a change into an official kernel means to propose it for SRU (stable release update), I will propose the patches for inclusion and when accepted those get into a proposed kernel. Normally those are prepared and made available and then verification has to be done within a week. Which is not working with a bug like this. But if there is a reasonable confidence that a test kernel has been running on your busy instances without the original issue and new stability problems, this should be a good argument.

Since the time I build the current v3 kernels there have been other updates, too. So I would go ahead and prepare a new set of those. I will post here when those are ready. If you then could start migrating your instances to those and report back here when you feel confident about the stability. Then I would start the steps required to integrate the changes into the official kernels.