Comment 6 for bug 1157757

Revision history for this message
Stefan Bader (smb) wrote :

I got around to manually narrow down the bisection by re-ordering the patches in the already obtained range. This got me to

* cda846f x86, realmode: read cr4 and EFER from kernel for 64-bit trampoline

I also got around to wire up my Intel test box to have a real serial port and then use this to handle Xen debug keys. Dumping the registers with a stuck HVM VCPU shows that eax and cr4 are still the same. That would indicate that code execution got at least to the place that assigns CR4 but not much further (EAX would get replaced quite soon).

So the contents written into CR4 were: 0x1407f0. My first suspect was the PGE flag since that looks to be depending on the PG flag in CR0 to be set first. However masking that off had no effect. What turned out to be the offender was the SMEP (supervisor mode execution protection) which is also set in the CR4 contents that seem to be passed in by Xen. By manually masking that off in trampoline_64.S:startup_32 all APs again get started successfully.

Now the question is probably whether the realmode code should be more conservative or whether it is the responsibility of the hypervisor to hide this from the system. Even more as to my understanding the SMEP bit in CR4 should actually not be set at all on this CPU as CPUID[7] does not indicate support in bit7 of EBX (looked at that after a boot into bare-metal mode).