The Mystery of Hyper-V's Limit Processor Functionality? (Part 2 - Final)
In my previous post I discussed how I went about trying to determine the differences in processor functionality provided by Hyper-V's Limit Processor Functionality (LPF) checkbox. You probably want to read through that in order to get the necessary background to understand this final instalment.
In this post I discuss how you can determine:
- If your operating system is running on a hypervisor,
- The processor feature differences presented for an operating system running directly on hardware versus a parent partition operating system on a hypervisor,
- The processor feature differences presented for a child partition operating system running without LPF set versus one that does.
In essence I ran a number of tools and found some minor discrepancies in the results I received from each. The basic premise I followed was to run them on Windows Vista x86, Windows Server 2008 (Parent partition with the Hyper-V RC0 role enabled) and then in a child partition running Windows XP SP3 with the LPF checkbox enabled and disabled.
I felt I was getting nowhere using the tools, and couldn't arbitrate between the results because I had no view of what they were doing to determine the processor information they did. So I write a tool, and in the process learned a lot!
In order to do that I had to determine a way to find out the processor identification and the features that they supported. Fortunately both Intel and AMD (and I assume other processor manufactures that provided x86 and x64 support) provide an instruction called CPUID to do this. "Brilliant", I thought, "I'll just use that and find out everything I need to know." And so I did!
I used a combination of C and Assembly to write a simple command line program that I could run in all three environments that could haul out and detect the information I required. I did not go to in-depth, but did manage to find out some really interesting bits of information.
As with the last post, I'm only going to focus on the differences I found between the various environments.
The first comparison, below, provides the differences found when running Windows Vista x86 directly on the hardware versus Windows Server 2008 with the Hyper-V role enabled. It makes for interesting reading!
The first and most notable difference that presents itself is the number of CPUID registers that are presented in each environment. There are 10 when Windows runs on the bare metal and only 6 with the Hyper-V hypervisor enabled. These registers are important, because they store the lists of processor capabilities, and calling CPUID with EAX set to 0, will tell an operating system how many registers to query in order to determine the processor functionality. Effectively this already limits the set of processor functions that the parent partition can determine versus an operating system running directly on the hardware without a hypervisor. The missing features relate to direct cache access and performance monitoring capabilities. Although not that interesting for the purposes of this blog entry, they are missing when running on a hypervisor.
What is more interesting is the startling differences that present themselves when the feature flags (the bits that report what the processors capabilities are) are queried. Although it is common sense, it remains interesting that the processor reports that it supports Virtual Machine eXtensions (VMX) when Vista is run, but does not do so when run on a hypervisor. Presumably this prevents a hypervisor from running in a child partition, because the child operating system will not detect the processor capability.
I found it interesting the MONITOR/MWAIT is supported on the hardware, and that SYSCALL/SYSRET is only present when the operating system is run on a hypervisor. AMD documentation describes SYSCALL/SYSRET as follows:
"SYSCALL and SYSRET are instructions used for low-latency system calls and returns in operating systems with a flat memory model and no segmentation. These instructions have been optimised by reducing the number of checks and memory references that are normally made so that a call or return takes less than one-fourth the number of internal clock cycles when compared to the current CALL/RET instruction method."
I assume that SYSCALL/SYSRET are enabled when VMX is active to help speed up the performance of the child partitions.
MONITOR and MWAIT instructions are described by Intel as follows:
"The MWAIT instruction is designed to operate with the MONITOR instruction. The two instructions allow the definition of an address at which to ‘wait’ (MONITOR) and an instruction that causes a predefined ‘implementation-dependent-optimised operation’ to commence at the ‘wait’ address (MWAIT). The execution of MWAIT is a hint to the processor that it can enter an implementation-dependent-optimised state while waiting for an event or a store operation to the address range armed by the preceding MONITOR instruction in program flow."
In researching this topic I came up with an interesting set of documentation called Hypervisor Virtual Processor Execution at http://msdn.microsoft.com/en-us/library/bb969750.aspx. It's a bit of a pity I had to do so much work of my own just to discover the documentation, but at the same time I learned quite a lot of new information!
You'll see in the screen shot above that I had no definition for a feature called Bit 31. Bit 31 was not documented in the Intel CPUID documentation, but is set for systems that have a hypervisor running! This is a great way for applications and operating systems to determine if they are running on the hardware directly or on a hypervisor!
Before visiting the Limit Processor Functionality differences further I ended up segueing to find out more about this function. I did some further research and discovered that Microsoft provides a new set of values at 0x40000000, which return the processor identification, vendor, features and minor and major release details of the hypervisor. What a discovery! I modified my code slightly to include querying the range and it returned the following when run:
Vendor string: Microsoft Hv
That was kind of cool, because now not only could I determine if I was running on a hypervisor, but I could also find out who the vendor of the hypervisor was.
Finally I needed to determine the differences for child partition operating systems running with the LPF setting disabled and enabled. The resulting differences are presented below:
At this point I was more than a little disappointed, so I decided to look at the Intel CPUID documentation once more, and see what features could be defined by CPUID(3) to CPUID(6) that may help explain what could be different. For those that are technical, my "Register Index" above is actually the largest standard function number returned when I call CPUID with EAX set to 0.
Function 3 provides the Processor Serial Number. This was only provided in the Pentium III, and so does not really explain any key differences that could have been caused by enabling LPF.
Function 4 provides deterministic cache parameters. This is a little more interesting because a BIOS (yes, even the BIOS for Hyper-V's partitions!) will use this to determine the number of processor cores in a specific processor package. If you look at the results I provided for SiSoft Sandra Lite in my previous blog post, "The Mystery of Hyper-V's Limit Processor Functionality? (Part1)", you will see the results differ when LPF is enabled or disabled. This could help explain why.
Function 5 provides further information about MONITOR/MWAIT support. It's an interesting function not to have provided when LPF is enabled, because limited MWAIT support can be provided to a child partition, but obviously not with LPF enabled.
At http://msdn.microsoft.com/en-us/library/bb969743.aspx it says, "Partitions that possess the CpuPowerManagement privilege can use MWAIT to set the logical processor’s C-state if support for the instruction is present in hardware". If my research is correct this could only be true if the partition possess the CpuPowerManagement privilege and LPF is not enabled.
Function 6 provides details about the Digital Thermal Sensor. Intel Core 2 Duo processors have a new Digital Thermal Sensor than older processors. This is provided to allow the system to determine the processor temperate for each core and adjust clock speed and voltage. Systems can slow the processor to reduce operating temperature. I'm not particularly sure how this is useful in a child partition and why it should not be present in an environment where LPF is enabled, but it's there.
So, in an LPF environment it appears Deterministic Cache Parameters are not present, MONITOR/MWAIT functionality can never be used and the digital thermal sensor information is not available.
Lastly, according to SiSoft Sandra the Maximum Physical and Virtual Address space for a child partition without LPF enabled versus one where it is enabled is 40-bit and 48-bit versus 36-bit and 32-bit respectively. This would indicate an LPF enabled child partition can address far less memory than child partitions that do not have it set.
I'm sure there is more to this. If and when I find out more information I'll be sure to share it.