How to investigate and use Turbo mode, C-States and P-States in XenServer

Many application developers (including CAD applications that favour a single core) may be interested in configuring XenServer such that their applications use the processor in turbo mode where possible.

What is Turbo mode?

As Intel details on their own page:

Intel® Turbo Boost Technology provides more performance when needed on 4th generation Intel® Core™ processor-based systems. Intel® Turbo Boost Technology 2.0 automatically allows processor cores to run faster than the Thermal Design Power (TDP) configuration specified frequency if they’re operating below power, current, and temperature specification limits.

Does XenServer support Turbo mode?

Yes it does. This is controlled in the underlying Xen hypervisor. However the underlying default Xen configuration is not configured for optimal performance. Xen’s power management (http://wiki.xen.org/wiki/Xen_power_management) is controlled by the governor, the xenpm http://wiki.xen.org/wiki/Xenpm_command is a tool within Xen that can control the governor. It is not recommended that users in general use xenpm to modify the governor below the XenServer hypervisor.

Instead we recommend that users ensure their BIOS enables sufficient C-states to facilitate turbo mode and request that a governor optimised for performance is used by XenServer. Details are later in this article.

Doesn’t XenServer required both C-States and Turbo Mode to be disabled?

No. This is not generally true, there has been some confusion owing to a workaround associated with an isolate issue affecting Intel Nehalem and Westmere processors. A KnowledgeBase article concerning Intel C-state transitions causing instability in XenServer 5.6 was published: http://support.citrix.com/article/CTX127395. This article recommended disabling C-state switching (power management) and Turbo Boost in the BIOS to address the issue. Recent versions of XenServer will disable C-states automatically on the specific affected processors only.

This was a specific issue affecting the Intel Nehalem and Westmere processors only. The problem was caused by an Intel erratum concerning C-state switching, see:

This issue was not related to or dependent upon Operating System in use.

Turbo Boost does not require C-state transitions to be enabled however disabling them makes Turbo Boost significantly less likely to occur.

Ensuring sufficient P-states and C-States are enabled to allow Turbo to occur

The XenServer and the underlying Xen governor can only use the subset of power states exposed by the server BIOS. Some hardware will come with a limit pre-set for power saving, meaning only the subset of the most aggressive power saving modes are enabled in the BIOS, this is a limit (a line drawn).

Whatever is configured in the BIOS limits what is available to the hypervisor to use. The hypervisor and its governor can’t enable additional P-states that are more friendly to performance over power saving. If not enabled on the BIOS those modes cannot subsequently be enabled. This article, http://support.citrix.com/article/CTX132714, details how to how to remove such settings if they were applied by the manufacturer  We would also advise users to ensure that BIOS setting such as speedstep are enabled where appropriate.

To ensure Xen uses a governor optimised for performance Run this command in Dom0:

/opt/xensource/libexec/xen-cmdline --set-xen cpufreq=xen:performance

This will take effect after the next reboot provided your BIOS is properly enabled.

The command:

xenpm set-scaling-governor performance

can be used to enable performance optimisation with immediate effect.

Platform Specific BIOS Information: Dell R720 – optimising for performance

To save on reboots run this command in Dom0 first:

/opt/xensource/libexec/xen-cmdline --set-xen cpufreq=xen:performance

Then reboot the host Press F2 at boot, then System BIOS>System Profile Settings Then change System Profile to "Performance Per Watt (OS)"

Platform Specific BIOS Information: IBM servers – optimising for performance

On IBM servers, find Operation Mode option in the System Settings section and set it to Maximum Performance

Other Platforms Specific BIOS Information

Otherwise your server vendor should be able to supply you with the namings and correct performance settings.

BIOS gotchas - subtleties to watch out for

  • Fan Speed: For many performance BIOS settings it is necessary to ensure that the server is also configured to allow high or maximal fan speed for the setting to take effect. 
  • Maximal Peak vs. Statix Max performance: Many servers e.g. Dell's have a BIOS setting such as "Static Max Performance", this option will in fact turbo mode from occuring as it favours maximum consistent performance.

How to investigate how your application is using C-states, P-states and Turbo Mode From the command line

Use xenpm to investigate C-state usage thus: 

[root@dt56 ~]# xenpm get-cpuidle-states

Note: If C-states are limited, e.g. as below where you can see a max C-state of 1, it will be very unlikely that turbo will occur:

Max C-state: C1

cpu id               : 0
total C-states       : 4
idle time(ms)        : 241176077
C0 [ACPI C0]         : transition [00000000000044025813]
                       residency  [00000000000005911172 ms]
C1 [ACPI C1]         : transition [00000000000044025813]
                       residency  [00000000000240858888 ms]
C2 [ACPI C2]         : transition [00000000000000000000]
                       residency  [00000000000000000000 ms]
C3 [ACPI C3]         : transition [00000000000000000000]
                       residency  [00000000000000000000 ms]
pc2                  : [00000000000000000000 ms]
pc3                  : [00000000000000000000 ms]
pc6                  : [00000000000000000000 ms]
pc7                  : [00000000000000000000 ms]
cc3                  : [00000000000000000000 ms]
cc6                  : [00000000000000000000 ms]
cc7                  : [00000000000000000000 ms]

 

Use xenpm to investigate P-state usage thus: 

[root@dt56 ~]# xenpm get-cpufreq-states

Note: If P-states are disabled you will find on most versions of Xenserver xenpm get-cpufreq-states will return nothing.

How to investigate how your application is using C-states, P-states and Turbo Mode Using Server Status Reports (Bugtools)

A XenServer “Bugtool” (Server status report) is a snapshot of a XenServer system that captures configuration options and logs. You can generate a Bugtool by following the instructions on this page.

The quickest method is probably to use XenCenter and the menu option Tools->Server Status Report.

Once the report has been collected there will be a large number of logs collected.

If you are interested in C-States, P-States and Turbo usage the following log files in the bugtool will be of particular interest capturing output from the xenpm commands above:

  • xenpm-get-cpufreq-states  
  • xenpm-get-cpuidle-states
  • xl-dmesg

If C-states have been disabled owing to the known Intel issues, you are likely to find a message to that effect in xl-dmesg:

(XEN) Disabling C-states C3 and C6 on Nehalem Processors due to errata

On systems where the BIOS and governor are correctly configured and Turbo is occurring the result in xenpm-get-cpufreq-states will look something similar to the log below, there are some useful features to note:

  • The “*” next to a P-State e.g. “*P15” indicates the current P-State. 
  • P0 is turbo mode, notice that it is logged on this Intel system as 3401MHz, where the maximal non-turbo mode is P1 with 3400MHz.
  • This convention of labelling turbo-mode with a frequency +1MHz above normal maximum frequency means that XenCenter (to-date XS6.2) does not reflect the true frequency of the turbo mode and as such users may interpret it that turbo mode is not occurring.
  • In order to validate that turbo mode is occurring and the frequency which is being obtained. We recommend that you use a tool such as CPU-Z installed within the guest in which your application is being run.

 

Contents of xenpm-get-cpufreq-states on a system with turbo mode frequently occuring:

 
cpu id               : 0
total P-states       : 16
usable P-states      : 16
current frequency    : 1600 MHz
P0         [3401 MHz]: transition [               68266]
                        residency  [             1823602 ms]
P1         [3400 MHz]: transition [                1396]
                        residency  [               14374 ms]
P2         [3300 MHz]: transition [                2885]
                        residency  [               52223 ms]
P3         [3100 MHz]: transition [                2090]
                        residency  [               20311 ms]
P4         [3000 MHz]: transition [                1974]
                        residency  [               21892 ms]
P5         [2900 MHz]: transition [                1820]
                        residency  [               26757 ms]
P6         [2800 MHz]: transition [                3896]
                        residency  [               90206 ms]
P7         [2600 MHz]: transition [                2262]
                        residency  [               30372 ms]
P8         [2500 MHz]: transition [                2153]
                        residency  [               25704 ms]
P9         [2400 MHz]: transition [                3854]
                        residency  [              138928 ms]
P10        [2200 MHz]: transition [                2020]
                        residency  [               42796 ms]
P11        [2100 MHz]: transition [                1802]
                        residency  [               31983 ms]
P12        [2000 MHz]: transition [                2223]
                        residency  [               42604 ms]
P13        [1900 MHz]: transition [                3953]
                        residency  [              146967 ms]
P14        [1700 MHz]: transition [                2280]
                        residency  [               61433 ms]
*P15       [1600 MHz]: transition [               63311]
                        residency  [             8768688 ms]

cpu id               : 1
total P-states       : 16
usable P-states      : 16
current frequency    : 1600 MHz
P0         [3401 MHz]: transition [               76707]
                        residency  [             1785359 ms]
P1         [3400 MHz]: transition [                1483]
                        residency  [               15921 ms]
P2         [3300 MHz]: transition [                2860]
                        residency  [               73382 ms]
P3         [3100 MHz]: transition [                2135]
                        residency  [               26447 ms]
P4         [3000 MHz]: transition [                2032]
                        residency  [               25760 ms]
P5         [2900 MHz]: transition [                1792]
                        residency  [               25958 ms]
P6         [2800 MHz]: transition [                4023]
                        residency  [               96967 ms]
P7         [2600 MHz]: transition [                2332]
                        residency  [               29420 ms]
P8         [2500 MHz]: transition [                2326]
                        residency  [               29897 ms]
P9         [2400 MHz]: transition [                3792]
                        residency  [              162292 ms]
P10        [2200 MHz]: transition [                2122]
                        residency  [               43214 ms]
P11        [2100 MHz]: transition [                1868]
                        residency  [               29894 ms]
P12        [2000 MHz]: transition [                2268]
                        residency  [               41548 ms]
P13        [1900 MHz]: transition [                4142]
                        residency  [              151858 ms]
P14        [1700 MHz]: transition [                2568]
                        residency  [               76681 ms]
*P15       [1600 MHz]: transition [               71673]
                        residency  [             8837895 ms]

cpu id               : 2
total P-states       : 16
usable P-states      : 16
current frequency    : 1600 MHz
P0         [3401 MHz]: transition [               69655]
                        residency  [             1693538 ms]
P1         [3400 MHz]: transition [                1239]
                        residency  [               13688 ms]
P2         [3300 MHz]: transition [                2706]
                        residency  [               50230 ms]
P3         [3100 MHz]: transition [                1958]
                        residency  [               18224 ms]
P4         [3000 MHz]: transition [                1738]
                        residency  [               20233 ms]
P5         [2900 MHz]: transition [                1640]
                        residency  [               22066 ms]
P6         [2800 MHz]: transition [                3748]
                        residency  [               81157 ms]
P7         [2600 MHz]: transition [                2085]
                        residency  [               23936 ms]
P8         [2500 MHz]: transition [                1998]
                        residency  [               21590 ms]
P9         [2400 MHz]: transition [                3533]
                        residency  [              109881 ms]
P10        [2200 MHz]: transition [                1887]
                        residency  [               41589 ms]
P11        [2100 MHz]: transition [                1801]
                        residency  [               35872 ms]
P12        [2000 MHz]: transition [                2114]
                        residency  [               46572 ms]
P13        [1900 MHz]: transition [                3851]
                        residency  [              134542 ms]
P14        [1700 MHz]: transition [                2257]
                        residency  [               60852 ms]
*P15       [1600 MHz]: transition [               65124]
                        residency  [             8265071 ms]



[root@chlorine ~]# xenpm get-cpuidle-states Max possible C-state: C7

cpu id               : 0
total C-states       : 4
idle time(ms)        : 947121748
C0                   : transition [            96658423]
                        residency  [            12218212 ms]
C1                   : transition [            30104795]
                        residency  [             9097245 ms]
C2                   : transition [             2255699]
                        residency  [             3180247 ms]
C3                   : transition [            64297929]
                        residency  [           933965468 ms]
pc2                  : [                   0 ms]
pc3                  : [                   0 ms]
pc6                  : [                   0 ms]
pc7                  : [                   0 ms]
cc3                  : [             6327570 ms]
cc6                  : [           904652183 ms]
cc7                  : [                   0 ms]

cpu id               : 1
total C-states       : 4
idle time(ms)        : 947010848
C0                   : transition [            71787044]
                        residency  [            12101678 ms]
C1                   : transition [            33794312]
                        residency  [            12434500 ms]
C2                   : transition [             2163346]
                        residency  [             3966765 ms]
C3                   : transition [            35829386]
                        residency  [           929958228 ms]
pc2                  : [                   0 ms]
pc3                  : [                   0 ms]
pc6                  : [                   0 ms]
pc7                  : [                   0 ms]
cc3                  : [             6327570 ms]
cc6                  : [           904652183 ms]
cc7                  : [                   0 ms]

cpu id               : 2
total C-states       : 4
idle time(ms)        : 947822931
C0                   : transition [            86963076]
                        residency  [            11334484 ms]
C1                   : transition [            31988056]
                        residency  [            10800935 ms]
C2                   : transition [             3604956]
                        residency  [             3833763 ms]
C3                   : transition [            51370064]
                        residency  [           932491989 ms]
pc2                  : [                   0 ms]
pc3                  : [                   0 ms]
pc6                  : [                   0 ms]
pc7                  : [                   0 ms]
cc3                  : [             6539578 ms]
cc6                  : [           904910914 ms]
cc7                  : [                   0 ms]

cpu id               : 3
total C-states       : 4
idle time(ms)        : 946436594
C0                   : transition [            95396399]
                        residency  [            12796533 ms]
C1                   : transition [            36325531]
                        residency  [            11573286 ms]
C2                   : transition [             3821198]
                        residency  [             4217152 ms]
C3                   : transition [            55249670]
                        residency  [           929874201 ms]
pc2                  : [                   0 ms]
pc3                  : [                   0 ms]
pc6                  : [                   0 ms]
pc7                  : [                   0 ms]
cc3                  : [             6539578 ms]
cc6                  : [           904910914 ms]
cc7                  : [                   0 ms]

cpu id               : 4
total C-states       : 4
idle time(ms)        : 947077181
C0                   : transition [            66236337]
                        residency  [            11994798 ms]
C1                   : transition [            31604501]
                        residency  [            12857532 ms]
C2                   : transition [             2001912]
                        residency  [             4093945 ms]
C3                   : transition [            32629924]
                        residency  [           929514896 ms]
pc2                  : [                   0 ms]
pc3                  : [                   0 ms]
pc6                  : [                   0 ms]
pc7                  : [                   0 ms]
cc3                  : [             7209529 ms]
cc6                  : [           901426733 ms]
cc7                  : [                   0 ms]

cpu id               : 5
total C-states       : 4
idle time(ms)        : 947106994
C0                   : transition [            65818330]
                        residency  [            11947416 ms]
C1                   : transition [            33355846]
                        residency  [            13217849 ms]
C2                   : transition [             1890815]
                        residency  [             3987678 ms]
C3                   : transition [            30571669]
                        residency  [           929308228 ms]
pc2                  : [                   0 ms]
pc3                  : [                   0 ms]
pc6                  : [                   0 ms]
pc7                  : [                   0 ms]
cc3                  : [             7209529 ms]
cc6                  : [           901426733 ms]
cc7                  : [                   0 ms]

cpu id               : 6
total C-states       : 4
idle time(ms)        : 947108167
C0                   : transition [            87805457]
                        residency  [            12088038 ms]
C1                   : transition [            31232243]
                        residency  [            11014113 ms]
C2                   : transition [             3571187]
                        residency  [             4044278 ms]
C3                   : transition [            53002027]
                        residency  [           931314743 ms]
pc2                  : [                   0 ms]
pc3                  : [                   0 ms]
pc6                  : [                   0 ms]
pc7                  : [                   0 ms]
cc3                  : [             6509217 ms]
cc6                  : [           904808644 ms]
cc7                  : [                   0 ms]

cpu id               : 7
total C-states       : 4
idle time(ms)        : 947002505
C0                   : transition [            90229714]
                        residency  [            12196682 ms]
C1                   : transition [            34405450]
                        residency  [            11350884 ms]
C2                   : transition [             3596554]
                        residency  [             3976629 ms]
C3                   : transition [            52227710]
                        residency  [           930936976 ms]
pc2                  : [                   0 ms]
pc3                  : [                   0 ms]
pc6                  : [                   0 ms]
pc7                  : [                   0 ms]
cc3                  : [             6509217 ms]
cc6                  : [           904808644 ms]
cc7                  : [                   0 ms]

About XenServer

XenServer is the leading open source virtualization platform, powered by the Xen Project hypervisor and the XAPI toolstack. It is used in the world's largest clouds and enterprises.
 
Technical support for XenServer is available from Citrix.