XenServer Performance Metrics

The Data Store for Performance Data

In order to store performance metrics, XenServer uses Round Robin Databases (RRDs). These consist of multiple Round Robin Archives (RRAs) - and the whole database is of a fixed size. This is because each RRA is in effect a circular buffer with a predefined maximum capacity.

The CDPs (Consolidated Data Points) are produced by applying the Consolidation Functions (CF's) to a number of actual data points, storing historical performance data in a far more compressed format.

What Metrics are Available via RRDs for XenServer?

A wide range of metrics are available for XenServer including: C-State, P-State, IOPS, Latency and many more. Not all of these are turned on by default. The full range of metrics available for both Hosts and VMs are detailed in Chapter 9 of the XenServer 6.2 Administrators Guide. This chapter also details how you can explore these metrics via XenCenter.

Data Granularity

Each archive in the database samples its particular metric on a specified granularity - so in XenServer's case we sample:

  • Every 5 seconds for the duration of 10 minutes
  • Every minute for the past two hours
  • Every hour for the past week
  • Every day for the past year

The sampling that takes place every 5 seconds records actual data points, however the following RRA's use Consolidation Functions (CF) instead. The CFs supported by XenServer are:

  • AVERAGE
  • MIN
  • MAX

There exist RRDs for individual VMs (including dom0) and the host. The VM RRDs are stored on the host on which they run, or the pool master when they are not running. This means that the location of VM must be known in order to retrieve the associated performance data.

Getting RRDs

Downloading the whole RRD

RRDs can be downloaded over HTTP from the XenServer host on which they reside by using the http handler registered at /host_rrd or /vm_rrd. Both addresses require authentication either by http auth, or by providing a valid XenAPI session references as a query argument. So two example calls could be:

Downloading a Host RRD

wget http://<server>/host_rrd?session_id=OpaqueRef:<SESSION HANDLE>

Downloading a VM RRD
In the case of downloading a VM's RRD, the VM uuid must also be specified as a query parameter:

wget http://<server>/vm_rrd?session_id=OpaqueRef:<SESSION HANDLE>&uuid=<VM UUID>

Both of these calls will download XML in a format that can be imported into the rrdtool for analysis, or parsed directly.

Getting updates from the RRD

So as to not have to download the whole database when you have most of the data already, a query can also be made to retrieve just updates. This is via a different http handler /rrd_updates. This also requires authentication with either http auth or a valid xapi session reference.

Updates are described in relation to a specified start time - that is described in Epoch time (Number of seconds since Jan 1 1970). The start time must be specified in a query parameter named 'start'.

The xml downloaded will be in RRD XPORT format, and will contain every VM's data (as opposed to querying a specific VM with its uuid). This means that it is not possible to query a particular VM on its own, or in fact a particular parameter. In order to differentiate which data set corresponds to which VM, the 'legend' field is prefixed with the VM's uuid. The data set also contains a prefix describing the CF used to collect the data (i.e. MAX/MIN/AVERAGE).

For retrieving the RRD updates of the host, a query parameter 'host=true' must be specified.

Downloading RRD updates for VMs

wget http://<server>/rrd_updates?session_id=OpaqueRef:<SESSION HANDLE>&start=10258122541

Downloading RRD updates for a Host

wget http://<server>/rrd_updates?session_id=OpaqueRef:<SESSION HANDLE>&start=10258122541&host=true

Downloading RRD updates specifying CF
It is also possible to specify which CF datasets you would like to download (MIN/MAX/AVERAGE) - you can do this by specifying it as the 'cf' query parameter:

wget http://<server>/rrd_updates?session_id=OpaqueRef:<SESSION HANDLE>&start=10258122541&cf=AVERAGE 

 

Caution! Details to be aware of when using RRD updates 

  • rrd_updates will only return those data sources that are currently being collected. So if you have unplugged VBDs for example, getting the vm_rrd will give you the historical data for that VBD from when it was in use, but rrd_updates won't.
  • rrd_updates will only give you what it considers to be the most appropriate archive for your request. As stated above, there are 4 archives available: one with 5 second updates for a maximum of 10 minutes, 1 minute for 2 hours, 1 hour for 1 week and 1 day for 1 year. When you do the GET, you specify on the URL the 'start' parameter, and it uses this to see which archive to give you. So if you specify a since value that is 9 minutes before 'now', you'll get 108 rows from the 10 minute archive, but if you specify 11 minutes before 'now', you'll get 11 rows from the 2 hour archive. In particular, it's important that you use the same definition of 'now' that the server is using, since the server is likely to be in GMT and your client may be in a different time zone. Each call to rrd_updates will return a value (in the 'end' XML tag) that can be used as the 'start' parameter for the next call.

 

Using RRDs

Once you have downloaded the XML representation of the RRDs in interest, then you can either use a utility such as rdtool, that will allow you to plot graphs amongst other forms of analysis very easily, or you can parse the XML yourself.

This might be useful in the case of wanting to perform simple tasks such as determining what the current Network throughput is, or perhaps carrying out your own, more complex analysis on the data points. To help in parsing the XML there is an example python library (parse_rrd) that can be used.

Examples of parsing the RRD databases:

Below are two examples that both use the sample python script to demonstrate how you can parse and use the metrics for yourself.

Example XML

DS Field Definitions

DS Defines a Data Source Field.
DS-Name The name of this Data Source.
DST Defines the Data Source Type. Can be GAUGE, COUNTER, DERIVE or ABSOLUTE.
HeartBeat Defines the minimum heartbeat; the maximum number of seconds before a DS value is considered unknown.
Min The minimum acceptable value. Values less than this number are considered unknown. This is optional.
Max The maximum acceptable value. Values exceeding this number are considered unknown. This is optional.

Example host rrd (note that a VM rrd is identically structured, but with different data sources):


<?xml version="1.0"?>
<rrd>
  <version>0003</version>
  <step>5</step>
  <lastupdate>1213616574</lastupdate>
  <ds>
    <name>memory_total_kib</name>
    <type>GAUGE</type>
    <minimal_heartbeat>300.0000</minimal_heartbeat>
    <min>0.0</min>
    <max>Infinity</max>
    <last_ds>2070172</last_ds>
    <value>9631315.6300</value>
    <unknown_sec>0</unknown_sec>
  </ds>
  <ds>
   <!-- other dss - the order of the data sources is important
        and defines the ordering of the columns in the archives below -->
  </ds>
  <rra>
    <cf>AVERAGE</cf>
    <pdp_per_row>1</pdp_per_row>
    <params>
      <xff>0.5000</xff>
    </params>
    <cdp_prep> <!-- This is for internal use -->
      <ds>
        <primary_value>0.0</primary_value>
        <secondary_value>0.0</secondary_value>
        <value>0.0</value>
        <unknown_datapoints>0</unknown_datapoints>
      </ds>
      ...other dss - internal use only...
    </cdp_prep>
    <database>
     <row>
        <v>2070172.0000</v>  <!-- columns correspond to the DSs defined above -->
        <v>1756408.0000</v>
        <v>0.0</v>
        <v>0.0</v>
        <v>732.2130</v>
        <v>0.0</v>
        <v>782.9186</v>
        <v>0.0</v>
        <v>647.0431</v>
        <v>0.0</v>
        <v>0.0001</v>
        <v>0.0268</v>
        <v>0.0100</v>
        <v>0.0</v>
        <v>615.1072</v>
     </row>
     ...
  </rra>
  ... other archives ...
</rrd>

Example rrd_updates - only 1 VM present, no host updates:


<xport>
  <meta>
    <start>1213578000</start>
    <step>3600</step>
    <end>1213617600</end>
    <rows>12</rows>
    <columns>12</columns>
    <legend>
      <entry>AVERAGE:vm:ecd8d7a0-1be3-4d91-bd0e-4888c0e30ab3:cpu1</entry> <!-- nb - each data source might have multiple entries for different consolidation functions -->
      <entry>AVERAGE:vm:ecd8d7a0-1be3-4d91-bd0e-4888c0e30ab3:cpu0</entry>
      <entry>AVERAGE:vm:ecd8d7a0-1be3-4d91-bd0e-4888c0e30ab3:memory</entry>
      <entry>MIN:vm:ecd8d7a0-1be3-4d91-bd0e-4888c0e30ab3:cpu1</entry>
      <entry>MIN:vm:ecd8d7a0-1be3-4d91-bd0e-4888c0e30ab3:cpu0</entry>
      <entry>MIN:vm:ecd8d7a0-1be3-4d91-bd0e-4888c0e30ab3:memory</entry>
      <entry>MAX:vm:ecd8d7a0-1be3-4d91-bd0e-4888c0e30ab3:cpu1</entry>
      <entry>MAX:vm:ecd8d7a0-1be3-4d91-bd0e-4888c0e30ab3:cpu0</entry>
      <entry>MAX:vm:ecd8d7a0-1be3-4d91-bd0e-4888c0e30ab3:memory</entry>
      <entry>LAST:vm:ecd8d7a0-1be3-4d91-bd0e-4888c0e30ab3:cpu1</entry>
      <entry>LAST:vm:ecd8d7a0-1be3-4d91-bd0e-4888c0e30ab3:cpu0</entry>
      <entry>LAST:vm:ecd8d7a0-1be3-4d91-bd0e-4888c0e30ab3:memory</entry>
    </legend>
  </meta>
  <data>
    <row>
      <t>1213617600</t>
      <v>0.0</v> <!-- once again, the order or the columns is defined by the legend above -->
      <v>0.0282</v>
      <v>209715200.0000</v>
      <v>0.0</v>
      <v>0.0201</v>
      <v>209715200.0000</v>
      <v>0.0</v>
      <v>0.0445</v>
      <v>209715200.0000</v>
      <v>0.0</v>
      <v>0.0243</v>
      <v>209715200.0000</v>
    </row>
   ...
  </data>
</xport>

External Java Example

 

About XenServer

XenServer is the leading open source virtualization platform, powered by the Xen Project hypervisor and the XAPI toolstack. It is used in the world's largest clouds and enterprises.
 
Commercial support for XenServer is available from Citrix.