XenServer Performance Metrics
The Data Store for Performance Data
In order to store performance metrics, XenServer uses Round Robin Databases (RRDs). These consist of multiple Round Robin Archives (RRAs) - and the whole database is of a fixed size. This is because each RRA is in effect a circular buffer with a predefined maximum capacity.
The CDPs (Consolidated Data Points) are produced by applying the Consolidation Functions (CF's) to a number of actual data points, storing historical performance data in a far more compressed format.
What Metrics are Available via RRDs for XenServer?
A wide range of metrics are available for XenServer including: C-State, P-State, IOPS, Latency and many more. Not all of these are turned on by default. The full range of metrics available for both Hosts and VMs are detailed in Chapter 9 of the XenServer 6.2 Administrators Guide. This chapter also details how you can explore these metrics via XenCenter.
Each archive in the database samples its particular metric on a specified granularity - so in XenServer's case we sample:
- Every 5 seconds for the duration of 10 minutes
- Every minute for the past two hours
- Every hour for the past week
- Every day for the past year
The sampling that takes place every 5 seconds records actual data points, however the following RRA's use Consolidation Functions (CF) instead. The CFs supported by XenServer are:
There exist RRDs for individual VMs (including dom0) and the host. The VM RRDs are stored on the host on which they run, or the pool master when they are not running. This means that the location of VM must be known in order to retrieve the associated performance data.
Downloading the whole RRD
RRDs can be downloaded over HTTP from the XenServer host on which they reside by using the http handler registered at /host_rrd or /vm_rrd. Both addresses require authentication either by http auth, or by providing a valid XenAPI session references as a query argument. So two example calls could be:
Downloading a Host RRD
wget http://<server>/host_rrd?session_id=OpaqueRef:<SESSION HANDLE>
Downloading a VM RRD
In the case of downloading a VM's RRD, the VM uuid must also be specified as a query parameter:
Downloading RRD updates for a Host
wget http://<server>/rrd_updates?session_id=OpaqueRef:<SESSION HANDLE>&start=10258122541&host=true
Downloading RRD updates specifying CF
It is also possible to specify which CF datasets you would like to download (MIN/MAX/AVERAGE) - you can do this by specifying it as the 'cf' query parameter:
wget http://<server>/rrd_updates?session_id=OpaqueRef:<SESSION HANDLE>&start=10258122541&cf=AVERAGE
Caution! Details to be aware of when using RRD updates
- rrd_updates will only return those data sources that are currently being collected. So if you have unplugged VBDs for example, getting the vm_rrd will give you the historical data for that VBD from when it was in use, but rrd_updates won't.
- rrd_updates will only give you what it considers to be the most appropriate archive for your request. As stated above, there are 4 archives available: one with 5 second updates for a maximum of 10 minutes, 1 minute for 2 hours, 1 hour for 1 week and 1 day for 1 year. When you do the GET, you specify on the URL the 'start' parameter, and it uses this to see which archive to give you. So if you specify a since value that is 9 minutes before 'now', you'll get 108 rows from the 10 minute archive, but if you specify 11 minutes before 'now', you'll get 11 rows from the 2 hour archive. In particular, it's important that you use the same definition of 'now' that the server is using, since the server is likely to be in GMT and your client may be in a different time zone. Each call to rrd_updates will return a value (in the 'end' XML tag) that can be used as the 'start' parameter for the next call.
Once you have downloaded the XML representation of the RRDs in interest, then you can either use a utility such as rdtool, that will allow you to plot graphs amongst other forms of analysis very easily, or you can parse the XML yourself.
This might be useful in the case of wanting to perform simple tasks such as determining what the current Network throughput is, or perhaps carrying out your own, more complex analysis on the data points. To help in parsing the XML there is an example python library (parse_rrd) that can be used.
Examples of parsing the RRD databases:
Below are two examples that both use the sample python script to demonstrate how you can parse and use the metrics for yourself.
- Listing the most recent data points for each parameter
- Graphing a particular parameter using GNUPlot
DS Field Definitions
|DS||Defines a Data Source Field.|
|DS-Name||The name of this Data Source.|
|DST||Defines the Data Source Type. Can be GAUGE, COUNTER, DERIVE or ABSOLUTE.|
|HeartBeat||Defines the minimum heartbeat; the maximum number of seconds before a DS value is considered unknown.|
|Min||The minimum acceptable value. Values less than this number are considered unknown. This is optional.|
|Max||The maximum acceptable value. Values exceeding this number are considered unknown. This is optional.|
Example host rrd (note that a VM rrd is identically structured, but with different data sources):
<?xml version="1.0"?> <rrd> <version>0003</version> <step>5</step> <lastupdate>1213616574</lastupdate> <ds> <name>memory_total_kib</name> <type>GAUGE</type> <minimal_heartbeat>300.0000</minimal_heartbeat> <min>0.0</min> <max>Infinity</max> <last_ds>2070172</last_ds> <value>9631315.6300</value> <unknown_sec>0</unknown_sec> </ds> <ds> <!-- other dss - the order of the data sources is important and defines the ordering of the columns in the archives below --> </ds> <rra> <cf>AVERAGE</cf> <pdp_per_row>1</pdp_per_row> <params> <xff>0.5000</xff> </params> <cdp_prep> <!-- This is for internal use --> <ds> <primary_value>0.0</primary_value> <secondary_value>0.0</secondary_value> <value>0.0</value> <unknown_datapoints>0</unknown_datapoints> </ds> ...other dss - internal use only... </cdp_prep> <database> <row> <v>2070172.0000</v> <!-- columns correspond to the DSs defined above --> <v>1756408.0000</v> <v>0.0</v> <v>0.0</v> <v>732.2130</v> <v>0.0</v> <v>782.9186</v> <v>0.0</v> <v>647.0431</v> <v>0.0</v> <v>0.0001</v> <v>0.0268</v> <v>0.0100</v> <v>0.0</v> <v>615.1072</v> </row> ... </rra> ... other archives ... </rrd>
Example rrd_updates - only 1 VM present, no host updates: