Virtualization Blog

Discussions and observations on virtualization.

A New Year, A New Way to Build for XenServer

Building bits of XenServer outside of Citrix has in the past been a bit of a challenging task, requiring careful construction of the build environment to replicate what 'XenBuilder', our internal build system, puts together. This has meant using custom DDK VMs or carefully installing by hand a set of packages taken from one of the XenServer ISOs. With XenServer Dundee, this will be a pain of the past, and making a build environment will be just a 'docker run' away.

Part of the work that's being done for XenServer Dundee has been moving things over to using standard build tools and packaging. In previous releases there have been a mix of RPMs, tarballs and patches for existing files, but for the Dundee project everything installed into dom0 is now packaged into an RPM. Taking inspiration and knowledge gained while working on xenserver/buildroot, we're building most of these dom0 packages now using mock. Mock is a standard tool for building RPM packages from source RPMs (SRPMS), and it works by constructing a completely clean chroot with only the dependencies defined by the SRPM. This means that everything needed to build a package must be in an RPM, and the dependencies defined by the SRPM must be correct too.

From the point of view of making reliably reproducible builds, using mock means there is very little possibility of having the build dependent upon the the environment. But there is also a side benefit of this work: If you actually want to rebuild a bit of XenServer you just need to have a yum repository with the XenServer RPMs in, and use 'yum-builddep' to put in place all of the build dependencies, and then building should be as simple as cloning the repository and typing 'make'.

The simplest place to do this would be in the dom0 environment itself, particularly now that the partition size has been bumped up to 20 gigs or so. However, that may well not be the most convenient. In fact, for a use case like this, the mighty Docker provides a perfect solution. Docker can quickly pull down a standard CentOS environment and then put in the reference to the XenServer yum repository, install gcc, OCaml, git, emacs and generally prepare the perfect build environment for development.

In fact, even better, Docker will actually do all of these bits for you! The docker hub has a facility for automatically building a Docker image provided everything required is in repository on Github. So we've prepared a repository containing a Dockerfile and associated gubbins that sets things up as above, and then the docker hub builds and hosts the resulting docker image.

Let's dive in with an example on how to use this. Say you have a desire to change some aspect of how networking works on XenServer, something that requires a change to the networking daemon itself, 'xcp-networkd'. We'll start by rebuilding that from the source RPM. Start the docker container and install the build dependencies:

$ docker run -i -t xenserver/xenserver-build-env
[root@15729a23550b /]# yum-builddep -y xcp-networkd

this will now download and install everything required to be able to build the network daemon. Next, let's just download and build the SRPM:

[root@15729a23550b /]# yumdownloader --source xcp-networkd

At time of writing, this downloads the SRPM "xcp-networkd-0.9.6-1+s0+0.10.0+8+g96c3fcc.el7.centos.src.rpm". This will build correctly in our environment:

[root@15729a23550b /]# rpmbuild --rebuild xcp-networkd-*
...
[root@15729a23550b /]# ls -l ~/rpmbuild/RPMS/x86_64/
total 2488
-rw-r--r-- 1 root root 1938536 Jan  7 11:15 xcp-networkd-0.9.6-1+s0+0.10.0+8+g96c3fcc.el7.centos.x86_64.rpm
-rw-r--r-- 1 root root  604440 Jan  7 11:15 xcp-networkd-debuginfo-0.9.6-1+s0+0.10.0+8+g96c3fcc.el7.centos.x86_64.rpm

To patch this, it's just the same as for CentOS, Fedora, and any other RPM based distro, so follow one of the many guides available.

Alternatively, you can compile straight from the source. Most of our software is hosted on github, either under the xapi-project or xenserver organisations. xcp-networkd is a xapi-project repository, so we can clone it from there:

[root@15729a23550b /]# cd ~
[root@15729a23550b ~]# git clone git://github.com/xapi-project/xcp-networkd

Most of our RPMs have version numbers constructed automatically containing useful information about the source, and where the source is from git repositories the version information comes from 'git describe'.

[root@15729a23550b ~]# cd xcp-networkd
[root@15729a23550b xcp-networkd]# git describe --tags
v0.10.0-8-g96c3fcc

The important part here is the hash, in this case '96c3fcc'. Comparing with the SRPM version, we can see these are identical. We can now just type 'make' to build the binaries:

[root@15729a23550b xcp-networkd]# make

this networkd binary can then be put onto your XenServer and run.

The yum repository used by the container is being created directly from the snapshot ISOs uploaded to xenserver.org, using a simple bash script named update_xs_yum.sh available on github. The container default will be to use the most recently available release, but the script can be used by anyone to generate a repository from the daily snapshots too, if this is required. There’s still a way to go before Dundee is released, and some aspect of this workflow are in flux – for example, the RPMs aren’t currently signed. However, by the time Dundee is out the door we hope to make many improvements in this area. Certainly here in Citrix, many of us have switched to using this for our day-to-day build needs, because it's simply far more convenient than our old custom chroot generation mechanism.

Recent comment in this post
Shawn Edwards
devrepo.xenerver.org is down, so this method of developing for xenserver currently doesn't work. Who do I need to bug to get this... Read More
Thursday, 02 June 2016 22:07
Continue reading
12060 Hits
1 Comment

XENSERVER产品DUNDEE预览版2发布

2015 刚刚结束, 2016新年开始,正是我们向XenServer社区发节日礼物的好时节。我们发布了下一代XenServer产品Dundee预览版2。我们花了大量精力解决已上报的各种问题,也导致该测试版本和九月份的预览版1相比有些延迟。我们确信该预览版和Steve Wilson博客 (https://www.citrix.com/blogs/2015/12/14/citrix-xenserver-infrastructure-strategy/) 中对思杰在XenServer项目贡献的肯定是很好的新年礼物。此外,我们已开始在xenserver.org网站提供一系列博客把主要改进点做深度介绍。对于那些更关注该预览版亮点的朋友,现在就让我在下面为您做一一介绍。

异构处理器集群

多年来XenServer一直支持用不同代的CPU创建处理器资源池,但是几年前采用因特尔CPU发生一些改变,这影响了混合使用最新的CPU和相对较老的CPU的能力。好消息是,使用Dundee预览版2,这种状况得到了彻底解决,且确确实实提高了性能。这个领域需要我们把事情完完全全的做正确,我们非常感激任何人运行Dundee体验该特性并上报成功或者上报遇到的问题。

增强的扩展性

当代的服务器致力于增强其能力,我们不仅要与时俱进,更要确保用户能创建真正反应物理服务器能力的虚拟机。Dundee预览版2 目前支持512个物理CPU(pCPU),可创建高达1.5TB内存的用户虚拟机。您也许会问是否考虑增加虚拟CPU上限,我们已经把上限扩大到32个。在Xen项目的管理程序中,我们已经默认支持PML(Page Modification Logging)。 对PML设计的详细信息已经发布在Xen项目归档中等待审核。最后,XenServer的Dom0内核版本已经升级到3.10.93。

支持新的SUSE版本

SUSE为其企业服务器版SLES(SUSE Linux Enterprise Server)和企业桌面版SLED(SUSE Linux Enterprise Desktop)发布了version 12 SP1,这两者在Dundee中都已得到支持。

安全更新

自从Dundee预览版1在九月下旬发布后,若干安全相关的热补丁已经在XenServer 6.5 SP1中发布。 同样的补丁也应用到了Dundee并且已经包含在预览版2中。

下载信息

您可以在预览页http://xenserver.org/preview) 下载Dundee预览版2,任何发现的问题都欢迎上报到我们的故障库https://bugs.xenserver.org)。

Recent Comments
Tim Mackey
Evgeny, The original English version can be found here: http://xenserver.org/discuss-virtualization/virtualization-blog/entry/xen... Read More
Saturday, 09 January 2016 13:29
xing
居然有中文版了。。。妹子。。。
Sunday, 10 January 2016 14:22
Continue reading
6329 Hits
3 Comments

Integrating XenServer, RDO and Neutron

XenServer is a great choice of hypervisor for OpenStack based clouds, but there is no native integration between it and RedHat's RDO packages. This means that setting up an integrated environment using XenServer and RDO is more difficult than it should be. This blog post aims to resolve that, giving a method where CentOS can be set up easily to use XenServer as the hypervisor.

Environment

  • Hypervisor: XenServer: 6.5
  • Guest: CentOS 7.0
  • OpenStack: Liberty
  • Network: Neutron, ML2 plugin, OVS, VLAN

Install XenServer

The XenServer integration with OpenStack has some optimizations which means that only EXT3 storage is supported. Make sure when installing your XenServer you select Optimized for XenDesktop when prompted. Use XenCenter to check that the SR type is EXT3 as fixing it after creating the VMs will require deleting the VMs and starting again.

Install OpenStack VM

With XenServer, the Nova Compute service must run in a virtual machine on the hypervisor that they will be controlling. As we're using CentOS 7.0 for this environment, create a VM using the CentOS 7.0 template in XenCenter. If you want to copy and paste the scripts from the rest of the blog, use the name "CentOS_RDO" for this VM. Install the CentOS 7.0 VM but shut it down before installing RDO.

Create network for OpenStack VM

In single box environment, we need three networks, "Integration network", "External network", "VM network". If you have appropriate networks for the above (e.g. a network that gives you external access) then rename the existing network to have the appropriate name-label. Note that a helper script rdo_xenserver_helper.sh is provided for some of the later steps in this blog rely on these specific name labels, so if you choose not to use them then please also update the helper script.

You can do this via XenCenter or run the following commands in dom0:

    xe network-create name-label=openstack-int-network
    xe network-create name-label=openstack-ext-network
    xe network-create name-label=openstack-vm-network

Create virtual network interfaces for OpenStack VM

This step requires the VM to be shut down, as it's modifying the network setup and the PV tools have not been installed in the guest.

    vm_uuid=$(xe vm-list name-label=CentOS_RDO minimal=true)
    vm_net_uuid=$(xe network-list name-label=openstack-vm-network minimal=true)
    next_device=$(xe vm-param-get uuid=$vm_uuid param-name=allowed-VIF-devices | cut -d';' -f1)
    vm_vif_uuid=$(xe vif-create device=$next_device network-uuid=$vm_net_uuid vm-uuid=$vm_uuid)
    xe vif-plug uuid=$vm_vif_uuid
    ext_net_uuid=$(xe network-list name-label=openstack-ext-network minimal=true)
    next_device=$(xe vm-param-get uuid=$vm_uuid param-name=allowed-VIF-devices | cut -d';' -f1)
    ext_vif_uuid=$(xe vif-create device=$next_device network-uuid=$ext_net_uuid vm-uuid=$vm_uuid)
    xe vif-plug uuid=$ext_vif_uuid

You can also choose use helper script to do these in dom0.

    source rdo_xenserver_helper.sh 
create_vif

Configure OpenStackVM/Hypervisor communications

Use HIMN tool (plugin for XenCenter) to add internal management network to OpenStack VMs. This effectively performs the following operations, which could also be performed manually in dom0 or use rdo_xenserver_helper.sh.

    source rdo_xenserver_helper.sh
create_himn

Note: If using the commands manually, they should be run when the OpenStack VM is shut down.

Set up DHCP on the HIMN network for OpenStack VM, allowing OpenStack VM to access its own hypervisor on the static address 169.254.0.1. Run helper script in domU.

    source rdo_xenserver_helper.sh 
active_himn_interface

Install RDO

Using the RDO Quickstart detailed installation guide, please follow the instructions step by step. This manual only points out the steps that you must pay attention to during installation.

Run Packstack to install OpenStack

Rather than running packstack immediately, we need to generate an answerfile so that we can tweak the configuration.

Generate answer file:

    packstack --gen-answer-file=

Install OpenStack services:

    packstack --answer-file=

These items in should be changed as below:

    CONFIG_NEUTRON_ML2_TYPE_DRIVERS=vlan 
CONFIG_NEUTRON_ML2_TENANT_NETWORK_TYPES=vlan

These items in should be changed according to your environment:

    CONFIG_NEUTRON_ML2_VLAN_RANGES=
CONFIG_NEUTRON_OVS_BRIDGE_MAPPINGS=
CONFIG_NEUTRON_OVS_BRIDGE_IFACES=

NOTE:

: physnet1 is physical network name for VLAN provider and tenant networks. 1000:1050 is VLAN tag ranges on each physical network for allocation to tenant networks.
: br-eth1 is OVS bridge for VM network. br-ex is OVS bridge for External network, neutron L3 agent use it for external traffic.
: eth1 is OpenStack VM's NIC which connected to VM network. eth2 is OpenStack VM's NIC which connected to External network.

Configure Nova and Neutron

Copy Nova and Neutron plugins to XenServer host.

    source rdo_xenserver_helper.sh 
install_dom0_plugins

Edit /etc/nova/nova.conf, switch compute driver to XenServer.

    [DEFAULT] 
compute_driver=xenapi.XenAPIDriver

[xenserver]
connection_url=http://169.254.0.1
connection_username=root
connection_password=
vif_driver=nova.virt.xenapi.vif.XenAPIOpenVswitchDriver
ovs_int_bridge=

NOTE:

The integration_bridge above can be found from dom0:
    xe network-list name-label=openstack-int-network params=bridge
169.254.0.1 is hypervisor dom0's address which OpenStack VM can reach via HIMN.

Install XenAPI Python XML RPC lightweight bindings.

    yum install -y python-pip 
pip install xenapi

Configure Neutron

Edit /etc/neutron/rootwrap.conf to support uing XenServer remotely.

    [xenapi] 
# XenAPI configuration is only required by the L2 agent if it is to
# target a XenServer/XCP compute host's dom0.
xenapi_connection_url=http://169.254.0.1
xenapi_connection_username=root
xenapi_connection_password=

Restart Nova and Neutron Services

    for svc in api cert conductor compute scheduler; do 
service openstack-nova-$svc restart;
done
service neutron-openvswitch-agent restart

Launch another neutron-openvswitch-agent to talk with dom0

XenServer has a seperation of dom0 and domU and all instances' VIFs are actually managed by dom0. Their corresponding OVS ports are created in dom0. Thus, we should manually start the other ovs agent which is in charge of these ports and is talking to dom0, refer xenserver_neutron picture.

Create ovs configuration file

    cp /etc/neutron/plugins/ml2/openvswitch_agent.ini etc/neutron/plugins/ml2/openvswitch_agent.ini.dom0
    [ovs] 
integration_bridge = xapi3
bridge_mappings = physnet1:xapi2

[agent]
root_helper = neutron-rootwrap-xen-dom0 /etc/neutron/rootwrap.conf
root_helper_daemon =
minimize_polling = False

[securitygroup]
firewall_driver = neutron.agent.firewall.NoopFirewallDriver

NOTE:

xapi3 the integration bridge is xapX in the graph. xapi2 is vm network bridge, it's xapiY in the graph.
    xe network-list name-label=openstack-int-network params=bridgexe network-list name-label=openstack-vm-network params=bridge

Launch neutron-openvswitch-agent

Replace cirros guest with one setup to work for XenServer

    /usr/bin/python2 /usr/bin/neutron-openvswitch-agent \
--config-file /usr/share/neutron/neutron-dist.conf \
--config-file /etc/neutron/neutron.conf --config-file \
/etc/neutron/plugins/ml2/openvswitch_agent.ini.dom0 \
--config-dir /etc/neutron/conf.d/neutron-openvswitch-agent \
--log-file /var/log/neutron/openvswitch-agent.log.dom0 &
    nova image-delete cirros
wget http://ca.downloads.xensource.com/OpenStack/cirros-0.3.4-x86_64-disk.vhd.tgz

glance image-create --name cirros --container-format ovf \
--disk-format vhd --property vm_mode=xen --visibility public \
--file cirros-0.3.4-x86_64-disk.vhd.tgz

Launching instance and test its connectivity

    source keystonerc_demo

[root@localhost ~(keystone_demo)]# glance image-list
+--------------------------------------+--------+
| ID | Name |
+--------------------------------------+--------+
| 5c227c8e-3cfa-4368-963c-6ebc2f846ee1 | cirros |
+--------------------------------------+--------+

 

    [root@localhost ~(keystone_demo)]# neutron net-list
+--------------------------------------+---------+--------------------------------------------------+
| id | name | subnets |
+--------------------------------------+---------+--------------------------------------------------+
| 91c0f6ac-36f2-46fc-b075-6213a241fc2b | private | 3a4eebdc-6727-43e3-b5fe-8760d64c00fb 10.0.0.0/24 |
| 7ccf5c93-ca20-4962-b8bb-bff655e29788 | public | 4e023f19-dfdd-4d00-94cc-dbea59b31698 |
+--------------------------------------+---------+--------------------------------------------------+
    nova boot --flavor m1.tiny --image cirros --nic \
net-id=91c0f6ac-36f2-46fc-b075-6213a241fc2b demo-instance
    [root@localhost ~(keystone_demo)]# neutron floatingip-create public
    Created a new floatingip:
+---------------------+--------------------------------------+
| Field | Value |
+---------------------+--------------------------------------+
| fixed_ip_address | |
| floating_ip_address | 172.24.4.228 |
| floating_network_id | 7ccf5c93-ca20-4962-b8bb-bff655e29788 |
| id | 2f0e7c1e-07dc-4c7e-b9a6-64f312e7f693 |
| port_id | |
| router_id | |
| status | DOWN |
| tenant_id | 838ec33967ff4f659b808e4a593e7085 |
+---------------------+--------------------------------------+
    nova add-floating-ip demo-instance 172.24.4.228

After these above steps, we have succefully booted an instance with floating ip, use "nova list" will output the instances

    [root@localhost ~(keystone_demo)]# nova list
+--------------------------------------+---------------+--------+------------+-------------+--------------------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+---------------+--------+------------+-------------+--------------------------------+
| ac82fcc8-1609-4d34-a4a7-80e5985433f7 | demo-inst1 | ACTIVE | - | Running | private=10.0.0.3, 172.24.4.227 |
| f302a03f-3761-48e6-a786-45b324182545 | demo-instance | ACTIVE | - | Running | private=10.0.0.4, 172.24.4.228 |
+--------------------------------------+---------------+--------+------------+-------------+--------------------------------+

Test the connectivity via floating ip, "ping 172.24.4.228" at the OpenStack VM, will properbly get outputs like:

    [root@localhost ~(keystone_demo)]# ping 172.24.4.228
    PING 172.24.4.228 (172.24.4.228) 56(84) bytes of data.
64 bytes from 172.24.4.228: icmp_seq=1 ttl=63 time=1.76 ms
64 bytes from 172.24.4.228: icmp_seq=2 ttl=63 time=0.666 ms
64 bytes from 172.24.4.228: icmp_seq=3 ttl=63 time=0.284 ms
Recent Comments
HuanXie
security group is not supported with the already released OpenStack version. But we have fixed this and it is now waiting for more... Read More
Friday, 08 January 2016 01:40
HuanXie
thanks... Read More
Thursday, 21 January 2016 01:25
Continue reading
15895 Hits
5 Comments

CPU Feature Levelling in Dundee

As anyone with more than one type of server is no doubt aware, CPU feature levelling is the method by which we try to make it safe for VMs to migrate. If each host in a pool is identical, this is easy. However if the pool has non-identical hardware, we must hide the differences so that a VM which is migrated continues without crashing.

I don't think I am going to surprise anyone by admitting that the old way XenServer did feature levelling was clunky and awkward to use. Because of a step change introduced in Intel Ivy Bridge CPUs, feature levelling also ceased to work correctly. As a result, we took the time to redesign it, and the results are available for use in Dundee beta 2.

Background

When a VM boots, the kernel looks at the available featureset and in general, turn as much on as it knows how to. Linux will go so far as to binary patch some of the hotpaths for performance reasons. Userspace libraries frequently have the same algorithm compiled multiple times, and will select the best one to use at startup, based on which CPU instructions are available.

On a bare metal system, the featureset will not change while the OS is running, and the same expectations exist in the virtual world.  Migration introduces a problem with this expectation; it is literally like unplugging the hard drive and RAM from one computer, plugging it into another, and letting the OS continue from where it was. For the VM not to crash, all the features it is using must continue to work at the destination, or in other words, features which are in use must not disappear at any point.

Therefore, the principle of feature levelling is to calculate the common subset of features available across the pool, and restrict the VM to this featureset. That way the VMs featureset will always work, no matter which pool member it is currently running on.

Hardware

There are many factors affecting the available featureset for VMs to use:

  • The CPU itself
  • The BIOS/firmware settings
  • The hypervisor command line parameters
  • The restrictions which the toolstack chooses to apply

It is also worth identifying that firmware/software updates might reduce the featureset (e.g. Intel releasing a microcode update to disable TSX on Haswell/Broadwell processors).

Hiding features is also tricky; x86 provides no architectural means to do so. Feature levelling is therefore implemented using vendor-specific extensions, which are documented as unstable interfaces and liable to change at any point at any point inf the future (as happened with Ivy Bridge).

XenServer Pre Dundee

Older versions of XenServer would require a new pool member to be identical before it would be permitted to join. Making this happen involved consulting the xe host-cpu-info features, divining some command line parameters for Xen and rebooting.

If everything went to plan, the new slave would be permitted to join the pool. Once a pool had been created, it was assumed to be homogeneous from that point on. The command line parameters had an effect on the entire host, including Xen itself. In a slightly heterogeneous case, the difference in features tended to be features which only Xen would care to use, so was needlessly penalised along with dom0.

XenServer Dundee

In Dundee, we have made some changes:

There are two featuresets rather than one. PV and HVM guests are fundamentally different types of virtualisation, and come with different restrictions and abilities. HVM guests will necessarily have a larger potential featureset than PV, and having a single featureset which is safe for PV guests would apply unnecessary restrictions to HVM guests.

The featuresets are recalculated and updated every boot. Assuming that servers stay the same after initial configuration is unsafe, and incorrect.

So long as the CPU Vendor is the same (i.e. all Intel or all AMD), a pool join will be permitted to happen, irrespective of the available features. The pool featuresets are dynamically recalculated every time a slave joins or leaves a pool, and every time a pool member reconnects after reboot.

When a VM is started, it will be given the current pool featureset. This permits it to move anywhere in the pool, as the pool existed when it started. Changes in pool level have no effect on running VMs. Their featureset is fixed at boot (and is fixed across migrate, suspend/resume, etc.), which matches the expectation of the OS. (One release note is that to update the VM featureset, it must be shut down fully and restarted. This is contrary to what would be expected with a plain reboot, and exists because of some metatdata caching in the toolstack which is proving hard to untangle.)

Migration safety checks are performed between the VMs fixed featureset and the destination hosts featureset. This way, even if the pool level drops because a less capable slave joined, an already-running VM will still be able to move anywhere except the new, less capable slave.

The method of hiding features from VMs now has no effect on Xen and dom0. They never migrate, and will have access to all the available features (albeit with dom0 subject to being a VM in the first place).

Conclusion

We hope that the changes listed above will make Dundee far easier to use in a heterogeneous setup.

All of this information applies equally to inter-pool migration (Storage XenMotion) as intra-pool migration.  In the case of upgrade from older versions of XenServer (Rolling Pool Upgrade, Storage XenMotion again), there is a complication because of having to fill in some gaps in the older toolstacks idea of the VMs featureset.  In such a case, an incoming VM is assumed to have the featureset of the host it lands on, rather than the pool level.  This matches the older logic (and is the only safe course of action), but does result in the VM possibly having a higher featureset than the pool level, and being less mobile as a result.  Once the VM has been shut down and started back up again, it will be given the regular pool level and behave normally.

To summarise the two key points:

  • We expect feature levelling to work between any combination of CPUs from the same vendor (with the exception of PV guests on pre-Nehalem CPUs, which lacked any masking capabilities whatsoever).
  • We expect there to be no need for any manual feature configuration to join a pool together.

That being said, this is a beta release and I highly encourage you to try it out and comment/report back.

Recent Comments
Tobias Kreidl
This is good news; it's been very inconvenient not being able to mix servers with Haswell-based processors with other Intel chip s... Read More
Wednesday, 23 December 2015 16:22
Tobias Kreidl
I would add that a re-calculation of the feature set after each reboot is an excellent idea as I have seen the CPU mask change as ... Read More
Wednesday, 23 December 2015 16:35
Willem Boterenbrood
Good news indeed, I think it is important to make sure that features you include in XenServer (or any product) are going to keep w... Read More
Friday, 25 December 2015 01:20
Continue reading
20389 Hits
3 Comments

Improving dom0 Responsiveness Under Load

Recently, the XenServer Engineering Team has been working on improving the responsiveness of the control domain when it is under heavy load. Many VMs doing lots of I/O operations can prevent one from connecting to the host through ssh or make the XenCenter session disconnected with no apparent reason. All of this happened when the control plane was overloaded by the datapath plane, leaving very little CPU for such important processes as sshd or xapi. Let's have a look at how much time it takes to repeatedly execute a simple xe vm-list command on a host with 20 VM pairs doing heavy network communication:

b2ap3_thumbnail_01.png

Most of the commands took around 2-3 seconds to complete, but some of them took as long as 30 seconds. The 2-3 seconds is slower than it should be, and 20-30 seconds is way outside of a reasonable operating window. The slow reaction time of 3 seconds and the heavy spikes of 30 seconds visible on the graph above are two separate issues affecting the responsiveness of the control commands. Let's tackle them one by one.

To fix the 2-3 seconds slowdown, we took advantage of the Linux kernel feature called cgroups (control groups). Cgroups allows the aggregation of processes into separate hierarchies that manage their access to various resources (CPU, memory, network). In our case, we utilised the CPU resource isolation, placing all control path daemons in the cpu control group subsystem. Giving them ten times more cpu share than datapath processes guarantee they would get enough computing power to execute control operations in a timely fashion. It's worth pointing out, that it does not slow down the datapath in times when the control plane is idle. The datapath reduces its cpu usage only when control operations need to run.

b2ap3_thumbnail_02_20151222-162855_1.png

We can see that the majority of the commands took just a fraction of a second to execute, which solves the first of our problems.

What about the commands that took 20-30 seconds to print out the list of VMs? This was caused by the way in which xapi handles the creation of threads, limiting the rate based on current load and memory usage in dom0. When the load goes too high, there is not enough xapi threads to handle the requests, which results in periodic spikes in the executions of the xe commands. However, this feature was useful when the dom0 was 32 bit and when the increased number of threads might have caused some issues to the stability of the whole system. Since dom0 is 64bit, and with the control groups enabled, we decided it is perfectly safe to get rid of xapi’s thread limiting feature.

With these changes applied, the execution times of control path commands became as one would expect them to be:

b2ap3_thumbnail_03_20151222-162856_1.png

In spite of heavy I/O load, control path processes receive all the CPU they need to get the job done, so can do it without any delay, leaving the user with a nicely responsive host regardless of the load in the guests. This makes a tremendous difference to the user-experience when interacting with the host via XenCenter, the xe CLI or over SSH.

Another real world example in which we expected significant improvements is bootstorm. In this benchmark we start more than hundred VMs and measure how much time it takes for the guests to become fully operational (time measured from starting the 1st VM to the completion of the n-th VM). Usual strategy employed is to run 25 VMs at a time. Following is the comparison of the results before and after the changes:

b2ap3_thumbnail_4495.png

Before, booting guests overloaded the control path which slowed down the boot process of latter VMs. After our improvements, the time of booting consecutive guests grows linearly with the whole benchmark completing twice as fast compared to the build without changes.

Another view on the same data - showing the time to boot a single VM:

b2ap3_thumbnail_4496.png

CPU resource isolation and xapi improvements make VMs resilient to the load generated by the simultaneously booting guests. Each of them takes the same amount of time to become ready compared to the significant increase that happened for the host without changes. That is how you would expect for the control plane to operate.

What other benefits would that improvements bring for the XenServer users? They will have no more problems with synchronizing XenCenter with the host and issuing commands to xapi. We expect now that XenDesktop users should be able to start many VMs in the pool master leaving it still responding to control path commands. It would allow them to start more VMs on the master, reducing the necessary hardware and decreasing the total cost of ownership. Cloud administrators can have increased confidence in their ability to administer the host despite unknown and unpredictable workloads in their tenants’ VMs.

TECHNICAL DETAILS

For anyone interested in playing around with the new feature, here are a couple of details of the implementation and the organisation of files in the dom0.

All the changes are contained in a single rpm called control-slice.

The control-slice itself is a systemd unit that defines a new slice to which all control-path daemons are assigned. You can find its configuration in the following file:

# cat /etc/systemd/system/control.slice 
[Unit]
Description=Control path slice
[Slice] CPUShares=10240

By modifying the CPUShares parameter one can change the cpu shares that control-path processes will get. Since the base value is 1024, assigning the shares of, for example, 2048 would mean that control-path processes would get twice the processing power than datapath processes. The default value for the control-slice is 10240, which means control-path processes get up to ten time more cpu than datapath. To apply the changes one has to reload the configuration and restart the control.slice unit:

# systemctl daemon-reload
# systemctl restart control.slice

Each daemon that belongs to the control-slice has a simple configuration file that specifies the name of the slice that it belongs to, for example for xapi we have:

# cat /etc/systemd/system/xapi.service.d/slice.conf 
[Service]
Slice=control.slice

Last but not least, systemd provides admins with a powerful utility that allows monitoring cgroups resources utilisation. These can be examined by typing the following command:

# systemd-cgtop

Above improvements are planned for the forthcoming XenServer Dundee release, and can be experienced with the Dundee beta.2 preview. Let us know if you liked it and if it made a difference to you!

Recent Comments
Tobias Kreidl
Really nice results and a nod to the engineering teams to keep identifying bottlenecks and improving each subsequent version of Xe... Read More
Tuesday, 22 December 2015 19:10
Rafal Mielniczuk
Hi fbifido, thanks for comment. These changes have no effect on storage performance, they affect only execution speed of control p... Read More
Thursday, 24 December 2015 14:27
Continue reading
17566 Hits
5 Comments

XenServer Dundee Beta.2 Available

With 2015 quickly coming to a close, and 2016 beckoning, it's time to deliver a holiday present to the XenServer community. Today, we've released beta 2 of project Dundee. While the lag between beta 1 in September and today has been a bit longer than many would've liked, part of that lag was due to the effort involved in resolving many of the issues reported. The team is confident you'll find both beta 2 and Steve Wilson's blog affirming Citrix's commitment to XenServer to be a nice gift. As part of that gift, we're planning to have a series of blogs covering a few of the major improvements in depth, but for those of you who like the highlights - let's jump right in!

CPU leveling

XenServer has supported for many years the ability to create resource pools with processors from different CPU generations, but a few years back a change was made with Intel CPUs which impacted our ability mix the newest CPUs with much older ones. The good news is that with Dundee beta.2, that situation should be fully resolved, and may indeed offer some performance improvements. Since this is an area where we really need to get things absolutely correct, we'd appreciate anyone running Dundee to try this out if you can and report back on successes and issues.

Increased scalability

Modern servers keep increasing their capacity, and not only do we need to keep pace, but we need to ensure users can create VMs which mirror the capacity of a physical machines. Dundee beta.2 now supports up to 512 physical cores (pCPUs), and can create guest VMs with up to 1.5 TB RAM. Some of you might ask about increasing vCPU limits, and we've bumped those up to 32 as well. We've also enabled Page Modification Logging (PML) in the Xen Project hypervisor as a default. The full design details for PML are posted in the Xen Project Archives for review if you'd like to get into the weeds of why this is valuable. Lastly we've bumped the kernel version to 3.10.93.

New SUSE templates

SUSE have released version 12 SP1 for both (SUSE Linux Enterprise Server) SLES and (SUSE Linux Enterprise Desktop) SLED, both of which are now supported templates in Dundee.

Security updates

Since Dundee beta.1 was made available in late September, a number of security hotfixes for XenServer 6.5 SP1 have been released. Where valid, those same security patches have been applied to Dundee and are included in beta.2.

Download Information

You can download Dundee beta.2 from the Preview Download page (http://xenserver.org/preview), and any issues found can be reported in our defect database (https://bugs.xenserver.org).     

Recent Comments
Tobias Kreidl
Great news, Tim! Is it correct the 32 VCPUs are now supported for both Linux and Windows guests (where of course supported by the ... Read More
Tuesday, 22 December 2015 04:43
Tim Mackey
Thanks, Tobias. This would be 32 vCPUS for both PV and HVM. I *think* we're at 254 VDIs in Dundee, but with the caveat things mi... Read More
Thursday, 24 December 2015 01:59
Tim Mackey
N3ST, We never commit to a stable upgrade path from any preview version to a final release. It's been known to work in previous ... Read More
Thursday, 24 December 2015 02:02
Continue reading
19403 Hits
15 Comments

Review: XenServer 6.5 SP1 Training CXS-300

A few weeks ago, I received an invitation to participate in the first new XenServer class to be rolled out in over three years, namely CXS-300: Citrix XenServer 6.5 SP1 Administration. Those of you with good memories may recall that XenServer 6.0, on which the previous course was based, was officially released on September 30, 2011. Being an invited guest in what was to be only the third time the class had been ever held was something that just couldn’t be passed up, so I hastily agreed. After all, the evolution of the product since 6.0 has been enormous. Plus, I have been a huge fan of XenServer since first working with version 5.0 back in 2008.  Shortly before the open-sourcing of XenServer in 2013, I still recall the warnings of brash naysayers that XenServer was all but dead. However, things took a very different turn in the summer of 2013 with the open-source release and subsequent major efforts to improve and augment product features. While certain elements were pulled and restored and there was a bit of confusion about changes in the licensing models, things have stabilized and all told, the power and versatility of XenServer with the 6.5 SP1 release is at a level now some thought it would never reach.

FROM 6.0 TO 6.5 – AND BEYOND

XenServer (XS for short) 6.5 SP1 made its debut on May 12, 2015. The feature set and changes are – as always – incorporated within the release notes. There are a number of changes of note that include an improved hotfix application mechanism, a whole new XenCenter layout (since 6.5), increased VM density, more guest OS support, a 64-bit kernel, the return of workload balancing (WLB) and the distributed virtual switch controller (DVSC) appliance, in-memory read caching, and many others. Significant improvements have been made to storage and network I/O performance and overall efficiency. XS 6.5 was also a release that benefited significantly from community participation in the Creedence project and the SP1 update builds upon this.

One notable point is that XenServer has been found to now host more XenDesktop/XenApp (XD/XA) instances than any other hypervisor (see this reference). And, indeed, when XenServer 6.0 was released, a lot of the associated training and testing on it was in conjunction with Provisioning Services (PVS). Some users, however, discovered XenServer long before this as a perfectly viable hypervisor capable of hosting a variety of Linux and Windows virtual machines, without having even given thought to XenDesktop or XenApp hosting. For those who first became familiar with XS in that context, the added course material covering provisioning services had in reality relatively little to do with XenServer functionality as an entity. Some viewed PVS an overly emphasized component of the course and exam. In this new course, I am pleased to say that XS’s original roots as a versatile hypervisor is where the emphasis now lies. XD/XA is of course discussed, but the many features available that are fundamental to XS itself is what the course focuses on, and it does that well.

COURSE MATERIALS: WHAT’S INCLUDED

The new “mission” of the course from my perspective is to focus on the core product itself and not only understand its concepts, but to be able to walk away with practical working knowledge. Citrix puts it that the course should be “engaging and immersive”. To that effect, the instructor-led course CXS-300 can be taken in a physical classroom or via remote GoToMeeting (I did the latter) and incorporates a lecture presentation, a parallel eCourseware manual plus a student exercise workbook (lab guide) and access to a personal live lab during the entire course. The eCourseware manual serves multiple purposes, providing the means to follow along with the instructor and later enabling an independent review of the presented material. It adds a very nice feature of providing an in-line notepad for each individual topic (hence, there are often many of these on a page) and these can be used for note taking and can be saved and later edited. In fact, a great takeaway of this training is that you are given permanent access to your personalized eCourseware manual, including all your notes.

The course itself is well organized; there are so many components to XenServer that five days works out in my opinion to be about right – partly because often question and answer sessions with the instructor will take up more time than one might guess, and also, in some cases all participants may have already some familiarity with XS or other hypervisor that makes it possible to go into some added depth in some areas. There will always need to be some flexibility depending on the level of students in any particular class.

A very strong point of the course is the set of diagrams and illustrations that are incorporated, some of which are animated. These compliment the written material very well and the visual reinforcement of the subject matter is very beneficial. Below is an example, illustrating a high availability (HA) scenario:

XS6.5SP1_course_image.jpg 

 

The course itself is divided into a number of chapters that cover the whole range of features of XS, enforced by some in-line Q&A examples in the eCourseware manual and with related lab exercises.  Included as part of the course are not only important standard components, such as HA and Xenmotion, but some that require plugins or advanced licenses, such as workload balancing (WLB), the distributed virtual switch controller (DVSC) appliance and in-memory read caching. The immediate hands-on lab exercises in each chapter with the just-discussed topics are a very strong point of the course and the majority of exercises are really well designed to allow putting the material directly to practical use. For those who have already some familiarity with XS and are able to complete the assignments quickly, the lab environment itself offers a great sandbox in which to experiment. Most components can readily be re-created if need be, so one can afford to be somewhat adventurous.

The lab, while relying heavily on the XenCenter GUI for most of the operations, does make a fair amount of use of the command line interface (CLI) for some operations. This is a very good thing for several reasons. First off, one may not always have access to XenCenter and knowing some essential commands is definitely a good thing in such an event. The CLI is also necessary in a few cases where there is no equivalent available in XenCenter. Some CLI commands offer some added parameters or advanced functionality that may again not be available in the management GUI. Furthermore, many operations can benefit from being scripted and this introduction to the CLI is a good starting point. For Windows aficionados, there are even some PowerShell exercises to whet their appetites, plus connecting to an Active Directory server to provide role-based access control (RBAC) is covered.

THE INSTRUCTOR

So far, the materials and content have been the primary points of discussion. However, what truly can make or break a class is the instructor. The class happened to be quite small, and primarily with individuals attending remotely. Attendees were in fact from four different countries in different time zones, making it a very early start for some and very late in the day for others. Roughly half of those participating in the class were not native English speakers, though all had admirable skills in both English and some form of hypervisor administration.  Being all able to keep up a common general pace allowed the class to flow exceptionally well. I was impressed with the overall abilities and astuteness of each and every participant.

The instructor, Jesse Wilson, was first class in many ways. First off, knowing the material and being able to present it well are primary prerequisites. But above and beyond that was his ability to field questions related to the topic at hand and even to go off onto relevant tangential material and be able to juggle all of that and still make sure the class stayed on schedule. Both keeping the flow going and also entertaining enough to hold students’ attention are key to holding a successful class. When elements of a topic became more of a debatable issue, he was quick to not only tackle the material in discussion, but to try this out right away in the lab environment to resolve it. The same pertained to demonstrating some themes that could benefit from a live demo as opposed to explaining them just verbally. Another strong point was his adding his own drawings to material to further clarify certain illustrations, where additional examples and explanations were helpful.

SUMMARY

All told, I found the course well structured, very relevant to the product and the working materials to be top notch. The course is attuned to the core product itself and all of its features, so all variations of the product editions are covered.

Positive points:

  • Good breadth of material
  • High-quality eCourseware materials
  • Well-presented illustrations and examples in the class material
  • Q&A incorporated into the eCourseware book
  • Ability to save course notes and permanent access to them
  • Relevant lab exercises matching the presented material
  • Real-life troubleshooting (nothing ever runs perfectly!)
  • Excellent instructor

Desiderata:

  • More “bonus” lab materials for those who want to dive deeper into topics
  • More time spent on networking and storage
  • A more responsive lab environment (which was slow at times)
  • More coverage of more complex storage Xenmotion cases in the lecture and lab

In short, this is a class that fulfills the needs of anyone from just learning about XenServer to even experienced administrators who want to dive more deeply into some of the additional features and differences that have been introduced in this latest XS 6.5 SP1 release. CXS-300: Citrix XenServer 6.5 SP1 Administration represents a makeover in every sense of the word, and I would say the end result is truly admirable.

Continue reading
18111 Hits
0 Comments

XenServer Dundee Beta 1 Available

We are very pleased to make the first beta of XenServer Dundee available to the community. As with all pre-release downloads, this can be found on the XenServer Preview page. This release does include some potential commercial features, and if you are an existing Citrix customer you can access those features using the XenServer Tech Preview. It's also important to note that a XenServer host installed from the installer obtained from either source will have identical version number and identical functionality. Application of a Tech Preview license unlocks the potential commercial functionality. So with the "where do I get Dundee beta 1" out of the way, I bet you're all interested in what the cool bits are, and what things might be worth paying attention to. With that in mind, here are some of the important platform differences between XenServer 6.5 SP1 and Dundee beta 1.

Updated dom0

The control domain, dom0, has undergone some significant changes. Last year we moved to a 64 bit control domain with a 3.10 kernel as part of our effort to increase overall performance and scalability. That change allowed us to increase VM density to 1000 VMs per host while making some significant gains in both storage and network performance. The dom0 improvements continue, and could have a direct impact on how you manage a XenServer.

CentOS 7

dom0 now uses CentOS 7 as it's core operating system, and along with that change is a significant change in how "agents" and some scripts run. CentOS 7 has adopted systemd, and by extension so too has XenServer. This means that shell scripts started at system initialization time will need to change to follow the unit and service definition model specified for systemd.

cgroups for Control Isolation

Certain xapi processes have been isolated into Linux control groups. This allows for greater system stability under extreme resource pressure which has created a considerably more deterministic model for VM operations. The most notable area where this can be observed is under bootstorm conditions. In XenServer 6.5 and prior, starting large numbers of VMs could result in start operations being blocked due to resource contention which could result in some VMs taking significantly longer to start than others. With xapi isolation into cgroups, VM start operations no longer block as before resulting in VM start times being much more equitable. This same optimization can be seen in other VM operations such as when large quantities of VMs are shutdown.

RBAC Provider Changes

XenServer 6.5 and prior used an older version of Likewise to provide Active Directory. Likewise is now known as Power Broker, and XenServer is using the Power Broker Identity Services to provide authentication for RBAC. This has improved performance, scale and reliability, especially for complex or nested directory structures. Since RBAC is core to delegated management of a XenServer environment, we are particularly interested in feedback on any issues users might have with RBAC in Dundee beta 1.

dom0 Disk Space Usage

In XenServer 6.5 and prior, dom0 disk space was limited to 4GB. While this size was sufficient for many configurations, it was limiting for more than a few of you. As a result we've split dom0 disk into three core partitions; system, log and swap. The system partition is now 18GB which should provide sufficient for some time to come. This also means that the overall install space required for XenServer increases from 8GB to 46GB. As you can imagine, given the importance of this major change, we are very interested to learn of any situations where this change prevents XenServer from installing or upgrading properly.

Storage Improvements

Having flexible storage options is very important to efficient operation of any virtualization solution. To that end, we've added in support for three highly requested storage improvements; thin provisioned block storage, NFSv4 and FCoE.

Thin Provisioned Block Storage

iSCSI and HBA block storage can now be configured to be thinly provisioned. This is of particular value to those users who provision guest storage with a high water mark expecting that some allocated storage won't be used. With XenServer 6.5 and prior, the storage provider would allocate the entire disk space which could result in a significant reduction in storage utilization which in turn would increase the cost of virtualization. Now block storage repositories can be configured with an initial size and an increment value. Since storage is critical in any virtualization solution, we are very interested in feedback on this functional change.

FCoE

Fibre Channel over Ethernet is protocol which allows Fibre Channel traffic to travel over standard ethernet networks. XenServer now is able to communicate with FCoE enabled storage solutions, and can be configured at install time to allow boot from SAN with FCoE. If you are using FCoE in your environment, we are very interested in learning both any issues as well as learning what CNA you used during your tests.

Operational Improvements

Many additional system level improvements have been made for Dundee beta 1, but the following highlight some of the operational improvements which have been made.

UEFI Boot

XenServer 6.5 and prior required legacy BIOS mode to be enabled on UEFI based servers. With Dundee beta 1, servers with native UEFI mode enabled should now be able to install and run XenServer. If you encounter a server which fails to install or boot XenServer in UEFI mode, please provide server details when reporting the incident.

Automatic Health Check

XenServer can now optionally generate a server status report on a schedule and automatically upload it to Citrix Insight Services (formerly known as TaaS). CIS is a free service which will then perform the analysis and report on any health issues associated with the XenServer installation. This automatic health check is in addition to the manual server status report facility which has been in XenServer for some time.

Improved Patch Management in XenCenter

Application of XenServer patches through XenCenter has become easier. The XenCenter updates wizard has been rewritten to find all patches available on Citrix’s support website, rather than ones that have been installed on other servers. This avoids missing updates, and allows automatic clean-up of patches files at the end of the installation.

Why Participate in the Beta Program

These platform highlights speak to how significant the engineering effort has been to get us to beta 1. They also overshadow some other arguably core items like a move to the Xen Project Hypervisor 4.6, host support for up to 5TB of host RAM or even Windows guest support for up to 1TB RAM. What they do show is our commitment to the install base and their continued adoption of XenServer at scale. Last year we ran an incredibly successful prerelease program for XenServer Creedence, and its partly through that program that XenServer 6.5 is as solid as it is. We're building on that solid base in the hopes that Dundee will better those accomplishments, and we're once again requesting your help. Download Dundee. Test it in your environment. Push it, and let us know how it goes. Just please be aware that this is prerelease code which shouldn't be placed in production and that we're not guaranteeing you'll ever be able to upgrade from it.

Download location: http://xenserver.org/prerelease

Defect database: https://bugs.xenserver.org

Tags:
Recent Comments
Ezequiel Mc Govern
Ceph SR suppport is On the roadmap?
Tuesday, 22 September 2015 19:15
Tim Mackey
Dan, I'll need to check the local SR file system, but think its still ext3. Local storage is an important use case, so I'm curiou... Read More
Monday, 28 September 2015 15:31
Senol Colak
Hi Tim, simple, ext3 is not supporting SSD drives with TRIM. You need ext4 to get TRIM support. After the successfull installatio... Read More
Thursday, 01 October 2015 13:19
Continue reading
44808 Hits
36 Comments

XenServer Dundee Alpha.3 Available

The XenServer team is pleased to announce the availability of the third alpha release in the Dundee release train. This release includes a number of performance oriented items and includes three new functional areas.

  • Microsoft Windows 10 driver support is now present in the XenServer tools. The tools have yet to be WHQL certified and are not yet working for GPU use cases, but users can safely use them to validate Windows 10 support.
  • FCoE storage support has been enabled for the Linux Bridge network stack. Note that the default network stack is OVS, so users wishing to test FCoE will need to convert the network stack to Bridge and will need to be aware of the feature limitations in Bridge relative to OVS.
  • Docker support present in XenServer 6.5 SP1 is now also present in Dundee

Considerable work has been performed to improve overall I/O throughput on larger systems and improve system responsiveness under heavy load. As part of this work, the number of vCPUs available to dom0 have been increased on systems with more than 8 pCPUs. Early results indicate a significant improvement in throughput compared to Creedence. We are particularly interested in hearing from users who have previously experienced responsiveness or I/O bottlenecks to look at Alpha.3 and provide their observations.

Dundee alpha.3 can be downloaded from the pre-release download page.     

Recent Comments
Sam McLeod
Hi Tim, Great to hear about the IO! As you know, we're very IO intensive and have very high speed flash-only iSCSI storage. I'll ... Read More
Thursday, 20 August 2015 01:50
Tim Mackey
Excellent, Sam. I look forward to hearing if you feel we've moved in the right direction and by how much.
Thursday, 20 August 2015 02:08
Chris Apsey
Has any testing been done with 40gbps NICS? We are considering upgrading to Chelsio T-580-LP-CRs, and if the ~24gbps barrier has ... Read More
Thursday, 20 August 2015 19:25
Continue reading
21072 Hits
10 Comments

Behind the Mountain - Load Testing XenServer 6.5 SP1

Behind the Mountain - Load Testing XenServer 6.5 SP1

With the right equipment (CPU and memory) – previous releases of XenServer like 6.1 are able to run up to 150 concurrent instances of Microsoft Windows and Linux VMs on a single server. More recent releases like 6.2 and 6.5 have pushed this envelope to the higher and higher altitudes of 500 and 650 Windows and Linux VMs respectively. The newly released 6.5 SP1 version has climbed yet higher into the stratosphere and is now able to run an amazing 1000 Windows 7 VMs. The next question is always - OK so the system can run these large numbers of VMs - but how many are useful in real-world environments and use-cases?

To answer this - we carry out detailed performance investigations to understand how well the system behaves under LoginVSI loads which simulate a typical knowledge-worker as a desktop (email, browser, apps etc.). In the case of the following test, the workload was a LoginVSI Medium one. We then determine the maximum number of such VMs which can be run like this, whilst offering the end-user really good performance and responsiveness.  

  • Starting with changes made to XenServer 6.5 - we show here that XS 6.5 can handle 500 VMs with every single one of them performing acceptably. Thus XenServer 6.5 enjoys a massive 40% improvement over XenServer 6.2 in this metric, and a massive 125% improvement over XenServer 6.1. Full details and plots etc. are shown in the article here.
  • On top of this - we have now completed the same measurements using XenServer 6.5 SP1. From this, I am very happy to say that 6.5 SP1 is able to run yet more, a lot more. In fact - an astonishing 20% more responsive VMs than 6.5 using the LoginVSI 3.5 workload. It gives a LoginVSI max score of 600 out of its maximum 1000 Windows 7 VMs which we can run on this host.  

With a bigger server, we fully expect XenServer to be able to run 1000 with the same responsiveness for full desktop workloads like LoginVSI. The full details and plots etc. are shown in the article here.

Recent Comments
Tobias Kreidl
Thanks for the post, Andy. It would be useful to have metrics for Windows 8.X and, come to think of it, also Windows 10 VMs once a... Read More
Thursday, 13 August 2015 17:34
Andrew Halley
Thanks Tobias. We're in the process of upgrading our LoginVSI suite to 4.0. The newer versions of LoginVSI have more intensive w... Read More
Thursday, 03 September 2015 15:56
Continue reading
17179 Hits
3 Comments

Storage XenMotion in Dundee

Before we dive into the enhancements brought in the Storage XenMotion(SXM) feature in the XenServer Dundee Alpha 2 release; here is the refresher of various VM migrations supported in XenServer and how users can leverage them for different use cases.

XenMotion refers to the live migration of a VM(VM's disk residing on a shared storage) from one host to another within a pool with minimal downtime. Hence, with XenMotion we are moving the VM without touching its disks. E.g. XenMotion feature is very helpful during host and pool maintenance and is used with XenServer features such as Work Load Balancing(WLB)and Rolling Pool Upgrade(RPU) where the VM's residing on shared ­­­­storage can be moved to other host within a pool.

Storage XenMotion is the marketing name given to two distinct XenServer features, live storage migration and VDI move. Both features refer to the movement of a VM's disk (vdi) from one storage repository to another. Live storage migration also includes the movement of the running VM from one host to another host. In the initial state, the VM’s disk can reside either on local storage of the host or shared storage. It can then be motioned to either local storage of another host or shared storage of a pool (or standalone host). The following classifications exist for SXM:

  • When source and destination hosts happen to be part of the same pool we refer to it as IntraPool SXM. You can choose to migrate the VM's vdis to local storage of the destination host or another shared storage of the pool. E.g. Leverage it when you need to live migrate VMs residing on a slave host to the master of a pool.
  • When source and destination hosts happen to be part of different pools we refer to it as CrossPool SXM. VM migration between two standalone hosts can also be regarded as CrossPool SXM. You can choose to migrate the VM's vdis to local storage of the destination host or shared storage of the destination pool. E.g. I often leverage CrossPool SXM when I need to migrate VM’s residing on pool having old version of XenServer to the pool having latest XenServer
  • When the source and destination hosts are the same, we refer to it as LiveVDI Move. E.g. Leverage LiveVDI move when you want to upgrade your storage arrays but you don’t want to move the VMs to some another host. In such cases, you can LiveVDI move VM's vdi say from the shared storage (which is planned to be upgraded) to another shared storage in a pool without taking down your VM’s.

When SXM was first rolled out in XenServer 6.1 ,there were some restrictions on VMs before they can be motioned such as maximum number of snapshots a VM can have while undergoing SXM, VM having checkpoints cannot be motioned, the VM has to be running (otherwise it’s not live migration) etc. For the XenServer DundeeAlpha 2 release the XenServer team has removed some of those constraints .Thus below is the list of enhancements brought to SXM.

1. VM can be motioned regardless of its power status. Therefore I can successfully migrate a suspended VM or move a halted VM within a pool (intra pool) or across pools (cross pool)

a1sx2_Thumbnail1_CroosPoolCopyEdited_20150811-093802_1.jpg

a1sx2_Thumbnail1_CrossPoolSuspendEdited.jpg

2. VM can have more than one snapshot and checkpoint during SXM operation. Thus VM can have a mix of snapshots and checkpoints and it can still be successfully motioned.

a1sx2_Thumbnail1_CrossPoolCheckpointEdited.jpg

3. A Halted VM can be copied from say pool A to pool B (cross-pool copy) .Note that VM’s that are not in halted state cannot be cross pool copied.

a1sx2_Thumbnail1_CroosPoolCopyEdited_20150811-094555_1.jpg

4. User created templates can also be copied from say pool A to pool B (cross-pool copy).Note that system templates cannot be cross pool copied.

a1sx2_Thumbnail1_CrossPoolCopyTemplateEdited.jpg

Well this is not the end of the SXM improvements! Stay tuned with XenServer where in the upcoming release we aim to reduce VM downtime further during migration operations, and do download the Dundee preview and try it out yourself.     

Recent Comments
Tobias Kreidl
Nice post, Mayur! An important point to note in Dundee is that there is now also the option to create thin provisioned storage usi... Read More
Wednesday, 05 August 2015 21:38
Tobias Kreidl
Mayur, The images currently posted here are thumbnails and very hard to read (at least for some of us). Could you either post the ... Read More
Saturday, 08 August 2015 17:55
Mayur Vadhar
Thanks for the comments ,Tobias ! I will edit the post so the images are clearly visible. And I have mentioned cross pool copy opt... Read More
Monday, 10 August 2015 15:26
Continue reading
22802 Hits
6 Comments

Dundee Alpha 2 Released

I am pleased to announce that today we have made available the second alpha build for XenServer Dundee. For those of you who missed the first alpha, it was focused entirely on the move to CentOS 7 for dom0. This important operational change is one long time XenServer users and those who have written management tooling for XenServer should be aware of throughout the Dundee development cycle. At the time of Alpha 1, no mention was made for feature changes, and with Alpha 2 we're going to talk about some features. So here are some of the important items to be aware of.

Thin Provisioning on block storage

For those who aren't aware, when a XenServer SR is using iSCSI or an HBA, the virtual disks have always consumed their entire allocated space regardless of how utilized the actual virtual disk was. With Dundee we now have full thin provisioning for all block storage independent of storage vendor. In order to take advantage of this, you will need to indicate during SR creation that thin provisioning is required. You will also be given the opportunity to specify the default vdi allocation which allows users to optimize vdi utilization against storage performance. We do know about a number of areas still needing attention, but are providing early access such that the community can further identify issues our testing hasn't yet encountered.

NFS version 4

While a simple enhancement, this was identified as a priority item during the Creedence previews last year. We didn't really have the time then to fully implement it, but as of Dundee Alpha 2 you can specify NFS 4 for SR creation in XenCenter.

Intel GVT-d

XenServer 6.5 SP1 introduced support for Intel GVT-d graphics in Haswell and Broadwell chips. This support has been ported to Dundee and is now present in Alpha 2. At this point GPU operations in Dundee should have feature parity to XenServer 6.5 SP1.

CIFS for virtual disk storage

For some time we've had CIFS as an option for ISO storage, but lacked it for virtual disk storage. That has been remedied and if you are running CIFS you can now use it for all your XenServer storage needs.

Changed dom0 disk size

During installation of XenServer 6.5 and prior, a 4GB partition is created for dom0 with an additional 4GB partition created as a backup. For some users, the 4GB partition was too limiting, particularly if remote SYSLOG wasn't used or when third party monitoring tools were installed in dom0. With Dundee we've completely changed the local storage layout for dom0, and this has significant implications for all users wishing to upgrade to Dundee.

New layout

The new partition layout will consume 46GB from local storage. If there is less than 46 GB available, then a fresh install will fail. The new partition layout will be as follows:

  • 512 MB UEFI boot partition
  • 18 GB dom0 partition
  • 18 GB backup partition
  • 4 GB logs partition
  • 1 GB SWAP partition

As you can see from this new partition layout that we've separated logs and SWAP out of the main operating partition, and that we're now supporting UEFI boot.

Upgrades

During upgrade, if there is at least 46 GB available, we will recreate the partition layout to match that of a fresh install. In the event 46GB isn't available, we will shrink the existing dom0 partition from 4 GB to 3.5 GB and create the 512 MB UEFI boot partition.

Downloading Alpha 2

 

Dundee Alpha 2 is available for download from xenserver.org/prerelease

Recent Comments
Tim Stephenson
46Gb+ seems very large for Dom0 :/ We're running XenServer on a set of Dell Poweredge R630's s equipped with dual 16Gb SD card bo... Read More
Tuesday, 14 July 2015 20:17
Tim Mackey
It's not 46GB for dom0, but 46GB for all storage dedicated to XenServer itself (dom0 is only 18GB). I hear you about the SD card ... Read More
Tuesday, 14 July 2015 20:26
Tobias Kreidl
First off, this is fantastic news, regarding what's mentioned (NFSv4, thin provisioning on LVM, bigger partition space, CIFS stora... Read More
Tuesday, 14 July 2015 20:45
Continue reading
29215 Hits
42 Comments

Preview of XenServer Administrators Handbook

Administering any technology can be both fun and challenging at times. For many, the fun part is designing a new deployment while for others the hardware selection process, system configuration and tuning and actual deployment can be a rewarding part of being an SRE. Then the challenging stuff hits where the design and deployment become a real part of the everyday inner workings of your company and with it come upgrades, failures, and fixes. For example, you might need to figure out how to scale beyond the original design, deal with failed hardware or find ways to update an entire data center without user downtime. No matter how long you've been working with a technology, the original paradigms often do change, and there is always an opportunity to learn how to do something more efficiently.

That's where a project JK Benedict and I have been working on with the good people of O'Reilly Media comes in. The idea is a simple one. We wanted a reference guide which would contain valuable information for anyone using XenServer - period. If you are just starting out, there would be information to help you make that first deployment a successful one. If you are looking at redesigning an existing deployment, there are valuable time-saving nuggets of info, too. If you are a longtime administrator, you would find some helpful recipes to solve real problems that you may not have tried yet. We didn't focus on long theoretical discussions, and we've made sure all content is relevant in a XenServer 6.2 or 6.5 environment. Oh, and we kept it concise because your time matters.

I am pleased to announce that attendees of OSCON will be able to get their hands on a preview edition of the upcoming XenServer Administrators Handbook. Not only will you be able to thumb through a copy of the preview book, but I'll have a signing at the O'Reilly booth on Wednesday July 22nd at 3:10 PM. I'm also told the first 25 people will get free copies, so be sure to camp out ;)

Now of course everyone always wants to know what animal which gets featured for the book cover. As you can see below, we have a bird. Not just any bird mind you, but a xenops. Now I didn't do anything to steer O'Reilly towards this, but find it very cool that we have an animal which also represents a very core component in XenServer; the xenopsd. For me, that's a clear indication we've created the appropriate content, and I hope you'll agree.

 

             

Recent Comments
prashant sreedharan
cool ! cant wait to get my hands on the book :-)
Tuesday, 07 July 2015 19:32
Tobias Kreidl
Congratulations, Tim and Jesse, as an update in this area is long overdue and in very good hands with you two. The XenServer commu... Read More
Tuesday, 07 July 2015 19:42
JK Benedict
Ah, Herr Tobias -- Danke freund. Danke fur ihre unterstutzung! Guten abent!
Thursday, 23 July 2015 09:26
Continue reading
13317 Hits
6 Comments

XenServer's LUN scalability

"How many VMs can coexist within a single LUN?"

An important consideration when planning a deployment of VMs on XenServer is around the sizing of your storage repositories (SRs). The question above is one I often hear. Is the performance acceptable if you have more than a handful of VMs in a single SR? And will some VMs perform well while others suffer?

In the past, XenServer's SRs didn't always scale too well, so it was not always advisable to cram too many VMs into a single LUN. But all that changed in XenServer 6.2, allowing excellent scalability up to very large numbers of VMs. And the subsequent 6.5 release made things even better.

The following graph shows the total throughput enjoyed by varying numbers of VMs doing I/O to their VDIs in parallel, where all VDIs are in a single SR.

3541.png

In XenServer 6.1 (blue line), a single VM would experience modest 240 MB/s. But, counter-intuitively, adding more VMs to the same SR would cause the total to fall, reaching a low point around 20 VMs achieving a total of only 30 MB/s – an average of only 1.5 MB/s each!

On the other hand, in XenServer 6.5 (red line), a single VM achieves 600 MB/s, and it only requires three or four VMs to max out the LUN's capabilities at 820 MB/s. Crucially, adding further VMs no longer causes the total throughput to fall, but remains constant at the maximum rate.

And how well distributed was the available throughput? Even with 100 VMs, the available throughput was spread very evenly -- on XenServer 6.5 with 100 VMs in a LUN, the highest average throughput achieved by a single VM was only 2% greater than the lowest. The following graph shows how consistently the available throughput is distributed amongst the VMs in each case:

4016.png

Specifics

  • Host: Dell R720 (2 x Xeon E5-2620 v2 @ 2.1 GHz, 64 GB RAM)
  • SR: Hardware HBA using FibreChannel to a single LUN on a Pure Storage 420 SAN
  • VMs: Debian 6.0 32-bit
  • I/O pattern in each VM: 4 MB sequential reads (O_DIRECT, queue-depth 1, single thread). The graph above has a similar shape for smaller block sizes and for writes.
Recent Comments
Tobias Kreidl
Very nice, Jonathan, and it is always good to raise discussions about standards that are known to change over time. This is partic... Read More
Friday, 26 June 2015 19:52
Tobias Kreidl
Indeed, depending on the specific characteristics of each storage array there will be some maximum queue depth per connection (por... Read More
Saturday, 27 June 2015 04:27
Jonathan Davies
Thanks for your comments, Tobias and John. You're absolutely right -- the LUN's capabilities are an important consideration. And n... Read More
Monday, 29 June 2015 08:53
Continue reading
16739 Hits
6 Comments

When Virtualised Storage is Faster than Bare Metal

An analysis of block size, inflight requests and outstanding data

INTRODUCTION

Back in August 2014 I went to the Xen Project Developer Summit in Chicago (IL) and presented a graph that caused a few faces to go "ahn?". The graph was meant to show how well XenServer 6.5 storage throughput could scale over several guests. For that, I compared 10 fio threads running in dom0 (mimicking 10 virtual disks) with 10 guests running 1 fio thread each. The result: the aggregate throughput of the virtual machines was actually higher.

In XenServer 6.5 (used for those measurements), the storage traffic of 10 VMs corresponds to 10 tapdisk3 processes doing I/O via libaio in dom0. My measurements used the same disk areas (raw block-based virtual disks) for each fio thread or tapdisk3. So how can 10 tapdisk3 processes possibly be faster than 10 fio threads also using libaio and also running in dom0?

At the time, I hypothesised that the lack of indirect I/O support in tapdisk3 was causing requests larger than 44 KiB (the maximum supported request size in Xen's traditional blkif protocol) to be split into smaller requests. And that the storage infrastructure (a Micron P320h) was responding better to a higher number of smaller requests. In case you are wondering, I also think that people thought I was crazy.

You can check out my one year old hypothesis between 5:10 and 5:30 on the XPDS'14 recording of my talk: https://youtu.be/bbdWFB1mBxA?t=5m10s

20150525-01-slide.jpg

TRADITIONAL STORAGE AND MERGES

For several years operating systems have been optimising storage I/O patterns (in software) before issuing them to the corresponding disk drivers. In Linux, this has been achieved via elevator schedulers and the block layer. Requests can be reordered, delayed, prioritised and even merged into a smaller number of larger requests.

Merging requests has been around for as long as I can remember. Everyone understands that less requests mean less overhead and that storage infrastructures respond better to larger requests. As a matter of fact, the graph above, which shows throughput as a function of request size, is proof of that: bigger requests means higher throughput.

It wasn't until 2010 that a proper means to fully disable request merging came into play in the Linux kernel. Alan Brunelle showed a 0.56% throughput improvement (and less CPU utilisation) by not trying to merge requests at all. I wonder if he questioned that splitting requests could actually be even more beneficial.

SPLITTING I/O REQUESTS

Given the results I have seen on my 2014 measurements, I would like to take this concept a step further. On top of not merging requests, let's forcibly split them.

The rationale behind this idea is that some drives today will respond better to a higher number of outstanding requests. The Micron P320h performance testing guide says that it "has been designed to operate at peak performance at a queue depth of 256" (page 11). Similar documentation from Intel uses a queue depth of 128 to indicate peak performance of its NVMe family of products.

But it is one thing to say that a drive requires a large number of outstanding requests to perform at its peak. It is a different thing to say that a batch of 8 requests of 4 KiB each will complete quicker than one 32 KiB request.

MEASUREMENTS AND RESULTS

So let's put that to the test. I wrote a little script to measure the random read throughput of two modern NVMe drives when facing workloads with varying block sizes and I/O depth. For block sizes from 512 B to 4 MiB, I am particularly interested in analysing how these disks respond to larger "single" requests in comparison to smaller "multiple" requests. In other words, what is faster: 1 outstanding request of X bytes or Y outstanding requests of X/Y bytes?

My test environment consists of a Dell PowerEdge R720 (Intel E5-2643v2 @ 3.5GHz, 2 Sockets, 6 Cores/socket, HT Enabled), with 64 GB of RAM running Linux Jessie 64bit and the Linux 4.0.4 kernel. My two disks are an Intel P3700 (400GB) and a Micron P320h (175GB). Fans were set to full speed and the power profiles are configured for OS Control, with a performance governor in place.

#!/bin/bash
sizes="512 1024 2048 4096 8192 16384 32768 65536 131072 262144 524288 \
       1048576 2097152 4194304"
drives="nvme0n1 rssda"

for drive in ${drives}; do
    for size in ${sizes}; do
        for ((qd=1; ${size}/${qd} >= 512; qd*=2)); do
            bs=$[ ${size} / ${qd} ]
            tp=$(fio --terse-version=3 --minimal --rw=randread --numjobs=1  \
                     --direct=1 --ioengine=libaio --runtime=30 --time_based \
                     --name=job --filename=/dev/${drive} --bs=${bs}         \
                     --iodepth=${qd} | awk -F';' '{print $7}')
            echo "${size} ${bs} ${qd} ${tp}" | tee -a ${drive}.dat
        done
    done
done

There are several ways of looking at the results. I believe it is always worth starting with a broad overview including everything that makes sense. The graphs below contain all the data points for each drive. Keep in mind that the "x" axis represent Block Size (in KiB) over the Queue Depth.

20150525-02-nvme.jpg

20150525-03-rssda.jpg

While the Intel P3700 is faster overall, both drives share a common treat: for a certain amount of outstanding data, throughput can be significantly higher if such data is split over several inflight requests (instead of a single large request). Because this workload consists of random reads, this is a characteristic that is not evident in spinning disks (where the seek time would negatively affect the total throughput of the workload).

To make this point clearer, I have isolated the workloads involving 512 KiB of outstanding data on the P3700 drive. The graph below shows that if a workload randomly reads 512 KiB of data one request at a time (queue depth=1), the throughput will be just under 1 GB/s. If, instead, the workload would read 8 KiB of data with 64 outstanding requests at a time, the throughput would be about double (just under 2 GB/s).

20150525-04-nvme512k.jpg

CONCLUSIONS

Storage technologies are constantly evolving. At this point in time, it appears that hardware is evolving much faster than software. In this post I have discussed a paradigm of workload optimisation (request merging) that perhaps no longer applies to modern solid state drives. As a matter of fact, I am proposing that the exact opposite (request splitting) should be done in certain cases.

Traditional spinning disks have always responded better to large requests. Such workloads reduced the overhead of seek times where the head of a disk must roam around to fetch random bits of data. In contrast, solid state drives respond better to parallel requests, with virtually no overhead for random access patterns.

Virtualisation platforms and software-defined storage solutions are perfectly placed to take advantage of such paradigm shifts. By understanding the hardware infrastructure they are placed on top of, as well as the workload patterns of their users (e.g. Virtual Desktops), requests can be easily manipulated to better explore system resources.

Recent Comments
Tobias Kreidl
Fascinating as always, Felipe, and thank you for sharing these results. Indeed, SSD behavior will be different from that of tradit... Read More
Monday, 25 May 2015 18:51
Felipe Franciosi
Thanks for the comment. I am welcoming more people experimenting with these findings and discussing this topic. If this turns out ... Read More
Tuesday, 26 May 2015 15:02
Felipe Franciosi
Thanks for the comment. IOPS measures how many requests (normally of the same size and issued with a constant queue depth) can be... Read More
Tuesday, 26 May 2015 15:17
Continue reading
14879 Hits
4 Comments

New ticket statuses on bugs.xenserver.org

I just wanted to mention that we've created a couple of new ticket statuses on bugs.xenserver.org, which should make it clearer where a ticket has got to in its lifecycle.

Acknowledged Issue will be used when we've done initial triage on the issue, determined that it is a genuine problem, and made a ticket in our internal issue tracker for it. Previously we've varied between leaving these Open, and closing them as Done, but neither choice was very satisfactory, so we decided that a new status was needed. Even when an issue is acknowledged, there is no guarantee when it will be fixed; however, when it has been fixed in the latest nightly builds, we will come back and move the ticket to Done.

Wishlist is for feature requests and the like, as opposed to bugs. They will generally be assigned to our Product Management team. Again, they have a different lifecycle because, while we welcome the feedback on what features are important to you, they may be implemented only much later, if at all, depending on all the priorities which the Product Managers are receiving from different sources.

Several tickets have been converted to these two new statuses already, and hopefully it will give you a clearer idea what we're doing with them than just leaving them all as Open.

Tags:
Continue reading
8842 Hits
0 Comments

Configuring XenApp to use two NVIDIA GRID engines

SUMMARY

The configuration of a XenApp virtual machine (VM) hosted on XenServer that supports two concurrent graphics processing engines in passthrough mode is shown to work reliably and provide the opportunity to give more flexibility to a single XenApp VM rather than having to spread the access to the engines over two separate XenApp VMs. This in turn can provide more flexibility, save operating system licensing costs and ostensibly, could be extended to incorporate additional GPU engines.

INTRODUCTION

A XenApp virtual machine (VM) that supports two or more concurrent graphics processing units (GPUs) has a number of advantages over running separate VM instances, each with its own GPU engine. For one, if users happen to be unevenly relegated to particular XenApp instances, some XenApp VMs may idle while other instances are overloaded, to the detriment of users associated with busy instances. It is also simpler to add capacity to such a VM as opposed to building and licensing yet another Windows Server VM.  This study made use of an NVIDIA GRID K2 (driver release 340.66), comprised of two Kepler GK104 engines and 8 GB of GDDR5 RAM (4 GB per GPU). It is hosted in a base system that consists of a Dell R720 with dual Intel Xeon E5-2680 v2 CPUs (40 VCPUs, total, hyperthreaded) hosting XenServer 6.2 SP1 running XenApp 7.6 as a VM with 16 VCPUs and 16 GB of memory on Windows 2012 R2 Datacenter.

PROCEDURE

It is important to note that these steps constitute changes that are not officially supported by Citrix or NVIDIA and are to be regarded as purely experimental at this stage.

Registry changes to XenApp were made according to these instructions provided in the Citrix Product Documentation.

On the XenServer, first list devices and look for GRID instances:
# lspci|grep -i nvid
44:00.0 VGA compatible controller: NVIDIA Corporation GK104GL [GRID K2] (rev a1)
45:00.0 VGA compatible controller: NVIDIA Corporation GK104GL [GRID K2] (rev a1)

Next, get the UUID of the VM:
# xe vm-list
uuid ( RO)           : 0c8a22cf-461f-0030-44df-2e56e9ac00a4
     name-label ( RW): TST-Win7-vmtst1
    power-state ( RO): running
uuid ( RO)           : 934c889e-ebe9-b85f-175c-9aab0628667c
     name-label ( RW): DEV-xapp
    power-state ( RO): running

Get the address of the existing GPU engine, if one is currently associated:
# xe vm-param-get param-name=other-config uuid=934c889e-ebe9-b85f-175c-9aab0628667c
vgpu_pci: 0/0000:44:00.0; pci: 0/0000:44:0.0; mac_seed: d229f84d-73cc-e5a5-d105-f5a3e87b82b7; install-methods: cdrom; base_template_name: Windows Server 2012 (64-bit)
(Note: ignore any vgpu_pci parameters that are irrelevant now to this process, but may be left over from earlier procedures and experiments.)

Dissociate the GPU via XenCenter or via the CLI, set GPU type to “none”.
Then, add both GPU engines following the recommendations in assigning multiple GPUs to a VM in XenServer using the other-config:pci parameter:
# xe vm-param-set uuid=934c889e-ebe9-b85f-175c-9aab0628667c
   other-config:pci=0/0000:44:0.0,0/0000:45:0.0
In other words, do not use the vgpu_pci parameter at all.

Check if the new parameters took hold:
# xe vm-param-get param-name=other-config uuid=934c889e-ebe9-b85f-175c-9aab0628667c params=all
vgpu_pci: 0/0000:44:00.0; pci: 0/0000:44:0.0,0/0000:45:0.0; mac_seed: d229f84d-73cc-e5a5-d105-f5a3e87b82b7; install-methods: cdrom; base_template_name: Windows Server 2012 (64-bit)
Next, turn GPU passthrough back on for the VM in XenCenter or via the CLI and start up the VM.

On the XenServer you should now see no GPUs available:
# nvidia-smi
Failed to initialize NVML: Unknown Error
This is good, as both K2 engines now have been allocated to the XenApp server.
On the XenServer you can also run “xn –v pci-list  934c889e-ebe9-b85f-175c-9aab0628667c” (the UUID of the VM) and should see the same two PCI devices allocated:
# xn -v pci-list 934c889e-ebe9-b85f-175c-9aab0628667c
id         pos bdf
0000:44:00.0 2   0000:44:00.0
0000:45:00.0 1   0000:45:00.0
More information can be gleaned from the “xn diagnostics” command.

Next, log onto the XenApp VM and check settings using nvidia-smi.exe. The output will resemble that of the image in Figure 1.

 

 GRID-Fig-1.jpg
Figure 1. Output From the nvidia-smi utility, showing the allocation of both K2 engines.


Note the output shows correctly that 4096 MiB of memory are allocated for each of the two engines in the K2, totaling its full capacity of 8196 MiB. XenCenter will still show only one GPU engine allocated (see Figure 2) since it is not aware that both are allocated to the XenApp VM and has currently no way of making that distinction.

 

GRID-Fig-2.jpgFigure 2. XenCenter GPU allocation (showing just one engine – all XenServer is currently capable of displaying)

 

So, how can you tell if it is really using both GRID engines? If you run the nvidia-smi.exe program on the XenApp VM itself, you will see it has two GPUs configured in passthrough mode (see the earlier screenshot in Figure 1). Depending on how apps are launched, you will see one or the other or both of them active.  As a test, we ran two concurrent Unigine "Heaven" benchmark instances and both came out with the same metrics within 1% of each other as well as when just one instance was run, and both engines showed as being active. Displayed in Figure 3 is a sample screenshot of the Unigine ”Heaven” benchmark running with one active instance; note that it sees both K2 engines present, even though the process is making use of just one.


GRID-Fig-3.jpg
Figure 3. A sample Unigine “Heaven” benchmark frame. Note the two sets of K2 engine metrics displayed in the upper right corner.


It is evident from the display in the upper right hand corner that one engine has allocated memory and is working, as evidenced by the correspondingly higher temperature reading and memory frequency. The result of a benchmark using openGL and a 1024x768 pixel resolution is seen in Figure 4. Note again the difference between what is shown for the two engines, in particular the memory and temperature parameters.

 GRID-Fig-4.jpg

Figure 4. Outcome of the benchmark. Note the higher memory and temperature on the second K2 engine.

 

When another instance is running concurrently, you see its memory and temperature also rise accordingly in addition to the load evident on the first engine, as well as activity on both engines in the output from the nvidia-smi.exe utility (Figure 5).


CaptureDualK2c.JPG
Figure 5. Two simultaneous benchmarks running, using both GRID K2 engines, and the nvidia-smi output.

You can also see with two instances running concurrently how the load is affected. Note in the performance graphs from XenCenter shown in Figure 6 how one copy of the “Heaven” benchmark impacts the server and then about halfway across the graphs, a second instance is launched.

 GRID-Fig-6.jpg

Figure 6. XenCenter performance metrics of first one, then a second concurrent Unigine “Heaven” benchmark.


CONCLUSIONS

The combination of two GRID K2 engines associated with a single, hefty XenApp VM works well for providing adequate capacity to support a number of concurrent users in GPU passthrough mode without the need of hosting additional XenApp instances. As there is a fair amount of leeway in the allocation of CPUs and memory to a virtualized instance under XenServer (up to 16 vCPUs and 128 GB of memory under XenServer 6.2 when these tests were run), one XenApp VM should be able to handle a reasonably large number of tasks.  As many as six concurrent sessions of this high-demand benchmark with 800x600 high-resolution settings have been tested with the GPUs still not saturating. A more typical application, like Google Earth, consumes around 3 to 5% of the cycles of a GRID K2 engine per instance during active use, depending on the activity and size of the window, so fairly minimal. In other words, twenty or more sessions could be handled by each engine, or potentially 40 or more for the entire GRID K2 with a single XenApp VM, provided of course that the XenApp’s memory and its own CPU resources are not overly taxed.

XenServer 6.2 already supports as many as eight physical GPUs per host, so as servers expand, one could envision having even more available engines that could be associated with a particular VM. Under some circumstances, passthrough mode affords more flexibility and makes better use of resources compared to creating specific vGPU assignments. Windows Server 2012 R2 Datacenter supports up to 64 sockets and 4 TB of memory, and hence should be able to support a significantly larger number of associated GPUs. XenServer 6.2 SP1 has a processor limit of 16 VCPUs and 128 GB of virtual memory. XenServer 6.5, officially released in January 2015, supports up to four K2 GRID cards in some physical servers and up to 192 GB of RAM per VM for some guest operating systems as does the newer release documented in the XenServer 6.5 SP1 User's Guide, so there is a lot of potential processing capacity available. Hence, a very large XenApp VM could be created that delivers a lot of raw power with substantial Microsoft server licensing savings. The performance meter shown above clearly indicates that VCPUs are the primary limiting factor in the XenApp configuration and with just two concurrent “Heaven” sessions running, about a fourth of the available CPU capacity is consumed compared to less than 3 GB of RAM, which is only a small additional amount of memory above that allocated by the first session.

These same tests were run after upgrading to XenServer 6.5 and with newer versions of the NVIDIA GRID drivers and continue to work as before. At various times, this configuration was run for many weeks on end with no stability issues or errors detected during the entire time.

ACKNOWLEDGEMENTS

I would like to thank my co-worker at NAU, Timothy Cochran, for assistance with the configuration of the Windows VMs used in this study. I am also indebted to Rachel Berry, Product Manager of HDX Graphics at Citrix and her team, as well as Thomas Poppelgaard and also Jason Southern of the NVIDIA Corporation for a number of stimulating discussions. Finally, I would like to greatly thank Will Wade of NVIDIA for making available the GRID K2 used in this study.

Continue reading
19307 Hits
0 Comments

Security bulletin covering VENOM

Last week a vulnerability in QEUM was reported with the marketing name of "VENOM", but which is more correctly known as CVE-2015-3456.  Citrix have released a security bulletin covering CVE-2015-3456 which has been updated to include hotfixes for XenServer 6.5, 6.5 SP1 and XenServer 6.2 SP1.

Learning about new XenServer hotfixes

When a hotfix is released for XenServer, it will be posted to the Citrix support web site. You can receive alerts from the support site by registering at http://support.citrix.com/profile/watches and following the instructions there. You will need to create an account if you don't have one, but the account is completely free. Whenever a security hotfix is released, there will be an accompanying security advisory in the form of a CTX knowledge base article for it, and those same KB articles will be linked on xenserver.org in the download page.

Patching XenServer hosts

XenServer admins are encouraged to schedule patching of their XenServer installations at their earliest opportunity. Please note that this bulletin does impact XenServer 6.2 hosts, and to apply the patch, all XenServer 6.2 hosts will first need to be patched to service pack 1 which can be found on the XenServer download page

Continue reading
28618 Hits
1 Comment

XenServer 6.5 SP1 Released

Wait, another XenServer release? Yes folks, there is no question we've been very busy improving upon XenServer over the past year, and the pace is quite fast. In case you missed it, we released XenServer 6.5 in January (formerly known as Creedence). Just a few weeks ago I announced and made available pre-release binaries for Dundee, and now we've just announced availability at Citrix Synergy of the first service pack for XenServer 6.5. Exciting times indeed.

What's in XenServer 6.5 SP1

I could bury the lead talk about hot fixes and roll-ups (more on that later), but the real value for SP1 is in the increased capabilities. Here are the lead items for this service pack:

  1. The Docker work we previewed in January at FOSDEM and later on xenserver.org is now available. If you've been using xscontainer in preview form, it should upgrade fine, but you should back up any VMs first. Completion of the Docker work also implies that CoreOS 633.1.0 is also an officially supported operating system with SP1. Containers deployed in Unbuntu 14.04 and RHEL, CentOS, and Oracle Enterprise Linux 7 and higher are supported.
  2. Adoption of LTS (long term support) guest support. XenServer guest support has historically required users of guest operating system to wait for XenServer to adopt official support for point releases in order to remain in a supported configuration. Starting with SP1, all supported operating systems can be upgraded within their major version and still retain "supported" status from Citrix support. For example, if a CentOS 6.6 VM is deployed, and the CentOS project subsequently releases CentOS 6.7, then upgrading that VM to CentOS 6.7 requires no changes to XenServer in order to remain a supported configuration.
  3. Intel GVT-d support for GPU pass through for Windows guests. This allows users of Xeon E3 Haswell processors to use the embedded GPU in those processors within a Windows guest using standard Intel graphics drivers.
  4. NVIDIA GPU pass though to Linux VMs allowing OpenGL and CUDA support for these operating systems.
  5. Installation of supplemental packs can now be performed through XenCenter Update. Note that since driver disks are only a special case of a supplemental pack, driver updates or installation of drivers not required for host installation can now also be performed using this mechanism
  6. Virtual machine density has been increased to 1000. What this means is that if you have a server which can reasonably be expected to run 1000 VMs of a given operating system, then using XenServer you can do so. No changes were made to the supported hardware configuration to accommodate this change.

Hotfix process

As with all XenServer service packs, XenServer 6.5 SP1 contains a rollup of all existing hot fixes for XenServer 6.5. This means that when provisioning a new host, your first post-installation step should be to apply SP1. It's also important to call out that when a service pack is released, hotfixes for the prior service pack level will no longer be created within six months. In this case, hotfixes for XenServer 6.5 will only be created through November 12th and following that point hotfixes will only be created for XenServer 6.5 SP1. In order for the development teams to streamline that transition, any defects raised for XenServer 6.5 in bugs.xenserver.org should be raised against 6.5 SP1 and not base 6.5.

Where to get XenServer 6.5 SP1

 

Downloading XenServer 6.5 SP1 is very easy, simply go to http://xenserver.org/download and download it!     

Recent Comments
Tobias Kreidl
Very nice! One quick question: It we want to work with both XS 6.5 SP1 and the technical pre-release Dundee version, is there or w... Read More
Tuesday, 12 May 2015 18:28
Tobias Kreidl
Found out these are 99% compatible (thanks, Andy!), so either should work fine. The Dundee XenCenter release shows a slightly newe... Read More
Wednesday, 13 May 2015 09:39
Stephen Turner
Hi, Tobias. Dundee XenCenter already includes all the changes in 6.5SP1, so you can just use that one to manage everything (everyt... Read More
Wednesday, 13 May 2015 08:23
Continue reading
35879 Hits
16 Comments

Introducing XenServer Dundee

It's spring time and after a particularly brutal winter here in Boston, I for one am happy to see the signs of spring. Grass and greenery, flowers budding, and warmer days all speak to good things coming. It's also time to unveil the next major XenServer project, code named Dundee. As with Creedence last year, we're going to be giving early access to a major new version of XenServer well in advance of its release. This project will have its share of functional improvements, and a few new features, but just like last year we're going to start with the platform and progress slowly.

CentOS 7 dom0

During the Creedence pre-release program, many commented on "Why CentOS 5.x? - CentOS 6 has been out for a while, and 7 is fresh". The answer to that question was pretty simple. We knew what userspace looked like with CentOS 5.x, and our users understood how to manage a CentOS 5.x system. CentOS 5 was being supported upstream until 2017, so there was no risk of us having something unsupported. Moving to CentOS 6.5 would've been a valid option if we didn't already plan on moving to CentOS 7, but we didn't want to change dom0 just to change it again in a years time. Plus if you recall, we took on quite a bit with Creedence in 2014.

So we're now a year later, and CentOS 7 makes perfect sense for dom0. Not only are there a few more upstream patches available, but Linux admins are now more comfortable with the changes in management paradigm. It's also those changes in paradigm which may present issues for you our users, and why this first alpha is all about validation. If you manage XenServer from a tool which uses the xapi SDK, then you shouldn't really experience too many problems. On the other hand, if you've favorite scripts, or tweaks you've made to configuration files, then you could be in for some extra work.

Now is also a perfect time to remind everyone that when you "upgrade" a XenServer, it's not an in place upgrade. We preserve the configuration files we know about, and then dom0 is reimaged. Any third party packages you have installed, custom scripts, and manual configuration changes have a good chance of being lost unless you've backed them up. In this case, with a move to CentOS 7, it's also possible that those items will need to be reworked to some degree.

Understanding the pre-release process

All pre-release downloads will be on our pre-release download page. We'll be providing new tagged builds every few weeks, and generally as we achieve internal milestones. With each build, we'll call out something which you as an interested participant in XenServer Dundee should be looking at. Issues encountered can be logged in the incident database at https://bugs.xenserver.org. Since we've more than one version of XenServer covered in the incident database, please make certain you report Dundee issues under the "Dundee" version. Of course there is no guarantee we'll be able to resolve what you find, but we do want to know about it. With this first alpha, we’re interested in the “big issues” you may hit, i.e. areas which would block usage of features or functionality or cases where there is a major impact. These are really useful as the product develops and matures during the alpha stage. If you are developing something for XenServer, we invite you to ask your questions on the development mailing list, but do remember it's not a product support list.

Lastly, while we're in a pre-release period, its also likely you may eventually encounter functionality which may form part of a commercial edition. At this point we're not committing to what functionality will actually ship, when it might ship, or if it'll require a commercial license. I understand that might be concerning, but it shouldn't be. If something is destined for a commercial edition, you'll see it "commercialized" in a Citrix Tech Preview before we release. Historically we're many months away from when a Tech Preview might happen, so right now the most important thing is to focus on the changes we're interested in your feedback on today - everything to do with a CentOS 7 dom0.

Download Dundee alpha.1: http://xenserver.org/preview

Recent Comments
Tobias Kreidl
Keep the XenServer evolution rolling! Great to see this initiative so soon after Creedence was released.
Tuesday, 28 April 2015 23:37
Tobias Kreidl
Is there a separate document available with just the release notes?
Wednesday, 29 April 2015 16:09
Martin Kralicek
Yes, the document/page containing roadmap or list of new features would be so good to have I am looking forward for next release.... Read More
Thursday, 07 May 2015 13:09
Continue reading
22399 Hits
10 Comments

About XenServer

XenServer is the leading open source virtualization platform, powered by the Xen Project hypervisor and the XAPI toolstack. It is used in the world's largest clouds and enterprises.
 
Technical support for XenServer is available from Citrix.