Virtualization Blog

Discussions and observations on virtualization.

From the Field: XenServer 7.X Pool Upgrades, Part 2: Updates and Aftermath

The XenServer upgrade process with XenServer 7 can be a bit more involved than with past upgrades, particularly if upgrading from a version prior to 7.0. In this series I’ll discuss various approaches to upgrading XS pools to 7.X and some of the issues you might encounter during the process based on my experiences in the field.

Part 1 dealt with some of the preliminary actions needed to be taken into consideration and the planning process, as well as the case for a clean installation. You can go back and review it here. Part 2 deals with the alternative, namely a true upgrade procedure, and what issues may arise during and after the procedure.

In-Place Upgrades to XS 7

An in-place upgrade to XS 7.X is a whole different beast compared to where the OS is overwritten from scratch. Mostly, it depends on whether or not you wish to retain the original partition scheme or go with the newer, larger, and more flexible disk partition layout. The bottom line is despite the added work and potential issues, you may as well go through with the update as it will make life easier later down the road.

Your choices here are to retain the current disk partitioning layout or to switch to the newer partition layout. If you want to know why the new XS disk layout changed, review the “Why was the XenServer 7 Disk Layout Changed?” section in Part 1 of this blog series.

In my experience, issues found with in-place upgrades cover five areas – I’m going to cover them as follows: 

  1. Pre-XS 7 Upgrade, Staying the Course with the Original Partition Layout
  2. The 72% Solution
  3. The (In)famous Dell Utility Partition
  4. XS 7.X to XS 7.X Maintaining the Current Partition Layout
  5. XS 6.X or7.X to XS 7.X With a Change to the New Partition Layout (plus possible issues on Dell Servers with iDRAC modules)

Pre-XS 7 Upgrade, Staying the Course with the Original Partition Layout

If you choose to stick with the original partition layout, the rest of your installation experience – whether you go with the rolling pool upgrade, or conventional upgrade – will be pretty much the same as before.

As with any such upgrade, make sure to check ahead of time that the pool is ready for this undertaking: 

  • XenCenter has been upgraded 
  • Assure the proper backups of VMs, metadata, etc. have been performed 
  • That all VMs are agile or in the case of a rolling pool upgrade, at least shut down on all local SRs. 
  • All hosts must not only be running the same version of XenServer, but must have the identical list of hotfixes applied. 

Depending on the various ages of the hosts and how hotfixes were applied, you could run into the situation where there is a discrepancy. This can be particularly frustrating if an older hotfix has since been superseded and the older version is no longer available, yet shows up in the applied hotfix list. Though I’d only recommend this in dire circumstances (such as this, where the XenServer version is going to be replaced by a new one anyway) there are ways to make XenServers believe they have had the same specific patches applied by manipulating the contents of /var/update/applied directory (see, for example, this forum thread).

One thing you might run into is the following issue, covered in the next section, which incidentally appears to be independent of whichever partition layout you use.

The 72% Solution

This is not really so much of a solution, but rather a warning as to what you may encounter and what to do before you potentially run into this or what to do afterwards should you encounter it. 

This issue apparently crops up frequently during the latter part of the installation process and is characterized by a long wait with a screen showing 72% completion of the installation that can last anywhere from about five minutes to over an hour in extreme cases. 

One apparent cause of this is if you have an exceedingly large number of message files on your current version of XS. If you have the chance, it would be worthwhile taking the time to examine the contents of the area where these are stored and manually clean up any ancient versions within those areas under /var/lib/xcp/blobs/ in particular as most of these reside under the “messages” and “refs” subdirectories, paying attention also to the symbolic links.

If you do get stuck in the midst of a long-lasting installation you can escape out by pressing ALT+F2 to get into the shell and check the installer logs under /tmp/install-log to see if it has thrown any errors (thanks, Nekta -- @nektasingh -- for that tip!). If all looks OK, continue to wait until the process completes.

The (In)famous Dell Utility Partition

Using the rolling pool upgrade for the first time, I ran into a terrible situation in which the upgrade proceeded all the way to the very end and just as the final reboot was about to happen, I got this error popping up on the console:

Which read:

An unrecoverable error has occurred. The error was:

Failed to install bootloader: installing for i386-pc platform.

/usr/sbin/grub-install: warning: your embedding area is unusually small. core.img won’t fit in it..

/usr/sbin/grub-install: warning: Embedding is not possible. GRUB can only be installed in this setup by using blocklists. However blocklists are UNRELIABLE and their use is discouraged..

/usr/sbin/grub-install: error: will not process with blocklists.

You can imagine the reaction.

What caused this and what can you do about it? After all, this was something I’d never encountered before in any previous XS upgrades.

It turns out that the reason for this is the Dell utility partition (see, for example, this link). The Dell utility partition is a five-block partition that Dell puts on for its own purposes at the beginning of many of its servers as they ship. This did not interfere with any installs up to and including XS 6.5 SP1, hence this came to me as a total surprise when I first encountered in during a XS 7 upgrade. 

And while this wasn't an issue initially or in one particular upgrade which for whatever reason managed to squeak by without any errors being reported whatsoever, it's too small to hold the initial configuration needed to do the installation under most circumstances when installing XS 7.

What's bad is that the XenServer installation doesn't apparently perform any pre-checks to see if that first partition on the disk is big enough. This is the case for both UEFI and Legacy/BIOS boots.

The solution was simply to delete that sda1 partition altogether using fdisk and re-install. Deleting the partition can be performed live on the host prior to the installation process.

You can then successfully bypass this issue. I have performed this surgical procedure on a number of hosts with success and have not experienced any adverse effects; you do not even require a vorpal sword to accomplish this task.

Possibly future upgrade processes will be better accommodating and should perform a pre-check for this along with looking at other potential inconsistencies such as the lack of VM agility or uniformity of applied hotfixes.

XS 6.X or 7.X to XS 7.X With a Change to the New Partition Layout 

This is the trickiest area where most, including me, seem to encounter issues.

One point that should be clarified right away is: 

Any local partition on or partly contained on the system disk to be upgraded is going to get destroyed in the process. 

There are no two ways about this. Hence, plan ahead: either 

  • Storage Xenmotion any local VMs to pooled storage, or
  • Export them. 

If you want or need to preserve your existing local storage SRs, you’ll have to stay with the original partition scheme.

Should you decide to update the new partition scheme, before the upgrade, you will need to perform the following action:

# touch /var/preserve/safe2upgrade
The upgrade process (to preserve the pool) will need to be performed using the rolling pool upgrade. The caveat is that if anything goes wrong during any part of the upgrade process, you will have to exit and try to start over. The end point can vary from it being simple to reconvene from where you left off, to having things in a broken state, depending on the circumstances.The pre-checks performed are supposed to catch issues beforehand and allow you to take care of them before launching into the upgrade process, but these do not always trap everything! Above all, make sure that:
  • All VMs are agile, i.e. that there are no VMs running or even resident on any local storage
  • High Availability (HA) and Workload Balancing (WLB) are disabled
  • You have created metadata backups to at least one pooled storage device or exported it to an external location 
  • You have plenty of space to hold VMs on whatever hosts remain in your pool, figuring you’ll have one out at any given point and that initially, only the master will be potentially able to take on VMs from whichever hosts is in the process of being upgraded 
Preferably also have recent backups of all your VMs. It’s generally also a good idea to keep good notes about your various host configurations, including all physical and virtual networks, VLANs, iSCSI and NFS connection and the like.  I’d recommend doing an “export resource data” for the pool, which has been available as a utility since XS 6.5 and can run from XenCenter.To export resource data:
  1. In the XenCenter Navigation pane, click Infrastructure and then click on the pool. 
  2. From the XenCenter menu, click Pool and then select Export Resource Data. 
  3. Browse to a location where you would like to save report and then click Save.
You can chose between an XLS or CVS output. Note that this feature is only available on XenCenter with paid-for licensed version of XenServer. However, it can also be run via the CLI for any (including free) version of XS 6.5 or newer using:
# xe pool-dump-database file-name=target-output-file-name
Being prepared now for the upgrade, you may still run into issues along the way, in particular if you run out of space to move VMs to a different server or if the upgrade process hangs. If the master server manages to make it through the upgrade and the upgrade process fails at some point later, one of the first consequences will be that you will no longer be able to add any external hosts to the pool because the pool will be in a mixed state of hosts running different XS versions. This is not a good situation and makes it very difficult under some circumstances to recover from.

Recommended Process for Changing XenServer Partition Layout

Through various discussions on the XenServer discussion forum as well as some of my own experimentation, this is what I recommend as the most solid and reliable way of doing an upgrade to XS 7.X with the new partition layout coming from a version that still has the old layout. It will take quite a bit longer, but is more reliable.  
In addition to all the preparatory steps listed above:

In addition to all the preparatory steps listed above:

  • You will be needing to eject all hosts from the pool at one point or another, so make sure you have carefully recorded the various network and other settings, in particular some which are not retained as such in the metadata exports, such as the individual iSCSI network connections or NFS connections. 
  • Copy your /etc/fstab file and also be sure to check for any other customizations you may have done, such as cron jobs or additions to /etc/rc.local (and also note that rc.local does not run on its own under XS 7.X, so you will need to manually enable it to do so – see the section “Enabling rc.local to Run On Its Own” below).

Once your enhanced preparation is complete: 

  1. Start with your pool master and do a rolling pool upgrade to it. Do not attempt to upgrade to the new partition layout! After it reboots, it should still have retained the role of pool master. Note that the rolling pool upgrade will not allow you to pick which host it will upgrade next beyond the master so be certain that any of these hosts can have all its VMs fit on the pool master. If necessary move some or all the VMs on the pool master onto other hosts within the pool before you commence the upgrade procedure.
  2. Pick another host and migrate all its VMs to the pool master.
  3. Follow this procedure for all hosts until the pool is completely upgraded to the identical version of XS 7.X on all pool members. You can either continue to use the rolling pool upgrade or if desired, switch to manual upgrade mode.
  4. Making sure you’ve carefully recorded all important storage and network information on that host, migrate all VMs off a non-master host and eject it from the pool.
  5. Touch the file /var/preserve/safe2upgrade and plan on the local SR being wiped and re-created. Then shutdown the host and perform a standalone installation to that just ejected host. It will have the new partition layout.
  6. Reduce the network settings on this host to just a single primary management interface NIC, one that matches the same one it had initially. Rejoin this just upgraded host back into the pool. Many of the pool metadata settings should automatically get recreated, in particular any of the pooled resources. Update any host-specific network settings as well as other customizations.
  7. Continue this process for the remainder of the non-master hosts.
  8. When these have all been completed, designate a new pool master using the command “xe pool-designate-new-master host-uuid=new-master-uuid”. Make sure the transition to the new pool master is properly completed. 
  9. Eject what was the original pool master and perform the standalone upgrade and rejoin the host to the pool as before.
  10. Celebrate your hard work!
While this process will take easily two times as long as a standard upgrade, at least you know that things should not break so badly that you may have to spend hours of additional time making things right. As a side benefit (if you want to consider it as such), it will also force you to take stock of how your pool is configured and require you to take careful inventory, In the event of a disaster, you will be grateful to have gone through this process as it may be very close to what you may have to go through under less controlled circumstances!I have done this myself several of times and had it work correctly each time.

An Additional Item: If the Console Shows No Network Configured on Dell Servers with iDRAC modules

This condition can show up unexpectedly and while normally something like this can be handled if the host is found to be in emergency mode with an “xe pool-recover-slaves” command, that’s not always the case. And even more oddly, if instead you ssh in and run xsconsole from the CLI, all looks perfectly normal, including all the network settings that appear present and correct and also match the settings visible in XenCenter. This condition, as far as I know, seems unique to Dell servers and was seen with iDRAC 7, 8 and 9 modules. Here's what it looks like:

The issue here appears to have been a change in behavior that kicked in starting with XenServer 7.1 and hence may not even be evident in XS 7.0.

The fix turned out to be an upgrade all the BIOS/firmware and iDRAC configurations. In this particular case, I made use of the method I described in this blog entry and that took care of it.  Note that this still does not seem to consistently address this issue, in particular with some older hardware (e.g., R715 with an iDRAC 6, even after updating to DSU 1.4.2, BIOS 3.2.1, but being stuck at iDRAC version 1.97 – apparently not upgradeable).

Enabling rc.local To Run On Its Own

The rc.local file is not automatically executed by default under RHEL/CentOS 7 installations, and since XS 7 is based on CentOS 7, it is no exception. If you wish to customize your environment and assure this file is executed at boot time, you will have to manually enable /etc/rc.local to run automatically on reboot on XenServer 7.X and to do so, will need to run these two commands:

# chmod u+x /etc/rc.d/rc.local
# systemctl start rc-local

You can verify that rc.local is now running with the following command, which in turn should produce output similar to what is shown below:

# systemctl status rc-local

   rc-local.service - /etc/rc.d/rc.local Compatibility
 Loaded: loaded (/usr/lib/systemd/system/rc-local.service; static; vendor preset: disabled)
Active: active (exited) since Sun 2017-06-18 09:09:50 MST; 1 weeks 6 days ago
Process: 4649 ExecStart=/etc/rc.d/rc.local start (code=exited, status=0/SUCCESS)

A Final Word

Once again, I will reiterate that feedback is always appreciated and the XenServer forum is one good option. Errors, on the other hand, should be reported to the XenServer bugs site with as much information as possible to help the engineers understand and reproduce the issues.

I sincerely hope some of the information in this series will have been useful to some degree.

I would like to most gratefully acknowledge Andrew Wood, fellow Citrix Technology Professional, for the review of and constructive additions to this blog.


Recent comment in this post
Tobias Kreidl
I would like to add that an option for a more efficient, yet safe update procedure, would be to follow what's in the article, but ... Read More
Thursday, 06 July 2017 19:00
Continue reading
1908 Hits
1 Comment

From the Field: XenServer 7.X Pool Upgrades, Part 1: Background Information and Clean Install

The XenServer upgrade process with XenServer 7 can be a bit more involved than with past upgrades, particularly if upgrading from a version prior to 7.0. In this series I’ll discuss various approaches to upgrading XenServer (XS) pools to 7.X and some of the issues you might encounter during the process based on my experiences in the field.

Part 1 deals with some of the preliminary actions that need to be taken into consideration and the planning process, as well as the case for a clean installation. Part 2 deals with the alternative, namely a true in-place upgrade procedure, and what issues may arise during and after the procedure.


Why Upgrade to XenServer 7?

XenServer (XS) 7 has been out for a little over a year now and is currently up to release 7.2. A number of improvements in scaling, functionality and features are available in XS 7, which should encourage users to upgrade to take advantage of them.


First Step in Any XS Upgrade

The first step in any XenServer upgrade is to upgrade your XenCenter to at least the minimum version required to support the newest installation of XenServer you are or will be accessing.

This step is crucial. Failure to do this causes puppies to die. This is because older versions of XenCenter won’t connect to newer versions of XS and in many cases, new features are built into the newer versions of XenCenter that are not available in older versions and support important components in the newer XS releases, including hotfix and rolling pool upgrade options, changes in licensing, and others such as supported pooled storage options. In short, your upgrade process will rapidly come to a screeching halt and there may be potential damage done unless you are working with the proper version of XenCenter.

It is also highly recommended to make backups of anything and everything prior to upgrading. More on that will be discussed later.


Summary of Pre-Upgrade Checks

From experience, the following are key activities to perform before upgrading any XenServer pool:

XenCenter has been upgraded to at least the version that does or will match your newest pool

  • Assure the proper backups of virtual machines (VMs), metadata, etc. have been performed
  • Make sure that all VMs are agile (can be migrated to other hosts within the pool) or in the case of a rolling pool upgrade, at least shut down on all local SRs
  • All hosts must not only be running the same version of XenServer, but must have the identical list of hotfixes applied
  • Clean up old messages in /var/lib/xcp/blobs/

I’d also recommend doing an “export resource data” for the pool, which has been available as a utility since XS 6.5 and can run from XenCenter.

To export resource data:

  1. In the XenCenter Navigation pane, click Infrastructure and then click on the pool.
  2. From the XenCenter menu, click Pool and then select Export Resource Data.
  3. Browse to a location where you would like to save the report and then click Save.

You can chose between an XLS or CVS output. Note that this feature is only available on XenCenter with paid-for licensed versions of XenServer. However, it can also be run via the CLI for any (including free) version of XS 6.5 or newer using:

# xe pool-dump-database file-name=target-file-name


Benefits of a XenServer 7 Upgrade

XenServer 7 is a significant jump from XS 6.5 SP1 in many ways, including improvements in scale and performance. It also has added a number of very nice new features (see what's new in XS 7; what's new in XS 7.1; XS 7.2 ).

While any upgrade can be a bit intimidating, ones that jump to a whole new edition can be more complicated and given the changes involved here, this case was no exception. I would also recommend trying to verify your hardware is present on the HCL though it’s often slow to get populated. If in doubt and you have the extra resources, take as close as possible a spare server and try to do an installation on it first to make sure it’s compatible. If you don’t have such a machine and have the spare capacity, eject one of the hosts from your pool and use it to test out the upgrade. Before performing an upgrade on a production server or pool, it is always recommended to try this out first on a test configuration.

I do not recommend at this time the rolling pool or standalone upgrade that involves switching at the same time from the old to the new partition layout. Overall experiences within the user community have not been consistently positive.

It is hoped that the preparatory steps that are recommended before undertaking an upgrade do not prove to be too daunting. This series is intended to help with those preparations as well as give guidelines and tips how to go about the process and what to do if issues arise. There are clearly a number of ways to tackle upgrades and there are undoubtedly numerous other possible approaches. One might, for example, storage XenMotion VMs to other pools or make use of temporary pooled storage as part of the process. The detaching and attaching pooled storage to servers already running a newer version of XenServer is another possible route to explore.

Experimentation is always something that should be encouraged, albeit not on production systems!

Feedback is always appreciated and the XenServer forum is one good option. Errors, on the other hand, should be reported to the XenServer bugs site with preferably sufficient detail to allow the engineers to be able to understand and reproduce the issues.


XS 7.X Leveraging a Clean Installation as an Upgrade Component

A clean install can become part of an upgrade procedure if for example an upgraded pool already exists and individual hosts or hosts from a whole different pool are to be merged into a pool running a newer version of XenServer. Hence talking about this option is relevant to upgrading. You can potentially storage Xenmotion VMs over to a new pool, then eject and dissolve individual pool members, perform a clean installation on them, and join the hosts to the new pool. Of course the target pool must have adequate resources to hold and run the VMs being moved over until the new hosts can be configured and added. This process might involve leveraging temporary external storage, for example, by creating an NFS SR.

The easiest installation scenario is a clean install, which has thankfully always been the case and this approach may work well in some instances. This is obviously suited best for a standalone server or as mentioned above, creating a server that will be joined to an existing pool.

Regardless of which upgrade process you undertake, be sure to review your VMs afterwards to see if XenTools need to be updated.


A Few Words About the XenServer 7 New Disk Layout

The main item to be aware of is that a clean installation will always create the new GPT partition layout as follows:



Size Purpose
4GB log partition (syslogd and other logs)
18GB backup partition for dom0 (for upgrades)
18GB XenServer host control domain (dom0) partition
0.5GB UEFI boot partition
1GB swap partition


This adds up to 41.5 GB. There can be a sixth partition which is optionally used for creating a local storage repository (SR) or a portion of a local SR if spanning it to other storage devices. See also James Bulpin's XS 7 blog for more details about the XS 7 architecture.

Here is what a real-life partition table looks like after performing an installation. Note that the space for a local SR ends up in partition 3 (in this case, a default LVM partition 94.6 GB in size):

# fdisk -l /dev/sda

WARNING: fdisk GPT support is currently new, and therefore in an experimental phase. Use at your own discretion.

Disk /dev/sda: 146.2 GB, 146163105792 bytes, 285474816 sectors

Units = sectors of 1 * 512 = 512 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk label type: gpt

#         Start            End          Size  Type       Name

 1     46139392     83888127     18G  Microsoft basic

 2       8390656     46139391     18G  Microsoft basic

 3     87033856   285474782   94.6G  Linux LVM

 4     83888128     84936703   512M  BIOS boot parti

 5            2048       8390655       4G  Microsoft basic

 6     84936704     87033855       1G  Linux swap


Why was the XenServer 7 Disk Layout Changed?

The XS 6.5 SP1 and earlier partition scheme was much smaller and simpler:

4 GB XenServer host control domain (dom0) partition

4 GB backup partition for dom0 (for upgrades)

The rest of the space was again available for use as a local SR in a third partition.

The default partition layout was changed under XS 7 for a number of reasons. For one, most storage devices these days start with a capacity of few hundred GB and so it makes sense to be able to allocate more space for the XenServer installation itself beyond just 8 GB. In that regard, running out of space has been a common problem caused by either a lot of hotfixes being stored or the accumulation of a lot of log files. Isolating /var/log to its own partition now helps keep XS from crashing in case it fills up. Before, /var/log was just another subdirectory on the system “/” partition and inevitably resulted in a system crash when the partition became 100% full. A separate UEFI boot area was also added. Plus, the Linux swap space was assigned to its own partition. In short, there was no reason not to make use of more space and provide an improved partitioning scheme that would also scale better for larger pools containing a lot of VMs.

Note: One idiosyncrasy of XS 6.5 SP1 is that it switched to a GUID partition table in order to use more than 2 TB of local storage if available, but this has evidently been undone in 7.0 since GPT is no longer limited as such when running a 64-bit OS based on CentOS 7.

Note: There are changes in the minimum host physical system requirements, including a recommended minimum of 46 GB of disk space. This will impact environments that used smaller partitions or SSDs for the XS OS.


Stage Complete: Level Up!

Well done – you’ve completed Part 1. In Part 2, I’ll deal with the alternative to leveraging a clean install -- a true in-place upgrade procedure, and what issues may arise during and after the procedure.

Once again, I will reiterate that feedback is always appreciated and the XenServer forum is one good option. Errors, on the other hand, should be reported to the XenServer bugs site with as much information as possible to help the engineers understand and reproduce them.


Continue reading
1499 Hits

XenServer 7.2 now available!

Hello XenServer community!

The XenServer team is proud to announce the release of version 7.2, which includes an array of improvements that further simplify and refine the user experience and enable even greater platform management scalability.

Click here to learn more about the new features available in XenServer 7.2.

See you on the XenServer page on!

Andy M.


Recent Comments
Awesome! All the download pages link to 7.1, tho.
Thursday, 25 May 2017 23:23
Andy Melmed
To learn about all the really cool things XenServer 7.2 has to offer our customers, visit Read More
Wednesday, 31 May 2017 21:18
Andy Melmed
Hi Christian, The links on the downloads page to access v7.2 have been updated. Thank you, Andy... Read More
Monday, 05 June 2017 13:31
Continue reading
5803 Hits

Creating backups with XenServer

Backup is an essential part of the business workflow for many of our customers - be it SMB, Enterprise Server Virtualisation or Virtual Desktop Infrastructure. Making the backup experience smoother is high up on our wishlist at XenServer Engineering and the delivery of improved VM import/export performance in XS 7.1 shows our commitment to that end. To continue improving our services supporting the backup ecosystem, we would like to better understand how you use backup with XenServer


  • How often do you backup? Do you have multiple jobs for monthly, weekly, daily backups?

  • How do you create your backups?

    • Use VM Export to backup VM metadata + disks

    • Snapshot at the VM level and use transfer/service VM to read off the snapshots

    • Use vdi-export to create differential disks (.vhd)

  • Do you use a third-party vendor for handling your backups?

  • Would support for incremental backups be useful for your use case?

Please leave a comment with your answers and any issues you may have with your backup experience today. We look forward to hearing from you!

Thank you,



Recent Comments
I use storage replication for sr and bacula for vm data. I still afraid of coalesce issue for using snapshot export.
Friday, 07 April 2017 02:30
Chandrika Srinivasan
Hi Beck, Is there a specific issue you are facing with coalesce? Which version of XenServer are you using? -Chandrika... Read More
Tuesday, 02 May 2017 14:02
Olivier Lambert
Check this: Should be integrated soon in our backup... Read More
Friday, 12 May 2017 19:23
Continue reading
5833 Hits

XenCenter 7.1 update now available!

A hotfix (XS71E001) has been released for customers using XenCenter as the management console for their XenServer 7.1 virtual environments. 

This hotfix offers improvements in XenCenter UI responsiveness, as well as several fixes associated with host health check analysis, status reports and updates. Additional information pertaining to this hotfix can be found here.

As always, we encourage customers read the hotfix release notes and install the hotfix to avoid any of the issues described in the notes.


Recent comment in this post
Tobias Kreidl
Note that this distribution includes both the XenCenter binary as well as a hotfix that needs to be applied to the XenServer host,... Read More
Friday, 07 April 2017 12:53
Continue reading
3373 Hits
1 Comment

Introducing... XenServer 7.1!

We are pleased to announce the release of XenServer 7.1!

Click here to learn about the new features and enhancements available in 7.1.

As is customary with every new release, we encourage you to give v7.1 a spin and report any issues via

Note: We ask that you target this release exclusively for new defect reports[*].

Thank you and enjoy the latest release!

[*]In case of problems with earlier releases, pre-XS v7.0 and outside of paid support, then we recommend you upgrade to the XS v7.x series.  




Recent Comments
Andrew Halley
See here for which features are available in which versions : Read More
Friday, 24 February 2017 17:41
Hey, Great news! The Download links still reflect 7.0 release, tho. Any chance to get a download link? -Chris.... Read More
Friday, 24 February 2017 22:07
Andrew Halley
We're working on it - now done!
Monday, 27 February 2017 16:51
Continue reading
5407 Hits

Staying Ahead of the Curve

Are you looking to improve the performance of your virtual servers and desktops?

Could your hypervisor use a boost when it comes to supporting graphics-intense applications?

Are you in need of an advanced security technology that offers a unique way of detecting and blocking sophisticated attacks against your data center before they cause any damage to your business?

Would you like to simplify the maintenance of your hosting infrastructure?

Does the idea of optimizing the performance, scalability, management and cost-savings of your application and desktop delivery solutions through the combination of an industry-leading hypervisor and industry-leading HCI platforms sound interesting to you?

Would you feel more comfortable knowing your hosting infrastructure was fully-supported for the next 10 years?

If you answered "yes" to any of the above, click here to learn more!

Until next time,



Continue reading
1513 Hits

XenServer High-Availability Alternative HA-Lizard

XenServer High-Availability Alternative HA-Lizard


XenServer (XS) contains a native high-availability (HA) option which allows quite a bit of flexibility in determining the state of a pool of hosts and under what circumstances Virtual Machines (VMs) are to be restarted on alternative hosts in the event of the loss of the ability of a host to be able to serve VMs. HA is a very useful feature that protects VMs from staying failed in the event of a server crash or other incident that makes VMs inaccessible. Allowing a XS pool to help itself maintain the functionality of VMs is an important feature and one that plays a large role in sustaining as much uptime as possible. Permitting the servers to automatically deal with fail-overs makes system administration easier and allows for more rapid reaction times to incidents, leading to increased up-time for servers and the applications they run.

XS allows for the designation of three different treatments of Virtual Machines: (1) always restart, (2) restart if possible, and (3) do not restart. The VMs designated with the highest restart priority will be the first to be attempted to restart and all will be handled, provided adequate resources (primarily, host memory) are available.  A specific start order, allowing for some VMs to be checked to be running before others, can also be established. VMs will be automatically distributed among whatever remaining XS hosts are considered active. Where necessary, note that hosts that contain expandable memory will be shrunk down to accommodate additional hosts and those hosts designated to be restarted will also be run with reduced memory, if necessary. If additional capacity exists to run more VMs, those designated as “start if possible” will be brought online. Whichever VMs that are not considered essential typically will be marked as “do not restart” and hence will be left “off” had they been running before, requiring any of those desired to be restarted to be done manually, resources permitting.

XS also allows for specifying the minimum number of active hosts to remain to accommodate failures; larger pools that are not overly populated with VMs can readily accommodate even two or more host failures.

The election of what hosts are “live” and should be considered active members of the pool follows a rather involved process of a combination of network accessibility plus access to an independent designated pooled Storage Repository (SR) that serves as an additional metric. The pooled SR can also be a fiber channel device, being independent of Ethernet connections. A quorum-based algorithm is applied to establish which servers are up and active as members of the pool and which -- in the event of a pool master failure -- should be elected the new pool master.



Without going into more detail, suffice it to say that this methodology works very well, however requiring a few prerequisite conditions that need to be taken into consideration. First of all, the mandate that a pooled storage device be available clearly means that a pool consisting of hosts that only make use of local storage will be precluded. Second, there is also a constraint that for a quorum to be possible, it is required to have a minimum of three hosts in the pool or HA results will be unpredictable as the election of a pool master can become ambiguous. This comes about because of the so-called “split brain” issue ( which is endemic in many different operating system environments that employ a quorum as means of making such a decision. Furthermore, while fencing (the process of isolating the host; see for example is the typical recourse, the lack of intercommunication can result in a wrong decision being made and hence loss of access to VMs. Having experimented with two-host pools and the native XenServer HA, I would say that an estimate of it working about half the time is about right and from a statistical viewpoint, pretty much what you would expect.

This limitation is, however, still of immediate concern to those with either no pooled storage and/or only two hosts in a pool. With a little bit of extra network connectivity, a relatively simple and inexpensive solution to the external SR can be provided by making a very small NFS-based SR available. The second condition, however, is not readily rectified without the expense of at least one additional host and all the connectivity associated with it. In some cases, this may simply not be an affordable option.



For a number of years now, an alternative method of providing HA has been available through the program package provided by HA-Lizard ( , a community project that provides a free alternative that is neither dependent on external SRs nor requires a minimum of three hosts within a pool. In this blog, the focus will be on the standard HA-Lizard version and because of the particularly harder-to-handle situation of a two-node pool, it will also be the subject of discussion.

I had been experimenting for some time with HA-Lizard and found in particular that I was able to create failure scenarios that needed some improvement. HA-Lizard’s Salvatore Costantino was more than willing to lend an ear to the cases I had found and this led further to a very productive collaboration on investigating and implementing means to deal with a number of specific cases involving two-host pools. The result of these several months of efforts is a new HA-Lizard release that manages to address a number of additional scenarios above and beyond its earlier capabilities.

It is worthwhile mentioning that there are two ways of deploying HA-Lizard:

1) Most use cases combine HA-Lizard and iSCSI-HA which creates a two-node pool using local storage while maintaining full VM agility with VMs being able to run on either host. In this case, DRBD ( is implemented in this type of deployment and it works very well making use of the real-time storage replication.

2) HA-Lizard, only, is used with an external Storage Repository (as in this particular case).

Before going into details of the investigation, a few words should go towards a brief explanation of how this works. Note that there is only Internet connectivity (the use of a heuristic network node) and no external SR, so how is a split brain situation then avoidable?

This is how I'd describe the course of action in this two-node situation:

If a node sees the gateway, assume it's alive. If it cannot, assume it's a good candidate for fencing. If the node that cannot see the gateway is the master, it should internally kill any running VMs and surrender its ability to be the master and fence itself. The slave node should promote itself to master and attempt to restart any missing VMs. Any that are on the previous master will probably fail though, because there is no communication to the old master. If the old VMs cannot be restarted, eventually the new master will be able to restart them regardless after a toolstack restart. If the slave node fails by not being able to communicate with the network, as long as the master still sees the network and not the slave’s network, it can assume the slave needs to fence itself, kill off its VMs and assume that they will be restarted on the current master. The slave needs to realize it cannot communicate out, and therefore should kill off any of its VMs and fence itself.

Naturally, the trickier part comes with the timing of the various actions, since each node has to blindly assume the other is going to conduct a sequence of events. The key here is that these are all agreed on ahead of time and as long as each follows its own specific instructions, it should not matter that each of the two nodes cannot see the other node. In essence, the lack of communication in this case allows for creating a very specific course of action! If both nodes fail, obviously the case is hopeless, but that would be true of any HA configuration in which no node is left standing.

Various test plans were worked out for various cases and the table below elucidates the different test scenarios, what was expected and what was actually observed. It is very encouraging that the vast majority of these cases can now be properly handled.


Particularly tricky here was the case of rebooting the master server from the shell, without first disabling HA-Lizard (something one could readily forget to do). Since the fail-over process takes a while, a large number of VMs cannot be handled before the communication breakdown takes place, hence one is left with a bit of a mess to clean up in the end. Nevertheless, it’s still good to know what happens if something takes place that rightfully shouldn’t!

The other cases, whether intentional or not, are handled predictably and reliably, which is of course the intent. Typically, a two-node pool isn’t going to have a lot of complex VM dependencies, so the lack of a start order of VMs should not be perceived as a big shortcoming. Support for this feature may even be added in a future release.



HA-Lizard is a viable alternative to the native Citrix HA configuration. It’s straightforward to set up and can handle standard failover cases with a selective “restart/do not restart” setting for each VM or can be globally configured. There are a quite a number of configuration parameters which the reader is encouraged to research in the extensive HA-Lizard documentation. There is also an on-line forum which serves as a source for information and prompt assistance with issues. This most recent release 2.1.3 is supported on both XenServer 6.5 and 7.0.

Above all, HA-Lizard shines when it comes to handling a non-pooled storage environment and in particular, all configurations of the dreaded two-node pool configuration. From my direct experience, HA-Lizard now handles the vast majority of issues involved in a two-node pool and can do so more reliably than the non-supported two-node pool using Citrix’ own HA application. It has been possible to conduct a lot of tests with various cases and importantly, and to do so multiple times to ensure the actions are predictable and repeatable.

I would encourage taking a look at HA-Lizard and giving it a good test run. The software is free (contributions are accepted) and it is in extensive use and has a proven track record.  For a two-host pool, I can frankly not think of a better alternative, especially with these latest improvements and enhancements.

I would also like to thank Salvatore Costantino for the opportunity to participate in this investigation and am very pleased to see the fruits of this collaboration. It has been one way of contributing to the Citrix XenServer user community that many can immediately benefit from.







Recent comment in this post
JK Benedict
I hath no idea why more have not read this intense article! As always: bravo, sir! BRAVO!
Wednesday, 04 January 2017 12:43
Continue reading
5511 Hits
1 Comment

PCI Pass-Through on XenServer 7.0

Plenty of people have asked me over the years how to pass-through generic PCI devices to virtual machines running on XenServer. Whilst it isn't officially supported by Citrix, it's none the less perfectly possible to do; just note that your mileage may vary, because clearly it's not rigorously tested with all the possible different types of device people might want to pass-through (from TV cards, to storage controllers, to USB hubs...!).

The process on XenServer 7.0 differs somewhat from previous releases, in that the Dom0 control domain is now CentOS 7.0-based, and UEFI boot (in addition to BIOS boot) is supported. Hence, I thought it would be worth writing up the latest instructions, for those who are feeling adventurous.

Of course, XenServer officially supports pass-through of GPUs to both Windows and Linux VMs, hence this territory isn't as uncharted as might first appear: pass-through in itself is fine. The wrinkles will be to do with a particular given piece of hardware.

A Short Introduction to PCI Pass-Through

Firstly, a little primer on what we're trying to do.

Your host will have a PCI bus, with multiple devices hosted on it, each with its own unique ID on the bus (more on that later; just remember this as "B:D.f"). In addition, each device has a globally unique vendor ID and device ID, which allows the operating system to look up what its human-readable name is in the PCI IDs database text file on the system. For example, vendor ID 10de corresponds to the NVIDIA Corporation, and device ID 11b4 corresponds to the Quadro K4200. Each device can then (optionally) have multiple sub-vendor and sub-device IDs, e.g. if an OEM has its own branded version of a supplier's component.

Normally, XenServer's control domain, Dom0, is given all PCI devices by the Xen hypervisor. Drivers in the Linux kernel running in Dom0 each bind to particular PCI device IDs, and thus make the hardware actually do something. XenServer then provides synthetic devices (emulated or para-virtualised) such as SCSI controllers and network cards to the virtual machines, passing the I/O through Dom0 and then out to the real hardware devices.

This is great, because it means the VMs never see the real hardware, and thus we can live migrate VMs around, or start them up on different physical machines, and the virtualised operating systems will be none the wiser.

If, however, we want to give a VM direct access to a piece of hardware, we need to do something different. The main reason one might want to is because the hardware in question isn't easy to virtualise, i.e. the hypervisor can't provide a synthetic device to a VM, and somehow then "share out" the real hardware between those synthetic devices. This is the case for everything from an SSL offload card to a GPU.

Aside: Virtual Functions

There are three ways of sharing out a PCI device between VMs. The first is what XenServer does for network cards and storage controllers, where a synthetic device is given to the VM, but then the I/O streams can effectively be mixed together on the real device (e.g. it doesn't matter that traffic from multiple VMs is streamed out of the same physical network card: that's what will end up happening at a physical switch anyway). That's fine if it's I/O you're dealing with.

The second is to use software to share out the device. Effectively you have some kind of "manager" of the hardware device that is responsible for sharing it between multiple virtual machines, as is done with NVIDIA GRID GPU virtualisation, where each VM still ends up with a real slice of GPU hardware, but controlled by a process in Dom0.

The third is to virtualise at the hardware device level, and have a PCI device expose multiple virtual functions (VFs). Each VF provides some subset of the functionality of the device, isolated from other VFs at the hardware level. Several VMs can then each be given their own VF (using exactly the same mechanism as passing through an entire PCI device). A couple of examples are certain Intel network cards, and AMD's MxGPU technology.

OK, So How Do I Pass-Through a Device?

Step 1

Firstly, we have to stop any driver in Dom0 claiming the device. In order to do that, we'll need to ascertain what the ID of the device we're interested in passing through is. We'll use B:D.f (Bus, Device, function) numbering to specify it.

Running lspci will tell you what's in your system:

davidcot@helical:~$ lspci
00:00.0 Host bridge: Intel Corporation 82X38/X48 Express DRAM Controller
00:01.0 PCI bridge: Intel Corporation 82X38/X48 Express Host-Primary PCI Express Bridge
00:06.0 PCI bridge: Intel Corporation 82X38/X48 Express Host-Secondary PCI Express Bridge
00:1a.0 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #4 (rev 02)
00:1a.1 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #5 (rev 02)
00:1a.2 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #6 (rev 02)
00:1a.7 USB controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #2 (rev 02)
00:1b.0 Audio device: Intel Corporation 82801I (ICH9 Family) HD Audio Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 1 (rev 02)
00:1c.5 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 6 (rev 02)
00:1d.0 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #1 (rev 02)
00:1d.1 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #2 (rev 02)
00:1d.2 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #3 (rev 02)
00:1d.7 USB controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #1 (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 92)
00:1f.0 ISA bridge: Intel Corporation 82801IR (ICH9R) LPC Interface Controller (rev 02)
00:1f.2 SATA controller: Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode] (rev 02)
00:1f.3 SMBus: Intel Corporation 82801I (ICH9 Family) SMBus Controller (rev 02)
01:00.0 VGA compatible controller: NVIDIA Corporation G86 [Quadro NVS 290] (rev a1)
04:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5754 Gigabit Ethernet PCI Express (rev 02)

Once you've found the device you're interested in, say 04:00.0 for my network card, we tell Dom0 to exclude it from being bound to by normal drivers. You can add to the Dom0 boot line as follows:

/opt/xensource/libexec/xen-cmdline --set-dom0 "xen-pciback.hide=(04:00.0)"

(What this does is edit /boot/grub/grub.cfg for you, or if you're booting using UEFI, /boot/efi/EFI/xenserver/grub.cfg instead!)

Step 2

Reboot! At the moment, a driver in Dom0 probably still has hold of your device, hence you need to reboot the host to get it relinquished.

Step 3

The easy bit: tell the toolstack to assign the PCI device to the VM. Run:

xe vm-list

And note the UUID of the VM you're interested in, then:

xe vm-param-set other-config:pci=0/0000:<B:D.f> uuid=<vm uuid>

Where, of course, <B.D.f> is the ID of the device you found in step 1 (like 04:00.0), and <vm uuid> corresponds to the VM you care about.

Step 4

Start your VM. At this point if you run lspci (or equivalent) within the VM, you should now see the device. However, that doesn't mean it will spring into life, because...

Step 5

Install a device driver for the piece of hardware you passed-through. The operating system within the VM may already ship with a suitable device driver, but it not, you'll need to go and get the appropriate one from the device manufacturer. This will normally be the standard Linux/Windows/other one that you would use for a physical system; the only difference occurs when you're using a virtual function, where the VF driver is likely to be a special one.

Health Warnings

As indicated above, pass-through has advantages and disadvantages. You'll get direct access to the hardware (and hence, for some functions, higher performance), but you'll forgo luxuries such as the ability to live migrate the virtual machine around (there's state now sitting on real hardware, versus virtual devices), and the ability to use high availability for that VM (because HA doesn't take into account how many free PCI devices of the right sort you have in your resource pool).

In addition, not all PCI devices take well to being passed through, and not all servers like doing so (e.g. if you're extending the PCI bus in a blade system to an expansion module, this can sometimes cause problems). Your mileage may therefore vary.

If you do get stuck, head over to the XenServer discussion forums and people will try to help out, but just note that Citrix doesn't officially support generic PCI pass-through, hence you're in the hands of the (very knowledgeable) community.


Hopefully this has helped clear up how pass-through is done on XenServer 7.0; do comment and let us know how you're using pass-through in your environment, so that we can learn what people want to do, and think about what to officially support on XenServer in the future!

Recent Comments
Tobias Kreidl
Yay, great to see this published in clear, concise steps! This is one for the XenServer forum to point to! ... Read More
Saturday, 05 November 2016 03:38
David Cottingham
If you want both the audio and GPU devices given to your VM, then yes, you need to use the procedure once for each device. Howeve... Read More
Monday, 07 November 2016 10:18
David Cottingham
Understood: it would be a performance gain for at least some use cases, as you're getting raw access to the NIC. The downside is t... Read More
Monday, 07 November 2016 10:06
Continue reading
14367 Hits

Better Together: PVS and XenServer!

XenServer adds new functionality to further simplify and enhance the secure and on-demand delivery of applications and desktops to enterprise environments.

If you haven't visited the Citrix blogs recently, we encourage you to visit to read about the latest integration efforts between PVS and XS.

If you're a Citrix customer, this article is a must read!

Andy Melmed, Senior Solutions Architect, XenServer PM


Continue reading
3113 Hits

Set Windows guest VM Static IP address in XenServer

A Bit of Why

For a XenServer Virtual Machine(VM) administrator, traditional way to set a static IP to a VM maybe not that direct. That is because XenServer do not provide API to set VM IP address from any management tool in history. To change the IP setting for a VM in XenServer, you will need to email the VM user and let them to do the setting manually. Or you may need to install some 3rd party tools to help you to set the IP address to the VM. For create new VM for users by VM clone, set IP maybe means multiple time of reboot.

To provide a better user experience, XenServer is now trying to provides easier way to set static IP address to Guest VM.


Set static IP for XenServer 7.0 Windows guest VM

XenServer 7.0 now have the ability to set Windows guest VM IP address by below interfaces:

  • IPv4
    • Set VM IPv4 address by command line interface(CLI):
      xe vif-configure-ipv4 address=  gateway=  mode=[static/none] uuid=
    • Set VM IPv4 address by XAPI
      VIF.configure_ipv4(vifObject, "static/none", "Some IP address", "some gateway address")
  • IPv6
    • Set VM IPv6 address by command line interface(CLI):
      xe vif-configure-ipv6 address=  gateway=  mode=[static/none] uuid=
    • Set VM IPv6 address by XAPI
      VIF.configure_ipv6(vifObject, "static/none", "Some IP address", "some gateway address")


The mode "none" means remove the current static IP setting and back to DHCP mode. If the static IP is not set by new interface, use the command to set the mode to "none" only do nothing.

Dive into details

Below diagram show how the configuration goes:


By using the interface:

1. XAPI will first store the IP configuration to XenStore as:

/local/domain//xenserver/device/vif= ""
  static-ip-setting = ""
     mac = "some mac address"
     error-code = "some error code"
     error-msg = "some error message"
     address = "some IP address"
     gateway = "some gateway address"

2. XenStore will notify the change to XenServer Guest agent tool of the configuration change.

3. XenServer guest agent receives the notification and sets IP address using netsh.

4. After setting IP address, XenServer guest agent then writes the operation result to xenstore key as: error-code and error-msg


1. Install XenServer PV tool to Windows Guest VM.

 2. From the command line interface (CLI), identify the Virtual Network Interface / Virtual Interface(VIF) you want to set the IP address by:

[root@dt65 ~]# xe vm-vif-list vm=Windows 7 (32-bit) (1) 
uuid ( RO)                         : 7dc56d5b-492c-bcf5-2549-b580dc928274
        vm-name-label ( RO): Windows 7 (32-bit) (1)
                     device ( RO): 1
                        MAC ( RO): 3e:aa:c3:dd:a7:ba
           network-uuid ( RO): 98f9a3b6-ad3f-14b3-da59-e3abc888e58e
network-name-label ( RO): Pool-wide network associated with eth1

uuid ( RO)                         : 0f59a97b-afcf-b6db-582d-2411d5bbc449
        vm-name-label ( RO): Windows 7 (32-bit) (1)
                     device ( RO): 0
                        MAC ( RO): 62:a1:03:31:a3:ee
           network-uuid ( RO): 41dac7d6-4a11-c9e6-cc48-ded78ceaf446
network-name-label ( RO): Pool-wide network associated with eth0

3. Call new interface to set IP address as:

[root@dt65 ~]# xe vif-configure-ipv4 uuid=0f59a97b-afcf-b6db-582d-2411d5bbc449 mode=static address= gateway=

4. Check result by XenStore error code key "error-code" and "error-msg" as:

[root@XenServer ~]# xenstore-ls /local/domain/13/xenserver/device/vif
0 = ""
  static-ip-setting = ""
     mac = "62:a1:03:31:a3:ee"
     error-code = "0"
     error-msg = ""
     address = ""
     gateway = ""
  1 = ""
     static-ip-setting = ""
     mac = "3e:aa:c3:dd:a7:ba"
     error-code = "0"
     error-msg = ""

Recent comment in this post
How use netsh.exe set IPs?
Monday, 07 November 2016 02:13
Continue reading
4938 Hits
1 Comment

Enable XSM on XenServer 6.5 SP1

1 Introduction

Certain virtualization environments require the extra security provided by XSM and FLASK ( XenServer 7 benefits from its upgrade of the control domain to CentOS 7, which includes support for enabling XSM and FLASK. But what about legacy XenServer 6.5 installations that also require the added security? XSM and FLASK may be enabled on XenServer 6.5 as well, but it requires a bit more work.

Note that XSM is not currently a user-visible feature in XenServer, or a supported technology.

This article describes how to enable XSM and FLASK in XenServer 6.5 SP1. It makes the assumption that the reader is familiar with accessing, building, and deploying XenServer's Xen RPMs from source. While this article pertains to resources from SP1 source RPMs (XS65ESP1-src-pkgs.tar.bz2 included with SP1,, a similar approach can be followed for other XenServer 6.5 hotfixes.

2 Patching Xen and xen.spec

XenServer issues some hypercalls not handled by Xen's XSM hooks. The following patch shows one possible way to handle these operations and commands, which is to always permit them.

diff --git a/xs6.5sp1/xen/xen-4.4.1/xen/xsm/flask/hooks.c b/xs6.5sp1/xen/xen-4.4.1/xen/xsm/flask/hooks.c
index 0cf7daf..a41fcc4 100644
--- a/xs6.5sp1/xen/xen-4.4.1/xen/xsm/flask/hooks.c
+++ b/xs6.5sp1/xen/xen-4.4.1/xen/xsm/flask/hooks.c
@@ -727,6 +727,12 @@ static int flask_domctl(struct domain *d, int cmd)
     case XEN_DOMCTL_cacheflush:
         return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__CACHEFLUSH);

+    case XEN_DOMCTL_get_runstate_info:
+        return 0;
+    case XEN_DOMCTL_setcorespersocket:
+        return 0;
         printk("flask_domctl: Unknown op %dn", cmd);
         return -EPERM;
@@ -782,6 +788,9 @@ static int flask_sysctl(int cmd)
     case XEN_SYSCTL_numainfo:
         return domain_has_xen(current->domain, XEN__PHYSINFO);

+    case XEN_SYSCTL_consoleringsize:
+        return 0;
         printk("flask_sysctl: Unknown op %dn", cmd);
         return -EPERM;
@@ -1299,6 +1308,9 @@ static int flask_platform_op(uint32_t op)
     case XENPF_get_cpuinfo:
         return domain_has_xen(current->domain, XEN__GETCPUINFO);

+    case XENPF_get_cpu_features:
+        return 0;
         printk("flask_platform_op: Unknown op %dn", op);
         return -EPERM;

The only other file that needs patching is Xen's RPM spec file, xen.spec. Modify HV_COMMON_OPTIONS as shown below.  Change this line:

% define HV_COMMON_OPTIONS max_phys_cpus=256


% define HV_COMMON_OPTIONS max_phys_cpus=256 XSM_ENABLE=y FLASK_ENABLE=y

3 Compiling and Loading a Policy

To build a security policy, navigate to tools/flask/policy in Xen's source tree. Run make to compile the default security policy. It will have a name like xenpolicy.24, depending on your version of checkpolicy.

Copy xenpolicy.24 over to Dom0's /boot directory. Open /boot/extlinux.conf and modify the default section's append /boot/xen.gz ... line so it has --- /boot/xenpolicy.24 at the end. For example:

append /boot/xen.gz dom0_mem=752M,max:752M [.. snip ..] splash --- /boot/initrd-3.10-xen.img --- /boot/xenpolicy.24

After making this change, reboot.

While booting (or afterwards, via xl dmesg), you should see messages indicating XSM and FLASK initialized, read the security policy, and started in permissive mode. For example:

(XEN) XSM Framework v1.0.0 initialized
(XEN) Policy len  0x1320, start at ffff830117ffe000.
(XEN) Flask:  Initializing.
(XEN) Flask:  Starting in permissive mode.

4 Exercises for the Reader

  1. Create a more sophisticated implementation for handling XenServer hypercalls in xen/xsm/flask/hooks.c.
  2. Write (and load) a custom policy.
  3. Boot with flask_enforcing=1 set, and study any violations that occur (see xl dmesg output).
Recent comment in this post
anshul makkar
Good Work. Once the user builds the default policy, it should work for most of the scenario except for the specialized once like p... Read More
Friday, 28 October 2016 11:22
Continue reading
2226 Hits
1 Comment

Making a difference to the next release

With the 1st alpha build of the new, upcoming XenServer release (codenamed " Ely") as announced by Andy Melmed's blog on the 11th October – I thought it'd be useful to provide a little retrospective on how you, the community, have helped getting it off to a great start by providing great feedback on previous alphas, betas and releases - and how this has been used to strengthen the codebase for Ely as a whole based on your experiences.
As I am sure you are well aware - community users of can make use of the incident tracking database at to raise issues on recent alphas, betas and XenServer releases to raise issues or problems they've found on their hardware or configuration.  These incidents are raised in the form of XSO tickets which can then be commented upon by other members of the community and folks who work on the product.  
We listened
Looking back on all of the XSO tickets raised on the latest 7.0 release - these total more than 200 individual incident reports.  I want to take the time to thank everyone who contributed to these, often detailed, specific, constructive reports, and for working iteratively to understand more of the underlying issues.  Some of these investigations are ongoing, and need further feedback, but many of them are sufficiently clear to move forward to the next step.  
We understood
The incident reports were triaged and, by working with the user community, more than 80% of them have been processed.  Frequently this involved questions and answers to get a better handle on what was the underlying problem.  Then trying a change to the configuration or even a private fix to see and confirm if it related to the problem or resolved it.  The enthusiasm and skill of the reporters has been amazing, and continually useful.  At this point - we've separated the incidents into those which can be fixed as bugs, and those which are requests for features.  The latter have been provided to Citrix product management for consideration.  
We did
Out of these which can be fixed as bugs,  we raised or updated 45 acknowledged defects in XenServer.  More than 70% of these are already fixed - with another 20% being actively worked on.  The small remainder are blocked for some reason and awaiting a change elsewhere in the product, upstream or in our ability to test.  The 70% of fixes have now successfully either become part of some of the hotfixes which have been released for 7.0, or are in the codebase already and are being progressively released as part of the Ely alpha programme for the community to try.  
So what's next?  With work continuing apace on Ely - we have now opened the "Ely alpha" as a affects-version in the incident database to raise issues with this latest build.  At the same time - in the spirit of continuing to progressively improve the actively developing codebase - we have removed the 6.5 SP1 affects-version – so folks can focus on the new release.
Finally - on behalf of all users - my personal thanks to everyone who has helped improve both Dundee and Ely - both by reporting incidents, triaging and fixing them and by continuing to give your feedback on the latest new version.  This really makes a difference to all members of the community.
Continue reading
2897 Hits

XenServer Ely Alpha 1 Available

Hear ye, hear ye… we are pleased to announce that an alpha release of XenServer Project Ely is now available for download! After Dundee (7.0), we've come a little closer to Cambridge (the birthplace of Xen) for our codename, as the city of Ely is just up the road.

Since releasing version 7.0 in May, the XenServer engineering team has been working fervently to prepare the platform with the latest innovations in server virtualization technology. As a precursor, a pre-release containing the prerequisites for enabling a number of powerful (and really cool!) new features has been made available for download from the pre-release page.

What's In it?


The following is a brief description of some of the feature-prerequisites included in this pre-release:


Xen 4.7:  This release of Xen adds support for "live-patching" of the Xen hypervisor, allowing issues to be patched without requiring a host reboot. In this alpha release there is no functionality for you to test in this area, but we thought it was worth telling you about none the less. Xen 4.7 also includes various performance improvements, and updates to the virtual machine introspection code (surfaced in XenServer as Direct Inspect).


Kernel 4.4: Updated kernel to support future feature considerations. All device drivers will be at the upstream versions; we'll be updating these with drops direct from the hardware vendors as we go through the development cycle.


VM import/export performance: a longstanding request from our user community, we've worked to improve the import/export speeds of VMs, and Ely alpha 1 now averages 2x faster than the previous version.


What We'd Like Help With


The purpose of this alpha release is really to make sure that a variety of hardware works with project Ely. Because we've updated core platform components (Xen and the Dom0 kernel), it's always important to check on hardware that we don’t have in our QA labs that all is well. Thus, the more people who can download this build, install, and run a couple of VMs to check all is well the better.


Additionally, we've been working with the community (over on XSO-445) on improving VM import/export performance: we'd like to see whether the improvements we've seen in our tests are what you see too. If they're not, we can figure out why and fix it :-).




This is pre-release software, not for production use. Upgrades from XenServer 7.0 should work fine, but it goes without saying that you should ensure you back up any critical data.


Reporting Bugs


We encourage visitors to download the pre-release and provide us with your feedback. If you do find a problem, please head over to the bug tracker and file a ticket. Please be sure to include a server status report!


Now that we've moved up to a new pre-release project, it's time to remove the XS 6.5 SP1 fix version from the bug tracker, in order that we keep it tidy. You'll see an "Ely alpha" affects version is now present instead.


What Next?


Stay tuned for another pre-release build in the near future: as you may have heard, we've been keeping busy!

As always, we look forward to working with the XenServer community to make the next major release of XenServer the best version ever!




Andy M.

Senior Solutions Architect - XenServer PM


Recent Comments
David Reade
The download link for the release notes does not work. For "" I'm getting "server not found". Is this a link to an... Read More
Tuesday, 11 October 2016 15:56
Tobias Kreidl
Always nice to see XenServer improvements and added features. The new kernel and live patching are nice. The vm-export/import gain... Read More
Wednesday, 12 October 2016 06:41
Willem Boterenbrood
Tobias, In my XenServer 7.0 test environment I see a large improvement of VM export speed compared to my 6.5SP1 live environment.... Read More
Friday, 14 October 2016 07:13
Continue reading
6661 Hits

XenServer Hotfix XS65ESP1035 Released

XenServer Hotfix XS65ESP1035 Released

News Flash: XenServer Hotfix XS65ESP1035 Released

Indeed, I was alerted early this morning (06:00 EST) via email that Citrix has released hotfix XS65ESP1035 for XenServer 6.5 SP1.  The official release and content is filed under CTX216249, which can be found here:

As of the writing of this article, this hotfix has not yet been added to CTX138115 (entitled "Recommended Updates for XenServer Hotfixes") or, as we like to call it "The Fastest Way to Patch A Vanilla XenServer With One or Two Reboots!"  I imagine that resource will be updated to reflect XS65ESP1035 soon.

Personally/Professionally, I will be installing this hotfix as, per CTX216249, I am excited to read what is addressed/fixed:

  • Duplicate entry for XS65ESP1021 was created when both XS65ESP1021 and XS65ESP1029 were applied.
  • When BATMAP (Block Allocation Map) in Xapi database contains erroneous data, the parent VHD (Virtual Hard Disk) does not get inflated causing coalesce failures and ENOSPC errors.
  • After deleting a snapshot on a pool member that is not the pool master, a coalesce operation may not succeed. In such cases, the coalesce process can constantly retry to complete the operation, resulting in the creation of multiple RefCounts that can consume a lot of space on the pool member.
In addition, this hotfix contains the following improvement:
  • This fix lets users set a custom retrans value for their NFS SRs thereby giving them more fine-grained control over how they want NFS mounts to behave in their environment.



This is storage based hotfix and while we can create VMs all day, we rely on the storage substrate to hold our precious VHDs, so plan accordingly to deploy it!

Applying The Patch Manually

As a disclaimer of sorts, always plan your patching during a maintenance window to prevent any production outages.  For me, I am currently up-to-date and will be rebooting my XenServer host(s) in a few hours, so I manually applied this patch.

Why?  If you look in XenCenter for updates, you won't see this hotfix listed (yet).  If it was available in XenCenter, checks and balances would inform me I need to suspend, migrate, or shutdown VMs.  For a standalone host, I really can't do that.  In my pool, I can't reboot for a few hours, but I need this patch installed, so I simply do the following on my XenServer stand-alone server OR XenServer primary/master server:

Using the command line in XenCenter, I make a directory in /root/ called "ups" and then descend into that directory because I plan to use wget (Web Get) to download the patch via its link in

[root@colossus ~]# mkdir ups
[root@colossus ~]# cd ups

Now, using wget I specify what to download over port 80 and to save it as "":

[root@colossus ups]# wget -O

We then see the usual wget progress bar and once it is complete, I can unzip the file "":

HTTP request sent, awaiting response... 200 OK
Length: 110966324 (106M) [application/zip]
Saving to: `'

100%[======================================>] 110,966,324 1.89M/s   in 56s    
2016-08-25 11:06:32 (1.90 MB/s) - `' saved [110966324/110966324]
[root@colossus ups]# unzip 
  inflating: XS65ESP1035.xsupdate   
  inflating: XS65ESP1035-src-pkgs.tar.bz2

I'm a big fan of using shortcuts - especially where UUIDs are involved.  Now that I have the patch ready to expand onto my XenServer master/stand-alone server, I want to create some kind of variable so I don't have to remember my host's UUID or the patch's UUID. 

For the host, I can simply source in a file that contains the XenServer primary/master server's INSTALLATION_UUID (better known as the host's UUID):

[root@colossus ups]# source /etc/xensource-inventory 
[root@colossus ups]# echo $INSTALLATION_UUID

With the variable $INSTALLATION_UUID set, I can now expand the patch and capture it's own UUID:

[root@colossus ups]# patchUUID=`xe patch-upload file-name=XS65ESP1035.xsupdate`
[root@colossus ups]# echo $patchUUID

NOW, I apply the patch to the host (yes, it still needs to be rebooted, but within a few hours) using both variables in the following command:

[root@colossus ups]# xe patch-apply uuid=$patchUUID host-uuid=$INSTALLATION_UUID
Preparing...                ##################################################
kernel                      ##################################################
unable to stat /sys/class/block//var/swap/swap.001: No such file or directory
Preparing...                ##################################################
sm                          ##################################################
Preparing...                ##################################################
blktap                      ##################################################
Preparing...                ##################################################
kpartx                      ##################################################
Preparing...                ##################################################
Preparing...                ##################################################
device-mapper-multipath     ##################################################

At this point, I can back out of the "ups" directory and remove it.  Likewise, I can also check to see if the patch UUID is listed in the XAPI database:

[root@colossus ups]# cd ..
[root@colossus ~]# rm -rf ups/
[root@colossus ~]# ls
[root@colossus ~]# xe patch-list uuid=$patchUUID
uuid ( RO)                    : cdf9eb54-c3da-423d-88ca-841b864f926b
              name-label ( RO): XS65ESP1035
        name-description ( RO): Public Availability: fixes to Storage
                    size ( RO): 21958176
                   hosts (SRO): 207cd7c1-da20-479b-98bc-e84cac64d0c0
    after-apply-guidance (SRO): restartHost

So, nothing really special -- just a quick way to apply patches to a XenServer primary/master server.  In the same manner, you can substitute the $INSTALLATION_UUID with other host UUIDs in a pool configuration, etc.

Well, off to reboot and thanks for reading!


-jkbs | @xenfomationMy Citrix Blog

To receive updates about the latest XenServer Software Releases, login or sign-up to pick and choose the content you need from



Citrix Support Knowledge Center:

Citrix Support Knowledge Center:

Citrix Profile/RSS Feeds:

Original Image Source:

Continue reading
4104 Hits

XenServer 7.0 performance improvements part 4: Aggregate I/O throughput improvements

The XenServer team has made a number of significant performance and scalability improvements in the XenServer 7.0 release. This is the fourth in a series of articles that will describe the principal improvements. For the previous ones, see:


In this article we return to the theme of I/O throughput. Specifically, we focus on improvements to the total throughput achieved by a number of VMs performing I/O concurrently. Measurements show that XenServer 7.0 enjoys aggregate network throughput over three times faster than XenServer 6.5, and also has an improvement to aggregate storage throughput.

What limits aggregate I/O throughput?

When a number of VMs are performing I/O concurrently, the total throughput that can be achieved is often limited by dom0 becoming fully busy, meaning it cannot do any additional work per unit time. The I/O backends (netback for network I/O and tapdisk3 for storage I/O) together consume 100% of available dom0 CPU time.

How can this limit be overcome?

Whenever there is a CPU bottleneck like this, there are two possible approaches to improving the performance:

  1. Reduce the amount of CPU time required to perform I/O.
  2. Increase the processing capacity of dom0, by giving it more vCPUs.

Surely approach 2 is easy and will give a quick win...? Intuitively, we might expect the total throughput to increase proportionally with the number of dom0 vCPUs.

Unfortunately it's not as straightforward as that. The following graph shows what happened to the aggregate network throughput on XenServer 6.5 if the number of dom0 vCPUs is artificially increased. (In this case, we are measuring the total network throughput of 40 VMs communicating amongst themselves on a single Dell R730 host.)


Counter-intuitively, the aggregate throughput decreases as we add more processing power to dom0! (This explains why the default was at most 8 vCPUs in XenServer 6.5.)

So is there no hope for giving dom0 more processing power...?

The explanation for the degradation in performance is that certain operations run more slowly when there are more vCPUs present. In order to make dom0 work better with more vCPUs, we needed to understand what those operations are, and whether they can be made to scale better.

Three such areas of poor scalability were discovered deep in the innards of Xen by Malcolm Crossley and David Vrabel, and improvements were made for each:

  1. Maptrack lock contention – improved by;a=commit;h=dff515dfeac4c1c13422a128c558ac21ddc6c8db
  2. Grant-table lock contention – improved by;a=commitdiff;h=b4650e9a96d78b87ccf7deb4f74733ccfcc64db5
  3. TLB flush on grant-unmap – improved by

The result of improving these areas is dramatic – see the green line in the following graph:


Now, throughput scales very well as the number of vCPUs increases. This means that, for the first time, it is now beneficial to allocate many vCPUs to dom0 – so that when there is demand, dom0 can deliver. Hence we have given XenServer 7.0 a higher default number of dom0 vCPUs.

How many vCPUs are now allocated to dom0 by default?

Most hosts will now get 16 vCPUs by default, but the exact number depends on the number of CPU cores on the host. The following graph summarises how the default number of dom0 vCPUs is calculated from the number of CPU cores on various current and historic XenServer releases:


Summary of improvements

I will conclude with some aggregate I/O measurements comparing XenServer 6.5 and 7.0 under default settings (no dom0 configuration changes) on a Dell R730xd.

  1. Aggregate network throughput – twenty pairs of 32-bit Debian 6.0 VMs sending and receiving traffic generated with iperf 2.0.5.
  2. Aggregate storage IOPS – twenty 32-bit Windows 7 SP1 VMs each doing single-threaded, serial, sequential 4KB reads with fio to a virtual disk on an Intel P3700 NVMe drive.
Continue reading
6257 Hits

XenServer 7.0 performance improvements part 3: Parallelised plug and unplug VBD operations in xenopsd

The XenServer team has made a number of significant performance and scalability improvements in the XenServer 7.0 release. This is the third in a series of articles that will describe the principal improvements. For the first two, see here:


The topic of this post is control plane performance. XenServer 7.0 achieves significant performance improvements through the support for parallel VBD operations in xenopsd. With the improvements, xenopsd is able to plug and unplug many VBDs (virtual block devices) at the same time, substantially improving the duration of VM lifecycle operations (start, migrate, shutdown) for VMs with many VBDs, and making it practical to operate VMs with up to 255 VBDs.

Background of the VM lifecycle operations

In XenServer, xenopsd is the dom0 component responsible for VM lifecycle operations:

  • during a VM start, xenopsd creates the VM container and then plugs the VBDs before starting the VCPUs;
  • during a VM shutdown, xenopsd stops the VCPUs and then unplugs the VBDs before destroying the VM container;
  • during a VM migrate, xenopsd creates a new VM container, unplugs the VBDs of the old VM container, and plugs the VBDs for the new VM before starting its VCPUs; while the VBDs are being unplugged and plugged on the other VM container, the user experiences a VM downtime when the VM is unresponsive because both old and new VM containers are paused.

Measurements have shown that a large part, usually most of the duration of these VM lifecycle operations is due to plugging and unplugging the VBDs, especially on slow or contended storage backends.



Why does xenopsd take some time to plug and unplug the VBDs?

The completion of a xenopsd VBD plug operation involves the execution of two storage layer operations, VDI attach and VDI activate (where VDI stands for virtual disk image). These VDI operations include control plane manipulation of daemons, block devices and disk metadata in dom0, which will take different amounts of time to execute depending on the type of the underlying Storage Repository (SRs, such as LVM, NFS or iSCSI) used to hold the VDIs, and the current load on the storage backend disks and their types (SSDs or HDs). Similarly, the completion of a xenopsd VBD unplug operation involves the execution of two storage layer operations, VDI deactivate and VDI detach, with the corresponding overhead of manipulating the control plane of the storage layer.

If the underlying physical disks are under high load, there may be contention preventing progress of the storage layer operations, and therefore xenopsd may need to wait many seconds before the requests to plug and unplug the VBDs can be served.

Originally, xenopsd would execute these VBD operations sequentially, and the total time to finish all of them for a single VM would depend of the number of VBDs in the VM. Essentially, it would be a sum of the time to operate each of othe VBDs of this VM, which would result in several minutes of wait for a lifecycle operation of a VM that had, for instance, 255 VBDs.

What are the advantages of parallel VBD operations?

Plugging and unplugging the VBDs in parallel in xenopsd:

  • provides a total duration for the VM lifecycle operations that is independent of the number of VBDs in the VM. This duration will typically be the duration of the longest individual VBD operation amongst the parallel VBD operations for that VM;
  • provides a significant instantaneous improvement for the user, across all the VBD operations involving more than 1 VBD per VM. The more devices involved, the larger the noticeable improvement, up to the saturation of the underlying storage layer;
  • this single improvement is immediately applicable across all of VM start, VM shutdown and VM migrate lifecycle operations.



Are there any disadvantages or limitations?

Plugging and unplugging VBDs uses dom0 memory. The main disadvantage of doing these in parallel is that dom0 needs more memory to handle all the parallel operations. To prevent situations where a large number of such operations would cause dom0 to run out of memory, we have added two limits:

  • the maximum number of global parallel operations that xenopsd can request is the same as the number of xenopsd worker-pool threads as defined by worker-pool-size in /etc/xenopsd.conf. This prevents regression in the maximum dom0 memory usage compared to when sequential VBD operations per VM was used in xenopsd. An increase in this value will increase the number of parallel VBD operations, at the expense of having to increase the dom0 memory for about 15MB for each extra parallel VBD operation.
  • the maximum number of per-VM parallel operations that xenopsd can request is currently fixed to 10, which covers a wide range of VMs and still provides a 10x improvement in lifecycle operation times for those VMs that have more than 10 VBDs.

Where do I find the changes?

The changes that implemented this feature are available in github at

What sort of theoretical improvements should I expect in XenServer 7.0?

The exact numbers depend on the SR type, storage backend load characteristics and the limits specified in the previous section. Given the limits in the previous section, the results for the duration of VDB plugs for a single VM will follow the pattern in the following table:

Number n of VBDs/VM
Improvement of VBD operations
<=10 VBDs/VM times faster
> 10 VBDs/VM

10 times faster

The table above assumes that the maximum number of global parallel operations discussed in the section above is not reached. If you want to guarantee the improvement in the table above for x>1 simultaneous VM lifecycle operations, at the expense of using more dom0 memory in the worst case, you will probably want to set worker-pool-size = (n * x) in /etc/xenopsd.conf, where is a number reflecting the average number of VBDs/VM amongst all VMs up to a maximum of n=10.

What sort of practical improvements should I expect in XenServer 7.0?

The VBD plug and unplug operations are only part of the overall operations necessary to execute a VM lifecycle operation. The remaining parts, such as creation of the VM container and VIF plugs, will disperse the VBD improvements of the previous section, though they are still significant. Some examples of improvements, using a EXT SR on a local SSD storage backend:

VM lifecycle operation
mImprovement with 8 VBDs/VM

Toolstack time to start a single VM



Toolstack time to bootstorm 125 VMs



The approximately 2s improvement in single VM start time was caused by plugging the 8 VBDs in parallel. As we see in the second row of the table, this can be a significant advantage in a bootstorm.

In XenServer 7.0, not only does xenopsd execute VBD operations in parallel, but it also has improvements in the storage layer operation times on VDIs, so you may observe that in your XenServer 7.0 environment further VM lifecycle time improvements beyond the expected ones from parallel VBD operations are noticeable, compared to XenServer 6.5SP1.


Recent comment in this post
Sam McLeod
Thanks for the post Marcus!
Wednesday, 20 July 2016 09:02
Continue reading
4409 Hits
1 Comment

Resetting Lost Root Password in XenServer 7.0

XenServer 7.0, Grub2, and a Lost Root Password

In a previous article I detailed how one could reset a lost root password to XenServer 6.2.  While the article is not limited to 6.2 (it works just as well for 6.5, 6.1, and 6.0.2), this article is dedicated to XenServer 7.0 as grub2 has been brought in to replace extlinux.

As such, if the local root user's (LRU) password for a XenServer 7.0 is forgotten physical (or "lights out") access to the host and a reboot will be required.  The contrast comes with grub2, the methods to boot the XenServer 7.0 host into single user mode, and how to reset the root password to a known token.

The Grub Boot Screen

Once obtaining physical or "lights out" to the XenServer 7.0 host in question, on reboot the following screen will appear:

It is important to note that once this screen appears, you only have four seconds to take action before the host proceeds to boot the kernel.

As should be default, the XenServer kernel is highlighted.  One will want to immediately press the key (for edit).

This will then refresh the grub interface - stopping any count-down-to-boot timers - which will reveal the boot entry.  It is within this window (using up, down, left, and right) one will want to navigate to around line 4 or five and isolate "ro nolvm":


Next, one will want to remove (or backspace/delete) the "ro" characters and type in "rw init=/sysroot/bin/sh", or as illustrated:


Don't worry if the directive is not on one line!


With this change made, press both Control and X at the same time as this will boot the XenServer kernel into single user style mode, or better known as Emergency Mode:

How to Change Root's Password

From the Emergency Mode prompt, execute the following command:

chroot /sysroot

Now, once can execute the "passwd" command to change root's credentials:


Now that root's credentials have been changed, utilize Control+Alt+Delete to reboot the XenServer 7.0 host and one will find via SSH, XenCenter, or directly that the root password has been changed: the host is ready to be managed again.


Recent Comments
Tobias Kreidl
Many thanks for this update, Jesse! It should be turned into a KB article, as well, if not already.
Friday, 24 June 2016 10:52
JK Benedict
Jordan -- Thanks for the compliments! However, it seems more apropos to say "Sorry to hear about your situation!" So, the steps... Read More
Monday, 27 June 2016 10:11
Continue reading
12031 Hits

XenServer 7.0 performance improvements part 2: Parallelised networking datapath

The XenServer team has made a number of significant performance and scalability improvements in the XenServer 7.0 release. This is the second in a series of articles that will describe the principal improvements. For the first, see

The topic of this post is network I/O performance. XenServer 7.0 achieves significant performance improvements through the support for multi-queue paravirtualised network interfaces. Measurements of one particular use-case show an improvement from 17 Gb/s to 41 Gb/s.

A bit of background about the PV network datapath

In order to perform network-based communications, a VM employs a paravirtualised network driver (netfront in Linux or xennet in Windows) in conjunction with netback in the control domain, dom0.


To the guest OS, the netfront driver feels just like a physical network device. When a guest wants to transmit data:

  • Netfront puts references to the page(s) containing that data into a "Transmit" ring buffer it shares with dom0.
  • Netback in dom0 picks up these references and maps the actual data from the guest's memory so it appears in dom0's address space.
  • Netback then hands the packet to the dom0 kernel, which uses normal routing rules to determine that it should go to an Open vSwitch device and then on to either a physical interface or the netback device for another guest on the same host.

When dom0 has a network packet it needs to send to the guest, the reverse procedure applies, using a separate "Receive" ring.

Amongst the factors that can limit network throughput are:

  1. the ring becoming full, causing netfront to have to wait before more data can be sent, and
  2. the netback process fully consuming an entire dom0 vCPU, meaning it cannot go any faster.

Multi-queue alleviates both of these potential bottlenecks.

What is multi-queue?

Rather than having a single Transmit and Receive ring per virtual interface (VIF), multi-queue means having multiple Transmit and Receive rings per VIF, and one netback thread for each:


Now, each TCP stream has the opportunity to be driven through a different Transmit or Receive ring. The particular ring chosen for each stream is determined by a hash of the TCP header (MAC, IP and port number of both the source and destination).

Crucially, this means that separate netback threads can work on each TCP stream in parallel. So where we were previously limited by the capacity of a single dom0 vCPU to process packets, now we can exploit several dom0 vCPUs. And where the capacity of a single Transmit ring limited the total amount of data in-flight, the system can now support a larger amount.

Which use-cases can take advantage of multi-queue?

Anything involving multiple TCP streams. For example, any kind of server VM that handles connections from more than one client at the same time.

Which guests can use multi-queue?

Since frontend changes are needed, the version of the guest's netfront driver matters. Although dom0 is geared up to support multi-queue, guests with old versions of netfront that lack multi-queue support are limited to single Transmit and Receive rings.

  • For Windows, the XenServer 7.0 xennet PV driver supports multi-queue.
  • For Linux, multi-queue support was added in Linux 3.16. This means that Debian Jessie 8.0 and Ubuntu 14.10 (or later) support multi-queue with their stock kernels. Over time, more and more distributions will pick up the relevant netfront changes.

How does the throughput scale with an increasing number of rings?

The following graph shows some measurements I made using iperf 2.0.5 between a pair of Debian 8.0 VMs both on a Dell R730xd host. The VMs each had 8 vCPUs, and iperf employed 8 threads each generating a separate TCP stream. The graph reports the sum of the 8 threads' throughputs, varying the number of queues configured on the guests' VIFs.


We can make several observations from this graph:

  • The throughput scales well up to four queues, with four queues achieving more than double the throughput possible with a single queue.
  • The blip at five queues probably arose when the hashing algorithm failed to spread the eight TCP streams evenly across the queues, and is thus a measurement artefact. With different TCP port numbers, this may not have happened.
  • While the throughput generally increases with an increasing number of queues, the throughput is not proportional to the number of rings. Ideally, the throughput would double when you double the number of rings. This doesn't happen in practice because the processing is not perfectly parallelisable: netfront needs to demultiplex the streams onto the rings, and there are some overheads due to locking and synchronisation between queues.

This graph also highlights the substantial improvement over XenServer 6.5, in which only one queue per VIF was supported. In this use-case of eight TCP streams, XenServer 7.0 achieves 41 Gb/s out-of-the-box where XenServer 6.5 could manage only 17 Gb/s – an improvement of 140%.

How many rings do I get by default?

By default the number of queues is limited by (a) the number of vCPUs the guest has and (b) the number of vCPUs dom0 has. A guest with four vCPUs will get four queues per VIF.

This is a sensible default, but if you want to manually override it, you can do so in the guest. In a Linux guest, add the parameter xen_netfront.max_queues=n, for some n, to the kernel command-line.

Recent Comments
Tobias Kreidl
Hi, Jonathan: Thanks for the insightful pair of articles. It's interesting how what appear to be nuances can make large performan... Read More
Tuesday, 21 June 2016 04:54
Jonathan Davies
Thanks for sharing your thoughts, Tobias. You ask about queue polling. In fact, netback already does this! It achieves this by us... Read More
Wednesday, 22 June 2016 08:40
Sam McLeod
Interesting post Jonathan, I've tried adjusting `xen_netfront.max_queues` amongst other similar values on both guests and hosts h... Read More
Tuesday, 21 June 2016 13:01
Continue reading
7651 Hits

XenServer 7.0 performance improvements part 1: Lower latency storage datapath

The XenServer team made a number of significant performance and scalability improvements in the XenServer 7.0 release. This is the first in a series of articles that will describe the principal improvements.

Our first topic is storage I/O performance. A performance improvement has been achieved through the adoption of a polling technique in tapdisk3, the component of XenServer responsible for handling I/O on virtual storage devices. Measurements of one particular use-case demonstrate a 50% increase in performance from 15,000 IOPS to 22,500 IOPS.

What is polling?

Normally, tapdisk3 operates in an event-driven manner. Here is a summary of the first few steps required when a VM wants to do some storage I/O:

  1. The VM's paravirtualised storage driver (called blkfront in Linux or xenvbd in Windows) puts a request in the ring it shares with dom0.
  2. It sends tapdisk3 a notification via an event-channel.
  3. This notification is delivered to domain 0 by Xen as an interrupt. If Domain 0 is not running, it will need to be scheduled in order to receive the interrupt.
  4. When it receives the interrupt, the domain 0 kernel schedules the corresponding backend process to run, tapdisk3.
  5. When tapdisk3 runs, it looks at the contents of the shared-memory ring.
  6. Finally, tapdisk3 finds the request which can then be transformed into a physical I/O request.

Polling is an alternative to this approach in which tapdisk3 repeatedly looks in the ring, speculatively checking for new requests. This means that steps 2–4 can be skipped: there's no need to wait for an event-channel interrupt, nor to wait for the tapdisk3 process to be scheduled: it's already running. This enables tapdisk3 to pick up the request much more promptly as it avoids these delays inherent to the event-driven approach.

The following diagram contrasts the timelines of these alternative approaches, showing how polling reduces the time until the request is picked up by the backend.


How does polling help improve storage I/O performance?

Polling is in established technique for reducing latency in event-driven systems. (One example of where it is used elsewhere to mitigate interrupt latency is in Linux networking drivers that use NAPI.)

Servicing I/O requests promptly is an essential part of optimising I/O performance. As I discussed in my talk at the 2015 Xen Project Developer Summit, reducing latency is the key to maintaining a low virtualisation overhead. As physical I/O devices get faster and faster, any latency incurred in the virtualisation layer becomes increasingly noticeable and translates into lower throughputs.

An I/O request from a VM has a long journey to physical storage and back again. Polling in tapdisk3 optimises one section of that journey.

Isn't polling really CPU intensive, and thus harmful?

Yes it is, so we need to handle it carefully. If left unchecked, polling could easily eat huge quantities of domain 0 CPU time, starving other processes and causing overall system performance to drop.

We have chosen to do two things to avoid consuming too much CPU time:

  1. Poll the ring only when there's a good chance of a request appearing. Of course, guest behaviour is totally unpredictable in general, but there are some principles that can increase our chances of polling at the right time. For example, one assumption we adopt is that it's worth polling for a short time after the guest issues an I/O request. It has issued one request, so there's a good chance that it will issue another soon after. And if this guess turns out to be correct, keep on polling for a bit longer in case any more turn up. If there are none for a while, stop polling and temporarily fall back to the event-based approach.
  2. Don't poll if domain 0 is already very busy. Since polling is expensive in terms of CPU cycles, we only enter the polling loop if we are sure that it won't starve other processes of CPU time they may need.

How much faster does it go?

The benefit you will get from polling depends primarily on the latency of your physical storage device. If you are using an old mechanical hard-drive or an NFS share on a server on the other side of the planet, shaving a few microseconds off the journey through the virtualisation layer isn't going to make much of a difference. But on modern devices and low-latency network-based storage, polling can make a sizeable difference. This is especially true for smaller request sizes since these are most latency-sensitive.

For example, the following graph shows an improvement of 50% in single-threaded sequential read I/O for small request sizes – from 15,000 IOPS to 22,500 IOPS. These measurements were made with iometer in a 32-bit Windows 7 SP1 VM on a Dell PowerEdge R730xd with an Intel P3700 NVMe drive.


How was polling implemented?

The code to add polling to tapdisk3 can be found in the following set of commits:

Continue reading
9406 Hits

About XenServer

XenServer is the leading open source virtualization platform, powered by the Xen Project hypervisor and the XAPI toolstack. It is used in the world's largest clouds and enterprises.
Commercial support for XenServer is available from Citrix.