Virtualization Blog

Discussions and observations on virtualization.

CPU Feature Levelling in Dundee

As anyone with more than one type of server is no doubt aware, CPU feature levelling is the method by which we try to make it safe for VMs to migrate. If each host in a pool is identical, this is easy. However if the pool has non-identical hardware, we must hide the differences so that a VM which is migrated continues without crashing.

I don't think I am going to surprise anyone by admitting that the old way XenServer did feature levelling was clunky and awkward to use. Because of a step change introduced in Intel Ivy Bridge CPUs, feature levelling also ceased to work correctly. As a result, we took the time to redesign it, and the results are available for use in Dundee beta 2.

Background

When a VM boots, the kernel looks at the available featureset and in general, turn as much on as it knows how to. Linux will go so far as to binary patch some of the hotpaths for performance reasons. Userspace libraries frequently have the same algorithm compiled multiple times, and will select the best one to use at startup, based on which CPU instructions are available.

On a bare metal system, the featureset will not change while the OS is running, and the same expectations exist in the virtual world.  Migration introduces a problem with this expectation; it is literally like unplugging the hard drive and RAM from one computer, plugging it into another, and letting the OS continue from where it was. For the VM not to crash, all the features it is using must continue to work at the destination, or in other words, features which are in use must not disappear at any point.

Therefore, the principle of feature levelling is to calculate the common subset of features available across the pool, and restrict the VM to this featureset. That way the VMs featureset will always work, no matter which pool member it is currently running on.

Hardware

There are many factors affecting the available featureset for VMs to use:

  • The CPU itself
  • The BIOS/firmware settings
  • The hypervisor command line parameters
  • The restrictions which the toolstack chooses to apply

It is also worth identifying that firmware/software updates might reduce the featureset (e.g. Intel releasing a microcode update to disable TSX on Haswell/Broadwell processors).

Hiding features is also tricky; x86 provides no architectural means to do so. Feature levelling is therefore implemented using vendor-specific extensions, which are documented as unstable interfaces and liable to change at any point at any point inf the future (as happened with Ivy Bridge).

XenServer Pre Dundee

Older versions of XenServer would require a new pool member to be identical before it would be permitted to join. Making this happen involved consulting the xe host-cpu-info features, divining some command line parameters for Xen and rebooting.

If everything went to plan, the new slave would be permitted to join the pool. Once a pool had been created, it was assumed to be homogeneous from that point on. The command line parameters had an effect on the entire host, including Xen itself. In a slightly heterogeneous case, the difference in features tended to be features which only Xen would care to use, so was needlessly penalised along with dom0.

XenServer Dundee

In Dundee, we have made some changes:

There are two featuresets rather than one. PV and HVM guests are fundamentally different types of virtualisation, and come with different restrictions and abilities. HVM guests will necessarily have a larger potential featureset than PV, and having a single featureset which is safe for PV guests would apply unnecessary restrictions to HVM guests.

The featuresets are recalculated and updated every boot. Assuming that servers stay the same after initial configuration is unsafe, and incorrect.

So long as the CPU Vendor is the same (i.e. all Intel or all AMD), a pool join will be permitted to happen, irrespective of the available features. The pool featuresets are dynamically recalculated every time a slave joins or leaves a pool, and every time a pool member reconnects after reboot.

When a VM is started, it will be given the current pool featureset. This permits it to move anywhere in the pool, as the pool existed when it started. Changes in pool level have no effect on running VMs. Their featureset is fixed at boot (and is fixed across migrate, suspend/resume, etc.), which matches the expectation of the OS. (One release note is that to update the VM featureset, it must be shut down fully and restarted. This is contrary to what would be expected with a plain reboot, and exists because of some metatdata caching in the toolstack which is proving hard to untangle.)

Migration safety checks are performed between the VMs fixed featureset and the destination hosts featureset. This way, even if the pool level drops because a less capable slave joined, an already-running VM will still be able to move anywhere except the new, less capable slave.

The method of hiding features from VMs now has no effect on Xen and dom0. They never migrate, and will have access to all the available features (albeit with dom0 subject to being a VM in the first place).

Conclusion

We hope that the changes listed above will make Dundee far easier to use in a heterogeneous setup.

All of this information applies equally to inter-pool migration (Storage XenMotion) as intra-pool migration.  In the case of upgrade from older versions of XenServer (Rolling Pool Upgrade, Storage XenMotion again), there is a complication because of having to fill in some gaps in the older toolstacks idea of the VMs featureset.  In such a case, an incoming VM is assumed to have the featureset of the host it lands on, rather than the pool level.  This matches the older logic (and is the only safe course of action), but does result in the VM possibly having a higher featureset than the pool level, and being less mobile as a result.  Once the VM has been shut down and started back up again, it will be given the regular pool level and behave normally.

To summarise the two key points:

  • We expect feature levelling to work between any combination of CPUs from the same vendor (with the exception of PV guests on pre-Nehalem CPUs, which lacked any masking capabilities whatsoever).
  • We expect there to be no need for any manual feature configuration to join a pool together.

That being said, this is a beta release and I highly encourage you to try it out and comment/report back.

Recent Comments
Tobias Kreidl
This is good news; it's been very inconvenient not being able to mix servers with Haswell-based processors with other Intel chip s... Read More
Wednesday, 23 December 2015 16:22
Tobias Kreidl
I would add that a re-calculation of the feature set after each reboot is an excellent idea as I have seen the CPU mask change as ... Read More
Wednesday, 23 December 2015 16:35
Willem Boterenbrood
Good news indeed, I think it is important to make sure that features you include in XenServer (or any product) are going to keep w... Read More
Friday, 25 December 2015 01:20
Continue reading
17927 Hits
3 Comments

Storage XenMotion in Dundee

Before we dive into the enhancements brought in the Storage XenMotion(SXM) feature in the XenServer Dundee Alpha 2 release; here is the refresher of various VM migrations supported in XenServer and how users can leverage them for different use cases.

XenMotion refers to the live migration of a VM(VM's disk residing on a shared storage) from one host to another within a pool with minimal downtime. Hence, with XenMotion we are moving the VM without touching its disks. E.g. XenMotion feature is very helpful during host and pool maintenance and is used with XenServer features such as Work Load Balancing(WLB)and Rolling Pool Upgrade(RPU) where the VM's residing on shared ­­­­storage can be moved to other host within a pool.

Storage XenMotion is the marketing name given to two distinct XenServer features, live storage migration and VDI move. Both features refer to the movement of a VM's disk (vdi) from one storage repository to another. Live storage migration also includes the movement of the running VM from one host to another host. In the initial state, the VM’s disk can reside either on local storage of the host or shared storage. It can then be motioned to either local storage of another host or shared storage of a pool (or standalone host). The following classifications exist for SXM:

  • When source and destination hosts happen to be part of the same pool we refer to it as IntraPool SXM. You can choose to migrate the VM's vdis to local storage of the destination host or another shared storage of the pool. E.g. Leverage it when you need to live migrate VMs residing on a slave host to the master of a pool.
  • When source and destination hosts happen to be part of different pools we refer to it as CrossPool SXM. VM migration between two standalone hosts can also be regarded as CrossPool SXM. You can choose to migrate the VM's vdis to local storage of the destination host or shared storage of the destination pool. E.g. I often leverage CrossPool SXM when I need to migrate VM’s residing on pool having old version of XenServer to the pool having latest XenServer
  • When the source and destination hosts are the same, we refer to it as LiveVDI Move. E.g. Leverage LiveVDI move when you want to upgrade your storage arrays but you don’t want to move the VMs to some another host. In such cases, you can LiveVDI move VM's vdi say from the shared storage (which is planned to be upgraded) to another shared storage in a pool without taking down your VM’s.

When SXM was first rolled out in XenServer 6.1 ,there were some restrictions on VMs before they can be motioned such as maximum number of snapshots a VM can have while undergoing SXM, VM having checkpoints cannot be motioned, the VM has to be running (otherwise it’s not live migration) etc. For the XenServer DundeeAlpha 2 release the XenServer team has removed some of those constraints .Thus below is the list of enhancements brought to SXM.

1. VM can be motioned regardless of its power status. Therefore I can successfully migrate a suspended VM or move a halted VM within a pool (intra pool) or across pools (cross pool)

a1sx2_Thumbnail1_CroosPoolCopyEdited_20150811-093802_1.jpg

a1sx2_Thumbnail1_CrossPoolSuspendEdited.jpg

2. VM can have more than one snapshot and checkpoint during SXM operation. Thus VM can have a mix of snapshots and checkpoints and it can still be successfully motioned.

a1sx2_Thumbnail1_CrossPoolCheckpointEdited.jpg

3. A Halted VM can be copied from say pool A to pool B (cross-pool copy) .Note that VM’s that are not in halted state cannot be cross pool copied.

a1sx2_Thumbnail1_CroosPoolCopyEdited_20150811-094555_1.jpg

4. User created templates can also be copied from say pool A to pool B (cross-pool copy).Note that system templates cannot be cross pool copied.

a1sx2_Thumbnail1_CrossPoolCopyTemplateEdited.jpg

Well this is not the end of the SXM improvements! Stay tuned with XenServer where in the upcoming release we aim to reduce VM downtime further during migration operations, and do download the Dundee preview and try it out yourself.     

Recent Comments
Tobias Kreidl
Nice post, Mayur! An important point to note in Dundee is that there is now also the option to create thin provisioned storage usi... Read More
Wednesday, 05 August 2015 21:38
Tobias Kreidl
Mayur, The images currently posted here are thumbnails and very hard to read (at least for some of us). Could you either post the ... Read More
Saturday, 08 August 2015 17:55
Mayur Vadhar
Thanks for the comments ,Tobias ! I will edit the post so the images are clearly visible. And I have mentioned cross pool copy opt... Read More
Monday, 10 August 2015 15:26
Continue reading
20418 Hits
6 Comments

About XenServer

XenServer is the leading open source virtualization platform, powered by the Xen Project hypervisor and the XAPI toolstack. It is used in the world's largest clouds and enterprises.
 
Commercial support for XenServer is available from Citrix.