Virtualization Blog

Discussions and observations on virtualization.

Creating backups with XenServer

Backup is an essential part of the business workflow for many of our customers - be it SMB, Enterprise Server Virtualisation or Virtual Desktop Infrastructure. Making the backup experience smoother is high up on our wishlist at XenServer Engineering and the delivery of improved VM import/export performance in XS 7.1 shows our commitment to that end. To continue improving our services supporting the backup ecosystem, we would like to better understand how you use backup with XenServer

 

  • How often do you backup? Do you have multiple jobs for monthly, weekly, daily backups?

  • How do you create your backups?

    • Use VM Export to backup VM metadata + disks

    • Snapshot at the VM level and use transfer/service VM to read off the snapshots

    • Use vdi-export to create differential disks (.vhd)

  • Do you use a third-party vendor for handling your backups?

  • Would support for incremental backups be useful for your use case?

Please leave a comment with your answers and any issues you may have with your backup experience today. We look forward to hearing from you!

Thank you,

Chandrika

 

Continue reading
940 Hits
0 Comments

XenCenter 7.1 update now available!

A hotfix (XS71E001) has been released for customers using XenCenter as the management console for their XenServer 7.1 virtual environments. 

This hotfix offers improvements in XenCenter UI responsiveness, as well as several fixes associated with host health check analysis, status reports and updates. Additional information pertaining to this hotfix can be found here.

As always, we encourage customers read the hotfix release notes and install the hotfix to avoid any of the issues described in the notes.

 

Continue reading
1206 Hits
0 Comments

Introducing... XenServer 7.1!

We are pleased to announce the release of XenServer 7.1!

Click here to learn about the new features and enhancements available in 7.1.

As is customary with every new release, we encourage you to give v7.1 a spin and report any issues via https://bugs.xenserver.org.

Note: We ask that you target this release exclusively for new defect reports[*].

Thank you and enjoy the latest release!

[*]In case of problems with earlier releases, pre-XS v7.0 and outside of paid support, then we recommend you upgrade to the XS v7.x series.  

 

 

 

Tags:
Recent Comments
Andrew Halley
See here for which features are available in which versions : https://www.citrix.com/content/dam/citrix/en_us/documents/product-o... Read More
Friday, 24 February 2017 17:41
Christian
Hey, Great news! The Download links still reflect 7.0 release, tho. Any chance to get a download link? -Chris.... Read More
Friday, 24 February 2017 22:07
Andrew Halley
We're working on it - now done!
Monday, 27 February 2017 16:51
Continue reading
3426 Hits
6 Comments

Staying Ahead of the Curve

Are you looking to improve the performance of your virtual servers and desktops?

Could your hypervisor use a boost when it comes to supporting graphics-intense applications?

Are you in need of an advanced security technology that offers a unique way of detecting and blocking sophisticated attacks against your data center before they cause any damage to your business?

Would you like to simplify the maintenance of your hosting infrastructure?

Does the idea of optimizing the performance, scalability, management and cost-savings of your application and desktop delivery solutions through the combination of an industry-leading hypervisor and industry-leading HCI platforms sound interesting to you?

Would you feel more comfortable knowing your hosting infrastructure was fully-supported for the next 10 years?

If you answered "yes" to any of the above, click here to learn more!

Until next time,

Andy

 

Continue reading
925 Hits
0 Comments

XenServer High-Availability Alternative HA-Lizard

XenServer High-Availability Alternative HA-Lizard

WHY HA AND WHAT IT DOES

XenServer (XS) contains a native high-availability (HA) option which allows quite a bit of flexibility in determining the state of a pool of hosts and under what circumstances Virtual Machines (VMs) are to be restarted on alternative hosts in the event of the loss of the ability of a host to be able to serve VMs. HA is a very useful feature that protects VMs from staying failed in the event of a server crash or other incident that makes VMs inaccessible. Allowing a XS pool to help itself maintain the functionality of VMs is an important feature and one that plays a large role in sustaining as much uptime as possible. Permitting the servers to automatically deal with fail-overs makes system administration easier and allows for more rapid reaction times to incidents, leading to increased up-time for servers and the applications they run.

XS allows for the designation of three different treatments of Virtual Machines: (1) always restart, (2) restart if possible, and (3) do not restart. The VMs designated with the highest restart priority will be the first to be attempted to restart and all will be handled, provided adequate resources (primarily, host memory) are available.  A specific start order, allowing for some VMs to be checked to be running before others, can also be established. VMs will be automatically distributed among whatever remaining XS hosts are considered active. Where necessary, note that hosts that contain expandable memory will be shrunk down to accommodate additional hosts and those hosts designated to be restarted will also be run with reduced memory, if necessary. If additional capacity exists to run more VMs, those designated as “start if possible” will be brought online. Whichever VMs that are not considered essential typically will be marked as “do not restart” and hence will be left “off” had they been running before, requiring any of those desired to be restarted to be done manually, resources permitting.

XS also allows for specifying the minimum number of active hosts to remain to accommodate failures; larger pools that are not overly populated with VMs can readily accommodate even two or more host failures.

The election of what hosts are “live” and should be considered active members of the pool follows a rather involved process of a combination of network accessibility plus access to an independent designated pooled Storage Repository (SR) that serves as an additional metric. The pooled SR can also be a fiber channel device, being independent of Ethernet connections. A quorum-based algorithm is applied to establish which servers are up and active as members of the pool and which -- in the event of a pool master failure -- should be elected the new pool master.

 

WHEN HA WORKS, IT WORKS GREAT

Without going into more detail, suffice it to say that this methodology works very well, however requiring a few prerequisite conditions that need to be taken into consideration. First of all, the mandate that a pooled storage device be available clearly means that a pool consisting of hosts that only make use of local storage will be precluded. Second, there is also a constraint that for a quorum to be possible, it is required to have a minimum of three hosts in the pool or HA results will be unpredictable as the election of a pool master can become ambiguous. This comes about because of the so-called “split brain” issue (http://linux-ha.org/wiki/Split_Brain) which is endemic in many different operating system environments that employ a quorum as means of making such a decision. Furthermore, while fencing (the process of isolating the host; see for example http://linux-ha.org/wiki/Fencing) is the typical recourse, the lack of intercommunication can result in a wrong decision being made and hence loss of access to VMs. Having experimented with two-host pools and the native XenServer HA, I would say that an estimate of it working about half the time is about right and from a statistical viewpoint, pretty much what you would expect.

This limitation is, however, still of immediate concern to those with either no pooled storage and/or only two hosts in a pool. With a little bit of extra network connectivity, a relatively simple and inexpensive solution to the external SR can be provided by making a very small NFS-based SR available. The second condition, however, is not readily rectified without the expense of at least one additional host and all the connectivity associated with it. In some cases, this may simply not be an affordable option.

 

ENTER HA-LIZARD

For a number of years now, an alternative method of providing HA has been available through the program package provided by HA-Lizard (http://www.halizard.com/) , a community project that provides a free alternative that is neither dependent on external SRs nor requires a minimum of three hosts within a pool. In this blog, the focus will be on the standard HA-Lizard version and because of the particularly harder-to-handle situation of a two-node pool, it will also be the subject of discussion.

I had been experimenting for some time with HA-Lizard and found in particular that I was able to create failure scenarios that needed some improvement. HA-Lizard’s Salvatore Costantino was more than willing to lend an ear to the cases I had found and this led further to a very productive collaboration on investigating and implementing means to deal with a number of specific cases involving two-host pools. The result of these several months of efforts is a new HA-Lizard release that manages to address a number of additional scenarios above and beyond its earlier capabilities.

It is worthwhile mentioning that there are two ways of deploying HA-Lizard:

1) Most use cases combine HA-Lizard and iSCSI-HA which creates a two-node pool using local storage while maintaining full VM agility with VMs being able to run on either host. In this case, DRBD (http://www.drbd.org/) is implemented in this type of deployment and it works very well making use of the real-time storage replication.

2) HA-Lizard, only, is used with an external Storage Repository (as in this particular case).

Before going into details of the investigation, a few words should go towards a brief explanation of how this works. Note that there is only Internet connectivity (the use of a heuristic network node) and no external SR, so how is a split brain situation then avoidable?

This is how I'd describe the course of action in this two-node situation:

If a node sees the gateway, assume it's alive. If it cannot, assume it's a good candidate for fencing. If the node that cannot see the gateway is the master, it should internally kill any running VMs and surrender its ability to be the master and fence itself. The slave node should promote itself to master and attempt to restart any missing VMs. Any that are on the previous master will probably fail though, because there is no communication to the old master. If the old VMs cannot be restarted, eventually the new master will be able to restart them regardless after a toolstack restart. If the slave node fails by not being able to communicate with the network, as long as the master still sees the network and not the slave’s network, it can assume the slave needs to fence itself, kill off its VMs and assume that they will be restarted on the current master. The slave needs to realize it cannot communicate out, and therefore should kill off any of its VMs and fence itself.

Naturally, the trickier part comes with the timing of the various actions, since each node has to blindly assume the other is going to conduct a sequence of events. The key here is that these are all agreed on ahead of time and as long as each follows its own specific instructions, it should not matter that each of the two nodes cannot see the other node. In essence, the lack of communication in this case allows for creating a very specific course of action! If both nodes fail, obviously the case is hopeless, but that would be true of any HA configuration in which no node is left standing.

Various test plans were worked out for various cases and the table below elucidates the different test scenarios, what was expected and what was actually observed. It is very encouraging that the vast majority of these cases can now be properly handled.

 

Particularly tricky here was the case of rebooting the master server from the shell, without first disabling HA-Lizard (something one could readily forget to do). Since the fail-over process takes a while, a large number of VMs cannot be handled before the communication breakdown takes place, hence one is left with a bit of a mess to clean up in the end. Nevertheless, it’s still good to know what happens if something takes place that rightfully shouldn’t!

The other cases, whether intentional or not, are handled predictably and reliably, which is of course the intent. Typically, a two-node pool isn’t going to have a lot of complex VM dependencies, so the lack of a start order of VMs should not be perceived as a big shortcoming. Support for this feature may even be added in a future release.

 

CONCLUSIONS

HA-Lizard is a viable alternative to the native Citrix HA configuration. It’s straightforward to set up and can handle standard failover cases with a selective “restart/do not restart” setting for each VM or can be globally configured. There are a quite a number of configuration parameters which the reader is encouraged to research in the extensive HA-Lizard documentation. There is also an on-line forum which serves as a source for information and prompt assistance with issues. This most recent release 2.1.3 is supported on both XenServer 6.5 and 7.0.

Above all, HA-Lizard shines when it comes to handling a non-pooled storage environment and in particular, all configurations of the dreaded two-node pool configuration. From my direct experience, HA-Lizard now handles the vast majority of issues involved in a two-node pool and can do so more reliably than the non-supported two-node pool using Citrix’ own HA application. It has been possible to conduct a lot of tests with various cases and importantly, and to do so multiple times to ensure the actions are predictable and repeatable.

I would encourage taking a look at HA-Lizard and giving it a good test run. The software is free (contributions are accepted) and it is in extensive use and has a proven track record.  For a two-host pool, I can frankly not think of a better alternative, especially with these latest improvements and enhancements.

I would also like to thank Salvatore Costantino for the opportunity to participate in this investigation and am very pleased to see the fruits of this collaboration. It has been one way of contributing to the Citrix XenServer user community that many can immediately benefit from.

 

 

 

 

 

 

Recent comment in this post
JK Benedict
I hath no idea why more have not read this intense article! As always: bravo, sir! BRAVO!
Wednesday, 04 January 2017 12:43
Continue reading
3616 Hits
1 Comment

PCI Pass-Through on XenServer 7.0

Plenty of people have asked me over the years how to pass-through generic PCI devices to virtual machines running on XenServer. Whilst it isn't officially supported by Citrix, it's none the less perfectly possible to do; just note that your mileage may vary, because clearly it's not rigorously tested with all the possible different types of device people might want to pass-through (from TV cards, to storage controllers, to USB hubs...!).

The process on XenServer 7.0 differs somewhat from previous releases, in that the Dom0 control domain is now CentOS 7.0-based, and UEFI boot (in addition to BIOS boot) is supported. Hence, I thought it would be worth writing up the latest instructions, for those who are feeling adventurous.

Of course, XenServer officially supports pass-through of GPUs to both Windows and Linux VMs, hence this territory isn't as uncharted as might first appear: pass-through in itself is fine. The wrinkles will be to do with a particular given piece of hardware.

A Short Introduction to PCI Pass-Through

Firstly, a little primer on what we're trying to do.

Your host will have a PCI bus, with multiple devices hosted on it, each with its own unique ID on the bus (more on that later; just remember this as "B:D.f"). In addition, each device has a globally unique vendor ID and device ID, which allows the operating system to look up what its human-readable name is in the PCI IDs database text file on the system. For example, vendor ID 10de corresponds to the NVIDIA Corporation, and device ID 11b4 corresponds to the Quadro K4200. Each device can then (optionally) have multiple sub-vendor and sub-device IDs, e.g. if an OEM has its own branded version of a supplier's component.

Normally, XenServer's control domain, Dom0, is given all PCI devices by the Xen hypervisor. Drivers in the Linux kernel running in Dom0 each bind to particular PCI device IDs, and thus make the hardware actually do something. XenServer then provides synthetic devices (emulated or para-virtualised) such as SCSI controllers and network cards to the virtual machines, passing the I/O through Dom0 and then out to the real hardware devices.

This is great, because it means the VMs never see the real hardware, and thus we can live migrate VMs around, or start them up on different physical machines, and the virtualised operating systems will be none the wiser.

If, however, we want to give a VM direct access to a piece of hardware, we need to do something different. The main reason one might want to is because the hardware in question isn't easy to virtualise, i.e. the hypervisor can't provide a synthetic device to a VM, and somehow then "share out" the real hardware between those synthetic devices. This is the case for everything from an SSL offload card to a GPU.

Aside: Virtual Functions

There are three ways of sharing out a PCI device between VMs. The first is what XenServer does for network cards and storage controllers, where a synthetic device is given to the VM, but then the I/O streams can effectively be mixed together on the real device (e.g. it doesn't matter that traffic from multiple VMs is streamed out of the same physical network card: that's what will end up happening at a physical switch anyway). That's fine if it's I/O you're dealing with.

The second is to use software to share out the device. Effectively you have some kind of "manager" of the hardware device that is responsible for sharing it between multiple virtual machines, as is done with NVIDIA GRID GPU virtualisation, where each VM still ends up with a real slice of GPU hardware, but controlled by a process in Dom0.

The third is to virtualise at the hardware device level, and have a PCI device expose multiple virtual functions (VFs). Each VF provides some subset of the functionality of the device, isolated from other VFs at the hardware level. Several VMs can then each be given their own VF (using exactly the same mechanism as passing through an entire PCI device). A couple of examples are certain Intel network cards, and AMD's MxGPU technology.

OK, So How Do I Pass-Through a Device?

Step 1

Firstly, we have to stop any driver in Dom0 claiming the device. In order to do that, we'll need to ascertain what the ID of the device we're interested in passing through is. We'll use B:D.f (Bus, Device, function) numbering to specify it.

Running lspci will tell you what's in your system:

davidcot@helical:~$ lspci
00:00.0 Host bridge: Intel Corporation 82X38/X48 Express DRAM Controller
00:01.0 PCI bridge: Intel Corporation 82X38/X48 Express Host-Primary PCI Express Bridge
00:06.0 PCI bridge: Intel Corporation 82X38/X48 Express Host-Secondary PCI Express Bridge
00:1a.0 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #4 (rev 02)
00:1a.1 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #5 (rev 02)
00:1a.2 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #6 (rev 02)
00:1a.7 USB controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #2 (rev 02)
00:1b.0 Audio device: Intel Corporation 82801I (ICH9 Family) HD Audio Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 1 (rev 02)
00:1c.5 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 6 (rev 02)
00:1d.0 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #1 (rev 02)
00:1d.1 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #2 (rev 02)
00:1d.2 USB controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #3 (rev 02)
00:1d.7 USB controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #1 (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 92)
00:1f.0 ISA bridge: Intel Corporation 82801IR (ICH9R) LPC Interface Controller (rev 02)
00:1f.2 SATA controller: Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode] (rev 02)
00:1f.3 SMBus: Intel Corporation 82801I (ICH9 Family) SMBus Controller (rev 02)
01:00.0 VGA compatible controller: NVIDIA Corporation G86 [Quadro NVS 290] (rev a1)
04:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5754 Gigabit Ethernet PCI Express (rev 02)

Once you've found the device you're interested in, say 04:00.0 for my network card, we tell Dom0 to exclude it from being bound to by normal drivers. You can add to the Dom0 boot line as follows:

/opt/xensource/libexec/xen-cmdline --set-dom0 "xen-pciback.hide=(04:00.0)"

(What this does is edit /boot/grub/grub.cfg for you, or if you're booting using UEFI, /boot/efi/EFI/xenserver/grub.cfg instead!)

Step 2

Reboot! At the moment, a driver in Dom0 probably still has hold of your device, hence you need to reboot the host to get it relinquished.

Step 3

The easy bit: tell the toolstack to assign the PCI device to the VM. Run:

xe vm-list

And note the UUID of the VM you're interested in, then:

xe vm-param-set other-config:pci=0/0000:<B:D.f> uuid=<vm uuid>

Where, of course, <B.D.f> is the ID of the device you found in step 1 (like 04:00.0), and <vm uuid> corresponds to the VM you care about.

Step 4

Start your VM. At this point if you run lspci (or equivalent) within the VM, you should now see the device. However, that doesn't mean it will spring into life, because...

Step 5

Install a device driver for the piece of hardware you passed-through. The operating system within the VM may already ship with a suitable device driver, but it not, you'll need to go and get the appropriate one from the device manufacturer. This will normally be the standard Linux/Windows/other one that you would use for a physical system; the only difference occurs when you're using a virtual function, where the VF driver is likely to be a special one.

Health Warnings

As indicated above, pass-through has advantages and disadvantages. You'll get direct access to the hardware (and hence, for some functions, higher performance), but you'll forgo luxuries such as the ability to live migrate the virtual machine around (there's state now sitting on real hardware, versus virtual devices), and the ability to use high availability for that VM (because HA doesn't take into account how many free PCI devices of the right sort you have in your resource pool).

In addition, not all PCI devices take well to being passed through, and not all servers like doing so (e.g. if you're extending the PCI bus in a blade system to an expansion module, this can sometimes cause problems). Your mileage may therefore vary.

If you do get stuck, head over to the XenServer discussion forums and people will try to help out, but just note that Citrix doesn't officially support generic PCI pass-through, hence you're in the hands of the (very knowledgeable) community.

Conclusion

Hopefully this has helped clear up how pass-through is done on XenServer 7.0; do comment and let us know how you're using pass-through in your environment, so that we can learn what people want to do, and think about what to officially support on XenServer in the future!

Recent Comments
Tobias Kreidl
Yay, great to see this published in clear, concise steps! This is one for the XenServer forum to point to! ... Read More
Saturday, 05 November 2016 03:38
David Cottingham
If you want both the audio and GPU devices given to your VM, then yes, you need to use the procedure once for each device. Howeve... Read More
Monday, 07 November 2016 10:18
David Cottingham
Understood: it would be a performance gain for at least some use cases, as you're getting raw access to the NIC. The downside is t... Read More
Monday, 07 November 2016 10:06
Continue reading
8774 Hits
13 Comments

Better Together: PVS and XenServer!

XenServer adds new functionality to further simplify and enhance the secure and on-demand delivery of applications and desktops to enterprise environments.

If you haven't visited the Citrix blogs recently, we encourage you to visit https://www.citrix.com/blogs/2016/10/31/pvs-and-xenserver-better-together/ to read about the latest integration efforts between PVS and XS.

If you're a Citrix customer, this article is a must read!

Andy Melmed, Senior Solutions Architect, XenServer PM

 

Continue reading
2366 Hits
0 Comments

Set Windows guest VM Static IP address in XenServer

A Bit of Why

For a XenServer Virtual Machine(VM) administrator, traditional way to set a static IP to a VM maybe not that direct. That is because XenServer do not provide API to set VM IP address from any management tool in history. To change the IP setting for a VM in XenServer, you will need to email the VM user and let them to do the setting manually. Or you may need to install some 3rd party tools to help you to set the IP address to the VM. For create new VM for users by VM clone, set IP maybe means multiple time of reboot.

To provide a better user experience, XenServer is now trying to provides easier way to set static IP address to Guest VM.

 

Set static IP for XenServer 7.0 Windows guest VM

XenServer 7.0 now have the ability to set Windows guest VM IP address by below interfaces:

  • IPv4
    • Set VM IPv4 address by command line interface(CLI):
      xe vif-configure-ipv4 address=  gateway=  mode=[static/none] uuid=
    • Set VM IPv4 address by XAPI
      VIF.configure_ipv4(vifObject, "static/none", "Some IP address", "some gateway address")
  • IPv6
    • Set VM IPv6 address by command line interface(CLI):
      xe vif-configure-ipv6 address=  gateway=  mode=[static/none] uuid=
    • Set VM IPv6 address by XAPI
      VIF.configure_ipv6(vifObject, "static/none", "Some IP address", "some gateway address")

Note:

The mode "none" means remove the current static IP setting and back to DHCP mode. If the static IP is not set by new interface, use the command to set the mode to "none" only do nothing.

Dive into details

Below diagram show how the configuration goes:

b2ap3_thumbnail_workflow.png

By using the interface:

1. XAPI will first store the IP configuration to XenStore as:

/local/domain//xenserver/device/vif= ""
  static-ip-setting = ""
     mac = "some mac address"
     error-code = "some error code"
     error-msg = "some error message"
     address = "some IP address"
     gateway = "some gateway address"

2. XenStore will notify the change to XenServer Guest agent tool of the configuration change.

3. XenServer guest agent receives the notification and sets IP address using netsh.

4. After setting IP address, XenServer guest agent then writes the operation result to xenstore key as: error-code and error-msg

Example

1. Install XenServer PV tool to Windows Guest VM.

 2. From the command line interface (CLI), identify the Virtual Network Interface / Virtual Interface(VIF) you want to set the IP address by:

[root@dt65 ~]# xe vm-vif-list vm=Windows 7 (32-bit) (1) 
uuid ( RO)                         : 7dc56d5b-492c-bcf5-2549-b580dc928274
        vm-name-label ( RO): Windows 7 (32-bit) (1)
                     device ( RO): 1
                        MAC ( RO): 3e:aa:c3:dd:a7:ba
           network-uuid ( RO): 98f9a3b6-ad3f-14b3-da59-e3abc888e58e
network-name-label ( RO): Pool-wide network associated with eth1


uuid ( RO)                         : 0f59a97b-afcf-b6db-582d-2411d5bbc449
        vm-name-label ( RO): Windows 7 (32-bit) (1)
                     device ( RO): 0
                        MAC ( RO): 62:a1:03:31:a3:ee
           network-uuid ( RO): 41dac7d6-4a11-c9e6-cc48-ded78ceaf446
network-name-label ( RO): Pool-wide network associated with eth0

3. Call new interface to set IP address as:

[root@dt65 ~]# xe vif-configure-ipv4 uuid=0f59a97b-afcf-b6db-582d-2411d5bbc449 mode=static address=10.64.5.6/24 gateway=10.64.5.1

4. Check result by XenStore error code key "error-code" and "error-msg" as:

[root@XenServer ~]# xenstore-ls /local/domain/13/xenserver/device/vif
0 = ""
  static-ip-setting = ""
     mac = "62:a1:03:31:a3:ee"
     error-code = "0"
     error-msg = ""
     address = "10.64.5.6/24"
     gateway = "10.64.5.1"
  1 = ""
     static-ip-setting = ""
     mac = "3e:aa:c3:dd:a7:ba"
     error-code = "0"
     error-msg = ""

Recent comment in this post
yao
How use netsh.exe set IPs?
Monday, 07 November 2016 02:13
Continue reading
3541 Hits
1 Comment

Enable XSM on XenServer 6.5 SP1

1 Introduction

Certain virtualization environments require the extra security provided by XSM and FLASK (https://wiki.xenproject.org/wiki/Xen_Security_Modules_:_XSM-FLASK). XenServer 7 benefits from its upgrade of the control domain to CentOS 7, which includes support for enabling XSM and FLASK. But what about legacy XenServer 6.5 installations that also require the added security? XSM and FLASK may be enabled on XenServer 6.5 as well, but it requires a bit more work.

Note that XSM is not currently a user-visible feature in XenServer, or a supported technology.

This article describes how to enable XSM and FLASK in XenServer 6.5 SP1. It makes the assumption that the reader is familiar with accessing, building, and deploying XenServer's Xen RPMs from source. While this article pertains to resources from SP1 source RPMs (XS65ESP1-src-pkgs.tar.bz2 included with SP1, http://support.citrix.com/article/CTX142355), a similar approach can be followed for other XenServer 6.5 hotfixes.

2 Patching Xen and xen.spec

XenServer issues some hypercalls not handled by Xen's XSM hooks. The following patch shows one possible way to handle these operations and commands, which is to always permit them.

diff --git a/xs6.5sp1/xen/xen-4.4.1/xen/xsm/flask/hooks.c b/xs6.5sp1/xen/xen-4.4.1/xen/xsm/flask/hooks.c
index 0cf7daf..a41fcc4 100644
--- a/xs6.5sp1/xen/xen-4.4.1/xen/xsm/flask/hooks.c
+++ b/xs6.5sp1/xen/xen-4.4.1/xen/xsm/flask/hooks.c
@@ -727,6 +727,12 @@ static int flask_domctl(struct domain *d, int cmd)
     case XEN_DOMCTL_cacheflush:
         return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__CACHEFLUSH);

+    case XEN_DOMCTL_get_runstate_info:
+        return 0;
+
+    case XEN_DOMCTL_setcorespersocket:
+        return 0;
+
     default:
         printk("flask_domctl: Unknown op %dn", cmd);
         return -EPERM;
@@ -782,6 +788,9 @@ static int flask_sysctl(int cmd)
     case XEN_SYSCTL_numainfo:
         return domain_has_xen(current->domain, XEN__PHYSINFO);

+    case XEN_SYSCTL_consoleringsize:
+        return 0;
+
     default:
         printk("flask_sysctl: Unknown op %dn", cmd);
         return -EPERM;
@@ -1299,6 +1308,9 @@ static int flask_platform_op(uint32_t op)
     case XENPF_get_cpuinfo:
         return domain_has_xen(current->domain, XEN__GETCPUINFO);

+    case XENPF_get_cpu_features:
+        return 0;
+
     default:
         printk("flask_platform_op: Unknown op %dn", op);
         return -EPERM;

The only other file that needs patching is Xen's RPM spec file, xen.spec. Modify HV_COMMON_OPTIONS as shown below.  Change this line:

% define HV_COMMON_OPTIONS max_phys_cpus=256

to:

% define HV_COMMON_OPTIONS max_phys_cpus=256 XSM_ENABLE=y FLASK_ENABLE=y

3 Compiling and Loading a Policy

To build a security policy, navigate to tools/flask/policy in Xen's source tree. Run make to compile the default security policy. It will have a name like xenpolicy.24, depending on your version of checkpolicy.

Copy xenpolicy.24 over to Dom0's /boot directory. Open /boot/extlinux.conf and modify the default section's append /boot/xen.gz ... line so it has --- /boot/xenpolicy.24 at the end. For example:

append /boot/xen.gz dom0_mem=752M,max:752M [.. snip ..] splash --- /boot/initrd-3.10-xen.img --- /boot/xenpolicy.24

After making this change, reboot.

While booting (or afterwards, via xl dmesg), you should see messages indicating XSM and FLASK initialized, read the security policy, and started in permissive mode. For example:

(XEN) XSM Framework v1.0.0 initialized
(XEN) Policy len  0x1320, start at ffff830117ffe000.
(XEN) Flask:  Initializing.
(XEN) AVC INITIALIZED
(XEN) Flask:  Starting in permissive mode.

4 Exercises for the Reader

  1. Create a more sophisticated implementation for handling XenServer hypercalls in xen/xsm/flask/hooks.c.
  2. Write (and load) a custom policy.
  3. Boot with flask_enforcing=1 set, and study any violations that occur (see xl dmesg output).
Recent comment in this post
anshul makkar
Good Work. Once the user builds the default policy, it should work for most of the scenario except for the specialized once like p... Read More
Friday, 28 October 2016 11:22
Continue reading
1610 Hits
1 Comment

Making a difference to the next release

With the 1st alpha build of the new, upcoming XenServer release (codenamed " Ely") as announced by Andy Melmed's blog on the 11th October – I thought it'd be useful to provide a little retrospective on how you, the xenserver.org community, have helped getting it off to a great start by providing great feedback on previous alphas, betas and releases - and how this has been used to strengthen the codebase for Ely as a whole based on your experiences.
 
As I am sure you are well aware - community users of xenserver.org can make use of the incident tracking database at bugs.xenserver.org to raise issues on recent alphas, betas and XenServer releases to raise issues or problems they've found on their hardware or configuration.  These incidents are raised in the form of XSO tickets which can then be commented upon by other members of the community and folks who work on the product.  
 
We listened
Looking back on all of the XSO tickets raised on the latest 7.0 release - these total more than 200 individual incident reports.  I want to take the time to thank everyone who contributed to these, often detailed, specific, constructive reports, and for working iteratively to understand more of the underlying issues.  Some of these investigations are ongoing, and need further feedback, but many of them are sufficiently clear to move forward to the next step.  
 
We understood
The incident reports were triaged and, by working with the user community, more than 80% of them have been processed.  Frequently this involved questions and answers to get a better handle on what was the underlying problem.  Then trying a change to the configuration or even a private fix to see and confirm if it related to the problem or resolved it.  The enthusiasm and skill of the reporters has been amazing, and continually useful.  At this point - we've separated the incidents into those which can be fixed as bugs, and those which are requests for features.  The latter have been provided to Citrix product management for consideration.  
 
We did
Out of these which can be fixed as bugs,  we raised or updated 45 acknowledged defects in XenServer.  More than 70% of these are already fixed - with another 20% being actively worked on.  The small remainder are blocked for some reason and awaiting a change elsewhere in the product, upstream or in our ability to test.  The 70% of fixes have now successfully either become part of some of the hotfixes which have been released for 7.0, or are in the codebase already and are being progressively released as part of the Ely alpha programme for the community to try.  
 
So what's next?  With work continuing apace on Ely - we have now opened the "Ely alpha" as a affects-version in the incident database to raise issues with this latest build.  At the same time - in the spirit of continuing to progressively improve the actively developing codebase - we have removed the 6.5 SP1 affects-version – so folks can focus on the new release.
 
Finally - on behalf of all xenserver.org users - my personal thanks to everyone who has helped improve both Dundee and Ely - both by reporting incidents, triaging and fixing them and by continuing to give your feedback on the latest new version.  This really makes a difference to all members of the community.
Continue reading
2278 Hits
0 Comments
Featured

XenServer Ely Alpha 1 Available

Hear ye, hear ye… we are pleased to announce that an alpha release of XenServer Project Ely is now available for download! After Dundee (7.0), we've come a little closer to Cambridge (the birthplace of Xen) for our codename, as the city of Ely is just up the road.

 
Since releasing version 7.0 in May, the XenServer engineering team has been working fervently to prepare the platform with the latest innovations in server virtualization technology. As a precursor, a pre-release containing the prerequisites for enabling a number of powerful (and really cool!) new features has been made available for download from the pre-release page.
 

What's In it?

 

The following is a brief description of some of the feature-prerequisites included in this pre-release:

 

Xen 4.7:  This release of Xen adds support for "live-patching" of the Xen hypervisor, allowing issues to be patched without requiring a host reboot. In this alpha release there is no functionality for you to test in this area, but we thought it was worth telling you about none the less. Xen 4.7 also includes various performance improvements, and updates to the virtual machine introspection code (surfaced in XenServer as Direct Inspect).

 

Kernel 4.4: Updated kernel to support future feature considerations. All device drivers will be at the upstream versions; we'll be updating these with drops direct from the hardware vendors as we go through the development cycle.

 

VM import/export performance: a longstanding request from our user community, we've worked to improve the import/export speeds of VMs, and Ely alpha 1 now averages 2x faster than the previous version.

 

What We'd Like Help With

 

The purpose of this alpha release is really to make sure that a variety of hardware works with project Ely. Because we've updated core platform components (Xen and the Dom0 kernel), it's always important to check on hardware that we don’t have in our QA labs that all is well. Thus, the more people who can download this build, install, and run a couple of VMs to check all is well the better.

 

Additionally, we've been working with the community (over on XSO-445) on improving VM import/export performance: we'd like to see whether the improvements we've seen in our tests are what you see too. If they're not, we can figure out why and fix it :-).

 

Upgrading

 

This is pre-release software, not for production use. Upgrades from XenServer 7.0 should work fine, but it goes without saying that you should ensure you back up any critical data.

 

Reporting Bugs

 

We encourage visitors to download the pre-release and provide us with your feedback. If you do find a problem, please head over to the bug tracker and file a ticket. Please be sure to include a server status report!

 

Now that we've moved up to a new pre-release project, it's time to remove the XS 6.5 SP1 fix version from the bug tracker, in order that we keep it tidy. You'll see an "Ely alpha" affects version is now present instead.

 

What Next?

 

Stay tuned for another pre-release build in the near future: as you may have heard, we've been keeping busy!

 
As always, we look forward to working with the XenServer community to make the next major release of XenServer the best version ever!

 

Cheers!

 

Andy M.

Senior Solutions Architect - XenServer PM

 

Recent Comments
David Reade
The download link for the release notes does not work. For "info.citrite.net" I'm getting "server not found". Is this a link to an... Read More
Tuesday, 11 October 2016 15:56
Tobias Kreidl
Always nice to see XenServer improvements and added features. The new kernel and live patching are nice. The vm-export/import gain... Read More
Wednesday, 12 October 2016 06:41
Willem Boterenbrood
Tobias, In my XenServer 7.0 test environment I see a large improvement of VM export speed compared to my 6.5SP1 live environment.... Read More
Friday, 14 October 2016 07:13
Continue reading
5321 Hits
39 Comments

XenServer Hotfix XS65ESP1035 Released

XenServer Hotfix XS65ESP1035 Released

News Flash: XenServer Hotfix XS65ESP1035 Released

Indeed, I was alerted early this morning (06:00 EST) via email that Citrix has released hotfix XS65ESP1035 for XenServer 6.5 SP1.  The official release and content is filed under CTX216249, which can be found here: http://support.citrix.com/article/CTX216249

As of the writing of this article, this hotfix has not yet been added to CTX138115 (entitled "Recommended Updates for XenServer Hotfixes") or, as we like to call it "The Fastest Way to Patch A Vanilla XenServer With One or Two Reboots!"  I imagine that resource will be updated to reflect XS65ESP1035 soon.

Personally/Professionally, I will be installing this hotfix as, per CTX216249, I am excited to read what is addressed/fixed:

  • Duplicate entry for XS65ESP1021 was created when both XS65ESP1021 and XS65ESP1029 were applied.
  • When BATMAP (Block Allocation Map) in Xapi database contains erroneous data, the parent VHD (Virtual Hard Disk) does not get inflated causing coalesce failures and ENOSPC errors.
  • After deleting a snapshot on a pool member that is not the pool master, a coalesce operation may not succeed. In such cases, the coalesce process can constantly retry to complete the operation, resulting in the creation of multiple RefCounts that can consume a lot of space on the pool member.
In addition, this hotfix contains the following improvement:
  • This fix lets users set a custom retrans value for their NFS SRs thereby giving them more fine-grained control over how they want NFS mounts to behave in their environment.

(Source: http://support.citrix.com/article/CTX216249)

So....

This is storage based hotfix and while we can create VMs all day, we rely on the storage substrate to hold our precious VHDs, so plan accordingly to deploy it!

Applying The Patch Manually

As a disclaimer of sorts, always plan your patching during a maintenance window to prevent any production outages.  For me, I am currently up-to-date and will be rebooting my XenServer host(s) in a few hours, so I manually applied this patch.

Why?  If you look in XenCenter for updates, you won't see this hotfix listed (yet).  If it was available in XenCenter, checks and balances would inform me I need to suspend, migrate, or shutdown VMs.  For a standalone host, I really can't do that.  In my pool, I can't reboot for a few hours, but I need this patch installed, so I simply do the following on my XenServer stand-alone server OR XenServer primary/master server:

Using the command line in XenCenter, I make a directory in /root/ called "ups" and then descend into that directory because I plan to use wget (Web Get) to download the patch via its link in http://support.citrix.com/article/CTX216249:

[root@colossus ~]# mkdir ups
[root@colossus ~]# cd ups

Now, using wget I specify what to download over port 80 and to save it as "hf35.zip":

[root@colossus ups]# wget http://support.citrix.com/supportkc/filedownload?uri=/filedownload/CTX216249/XS65ESP1035.zip -O hf35.zip

We then see the usual wget progress bar and once it is complete, I can unzip the file "hf35.zip":

HTTP request sent, awaiting response... 200 OK
Length: 110966324 (106M) [application/zip]
Saving to: `hf35.zip'

100%[======================================>] 110,966,324 1.89M/s   in 56s    
2016-08-25 11:06:32 (1.90 MB/s) - `hf35.zip' saved [110966324/110966324]
[root@colossus ups]# unzip hf35.zip 
Archive:  hf35.zip
  inflating: XS65ESP1035.xsupdate   
  inflating: XS65ESP1035-src-pkgs.tar.bz2

I'm a big fan of using shortcuts - especially where UUIDs are involved.  Now that I have the patch ready to expand onto my XenServer master/stand-alone server, I want to create some kind of variable so I don't have to remember my host's UUID or the patch's UUID. 

For the host, I can simply source in a file that contains the XenServer primary/master server's INSTALLATION_UUID (better known as the host's UUID):

[root@colossus ups]# source /etc/xensource-inventory 
[root@colossus ups]# echo $INSTALLATION_UUID
207cd7c1-da20-479b-98bc-e84cac64d0c0

With the variable $INSTALLATION_UUID set, I can now expand the patch and capture it's own UUID:

[root@colossus ups]# patchUUID=`xe patch-upload file-name=XS65ESP1035.xsupdate`
[root@colossus ups]# echo $patchUUID
cdf9eb54-c3da-423d-88ca-841b864f926b

NOW, I apply the patch to the host (yes, it still needs to be rebooted, but within a few hours) using both variables in the following command:

[root@colossus ups]# xe patch-apply uuid=$patchUUID host-uuid=$INSTALLATION_UUID
   
Preparing...                ##################################################
kernel                      ##################################################
unable to stat /sys/class/block//var/swap/swap.001: No such file or directory
Preparing...                ##################################################
sm                          ##################################################
Preparing...                ##################################################
blktap                      ##################################################
Preparing...                ##################################################
kpartx                      ##################################################
Preparing...                ##################################################
device-mapper-multipath-libs##################################################
Preparing...                ##################################################
device-mapper-multipath     ##################################################

At this point, I can back out of the "ups" directory and remove it.  Likewise, I can also check to see if the patch UUID is listed in the XAPI database:

[root@colossus ups]# cd ..
[root@colossus ~]# rm -rf ups/
[root@colossus ~]# ls
support.tar.bz2
[root@colossus ~]# xe patch-list uuid=$patchUUID
uuid ( RO)                    : cdf9eb54-c3da-423d-88ca-841b864f926b
              name-label ( RO): XS65ESP1035
        name-description ( RO): Public Availability: fixes to Storage
                    size ( RO): 21958176
                   hosts (SRO): 207cd7c1-da20-479b-98bc-e84cac64d0c0
    after-apply-guidance (SRO): restartHost

So, nothing really special -- just a quick way to apply patches to a XenServer primary/master server.  In the same manner, you can substitute the $INSTALLATION_UUID with other host UUIDs in a pool configuration, etc.

Well, off to reboot and thanks for reading!

 

-jkbs | @xenfomationMy Citrix Blog

To receive updates about the latest XenServer Software Releases, login or sign-up to pick and choose the content you need from http://support.citrix.com/customerservice/

 


Sources

Citrix Support Knowledge Center: http://support.citrix.com/article/CTX216249

Citrix Support Knowledge Center: http://support.citrix.com/customerservice/

Citrix Profile/RSS Feeds: http://support.citrix.com/profile/watches/

Original Image Source: http://www.gimphoto.com/p/download-win-zip.html

Continue reading
3433 Hits
0 Comments

XenServer 7.0 performance improvements part 4: Aggregate I/O throughput improvements

The XenServer team has made a number of significant performance and scalability improvements in the XenServer 7.0 release. This is the fourth in a series of articles that will describe the principal improvements. For the previous ones, see:

  1. http://xenserver.org/blog/entry/dundee-tapdisk3-polling.html
  2. http://xenserver.org/blog/entry/dundee-networking-multi-queue.html
  3. http://xenserver.org/blog/entry/dundee-parallel-vbd-operations.html

In this article we return to the theme of I/O throughput. Specifically, we focus on improvements to the total throughput achieved by a number of VMs performing I/O concurrently. Measurements show that XenServer 7.0 enjoys aggregate network throughput over three times faster than XenServer 6.5, and also has an improvement to aggregate storage throughput.

What limits aggregate I/O throughput?

When a number of VMs are performing I/O concurrently, the total throughput that can be achieved is often limited by dom0 becoming fully busy, meaning it cannot do any additional work per unit time. The I/O backends (netback for network I/O and tapdisk3 for storage I/O) together consume 100% of available dom0 CPU time.

How can this limit be overcome?

Whenever there is a CPU bottleneck like this, there are two possible approaches to improving the performance:

  1. Reduce the amount of CPU time required to perform I/O.
  2. Increase the processing capacity of dom0, by giving it more vCPUs.

Surely approach 2 is easy and will give a quick win...? Intuitively, we might expect the total throughput to increase proportionally with the number of dom0 vCPUs.

Unfortunately it's not as straightforward as that. The following graph shows what happened to the aggregate network throughput on XenServer 6.5 if the number of dom0 vCPUs is artificially increased. (In this case, we are measuring the total network throughput of 40 VMs communicating amongst themselves on a single Dell R730 host.)

b2ap3_thumbnail_5179.png

Counter-intuitively, the aggregate throughput decreases as we add more processing power to dom0! (This explains why the default was at most 8 vCPUs in XenServer 6.5.)

So is there no hope for giving dom0 more processing power...?

The explanation for the degradation in performance is that certain operations run more slowly when there are more vCPUs present. In order to make dom0 work better with more vCPUs, we needed to understand what those operations are, and whether they can be made to scale better.

Three such areas of poor scalability were discovered deep in the innards of Xen by Malcolm Crossley and David Vrabel, and improvements were made for each:

  1. Maptrack lock contention – improved by http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=dff515dfeac4c1c13422a128c558ac21ddc6c8db
  2. Grant-table lock contention – improved by http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=b4650e9a96d78b87ccf7deb4f74733ccfcc64db5
  3. TLB flush on grant-unmap – improved by https://github.com/xenserver/xen-4.6.pg/blob/master/master/avoid-gnt-unmap-tlb-flush-if-not-accessed.patch

The result of improving these areas is dramatic – see the green line in the following graph:

b2ap3_thumbnail_4190.png

Now, throughput scales very well as the number of vCPUs increases. This means that, for the first time, it is now beneficial to allocate many vCPUs to dom0 – so that when there is demand, dom0 can deliver. Hence we have given XenServer 7.0 a higher default number of dom0 vCPUs.

How many vCPUs are now allocated to dom0 by default?

Most hosts will now get 16 vCPUs by default, but the exact number depends on the number of CPU cores on the host. The following graph summarises how the default number of dom0 vCPUs is calculated from the number of CPU cores on various current and historic XenServer releases:

b2ap3_thumbnail_dom0-vcpus.png

Summary of improvements

I will conclude with some aggregate I/O measurements comparing XenServer 6.5 and 7.0 under default settings (no dom0 configuration changes) on a Dell R730xd.

  1. Aggregate network throughput – twenty pairs of 32-bit Debian 6.0 VMs sending and receiving traffic generated with iperf 2.0.5.
    b2ap3_thumbnail_aggr-intrahost-r730_20160729-093608_1.png
  2. Aggregate storage IOPS – twenty 32-bit Windows 7 SP1 VMs each doing single-threaded, serial, sequential 4KB reads with fio to a virtual disk on an Intel P3700 NVMe drive.
    b2ap3_thumbnail_storage-iops-aggr-p3700-win7.png
Continue reading
4732 Hits
2 Comments

XenServer 7.0 performance improvements part 3: Parallelised plug and unplug VBD operations in xenopsd

The XenServer team has made a number of significant performance and scalability improvements in the XenServer 7.0 release. This is the third in a series of articles that will describe the principal improvements. For the first two, see here:

  1. http://xenserver.org/blog/entry/dundee-tapdisk3-polling.html
  2. http://xenserver.org/blog/entry/dundee-networking-multi-queue.html

The topic of this post is control plane performance. XenServer 7.0 achieves significant performance improvements through the support for parallel VBD operations in xenopsd. With the improvements, xenopsd is able to plug and unplug many VBDs (virtual block devices) at the same time, substantially improving the duration of VM lifecycle operations (start, migrate, shutdown) for VMs with many VBDs, and making it practical to operate VMs with up to 255 VBDs.

Background of the VM lifecycle operations

In XenServer, xenopsd is the dom0 component responsible for VM lifecycle operations:

  • during a VM start, xenopsd creates the VM container and then plugs the VBDs before starting the VCPUs;
  • during a VM shutdown, xenopsd stops the VCPUs and then unplugs the VBDs before destroying the VM container;
  • during a VM migrate, xenopsd creates a new VM container, unplugs the VBDs of the old VM container, and plugs the VBDs for the new VM before starting its VCPUs; while the VBDs are being unplugged and plugged on the other VM container, the user experiences a VM downtime when the VM is unresponsive because both old and new VM containers are paused.

Measurements have shown that a large part, usually most of the duration of these VM lifecycle operations is due to plugging and unplugging the VBDs, especially on slow or contended storage backends.

b2ap3_thumbnail_vbd-plugs-sequential.png

 

Why does xenopsd take some time to plug and unplug the VBDs?

The completion of a xenopsd VBD plug operation involves the execution of two storage layer operations, VDI attach and VDI activate (where VDI stands for virtual disk image). These VDI operations include control plane manipulation of daemons, block devices and disk metadata in dom0, which will take different amounts of time to execute depending on the type of the underlying Storage Repository (SRs, such as LVM, NFS or iSCSI) used to hold the VDIs, and the current load on the storage backend disks and their types (SSDs or HDs). Similarly, the completion of a xenopsd VBD unplug operation involves the execution of two storage layer operations, VDI deactivate and VDI detach, with the corresponding overhead of manipulating the control plane of the storage layer.

If the underlying physical disks are under high load, there may be contention preventing progress of the storage layer operations, and therefore xenopsd may need to wait many seconds before the requests to plug and unplug the VBDs can be served.

Originally, xenopsd would execute these VBD operations sequentially, and the total time to finish all of them for a single VM would depend of the number of VBDs in the VM. Essentially, it would be a sum of the time to operate each of othe VBDs of this VM, which would result in several minutes of wait for a lifecycle operation of a VM that had, for instance, 255 VBDs.

What are the advantages of parallel VBD operations?

Plugging and unplugging the VBDs in parallel in xenopsd:

  • provides a total duration for the VM lifecycle operations that is independent of the number of VBDs in the VM. This duration will typically be the duration of the longest individual VBD operation amongst the parallel VBD operations for that VM;
  • provides a significant instantaneous improvement for the user, across all the VBD operations involving more than 1 VBD per VM. The more devices involved, the larger the noticeable improvement, up to the saturation of the underlying storage layer;
  • this single improvement is immediately applicable across all of VM start, VM shutdown and VM migrate lifecycle operations.

b2ap3_thumbnail_vbd-plugs-parallel.png

 

Are there any disadvantages or limitations?

Plugging and unplugging VBDs uses dom0 memory. The main disadvantage of doing these in parallel is that dom0 needs more memory to handle all the parallel operations. To prevent situations where a large number of such operations would cause dom0 to run out of memory, we have added two limits:

  • the maximum number of global parallel operations that xenopsd can request is the same as the number of xenopsd worker-pool threads as defined by worker-pool-size in /etc/xenopsd.conf. This prevents regression in the maximum dom0 memory usage compared to when sequential VBD operations per VM was used in xenopsd. An increase in this value will increase the number of parallel VBD operations, at the expense of having to increase the dom0 memory for about 15MB for each extra parallel VBD operation.
  • the maximum number of per-VM parallel operations that xenopsd can request is currently fixed to 10, which covers a wide range of VMs and still provides a 10x improvement in lifecycle operation times for those VMs that have more than 10 VBDs.

Where do I find the changes?

The changes that implemented this feature are available in github at https://github.com/xapi-project/xenopsd/pull/250

What sort of theoretical improvements should I expect in XenServer 7.0?

The exact numbers depend on the SR type, storage backend load characteristics and the limits specified in the previous section. Given the limits in the previous section, the results for the duration of VDB plugs for a single VM will follow the pattern in the following table:

Number n of VBDs/VM
Improvement of VBD operations
<=10 VBDs/VM times faster
> 10 VBDs/VM

10 times faster

The table above assumes that the maximum number of global parallel operations discussed in the section above is not reached. If you want to guarantee the improvement in the table above for x>1 simultaneous VM lifecycle operations, at the expense of using more dom0 memory in the worst case, you will probably want to set worker-pool-size = (n * x) in /etc/xenopsd.conf, where is a number reflecting the average number of VBDs/VM amongst all VMs up to a maximum of n=10.

What sort of practical improvements should I expect in XenServer 7.0?

The VBD plug and unplug operations are only part of the overall operations necessary to execute a VM lifecycle operation. The remaining parts, such as creation of the VM container and VIF plugs, will disperse the VBD improvements of the previous section, though they are still significant. Some examples of improvements, using a EXT SR on a local SSD storage backend:

VM lifecycle operation
mImprovement with 8 VBDs/VM

Toolstack time to start a single VM

b2ap3_thumbnail_vmstart-8vbds-1vm.png

 

Toolstack time to bootstorm 125 VMs

b2ap3_thumbnail_bootstorm-8vbds-125vms.png

 

The approximately 2s improvement in single VM start time was caused by plugging the 8 VBDs in parallel. As we see in the second row of the table, this can be a significant advantage in a bootstorm.

In XenServer 7.0, not only does xenopsd execute VBD operations in parallel, but it also has improvements in the storage layer operation times on VDIs, so you may observe that in your XenServer 7.0 environment further VM lifecycle time improvements beyond the expected ones from parallel VBD operations are noticeable, compared to XenServer 6.5SP1.

 

Recent comment in this post
Sam McLeod
Thanks for the post Marcus!
Wednesday, 20 July 2016 09:02
Continue reading
3597 Hits
1 Comment

Resetting Lost Root Password in XenServer 7.0

XenServer 7.0, Grub2, and a Lost Root Password

In a previous article I detailed how one could reset a lost root password to XenServer 6.2.  While the article is not limited to 6.2 (it works just as well for 6.5, 6.1, and 6.0.2), this article is dedicated to XenServer 7.0 as grub2 has been brought in to replace extlinux.

As such, if the local root user's (LRU) password for a XenServer 7.0 is forgotten physical (or "lights out") access to the host and a reboot will be required.  The contrast comes with grub2, the methods to boot the XenServer 7.0 host into single user mode, and how to reset the root password to a known token.

The Grub Boot Screen

Once obtaining physical or "lights out" to the XenServer 7.0 host in question, on reboot the following screen will appear:

It is important to note that once this screen appears, you only have four seconds to take action before the host proceeds to boot the kernel.

As should be default, the XenServer kernel is highlighted.  One will want to immediately press the key (for edit).

This will then refresh the grub interface - stopping any count-down-to-boot timers - which will reveal the boot entry.  It is within this window (using up, down, left, and right) one will want to navigate to around line 4 or five and isolate "ro nolvm":

 

Next, one will want to remove (or backspace/delete) the "ro" characters and type in "rw init=/sysroot/bin/sh", or as illustrated:

 

Don't worry if the directive is not on one line!

 

With this change made, press both Control and X at the same time as this will boot the XenServer kernel into single user style mode, or better known as Emergency Mode:

How to Change Root's Password

From the Emergency Mode prompt, execute the following command:

chroot /sysroot

Now, once can execute the "passwd" command to change root's credentials:

Finally....

Now that root's credentials have been changed, utilize Control+Alt+Delete to reboot the XenServer 7.0 host and one will find via SSH, XenCenter, or directly that the root password has been changed: the host is ready to be managed again.

 

Recent Comments
Tobias Kreidl
Many thanks for this update, Jesse! It should be turned into a KB article, as well, if not already.
Friday, 24 June 2016 10:52
JK Benedict
Jordan -- Thanks for the compliments! However, it seems more apropos to say "Sorry to hear about your situation!" So, the steps... Read More
Monday, 27 June 2016 10:11
Continue reading
9211 Hits
6 Comments

XenServer 7.0 performance improvements part 2: Parallelised networking datapath

The XenServer team has made a number of significant performance and scalability improvements in the XenServer 7.0 release. This is the second in a series of articles that will describe the principal improvements. For the first, see http://xenserver.org/blog/entry/dundee-tapdisk3-polling.html.

The topic of this post is network I/O performance. XenServer 7.0 achieves significant performance improvements through the support for multi-queue paravirtualised network interfaces. Measurements of one particular use-case show an improvement from 17 Gb/s to 41 Gb/s.

A bit of background about the PV network datapath

In order to perform network-based communications, a VM employs a paravirtualised network driver (netfront in Linux or xennet in Windows) in conjunction with netback in the control domain, dom0.

a1sx2_Original2_single-queue.png

To the guest OS, the netfront driver feels just like a physical network device. When a guest wants to transmit data:

  • Netfront puts references to the page(s) containing that data into a "Transmit" ring buffer it shares with dom0.
  • Netback in dom0 picks up these references and maps the actual data from the guest's memory so it appears in dom0's address space.
  • Netback then hands the packet to the dom0 kernel, which uses normal routing rules to determine that it should go to an Open vSwitch device and then on to either a physical interface or the netback device for another guest on the same host.

When dom0 has a network packet it needs to send to the guest, the reverse procedure applies, using a separate "Receive" ring.

Amongst the factors that can limit network throughput are:

  1. the ring becoming full, causing netfront to have to wait before more data can be sent, and
  2. the netback process fully consuming an entire dom0 vCPU, meaning it cannot go any faster.

Multi-queue alleviates both of these potential bottlenecks.

What is multi-queue?

Rather than having a single Transmit and Receive ring per virtual interface (VIF), multi-queue means having multiple Transmit and Receive rings per VIF, and one netback thread for each:

a1sx2_Original1_multi-queue.png

Now, each TCP stream has the opportunity to be driven through a different Transmit or Receive ring. The particular ring chosen for each stream is determined by a hash of the TCP header (MAC, IP and port number of both the source and destination).

Crucially, this means that separate netback threads can work on each TCP stream in parallel. So where we were previously limited by the capacity of a single dom0 vCPU to process packets, now we can exploit several dom0 vCPUs. And where the capacity of a single Transmit ring limited the total amount of data in-flight, the system can now support a larger amount.

Which use-cases can take advantage of multi-queue?

Anything involving multiple TCP streams. For example, any kind of server VM that handles connections from more than one client at the same time.

Which guests can use multi-queue?

Since frontend changes are needed, the version of the guest's netfront driver matters. Although dom0 is geared up to support multi-queue, guests with old versions of netfront that lack multi-queue support are limited to single Transmit and Receive rings.

  • For Windows, the XenServer 7.0 xennet PV driver supports multi-queue.
  • For Linux, multi-queue support was added in Linux 3.16. This means that Debian Jessie 8.0 and Ubuntu 14.10 (or later) support multi-queue with their stock kernels. Over time, more and more distributions will pick up the relevant netfront changes.

How does the throughput scale with an increasing number of rings?

The following graph shows some measurements I made using iperf 2.0.5 between a pair of Debian 8.0 VMs both on a Dell R730xd host. The VMs each had 8 vCPUs, and iperf employed 8 threads each generating a separate TCP stream. The graph reports the sum of the 8 threads' throughputs, varying the number of queues configured on the guests' VIFs.

5104.png

We can make several observations from this graph:

  • The throughput scales well up to four queues, with four queues achieving more than double the throughput possible with a single queue.
  • The blip at five queues probably arose when the hashing algorithm failed to spread the eight TCP streams evenly across the queues, and is thus a measurement artefact. With different TCP port numbers, this may not have happened.
  • While the throughput generally increases with an increasing number of queues, the throughput is not proportional to the number of rings. Ideally, the throughput would double when you double the number of rings. This doesn't happen in practice because the processing is not perfectly parallelisable: netfront needs to demultiplex the streams onto the rings, and there are some overheads due to locking and synchronisation between queues.

This graph also highlights the substantial improvement over XenServer 6.5, in which only one queue per VIF was supported. In this use-case of eight TCP streams, XenServer 7.0 achieves 41 Gb/s out-of-the-box where XenServer 6.5 could manage only 17 Gb/s – an improvement of 140%.

How many rings do I get by default?

By default the number of queues is limited by (a) the number of vCPUs the guest has and (b) the number of vCPUs dom0 has. A guest with four vCPUs will get four queues per VIF.

This is a sensible default, but if you want to manually override it, you can do so in the guest. In a Linux guest, add the parameter xen_netfront.max_queues=n, for some n, to the kernel command-line.

Recent Comments
Tobias Kreidl
Hi, Jonathan: Thanks for the insightful pair of articles. It's interesting how what appear to be nuances can make large performan... Read More
Tuesday, 21 June 2016 04:54
Jonathan Davies
Thanks for sharing your thoughts, Tobias. You ask about queue polling. In fact, netback already does this! It achieves this by us... Read More
Wednesday, 22 June 2016 08:40
Sam McLeod
Interesting post Jonathan, I've tried adjusting `xen_netfront.max_queues` amongst other similar values on both guests and hosts h... Read More
Tuesday, 21 June 2016 13:01
Continue reading
6192 Hits
5 Comments

XenServer 7.0 performance improvements part 1: Lower latency storage datapath

The XenServer team made a number of significant performance and scalability improvements in the XenServer 7.0 release. This is the first in a series of articles that will describe the principal improvements.

Our first topic is storage I/O performance. A performance improvement has been achieved through the adoption of a polling technique in tapdisk3, the component of XenServer responsible for handling I/O on virtual storage devices. Measurements of one particular use-case demonstrate a 50% increase in performance from 15,000 IOPS to 22,500 IOPS.

What is polling?

Normally, tapdisk3 operates in an event-driven manner. Here is a summary of the first few steps required when a VM wants to do some storage I/O:

  1. The VM's paravirtualised storage driver (called blkfront in Linux or xenvbd in Windows) puts a request in the ring it shares with dom0.
  2. It sends tapdisk3 a notification via an event-channel.
  3. This notification is delivered to domain 0 by Xen as an interrupt. If Domain 0 is not running, it will need to be scheduled in order to receive the interrupt.
  4. When it receives the interrupt, the domain 0 kernel schedules the corresponding backend process to run, tapdisk3.
  5. When tapdisk3 runs, it looks at the contents of the shared-memory ring.
  6. Finally, tapdisk3 finds the request which can then be transformed into a physical I/O request.

Polling is an alternative to this approach in which tapdisk3 repeatedly looks in the ring, speculatively checking for new requests. This means that steps 2–4 can be skipped: there's no need to wait for an event-channel interrupt, nor to wait for the tapdisk3 process to be scheduled: it's already running. This enables tapdisk3 to pick up the request much more promptly as it avoids these delays inherent to the event-driven approach.

The following diagram contrasts the timelines of these alternative approaches, showing how polling reduces the time until the request is picked up by the backend.

b2ap3_thumbnail_polling-explained.png

How does polling help improve storage I/O performance?

Polling is in established technique for reducing latency in event-driven systems. (One example of where it is used elsewhere to mitigate interrupt latency is in Linux networking drivers that use NAPI.)

Servicing I/O requests promptly is an essential part of optimising I/O performance. As I discussed in my talk at the 2015 Xen Project Developer Summit, reducing latency is the key to maintaining a low virtualisation overhead. As physical I/O devices get faster and faster, any latency incurred in the virtualisation layer becomes increasingly noticeable and translates into lower throughputs.

An I/O request from a VM has a long journey to physical storage and back again. Polling in tapdisk3 optimises one section of that journey.

Isn't polling really CPU intensive, and thus harmful?

Yes it is, so we need to handle it carefully. If left unchecked, polling could easily eat huge quantities of domain 0 CPU time, starving other processes and causing overall system performance to drop.

We have chosen to do two things to avoid consuming too much CPU time:

  1. Poll the ring only when there's a good chance of a request appearing. Of course, guest behaviour is totally unpredictable in general, but there are some principles that can increase our chances of polling at the right time. For example, one assumption we adopt is that it's worth polling for a short time after the guest issues an I/O request. It has issued one request, so there's a good chance that it will issue another soon after. And if this guess turns out to be correct, keep on polling for a bit longer in case any more turn up. If there are none for a while, stop polling and temporarily fall back to the event-based approach.
  2. Don't poll if domain 0 is already very busy. Since polling is expensive in terms of CPU cycles, we only enter the polling loop if we are sure that it won't starve other processes of CPU time they may need.

How much faster does it go?

The benefit you will get from polling depends primarily on the latency of your physical storage device. If you are using an old mechanical hard-drive or an NFS share on a server on the other side of the planet, shaving a few microseconds off the journey through the virtualisation layer isn't going to make much of a difference. But on modern devices and low-latency network-based storage, polling can make a sizeable difference. This is especially true for smaller request sizes since these are most latency-sensitive.

For example, the following graph shows an improvement of 50% in single-threaded sequential read I/O for small request sizes – from 15,000 IOPS to 22,500 IOPS. These measurements were made with iometer in a 32-bit Windows 7 SP1 VM on a Dell PowerEdge R730xd with an Intel P3700 NVMe drive.

b2ap3_thumbnail_5071.png

How was polling implemented?

The code to add polling to tapdisk3 can be found in the following set of commits: https://github.com/xapi-project/blktap/pull/179/commits.

Continue reading
7380 Hits
0 Comments

XenServer Dundee Released

It was a little over a year ago when I introduced a project code named Dundee to this community. In the intervening year, we've had a number pre-release builds; all introducing ever greater capabilities into what I'm now happy to announce as XenServer 7. As you would expect from a major version number, XenServer 7 makes some rather significant strides forward, and defines a significant new capability.

Let's start first with the significant new capability. Some of you may have noted an interesting new security effort appear in upstream Xen a few years ago. Leading this effort was Bitdefender, and at the time it was known by the catchy title of "virtual machine introspection". This effort takes full advantage of the Intel EPT virtualization extensions to permit a true agentless anti-malware solution, where the anti-malware engine is placed in a service VM which is inaccessible from the guest VMs. XenServer 7 officially supports this technology with the Direct Inspect API set, and is platform ready for Bitdefender GravityZone HVI. For virtualization users, the combination of Direct Inspect and GravityZone HVI reduces the attack surface for malware by both removing in-guest agents, and by actively monitoring memory usage from the hypervisor to detect malicious memory accesses and flag questionable activity for remediation. When combined with support for Intel SMAP and PML, XenServer 7 offers significantly increased security compared to previous versions. Since secure operation extends to secure access to the host management APIs, XenServer 7 fully supports TLS 1.2, and can optionally mandate the use of TLS 1.2.

XenServer 7 extends the vGPU market initially defined in 2013 to include both increased scalability with NVIDIA GRID Maxwell M10 and the latest Intel Iris Pro virtual graphics. When combined, these vGPU extensions open the door to greater adoption of virtualized graphics by both increasing the number of GPU enabled VMs per host, as well as potentially removing the requirement for a dedicated GPU add-in card.

Operating virtual infrastructure at any level of scale requires an understanding of the overall health of the environment. While recent XenServer versions have included the ability to upload server status information to the free Citrix Insight Services, this operation was completely manual. With XenServer 7, we're introducing Health Check which is a proactive service which works in concert with Insight Services to monitor the operational health of a XenServer environment, and proactively alert you to any issues. The best part of Health Check is that it's completely free and open to any user of XenServer 7.

No major release would be complete without a requisite bump in performance, and XenServer 7 is no exception. Host memory limits have been bumped to 5TB per host, with a corresponding bump to 1.5TB per VM; OS willing of course. Host CPU count has been increased to 288 cores, and guest virtual CPU count has increased to 32; again OS willing. Disk scalability has also increased with support for up to 255 virtual block devices per VM and 4096 VBDs per host, all while supporting up to 20,000 VDIs per SR. Since XenServer often is deployed in Microsoft Windows environments, Active Directory support for role based authentication is a key requirement, and with XenServer 7, we've improved overall AD performance to support very large AD forests with a resulting improvement in login times.

 

XenServer 7 is available for download today, and can be obtained for free from the XenServer download page.

Recent Comments
Willem Boterenbrood
Congrats on the new release! We were waiting for it to arrive, finally XenServer support for Xeon v3/4 CPU masking and much more i... Read More
Tuesday, 24 May 2016 18:22
David Cottingham
Fixed -- thanks for catching that :-). Upgrades: please see http://docs.citrix.com/content/dam/docs/en-us/xenserver/xenserver-7-0... Read More
Tuesday, 07 June 2016 14:59
David Cottingham
DVSC is supported on 7.0. We're working on getting the downloads on citrix.com accessible.
Tuesday, 07 June 2016 15:07
Continue reading
13235 Hits
29 Comments

XenServer Administrators Handbook Published

Last year, I announced that we were working on a XenServer Administrators Handbook, and I'm very pleased to announce that it's been published. Not only have we been published, but based on the Amazon reviews to date we've done a pretty decent job. In part, I suspect that has a ton to do with the book being focused on what information you, XenServer administrators, need to be successful when running a XenServer environment regardless of scale or workload.

XenServer Administrators HandbookThe handbook is formatted following a simple premise; first you need to plan your deployment and second you need to run it. With that in mind, we start with exactly what a XenServer is, define how it works and what expectations it has on infrastructure. After all, it's critical to understand how a product like XenServer interfaces with the real world, and how its virtual objects relate to each other. We even cover some of the misunderstandings those new to XenServer might have.

While it might be tempting to go deep on some of this stuff, Jesse and I both recognized that virtualization SREs have a job to do and that's to run virtual infrastructure. As interesting as it might be to dig into how the product is implemented, that's not the role of an administrators handbook. That's why the second half of the book provides some real world scenarios, and how to go about solving them.

We had an almost limitless list of scenarios to choose from, and what you see in the book represents real world situations which most SREs will face at some point. The goal of this format being to have a handbook which can be actively used, not something which is read once and placed on some shelf (virtual or physical). During the technical review phase, we sent copies out to actual XenServer admins, all of whom stated that we'd presented some piece of information they hadn't previously known. I for one consider that to be a fantastic compliment.

Lastly, I want to finish off by saying that like all good works, this is very much a "we" effort. Jesse did a top notch job as co-author and brings the experience of someone who's job it is to help solve customer problems. Our technical reviewers added tremendously to the polish you'll find in the book. The O'Reilly Media team was a pleasure to work with, pushing when we needed to be pushed but understanding that day jobs and family take precedence.

So whether you're looking at XenServer out of personal interest, have been tasked with designing a XenServer installation to support Citrix workloads, clouds, or for general purpose virtualization, or have a XenServer environment to call your own, there is something in here for you. On behalf of Jesse, we hope that everyone who gets a copy finds it valuable. The XenServer Administrator's handbook is available from book sellers everywhere including:

Amazon: http://www.amazon.com/XenServer-Administration-Handbook-Successful-Deployments/dp/149193543X/

Barnes and Noble: http://www.barnesandnoble.com/w/xenserver-administration-handbook-tim-mackey/1123640451

O'Reilly Media: http://shop.oreilly.com/product/0636920043737.do

If you need a copy of XenServer to work with, you can obtain that for free from: http://xenserver.org/download

Recent Comments
Tobias Kreidl
A timely publication, given all the major recent enhancements to XenServer. It's packed with a lot of hands-on, practical advice a... Read More
Tuesday, 03 May 2016 03:37
Eric Hosmer
Been looking forward to getting this book, just purchased it on Amazon. Now I just need to find that mythical free time to read ... Read More
Friday, 06 May 2016 22:41
Continue reading
8667 Hits
2 Comments

Implementing VDI-per-LUN storage

With storage providers adding better functionality to provide features like QoS, fast snapshot & clone and with the advent of storage-as-a-service, we are interested in the ability to utilize these features from XenServer. VMware’s VVols offering already allows integration of vendor provided storage features into their hypervisor. Since most storage allows operations at the granularity of a LUN, the idea is to have a one-to-one mapping between a LUN on the backend and a virtual disk (VDI) on the hypervisor. In this post we are going to talk about the supplemental pack that we have developed in order to enable VDI-per-LUN.

Xenserver Storage

To understand the supplemental pack, it is useful to first review how XenServer storage works. In XenServer, a storage repository (SR) is a top-level entity which acts as a pool for storing VDIs which appear to the VMs as virtual disks. XenServer provides different types of SRs (File, NFS, Local, iSCSI). In this post we will be looking at iSCSI based SRs as iSCSI is the most popular protocol for remote storage and the supplemental pack we developed is targeted towards iSCSI based SRs. An iSCSI SR uses LVM to store VDIs over logical volumes (hence the type is lvmoiscsi). For instance:

[root@coe-hq-xen08 ~]# xe sr-list type=lvmoiscsi
uuid ( RO)                : c67132ec-0b1f-3a69-0305-6450bfccd790
          name-label ( RW): syed-sr
    name-description ( RW): iSCSI SR [172.31.255.200 (iqn.2001-05.com.equallogic:0-8a0906-c24f8b402-b600000036456e84-syed-iscsi-opt-test; LUN 0: 6090A028408B4FC2846E4536000000B6: 10 GB (EQLOGIC))]
                host ( RO): coe-hq-xen08
                type ( RO): lvmoiscsi
        content-type ( RO):

The above SR is created from a LUN on a Dell EqualLogic. The VDIs belonging to this SR can be listed by:

[root@coe-hq-xen08 ~]# xe vdi-list sr-uuid=c67132ec-0b1f-3a69-0305-6450bfccd790 params=uuid
uuid ( RO)    : ef5633d2-2ad0-4996-8635-2fc10e05de9a

uuid ( RO)    : b7d0973f-3983-486f-8bc0-7e0b6317bfc4

uuid ( RO)    : bee039ed-c7d1-4971-8165-913946130d11

uuid ( RO)    : efd5285a-3788-4226-9c6a-0192ff2c1c5e

uuid ( RO)    : 568634f9-5784-4e6c-85d9-f747ceeada23

[root@coe-hq-xen08 ~]#

This SR has 5 VDI. From LVM’s perspective, an SR is a volume group (VG) and each VDI is a logical volume(LV) inside that volume group. This can be seen via the following commands:

[root@coe-hq-xen08 ~]# vgs | grep c67132ec-0b1f-3a69-0305-6450bfccd790
  VG_XenStorage-c67132ec-0b1f-3a69-0305-6450bfccd790   1   6   0 wz--n-   9.99G 5.03G
[root@coe-hq-xen08 ~]# lvs VG_XenStorage-c67132ec-0b1f-3a69-0305-6450bfccd790
  LV                                       VG                                                 Attr   LSize 
  MGT                                      VG_XenStorage-c67132ec-0b1f-3a69-0305-6450bfccd790 -wi-a-   4.00M                                 
  VHD-568634f9-5784-4e6c-85d9-f747ceeada23 VG_XenStorage-c67132ec-0b1f-3a69-0305-6450bfccd790 -wi-ao   8.00M                               
  VHD-b7d0973f-3983-486f-8bc0-7e0b6317bfc4 VG_XenStorage-c67132ec-0b1f-3a69-0305-6450bfccd790 -wi-ao   2.45G                               
  VHD-bee039ed-c7d1-4971-8165-913946130d11 VG_XenStorage-c67132ec-0b1f-3a69-0305-6450bfccd790 -wi---   8.00M                                
  VHD-ef5633d2-2ad0-4996-8635-2fc10e05de9a VG_XenStorage-c67132ec-0b1f-3a69-0305-6450bfccd790 -ri-ao   2.45G
VHD-efd5285a-3788-4226-9c6a-0192ff2c1c5e VG_XenStorage-c67132ec-0b1f-3a69-0305-6450bfccd790 -ri-ao  36.00M

Here c67132ec-0b1f-3a69-0305-6450bfccd790 is the UUID of the SR. Each VDI is represented by a corresponding LV which is of the format VHD-. Some of the LVs have a small size of 8MB. These are snapshots taken on XenServer. There is also a LV named MGT which holds metadata about the SR and the VDIs present in it. Note that all of this is present in an SR which is a LUN on the backend storage.

Now XenServer can attach a LUN at the level of an SR but we want to map a LUN to a single VDI. In order to do that, we restrict an SR to contain a single VDI. Our new SR has the following LVs:

[root@coe-hq-xen09 ~]# lvs VG_XenStorage-1fe527a4-7e96-cdd9-f347-a15c240f26e9
LV                                       VG                                                 Attr   LSize
MGT                                      VG_XenStorage-1fe527a4-7e96-cdd9-f347-a15c240f26e9 -wi-a- 4.00M
VHD-09b14a1b-9c0a-489e-979c-fd61606375de VG_XenStorage-1fe527a4-7e96-cdd9-f347-a15c240f26e9 -wi--- 8.02G
[root@coe-hq-xen09 ~]#

b2ap3_thumbnail_vdi-lun.png

If a snapshot or clone of the LUN is taken on the backend, all the unique identifiers associated with the different entities in the LUN also get cloned and any attempt to attach the LUN back to XenServer will result in an error because of conflicts of unique IDs.

Resignature and supplemental pack

In order for the cloned LUN to be re-attached, we need to resignature the unique IDs present in the LUN. The following IDs need to be resignatured

  • LVM UUIDs (PV, VG, LV)
  • VDI UUID
  • SR metadata in the MGT Logical volume

We at CloudOps have developed an open-source supplemental pack which solves the resignature problem. You can find it here. The supplemental pack adds a new type of SR (relvmoiscsi) and you can use it to resignature your lvmoiscsi SRs. After installing the supplemental pack, you can resignature a clone using the following command

[root@coe-hq-xen08 ~]# xe sr-create name-label=syed-single-clone type=relvmoiscsi 
device-config:target=172.31.255.200
device-config:targetIQN=$IQN
device-config:SCSIid=$SCSIid
device-config:resign=true
shared=true
Error code: SR_BACKEND_FAILURE_1
Error parameters: , Error reporting error, unknown key The SR has been successfully resigned. Use the lvmoiscsi type to attach it,
[root@coe-hq-xen08 ~]#

Here, instead of creating a new SR, the supplemental pack re-signatures the provided LUN and detaches it (the error is expected as we don’t actually create an SR). You can see from the error message that the SR has been re-signed successfully. Now the cloned SR can be introduced back to XenServer without any conflicts using the following commands:

[root@coe-hq-xen09 ~]# xe sr-probe type=lvmoiscsi device-config:target=172.31.255.200 device-config:targetIQN=$IQN device-config:SCSIid=$SCSIid

   		 5f616adb-6a53-7fa2-8181-429f95bff0e7
   		 /dev/disk/by-id/scsi-36090a028408b3feba66af52e0000a0e6
   		 5364514816

[root@coe-hq-xen09 ~]# xe sr-introduce name-label=vdi-test-resign type=lvmoiscsi 
uuid=5f616adb-6a53-7fa2-8181-429f95bff0e7
5f616adb-6a53-7fa2-8181-429f95bff0e7

This supplemental pack can be used in conjunction with an external orchestrator like CloudStack or OpenStack which can manage both the storage and compute. Working with SolidFire we have implemented this functionality, available in the next release of Apache CloudStack. You can check out a preview of this feature in a screencast here.

Recent Comments
Nick
If I am reading this correctly, this is just basically setting up XS to use 1 SR per VM, this isn't scalable as the limits for LUN... Read More
Tuesday, 26 April 2016 14:57
Syed Ahmed
Hi Nick, The limit of 256 SRs is when using Multipating. If no multipath is used, the number of SRs that can be created are well... Read More
Tuesday, 26 April 2016 17:19
Syed Ahmed
There is an initial overhead when creating SRs. However, we did not find any performance degradation in our tests once the SR is s... Read More
Wednesday, 27 April 2016 09:21
Continue reading
4569 Hits
7 Comments

About XenServer

XenServer is the leading open source virtualization platform, powered by the Xen Project hypervisor and the XAPI toolstack. It is used in the world's largest clouds and enterprises.
 
Commercial support for XenServer is available from Citrix.