Virtualization Blog

Discussions and observations on virtualization.

Dundee Alpha 2 Released

I am pleased to announce that today we have made available the second alpha build for XenServer Dundee. For those of you who missed the first alpha, it was focused entirely on the move to CentOS 7 for dom0. This important operational change is one long time XenServer users and those who have written management tooling for XenServer should be aware of throughout the Dundee development cycle. At the time of Alpha 1, no mention was made for feature changes, and with Alpha 2 we're going to talk about some features. So here are some of the important items to be aware of.

Thin Provisioning on block storage

For those who aren't aware, when a XenServer SR is using iSCSI or an HBA, the virtual disks have always consumed their entire allocated space regardless of how utilized the actual virtual disk was. With Dundee we now have full thin provisioning for all block storage independent of storage vendor. In order to take advantage of this, you will need to indicate during SR creation that thin provisioning is required. You will also be given the opportunity to specify the default vdi allocation which allows users to optimize vdi utilization against storage performance. We do know about a number of areas still needing attention, but are providing early access such that the community can further identify issues our testing hasn't yet encountered.

NFS version 4

While a simple enhancement, this was identified as a priority item during the Creedence previews last year. We didn't really have the time then to fully implement it, but as of Dundee Alpha 2 you can specify NFS 4 for SR creation in XenCenter.

Intel GVT-d

XenServer 6.5 SP1 introduced support for Intel GVT-d graphics in Haswell and Broadwell chips. This support has been ported to Dundee and is now present in Alpha 2. At this point GPU operations in Dundee should have feature parity to XenServer 6.5 SP1.

CIFS for virtual disk storage

For some time we've had CIFS as an option for ISO storage, but lacked it for virtual disk storage. That has been remedied and if you are running CIFS you can now use it for all your XenServer storage needs.

Changed dom0 disk size

During installation of XenServer 6.5 and prior, a 4GB partition is created for dom0 with an additional 4GB partition created as a backup. For some users, the 4GB partition was too limiting, particularly if remote SYSLOG wasn't used or when third party monitoring tools were installed in dom0. With Dundee we've completely changed the local storage layout for dom0, and this has significant implications for all users wishing to upgrade to Dundee.

New layout

The new partition layout will consume 46GB from local storage. If there is less than 46 GB available, then a fresh install will fail. The new partition layout will be as follows:

  • 512 MB UEFI boot partition
  • 18 GB dom0 partition
  • 18 GB backup partition
  • 4 GB logs partition
  • 1 GB SWAP partition

As you can see from this new partition layout that we've separated logs and SWAP out of the main operating partition, and that we're now supporting UEFI boot.

Upgrades

During upgrade, if there is at least 46 GB available, we will recreate the partition layout to match that of a fresh install. In the event 46GB isn't available, we will shrink the existing dom0 partition from 4 GB to 3.5 GB and create the 512 MB UEFI boot partition.

Downloading Alpha 2

 

Dundee Alpha 2 is available for download from xenserver.org/prerelease

Recent Comments
Tim Stephenson
46Gb+ seems very large for Dom0 :/ We're running XenServer on a set of Dell Poweredge R630's s equipped with dual 16Gb SD card bo... Read More
Tuesday, 14 July 2015 20:17
Tim Mackey
It's not 46GB for dom0, but 46GB for all storage dedicated to XenServer itself (dom0 is only 18GB). I hear you about the SD card ... Read More
Tuesday, 14 July 2015 20:26
Tobias Kreidl
First off, this is fantastic news, regarding what's mentioned (NFSv4, thin provisioning on LVM, bigger partition space, CIFS stora... Read More
Tuesday, 14 July 2015 20:45
Continue reading
26542 Hits
42 Comments

Configuring XenApp to use two NVIDIA GRID engines

SUMMARY

The configuration of a XenApp virtual machine (VM) hosted on XenServer that supports two concurrent graphics processing engines in passthrough mode is shown to work reliably and provide the opportunity to give more flexibility to a single XenApp VM rather than having to spread the access to the engines over two separate XenApp VMs. This in turn can provide more flexibility, save operating system licensing costs and ostensibly, could be extended to incorporate additional GPU engines.

INTRODUCTION

A XenApp virtual machine (VM) that supports two or more concurrent graphics processing units (GPUs) has a number of advantages over running separate VM instances, each with its own GPU engine. For one, if users happen to be unevenly relegated to particular XenApp instances, some XenApp VMs may idle while other instances are overloaded, to the detriment of users associated with busy instances. It is also simpler to add capacity to such a VM as opposed to building and licensing yet another Windows Server VM.  This study made use of an NVIDIA GRID K2 (driver release 340.66), comprised of two Kepler GK104 engines and 8 GB of GDDR5 RAM (4 GB per GPU). It is hosted in a base system that consists of a Dell R720 with dual Intel Xeon E5-2680 v2 CPUs (40 VCPUs, total, hyperthreaded) hosting XenServer 6.2 SP1 running XenApp 7.6 as a VM with 16 VCPUs and 16 GB of memory on Windows 2012 R2 Datacenter.

PROCEDURE

It is important to note that these steps constitute changes that are not officially supported by Citrix or NVIDIA and are to be regarded as purely experimental at this stage.

Registry changes to XenApp were made according to these instructions provided in the Citrix Product Documentation.

On the XenServer, first list devices and look for GRID instances:
# lspci|grep -i nvid
44:00.0 VGA compatible controller: NVIDIA Corporation GK104GL [GRID K2] (rev a1)
45:00.0 VGA compatible controller: NVIDIA Corporation GK104GL [GRID K2] (rev a1)

Next, get the UUID of the VM:
# xe vm-list
uuid ( RO)           : 0c8a22cf-461f-0030-44df-2e56e9ac00a4
     name-label ( RW): TST-Win7-vmtst1
    power-state ( RO): running
uuid ( RO)           : 934c889e-ebe9-b85f-175c-9aab0628667c
     name-label ( RW): DEV-xapp
    power-state ( RO): running

Get the address of the existing GPU engine, if one is currently associated:
# xe vm-param-get param-name=other-config uuid=934c889e-ebe9-b85f-175c-9aab0628667c
vgpu_pci: 0/0000:44:00.0; pci: 0/0000:44:0.0; mac_seed: d229f84d-73cc-e5a5-d105-f5a3e87b82b7; install-methods: cdrom; base_template_name: Windows Server 2012 (64-bit)
(Note: ignore any vgpu_pci parameters that are irrelevant now to this process, but may be left over from earlier procedures and experiments.)

Dissociate the GPU via XenCenter or via the CLI, set GPU type to “none”.
Then, add both GPU engines following the recommendations in assigning multiple GPUs to a VM in XenServer using the other-config:pci parameter:
# xe vm-param-set uuid=934c889e-ebe9-b85f-175c-9aab0628667c
   other-config:pci=0/0000:44:0.0,0/0000:45:0.0
In other words, do not use the vgpu_pci parameter at all.

Check if the new parameters took hold:
# xe vm-param-get param-name=other-config uuid=934c889e-ebe9-b85f-175c-9aab0628667c params=all
vgpu_pci: 0/0000:44:00.0; pci: 0/0000:44:0.0,0/0000:45:0.0; mac_seed: d229f84d-73cc-e5a5-d105-f5a3e87b82b7; install-methods: cdrom; base_template_name: Windows Server 2012 (64-bit)
Next, turn GPU passthrough back on for the VM in XenCenter or via the CLI and start up the VM.

On the XenServer you should now see no GPUs available:
# nvidia-smi
Failed to initialize NVML: Unknown Error
This is good, as both K2 engines now have been allocated to the XenApp server.
On the XenServer you can also run “xn –v pci-list  934c889e-ebe9-b85f-175c-9aab0628667c” (the UUID of the VM) and should see the same two PCI devices allocated:
# xn -v pci-list 934c889e-ebe9-b85f-175c-9aab0628667c
id         pos bdf
0000:44:00.0 2   0000:44:00.0
0000:45:00.0 1   0000:45:00.0
More information can be gleaned from the “xn diagnostics” command.

Next, log onto the XenApp VM and check settings using nvidia-smi.exe. The output will resemble that of the image in Figure 1.

 

 GRID-Fig-1.jpg
Figure 1. Output From the nvidia-smi utility, showing the allocation of both K2 engines.


Note the output shows correctly that 4096 MiB of memory are allocated for each of the two engines in the K2, totaling its full capacity of 8196 MiB. XenCenter will still show only one GPU engine allocated (see Figure 2) since it is not aware that both are allocated to the XenApp VM and has currently no way of making that distinction.

 

GRID-Fig-2.jpgFigure 2. XenCenter GPU allocation (showing just one engine – all XenServer is currently capable of displaying)

 

So, how can you tell if it is really using both GRID engines? If you run the nvidia-smi.exe program on the XenApp VM itself, you will see it has two GPUs configured in passthrough mode (see the earlier screenshot in Figure 1). Depending on how apps are launched, you will see one or the other or both of them active.  As a test, we ran two concurrent Unigine "Heaven" benchmark instances and both came out with the same metrics within 1% of each other as well as when just one instance was run, and both engines showed as being active. Displayed in Figure 3 is a sample screenshot of the Unigine ”Heaven” benchmark running with one active instance; note that it sees both K2 engines present, even though the process is making use of just one.


GRID-Fig-3.jpg
Figure 3. A sample Unigine “Heaven” benchmark frame. Note the two sets of K2 engine metrics displayed in the upper right corner.


It is evident from the display in the upper right hand corner that one engine has allocated memory and is working, as evidenced by the correspondingly higher temperature reading and memory frequency. The result of a benchmark using openGL and a 1024x768 pixel resolution is seen in Figure 4. Note again the difference between what is shown for the two engines, in particular the memory and temperature parameters.

 GRID-Fig-4.jpg

Figure 4. Outcome of the benchmark. Note the higher memory and temperature on the second K2 engine.

 

When another instance is running concurrently, you see its memory and temperature also rise accordingly in addition to the load evident on the first engine, as well as activity on both engines in the output from the nvidia-smi.exe utility (Figure 5).


CaptureDualK2c.JPG
Figure 5. Two simultaneous benchmarks running, using both GRID K2 engines, and the nvidia-smi output.

You can also see with two instances running concurrently how the load is affected. Note in the performance graphs from XenCenter shown in Figure 6 how one copy of the “Heaven” benchmark impacts the server and then about halfway across the graphs, a second instance is launched.

 GRID-Fig-6.jpg

Figure 6. XenCenter performance metrics of first one, then a second concurrent Unigine “Heaven” benchmark.


CONCLUSIONS

The combination of two GRID K2 engines associated with a single, hefty XenApp VM works well for providing adequate capacity to support a number of concurrent users in GPU passthrough mode without the need of hosting additional XenApp instances. As there is a fair amount of leeway in the allocation of CPUs and memory to a virtualized instance under XenServer (up to 16 vCPUs and 128 GB of memory under XenServer 6.2 when these tests were run), one XenApp VM should be able to handle a reasonably large number of tasks.  As many as six concurrent sessions of this high-demand benchmark with 800x600 high-resolution settings have been tested with the GPUs still not saturating. A more typical application, like Google Earth, consumes around 3 to 5% of the cycles of a GRID K2 engine per instance during active use, depending on the activity and size of the window, so fairly minimal. In other words, twenty or more sessions could be handled by each engine, or potentially 40 or more for the entire GRID K2 with a single XenApp VM, provided of course that the XenApp’s memory and its own CPU resources are not overly taxed.

XenServer 6.2 already supports as many as eight physical GPUs per host, so as servers expand, one could envision having even more available engines that could be associated with a particular VM. Under some circumstances, passthrough mode affords more flexibility and makes better use of resources compared to creating specific vGPU assignments. Windows Server 2012 R2 Datacenter supports up to 64 sockets and 4 TB of memory, and hence should be able to support a significantly larger number of associated GPUs. XenServer 6.2 SP1 has a processor limit of 16 VCPUs and 128 GB of virtual memory. XenServer 6.5, officially released in January 2015, supports up to four K2 GRID cards in some physical servers and up to 192 GB of RAM per VM for some guest operating systems as does the newer release documented in the XenServer 6.5 SP1 User's Guide, so there is a lot of potential processing capacity available. Hence, a very large XenApp VM could be created that delivers a lot of raw power with substantial Microsoft server licensing savings. The performance meter shown above clearly indicates that VCPUs are the primary limiting factor in the XenApp configuration and with just two concurrent “Heaven” sessions running, about a fourth of the available CPU capacity is consumed compared to less than 3 GB of RAM, which is only a small additional amount of memory above that allocated by the first session.

These same tests were run after upgrading to XenServer 6.5 and with newer versions of the NVIDIA GRID drivers and continue to work as before. At various times, this configuration was run for many weeks on end with no stability issues or errors detected during the entire time.

ACKNOWLEDGEMENTS

I would like to thank my co-worker at NAU, Timothy Cochran, for assistance with the configuration of the Windows VMs used in this study. I am also indebted to Rachel Berry, Product Manager of HDX Graphics at Citrix and her team, as well as Thomas Poppelgaard and also Jason Southern of the NVIDIA Corporation for a number of stimulating discussions. Finally, I would like to greatly thank Will Wade of NVIDIA for making available the GRID K2 used in this study.

Continue reading
16645 Hits
0 Comments

About XenServer

XenServer is the leading open source virtualization platform, powered by the Xen Project hypervisor and the XAPI toolstack. It is used in the world's largest clouds and enterprises.
 
Commercial support for XenServer is available from Citrix.