Virtualization Blog

Discussions and observations on virtualization.

Citrix Joins OpenStack Foundation

Some of you might have noticed that Citrix joined the OpenStack Foundation yesterday and may be wondering what this means for two key technologies I've been closely involved with; Apache CloudStack and XenServer. The first, and arguably most important thing to note is that as Steve Wilson has stated, we're embracing both OpenStack and CloudStack to help further innovation. Nand Mulchandani also highlights that a culture of “anyness” is a core part of Citrix. With all the noise in the market about the various IaaS cloud solutions, supporting user choice is an important point to be clear on. So with that as backdrop, what does this really mean?

The XenServer Perspective on OpenStack

As I mentioned in my blog about OpenStack Summit, I really want XenServer to be a first class citizen within OpenStack. I tried to further that objective through submission of presentations to OpenStack Summit, but if you look at the schedule you'll note that no XenServer related talks were accepted. That's unfortunate, and really speaks to the challenge we face within a community when we're not the obvious or default choice. Obviously we can raise our profile through contributions and simply showing up at OpenStack events, but there is also a pretty important and easy thing we can change.

When a vendor evaluates a technology, they look at the ecosystem around it. OpenStack technology has a ton of buzz. If you look on job boards, you'll see many postings for OpenStack positions. If you search for cloud technologies, key supporters of OpenStack will be listed. Importantly, when selecting a technology suite, you'll look at who supports their technology with the suite and use them in your short list. Until today, it was unclear if Citrix actively supported the use of XenServer within OpenStack. Our joining the OpenStack Foundation is one way of signaling to those who prefer OpenStack that Citrix is supportive of their efforts. So if you've been quietly using XenServer in an OpenStack environment, I want to learn more about it. I want to learn what works, and where the pain points are so they might be addressed. If you've ever questioned if production support for XenServer when used with OpenStack could be supported, the answer is yes, and here's a link to buy support (hard sell over)!

The XenServer Perspective on CloudStack

For those of you who have adopted XenServer for your CloudStack clouds, nothing has changed and you should feel nothing change. XenServer will remain a first class citizen in CloudStack, and we'll continue to improve all aspects of XenServer operation within CloudStack such that XenServer remains an obvious choice. You'll continue to see XenServer content proposed to CloudStack events, and I hope you'll continue to accept those talks. I promise to continue to work on cool things like the Packer work I presented at CloudStack Day Austin which showed a method to migrate legacy infrastructure running on XenServer to a CloudStack cloud powered by XenServer; all without the users even noticing the migration happened. My hope is that the OpenStack community will want some of those same cool things, but that will take time and can't be forced.

So in the end this really isn't a commentary about which cloud solution is better, but a case of allowing customer choice. OpenStack has mindshare, and it only makes sense for Citrix and its technology suite to have a seat at the table. With Citrix openly supporting its technologies when deployed with OpenStack, everyone has the freedom to choose which solution works best.     

Recent Comments
Sebastian
I would like to see XenServer in OpenStack. At the moment we use XenServer on all our servers but we are looking for a solution li... Read More
Tuesday, 28 April 2015 08:07
Tim Mackey
Sebastian, XenServer is supported through the use of the "xapi" Nova driver in OpenStack, and also within CloudStack. Both OpenS... Read More
Tuesday, 28 April 2015 13:29
Continue reading
13655 Hits
2 Comments

XenServer 6.5 and Asymmetric Logical Unit Access (ALUA) for iSCSI Devices

INTRODUCTION

There are a number of ways to connect storage devices to XenServer hosts and pools, including local storage, HBA SAS and fiber channel, NFS and iSCSI. With iSCSI, there are a number of implementation variations including support for multipathing with both active/active and active/passive configurations, plus the ability to support so-called “jumbo frames” where the MTU is increased from 1500 to typically 9000 to optimize frame transmissions. One of the lesser-known and somewhat esoteric iSCSI options available on many modern iSCSI-based storage devices is Asymmetric Logical Unit Access (ALUA), a protocol that has been around for a decade and is furthermore mysterious and intriguing because of its ability to be used not only with iSCSI, but also with fiber channel storage. The purpose of this article is an attempt to both clarify and outline how ALUA can be used more flexibly now with iSCSI on XenServer 6.5.

HISTORY

ALUA support on XenServer goes way back to XenServer 5.6 and initially only with fiber channel devices. The support of iSCSI ALUA connectivity started on XenServer 6.0 and was initially limited to specific ALUA-capable devices, which included the EMC Clariion, NetApp FAS as well as the EMC VMAX and VNX series. Each device required specific multipath.conf file configurations to properly integrate with the server used to access them, XenServer being no exception. The upstream XenServer code also required customizations. The "How to Configure ALUA Multipathing on XenServer 6.x for Enterprise Arrays" article CTX132976 (March 2014, revised March 2015) currently only discusses ALUA support through XenServer 6.2 and only for specific devices, stating: “Most significant is the usability enhancement for ALUA; for EMC™ VNX™ and NetApp™ FAS™, XenServer will automatically configure for ALUA if an ALUA-capable LUN is attached”.

It was announced in the XenServer 6.5 Release Notes that XenServer will automatically connect to one of these aforementioned documented devices and it is now running the updated device mapper multipath (DMMP) version 0.4.9-72. This rekindled my interest in ALUA connectivity and after some research and discussions with Citrix and Dell about support, it appeared this might now be possible specifically for the Dell MD3600i units we have used on XenServer pools for some time now. What is not stated in the release notes is that XenServer 6.5 now has the ability to connect generically to a large number of ALUA-capable storage arrays. This will be gone into detail later. It is also of note that MPP-RDAC support is no longer available in XenServer 6.5 and DMMP is the exclusive multipath mechanism supported. This was in part because of support and vendor-specific issues (see, for example, the XenServer 6.5 Release Notes or this document from Dell, Inc.).

But first, how are ALUA connections even established? And perhaps of greater interest, what are the benefits of ALUA in the first place?

ALUA DEFINITIONS AND SETTINGS

As the name suggests, ALUA is intended to optimize storage traffic by making use of optimized paths. With multipathing and multiple controllers, there are a number of paths a packet can take to reach its destination. With two controllers on a storage array and two NICs dedicated to iSCSI traffic on a host, there are four possible paths to a storage Logical Unit Number (LUN). On the XenServer side, LUNs then are associated with storage repositories (SRs). ALUA recognizes that once an initial path is established to a LUN that any multipathing activity destined for that same LUN is better served if routed through the same storage array controller. It attempts to do so as much as possible, unless of course a failure forces the connection to have to take an alternative path. ALUA connections fall into five self-explanatory categories (listed along with their associated hex codes):

  • Active/Optimized : x0
  • Active/Non-Optimized : x1
  • Standby : x2
  • Unavailable : x3
  • Transitioning : xf

For ALUA to work, it is understood that an active/active storage path is required and furthermore that an asymmetrical active/active mechanism is involved. The advantage of ALUA comes from less fragmentation of packet traffic by routing if at all possible both paths of the multipath connection via the same storage array controller as the extra path through a different controller is less efficient. It is very difficult to locate specific metrics on the overall gains, but hints of up to 20% can be found in on-line articles (e.g., this openBench Labs report on Nexsan), hence this is not an insignificant amount and potentially more significant that gains reached by implementing jumbo frames. It should be noted that the debate continues to this day regarding the benefits of jumbo frames and to what degree, if any, they are beneficial. Among numerous articles to be found are: The Great Jumbo Frames Debate from Michael Webster, Jumbo Frames or Not - Purdue University Research, Jumbo Frames Comparison Testing, and MTU Issues from ESNet. Each installation environment will have its idiosyncrasies and it is best to conduct tests within one's unique configuration to evaluate such options.

The SCSI Architecture Model version defines these SCSI Primary Commands (SPC-3) used to determine paths. The mechanism by which this is accomplished is target port group support (TPGS). The characteristics of a path can be read via an RTPG command or set with an STPG command. With ALUA, non-preferred controller paths are used only for fail-over purposes. This is illustrated in Figure 1, where an optimized network connection is shown in red, taking advantage of routing all the storage network traffic via Node A (e.g., storage controller module 0) to LUN A (e.g., 2).

 

b2ap3_thumbnail_ALUAfig1.jpg

Figure 1.  ALUA connections, with the active/optimized paths to Node A shown as red lines and the active/non-optimized paths shown as dotted black lines.

 

Various SPC commands are provided as utilities within the sg3_utils (SCSI generic) Linux package.

There are other ways to make such queries, for example, VMware has a “esxcli nmp device list” command and NetApp appliances support “igroup” commands that will provide direct information about ALUA-related connections.

Let us first examine a generic Linux server containing ALUA support connected to an ALUA-capable device. In general, note that this will entail a specific configuration to the /etc/multipath.conf file and typical entries, especially for some older arrays or XenServer versions, will use one or more explicit configuration parameters such as:

  • hardware_handler ”1 alua”
  • prio “alua”
  • path_checker “alua”

Consulting the Citrix knowledge base article CTX132976, we see for example the EMC Corporation DGC Clariion device makes use of an entry configured as:

        device{
                vendor "DGC"
                product "*"
                path_grouping_policy group_by_prio
                getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
                prio_callout "/sbin/mpath_prio_emc /dev/%n"
                hardware_handler "1 alua"
                no_path_retry 300
                path_checker emc_clariion
                failback immediate
        }

To investigate the multipath configuration in more detail, we can make use of the TPGS setting. The TPGS setting can be read using the sg_rtpg command. By using multiple “v” flags to increase verbosity and “d” to specify the decoding of the status code descriptor returned for the asymmetric access state, we might see something like the following for one of the paths:

# sg_rtpg -vvd /dev/sde
open /dev/sdg with flags=0x802
    report target port groups cdb: a3 0a 00 00 00 00 00 00 04 00 00 00
    report target port group: requested 1024 bytes but got 116 bytes
Report list length = 116
Report target port groups:
  target port group id : 0x1 , Pref=0
    target port group asymmetric access state : 0x01 (active/non optimized)
    T_SUP : 0, O_SUP : 0, U_SUP : 1, S_SUP : 0, AN_SUP : 1, AO_SUP : 1
    status code : 0x01 (target port asym. state changed by SET TARGET PORT GROUPS command)
    vendor unique status : 0x00
    target port count : 02
    Relative target port ids:
      0x01
      0x02
(--snip--)

Noting the boldfaced characters above, we see here specifically that target port ID 1 is an active/non-optimized ALUA path, both from the “target port group id” line as well as from the “status code”. We also see there are two paths identified, with target port IDs 1,1 and 1,2.

There are a slew of additional “sg” commands, such as the sg_inq command, often used with the flag “-p 0x83” to get the VPD (vital product data) page of interest, sg_rdac, etc. The sg_inq command will in general return, in fact, TPGS > 0 for devices that support ALUA. More on that will be discussed later on in this article. One additional command of particular interest, because not all storage arrays in fact support target port group queries (more also on this important point later!), is sg_vpd (sg vital product data fetcher), as it does not require TPG access. The base syntax of interest here is:

sg_vpd –p 0xc9 –hex /dev/…

where “/dev/…” should be the full path to the device in question. Looking at an example of the output of a real such device, we get:

# sg_vpd -p 0xc9 --hex /dev/mapper/mpathb1
Volume access control (RDAC) VPD Page:
00     00 c9 00 2c 76 61 63 31  f1 01 00 01 01 01 00 00    ...,vac1........
10     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00    ................
20     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00    ................

If one reads the source code for various device handlers (see the multipath tools hardware table for an extensive list of hardware profiles as well as the Linux SCSI device handler regarding how the data are interpreted through the device handler), one can determine that the value of interest here is that of avte_cvp (part of the RDAC c9_inquiry structure), which is the sixth hex value, and will indicate if the connected device is using ALUA (if shifted right five bits together with a logical AND with 0x1, in the RDAC world, known as IOSHIP mode), AVT, or Automatic Volume Transfer mode (if shifted right seven bits together with a logical AND with 0x1), or otherwise defaults in general to basic RDAC (legacy) mode. In the case above we see “61” returned (indicated in boldface), so (0x61 >> 5 & 0x1) is equal to 1, and hence the above connection is indeed an ALUA RDAC-based connection.

I will revisit sg commands once again later on. Do note that the sg3_utils package is not installed on stock XenServer distributions and as with any external package, the installation of external packages may void any official Citrix support.

MULTIPATH CONFIGURATIONS AND REPORTS

In addition to all the information that various sg commands provide, there is also an abundance of information available from the standard multipath command. We saw a sample multipath.conf file earlier, and at least with many standard Linux OS versions and ALUA-capable arrays, information on the multipath status can be more readily obtained using stock multipath commands.

For example, on an ALUA-enabled connection we might see output similar to the following from a “multipath –ll” command (there will be a number of variations in output, depending on the version, verbosity and implementation of the multipath utility):

mpath2 (3600601602df02d00abe0159e5c21e111) dm-4 DGC,VRAID
[size=100G][features=1 queue_if_no_path][hwhandler=1 alua][rw]
_ round-robin 0 [prio=50][active]
 _ 1:0:3:20  sds   70:724   [active][ready]
 _ 0:0:1:20  sdk   67:262   [active][ready]
_ round-robin 0 [prio=10][enabled]
 _ 0:0:2:20  sde   8:592    [active][ready]
 _ 1:0:2:20  sdx   128:592  [active][ready]

Recalling the device sde from the section above, note that it falls under a path with a lower priority of 10,  indicating it is part of an active, non-optimized network connection vs. 50, which indicates being in an active, optimized group; a priority of “1” would indicate the device is in the standby group. Depending on what mechanism is used to generate the priority values, be aware that these priority values will vary considerably; the most important point is that whatever path has a higher “prio” value will be the optimized path. In some newer versions of the multipath utility, the string “hwhandler=1 alua” shows clearly that the controller is configured to allow the hardware handler to help establish the multipathing policy as well as that ALUA is established for this device. I have read that the path priority will be elevated to typically a value of between 50 and 80 for optimized ALUA-based connections (cf. mpath_prio_alua in this Suse article), but have not seen this consistently.

The multipath.conf file itself has traditionally needed tailoring to each specific device. It is particularly convenient, however, that using a generic configuration is now possible for a device that makes use of the internal hardware handler and is rdac-based and can auto-negotiate an ALUA connection. The italicized entries below represent the specific device itself, but others should now work using this generic sort of connection:

device {
                vendor                  "DELL"
                product                 "MD36xx(i|f)"
                features                "2 pg_init_retries 50"
                hardware_handler        "1 rdac"
                path_selector           "round-robin 0"
                path_grouping_policy    group_by_prio
                failback                immediate
                rr_min_io               100
                path_checker            rdac
                prio                    rdac
                no_path_retry           30
                detect_prio             yes
                retain_attached_hw_handler yes
        }

Note how this differs (the additional entries above are in boldface type) from the “stock” version (in XenServer 6.5) of the MD36xx multipath configuration:

device {
                vendor                  "DELL"
                product                 "MD36xx(i|f)"
                features                "2 pg_init_retries 50"
                hardware_handler        "1 rdac"
                path_selector           "round-robin 0"
                path_grouping_policy    group_by_prio
                failback                immediate
                rr_min_io               100
                path_checker            rdac
                prio                    rdac
                no_path_retry           30
        }

THE CURIOUS CASE OF DELL MD32XX/36XX ARRAY CONTROLLERS

The LSI controllers incorporated into Dell’s MD32xx and MD36xx series of iSCSI storage arrays represent an unusual and interesting case. As promised earlier, we will get back to looking at the sg_inq command, which queries a storage device for several pieces of information, including TPGS. Typically, an array that supports ALUA will return a value of TPGS > 0, for example:

# sg_inq /dev/sda
standard INQUIRY:
PQual=0 Device_type=0 RMB=0 version=0x04 [SPC-2]
[AERC=0] [TrmTsk=0] NormACA=1 HiSUP=1 Resp_data_format=2
SCCS=0 ACC=0 TPGS=1 3PC=1 Protect=0 BQue=0
EncServ=0 MultiP=1 (VS=0) [MChngr=0] [ACKREQQ=0] Addr16=0
[RelAdr=0] WBus16=0 Sync=0 Linked=0 [TranDis=0] CmdQue=1
[SPI: Clocking=0x0 QAS=0 IUS=0]
length=117 (0x75) Peripheral device type: disk
Vendor identification: NETAPP
Product identification: LUN
Product revision level: 811a

Highlighted in boldface, we see in this case above that TPGS is reported to have a value of 1. The MD36xx has supported ALUA since RAID controller firmware 07.84.00.64 and  NVSRAM  N26X0-784890-904, however, even with that (or newer) revision level, an sg_inq returns the following for this particular storage array:

# sg_inq /dev/mapper/36782bcb0002c039d00005f7851dd65de
standard INQUIRY:
  PQual=0  Device_type=0  RMB=0  version=0x05  [SPC-3]
  [AERC=0]  [TrmTsk=0]  NormACA=1  HiSUP=1  Resp_data_format=2
  SCCS=0  ACC=0  TPGS=0  3PC=1  Protect=0  BQue=0
  EncServ=1  MultiP=1 (VS=0)  [MChngr=0]  [ACKREQQ=0]  Addr16=0
  [RelAdr=0]  WBus16=1  Sync=1  Linked=0  [TranDis=0]  CmdQue=1
  [SPI: Clocking=0x0  QAS=0  IUS=0]
    length=74 (0x4a)   Peripheral device type: disk
 Vendor identification: DELL
 Product identification: MD36xxi
 Product revision level: 0784
 Unit serial number: 142002I

Various attempts to modify the multipath.conf file to try to force TPGS to appear with any value greater than zero all failed. Above all, it seemed that without access to the TPGS command, there was no way to query the device for ALUA-related information.  Furthermore, the command mpath_prio_alua and similar commands appear to have been deprecated in newer versions of the device-mapper-multipath package, and so offer no help.

This proved to be a major roadblock in making any progress. Ultimately it turned out that the key to looking for ALUA connectivity in this particular case comes oddly from ignoring what TPGS reports, and rather focusing on what the MD36xx controller is doing. What is going on here is that the hardware handler is taking over control and the clue comes from the sg_vpd output shown above. To see how a LUN is mapped for these particular devices, one needs to hunt back through the /var/log/messages file for entries that appear when the LUN was first attached. To investigate this for the MD36xx array, we know it uses the internal “rdac” connection mechanism for the hardware handler, so a Linux grep command for “rdac” in the /var/log/messages file around the time the connection was established to a LUN should reveal how it was established.

Sure enough, if one looks at a case where the connection is known to not be making use of ALUA, you might see entries such as these:

[   98.790309] rdac: device handler registered
[   98.796762] sd 4:0:0:0: rdac: AVT mode detected
[   98.796981] sd 4:0:0:0: rdac: LUN 0 (owned (AVT mode))
[   98.797672] sd 5:0:0:0: rdac: AVT mode detected
[   98.797883] sd 5:0:0:0: rdac: LUN 0 (owned (AVT mode))
[   98.798590] sd 6:0:0:0: rdac: AVT mode detected
[   98.798811] sd 6:0:0:0: rdac: LUN 0 (owned (AVT mode))
[   98.799475] sd 7:0:0:0: rdac: AVT mode detected
[   98.799691] sd 7:0:0:0: rdac: LUN 0 (owned (AVT mode))

In contrast, an ALUA-based connection to LUNs shown below on an MD3600i that has new enough firmware to support ALUA and using an appropriate client that also supports ALUA and has a properly configured entry in the /etc/multipath.conf file will instead show the IOSHIP connection mechanism (see p. 124 of this IBM System Storage manual for more on I/O Shipping):

Mar 11 09:45:45 xs65test kernel: [   70.823257] scsi 8:0:0:1: rdac: LUN 1 (IOSHIP) (owned)
Mar 11 09:45:46 xs65test kernel: [   71.385835] scsi 9:0:0:0: rdac: LUN 0 (IOSHIP) (unowned)
Mar 11 09:45:46 xs65test kernel: [   71.389345] scsi 9:0:0:1: rdac: LUN 1 (IOSHIP) (owned)
Mar 11 09:45:46 xs65test kernel: [   71.957649] scsi 10:0:0:0: rdac: LUN 0 (IOSHIP) (owned)
Mar 11 09:45:46 xs65test kernel: [   71.961788] scsi 10:0:0:1: rdac: LUN 1 (IOSHIP) (unowned)
Mar 11 09:45:47 xs65test kernel: [   72.531325] scsi 11:0:0:0: rdac: LUN 0 (IOSHIP) (owned)

Hence, we happily recognize that indeed, ALUA is working.

The even better news is that not only is ALUA now functional in XenServer 6.5 but should, in fact, work now with a large number of ALUA-capable storage arrays, both with custom configuration needs as well as potentially many that may work generically. Another surprising find was that for the MD3600i arrays tested, it turns out that even the “stock” version of the MD36xxi multipath configuration entry provided with XenServer 6.5 creates ALUA connections. The reason for this is that the hardware handler is being used consistently, provided no specific profile overrides are intercepted, and so primarily the storage device is doing the negotiation itself instead of being driven by the file-based configuration. This is what made the determination of ALUA connectivity more difficult, namely that the TPGS setting was never changed from zero and could consequently not be used to query for the group settings.

CONCLUSIONS

First off, it is really nice to know now that many modern storage devices support ALUA and that XenServer 6.5 now provides an easier means to leverage this protocol. It is also a lesson that documentation can be either hard to find and in some cases, is in need of being updated to reflect the current state. Individual vendors will generally provide specific instructions regarding iSCSI connectivity, and should of course be followed. Experimentation is best carried out on non-production servers where a major faux pas will not have catastrophic consequences.

To me, this was also a lesson in persistence as well as an opportunity to share the curiosity and knowledge among a number of individuals who were helpful throughout this process. Above all, among many who deserve thanks, I would like to thank in particular Justin Bovee from Dell and Robert Breker of Citrix for numerous valuable conversations and information exchanges.

Recent Comments
JK Benedict
POETRY! Thank you so much for this effort, Tobias!!!
Monday, 20 April 2015 14:01
Loren Saxby
This is excellent! It's amazing how certain features are overlooked or never brought up to begin with. Articles like this one shed... Read More
Wednesday, 06 May 2015 06:01
Tobias Kreidl
Thank you for all your collective comments. Shedding some light on the obscure can be very rewarding, even if only a small audienc... Read More
Sunday, 10 May 2015 23:38
Continue reading
33083 Hits
9 Comments

Is it really “containers vs. VMs”?

There are some in the Docker and container world that believe that there is some kind of competition between Docker and hypervisors; they would have us believe that containers render VMs, and therefore hypervisors, redundant. Is that really true? I think not. I believe that containers and VMs perform complementary roles and add value to each other.

Let's look at what VMs and containers are really all about. Firstly let's consider what they have in common: they can both be used to encapsulate an application and therefore both use images containing the application, its libraries and other runtime dependencies (in fact you could argue that a Docker image is conceptually just a VM image without a kernel and init scripts). Hypervisor vendors have been telling us for years to have just one application per OS instance, that's the normal model with AWS AMIs too – again, this looks just like a Docker image.

But that's a top down, application-centric view. Let's now look at it from the infrastructure perspective. The boundary of a container is the boundary of an application, the separation between the internal workings of the applications and its external interface. The boundary of a VM is the boundary of the resource allocation, ownership, trust and availability of a piece of abstracted infrastructure.

By separating these application and infrastructure boundaries we get more flexibility than if a single entity tries to implement both:

  • I can put multiple application containers within one VM where they share the same level of trust and only have to worry about protecting the trust boundary once, rather than multiple times for individual applications. VMs' trust and ownership boundaries have long been used to provide multi-tenancy – this isn't just important in public clouds but matters for enterprises that increasingly see applications being provided by individual departments or individual employees.
  • Applications often work together with other applications, that's why Docker has inter-container communication mechanisms such as "links". I can use the application container to keep each app nicely encapsulated and I can use the VM boundary to put a hard shell around the set of cooperating applications. I can also use this VM boundary to define the unit of resource accounting and reporting.
  • I can put cooperating application containers in a VM to share a common availability boundary; if they're working together then I probably want them to fail and succeed together. Resource isolation boundaries are good for containing faults – I'd rather have the "blast radius" of a faulty container being the VM which contains that container and its collaborating applications rather than an entire server.

So am I arguing that VMs are better than containers? Absolutely not. I believe that both mechanisms have a valuable part to play in the deployment of scalable, efficient, secure and flexible systems. That's why we're looking at ways to enhance XenServer to make it a great platform for running containers within VMs. Our recent preview of Docker integration is just the start. As well as requests to support other Docker-optimized Linux distributions (the preview supports CoreOS) we heard that you want to see infrastructure level information made available to higher level management tools for audit and reporting. Stay tuned for more.

Tags:
Recent Comments
Tobias Kreidl
What about Microsoft's embracing of Docker and its container deployment "Hyper-V Containers" plus its "Nano Server" -- how will th... Read More
Friday, 10 April 2015 15:50
James Bulpin
Should mesh well. One of XenServer's strengths is it's equally happy with both Windows and Linux workloads - should extend nicely ... Read More
Friday, 10 April 2015 16:10
Continue reading
15608 Hits
2 Comments

iSCSI and Jumbo Frames

So, you either just setup iSCSI or are having performance issues with your current iSCSI device. Here are some pointers to ensure "networking" is not the limiting factor:

1. Are my packets even making it to the iSCSI target?
Always check in XenCenter that your NICS responsible for storage are pointing to the correct target IPS. If they are, ensure you can ping these targets from within XenServer's command line:

ping x.x.x.x

If you cannot ping the target, that may be the issue.

Use the 'route' command to show if XenServer has a device and target to hit on the iSCSI target's subnet. If route shows nothing related to your iSCSI target IPs or takes a long time to show the target's IP/Route information, revisit your network configuration: working from the iSCSI device config, switch ports, and all the way up to the storage interface defined for your XenServer(s).

Odds are the packets are trying to route out via another interface or there is a cable mismatch/VLAN tag mismatch.  Or, at worse, the network cable is bad!

2. Is your network really setup for Jumbo Frames?
If you can ping our iSCSI targets, but Re having performance issues with Jumbo Frames (9000 or 4500 Mtu size, based on vendor) ensure your storage interface on XenServer is configured to leverage this Mtu size.

One can also execute a ping command to see if there is fragmentation or support enabled for the larger MTUs:

ping x.x.x.x -M do -s 8972

This tells XenServer to ping, without fragmenting frames, your iSCSI target with an Mtu of 9000 (the rest comes from the ping and other overhead, so use 8972).

If this return fragments or other errors, check the cabling from XenServer along with the switch settings AND iSCSI setup. Sometimes these attributes can be powered after firmware updates to the iSCSI enabled, managed storage devicd

3. Always make sure your network firmware and drivers are up to date!

And these are but three simple ways to isolate issues with iSCSI connectivity/performance.  The rest, well, more to come...



--jkbs | @xenfomation | XenServer.org Blog

Recent Comments
Tobias Kreidl
Thanks for posting this, Jesse. There are of course numerous tweaks possible to improve stock network settings, published by Citri... Read More
Saturday, 18 April 2015 05:23
JK Benedict
Quite welcome, Tobias and it is always great to hear from you! The article you sent is, well, quite amazing. I have seen in trad... Read More
Monday, 20 April 2015 14:20
Continue reading
21710 Hits
3 Comments

Preview of XenServer support for Docker and Container Management

I'm excited to be able to share with you a preview of our new XenServer support for Docker and Container Management. Downloads can be found on the preview page, read on for installation instructions and more details.

Today many Docker applications run in containers within VMs hosted on hypervisors such as XenServer and other distributions of Xen. The synergy between containers as an application isolation mechanism and hypervisors as a secure physical infrastructure virtualization mechanism is something that I'll be blogging more about in the future. I firmly believe that these two technologies add value to each other, especially if they are aware of each other and designed to work together for an even better result.

That's why we've been looking at how we can enhance XenServer to be a great platform for Docker applications and how we can contribute to the Docker ecosystem to best leverage the capabilities and services available from the hypervisor. As a first step in this initiative I'm pleased to announce a preview of our new XenServer support for Docker applications. Those who attended Citrix Summit in January or FOSDEM in February may have seen an earlier version of this support being demo'd.

The preview is designed to work on top of XenServer 6.5 and comes in two parts: a supplemental pack for the servers and a build of XenCenter with the UI changes. XenCenter is installed in the normal Windows manner. The supplemental pack is installed in the same way as other XenServer supp-packs by copying the ISO file to each server in the pool and executing the following command in domain 0:

xe-install-supplemental-pack xscontainer-6.5.0-100205c.iso
mount: xscontainer-6.5.0-100205c.iso is write-protected, mounting read-only
Installing 'XenServer Container Management'...

Preparing...                ########################################### [100%]
   1:guest-templates        ########################################### [ 50%]
Waiting for xapi to signal init complete
Removing any existing built-in templates
Regenerating built-in templates
   2:xscontainer            ########################################### [100%]
Pack installation successful.

So what do you get with this preview? First off you get support for running CoreOS Linux VMs - CoreOS is a minimal Linux distribution popular for hosting Docker apps. The XenCenter VM installation wizard now includes a template for CoreOS and additional dialogs for setting the VM up (that's setting up a cloud config drive under the hood). This process also prepares the VM to be managed, to enable the main part of the preview's functionality to interact with it.

b2ap3_thumbnail_new_vm_coreos_cloudconfig.jpg

Secondly, and most importantly, XenServer becomes aware of “Container managed” VMs running Docker containers. It queries the VMs to enumerate the application containers running on each and then displays these within XenCenter's infrastructure view. XenCenter also allows interaction with the containers to start, stop and pause them. We want XenServer to be a platform for Docker and complement, not replace, the core part of the Docker application ecosystem, and therefore we expect that the individual Docker Engine instances in the VMs will be managed by one of the many Docker management tools such as Kubernetes, Docker Compose or ShipYard.

b2ap3_thumbnail_container_treeview.png

So what can you do with this preview?

Monitoring and visibility - knowing which VMs are in use for Docker hosting and which containers on them are actually running. Today's interface is more of a "pets" than "cattle" one but we've got experience  in showing what's going on at greater scale.

Diagnostics - easy access to basic container information such as forwarded network ports and originating Docker image name. This can help accelerate investigations into problems where either or both of the infrastructure and application layers may be implicated. Going forward we’d like to also provide easy access to the container-console.

Performance - spotted a VM that's using a lot of resource? This functionality allows you to see which containers are running on that VM, what processes run inside, how much CPU time each consumed, to help identify the one consuming the resource. In the future we'd like to add per-container resource usage reporting for correlation with the VM level metrics.

Control applications - using XenCenter you can start, stop and pause application containers. This feature has a number of use cases in both evaluation and deployment scenarios including rapidly terminating problematic applications.

We'd love to hear your feedback on this preview: what was useful, what wasn't? What would you like to see that wasn't there? Did you encounter problems or bugs? Please can share your feedback using our normal preview feedback mechanism by creating a ticket in the "XenServer Org" (XSO) project at bugs.xenserver.org

This preview is a first step towards a much richer Docker-XenServer mutual awareness and optimization to help bridge the gap between the worlds of the infrastructure administrator and the application developer/administrator. This is just the beginning, we expect to be improving, extending and enhancing the overall XenServer-Docker experience beyond that. Look out for more blog posts one this topic...

For a detailed guide to using this preview please see this article.

Tags:
Recent Comments
Thomas Subotitsch
Great news. Hope that API will also get commands for docker management.
Tuesday, 17 March 2015 07:41
Slava
Get this error when I try to start the Core-OS vm: Only 1 LUN may be used with shared OCFS I tried iSCSI SR and Local storage, s... Read More
Friday, 24 April 2015 19:23
James Bulpin
Slava: You need to use XenServer 6.5 "Creedence" for this preview. As the error message "Only 1 LUN may be used with shared OCFS" ... Read More
Monday, 27 April 2015 12:26
Continue reading
48912 Hits
11 Comments

Participe do XenServer Day Fortaleza 2015

Participe do XenServer Day Fortaleza 2015

Fortaleza | Ceará | Brasil | 27/02/2015 | 14:00h
UFC - Universidade Federal do Ceará Campus do Pici | Bloco 902 |
Auditório Reitor Ícaro de Souza Moreira (Auditório do Centro de Ciências)

O XenServer Day foi um evento criado para usuários corporativos, desenvolvedores, fornecedores de serviços e entusiastas pelo Citrix™ XenServer™. O evento será realizado na Cidade de Fortaleza/CE, na Universidade Federal do Ceará, Auditório Reitor Ícaro de Souza Moreira (Auditório do Centro de Ciências), no dia 27/02/2015. Nesta edição do XenServer Day, vai ser realizada em conjunto com o XenServer Creedence World Tour, iniciativa da comunidade Open Source XenServer.org para lançamento do XenServer 6.5 (Codename Creedence).

Segue abaixo agenda de palestras:

14:00h - Abertura do Evento
14:20h - O que há de novo no Citrix XenServer 6.5 - Lorscheider Santiago (Quales Tecnologia)
15:30h - Palestra Unitrends - André Favoretto (Globix)
16:10h - Gerenciando infraestruturas virtuais em nuvem, servidor e desktop com Citrix XenServer 6.5  - Lorscheider Santiago (Quales Tecnologia)
17:00h - Infraestrutura de TI com Segurança: O que isso representa para o seu negócio - Vinícius Minneto (Ascenty)
17:30h - Encerramento

Os primeiros participantes que chegarem ao evento vão ganhar a camisa oficial da XenServer Creedence World Tour (Estoque limitado)

Para mais informações sobre o evento e fazer a sua inscrição, clique aqui

Está nas Redes Sociais? Compartilhe a hashtag: #XenServerDayFortaleza

Lorscheider Santiago - @lsantiagos

Recent comment in this post
Hafiz
Hi, I am attempting to create a virtual windows 2012 r2 server in Citrix XenServer 7.6 but keep getting an error during the initi... Read More
Saturday, 28 November 2015 00:55
Continue reading
8375 Hits
1 Comment

XenServer at OpenStack Summit

It's coming up on time for OpenStack Summit Vancouver where OpenStack developers and administrators will come together to discuss what it means and takes to run a successful cloud based on OpenStack technologies. As in past Summits, there will be a realistic focus on KVM based deployments due to KVM, or more precisely libvirt, having "Group A" status within the compute driver test matrix. XenServer currently has "Group B" status, and when you note that the distinction between A and B really boils down to which can gate a commit, there is no logical reason why XenServer shouldn't be a more prevalent option.

Having XenServer be thought of as completely appropriate for OpenStack deployments is something I'm looking to increase, and I'm asking for your help. The OpenStack Summit organizers want to ensure the content matches the needs of the community. In order to help ensure this, they invite their community to vote on the potential merit of all proposals. This is pretty cool since it helps ensure that the audience gets what they want, but it also makes it a bit harder if you're not part of the "mainstream". That's where I reach out to you in the XenServer community. If you're interested in seeing XenServer have greater mindshare within OpenStack, then please vote for one or both of my submissions. If your personal preference is for another cloud solution, I hope that you agree with me that increasing our install base strengthens both our community and XenServer, and will still take the time to vote. Note that you may be required to create an account, and that voting closes on February 23rd.

Packaging GPU intensive applications for OpenStack

If you'd like to see the GPU capabilities of XenServer materialize within OpenStack, please vote for this session using this link: https://www.openstack.org/vote-vancouver/Presentation/packaging-gpu-intensive-applications-for-openstack. The session will encompass some of the Packer work I've been involved with, and also the GPU work XenServer is leading on with NVIDIA.

Avoiding the 1000 dollar VM in your first cloud

This session covers the paradigm shifts involved when an organization decides to move from traditional data center operations to "the could". Since this is a technology talk, it's not strictly XenServer orientated, but XenServer examples are present. To vote for this session, use this link: https://www.openstack.org/vote-vancouver/Presentation/avoiding-the-1000-dollar-vm-in-your-first-cloud

Thank you to everyone who decides to support this effort.

Continue reading
22876 Hits
0 Comments

xenserver.org gets a refresh

Now that Creedence has shipped as XenServer 6.5, and we've even addressed some early issues with hotfixes (in record time no less), it was time to give xenserver.org a bit of an update as well. All of the content you've known to be on xenserver.org is still here, but this face lift is the first in a series of changes you'll see coming over the next few months.

Our Role

The role of xenserver.org will be shifting slightly from what we did in 2014 with an objective that by the end of 2015 it is the portal virtualization administrators use to find the information they need to be successful with XenServer. That's everything from development blogs, pre-release information, but also deeper technical content. Not everything will be hosted on xenserver.org, but we'll be looking for the most complete and accurate content available. Recognizing that commercial support is a critical requirement for production use of any technology, if we list a solution we'll also state clearly if its use is commercially supportable by Citrix or whether it could invalidate your support contract. In the end, this about successfully running a XenServer environment, so some practices presented might not be "officially sanctioned" and tested to the same level as commercially supported features, but are known by the community to work.

Community Content

The new xenserver.org will also have prominent community content. By its very nature, XenServer lives in a data center ecosystem populated by third party solutions. Some of those solutions are commercial in nature, and because commercial solutions should always retain "supported environment" status for a product, we've categorized them all under the "Citrix Ready" banner. Details on Citrix Ready requirements can be found on their FAQ page. Other solutions can be found within open source projects. We on the XenServer team are active in many, and we're consolidating information you'll need to be successful with various projects under the "Community" banner.

Commercial Content

We've always promoted commercial support on xenserver.org, and that's not changing. If anything, you'll see us bias a bit more towards promoting both support and some of the premium features within XenServer. After-all there is only one XenServer and the only difference between the installer you get from xenserver.org and from citrix.com is the EULA. Once you apply a commercial license, or use XenServer as part of an entitlement within XenDesktop, you are bound by the same commercial EULA regardless of where the installation media originated.

Contributing Content

Public content contributions to xenserver.org have always been welcome, and with our new focus on technical information to run a successful XenServer installation, we're actively seeking more content. This could be in the form of article or blog submissions, but I'm willing to bet the most efficient way will be just letting us know about content you discover. If you find something, tweet it to me @XenServerArmy and we'll take a look at the content. If it is something we can use, we'll write a summary blog or article and link to it. Of course before that can happen we'll need to verify if the content could create an unsupported configuration and warn users upfront if it does.

 

What kind of content are we looking for? That's simple, anything you find useful to manage your XenServer installation. It doesn't matter how big or small that might be, or what tooling you have in place, if it helps you to be productive, we think that's valuable stuff for the community at large.     

Recent Comments
Tobias Kreidl
Tim, This is a great new direction and the planned diverse content is a welcome change. Many of the more technical articles have b... Read More
Thursday, 19 February 2015 19:15
Tim Mackey
I'm not certain what you mean by "font +/-". I used Chrome to scale the site and the fonts scaled as they should.
Wednesday, 25 February 2015 15:39
Tim Mackey
Thanks. I'm seeing something similar, and am curious if for you its just the homepage, or other pages?
Wednesday, 25 February 2015 15:38
Continue reading
15777 Hits
7 Comments

History and Syslog Tweaks

Introduction

As XenServer Administrators already know (or will know), there is one user "to rule them all"... and that user is root.  Be it an SSH connection or command-line interaction with DOM0 via XenCenter, while you may be typing commands in RING3 (user space), you are doing it as the root user.

This is quite appropriate for XenServer's architecture as once the bare-metal is powered on, one is not booting into the the latest "re-spin" of some well-known (or completely obscure) Linux-spin.  Quite the opposite.  One is actually booting into the virtualization layer: dom0 or the Control Domain.  This is where separation of Guest VMs (domUs) and user space programmes (ping, fsck, and even XE) begins... even at the command line for root.

In summary, it is not uncommon for many Administrators to require root access to a XenServer... at one time.  Thus, this article will show my own means of adding granularity to the HISTORY command as well as logging (via Syslog) of each and every root user session.

Assumptions

As BASH is the default shell, this article assumes that one has knowledge of BASH, things "BASH", Linux-based utilities, and so forth.  If one isn't familiar with BASH, how BASH leverages global and local scripts to setup a user environment, etc I have provided the following resources:

  • BASH login scripts : http://www.linuxfromscratch.org/blfs/view/6.3/postlfs/profile.html
  • Terminal Colors : http://www.tldp.org/HOWTO/Bash-Prompt-HOWTO/x329.html
  • HISTORY command : http://www.tecmint.com/history-command-examples/

Purpose

The purpose I wanted to achieve was not just a more 'clean way' to look at the history command, but to also log the root user's session information: recording their access means, what command they ran, and WHEN.


In short, we go from this:

To this (plus record of each command in /var/log/user.log | /var/log/messages):

What To Do?

First, we want to backup /etc/bashrc to /etc/backup.bashrc in the event one would like to revert to the original HISTORY method, etc.  This can be done via the command-line of the XenServer:

cp /etc/bashrc /etc/backup.bashrc

Secondly, the following addition will should be added to the end of /etc/bashrc:

##[ HISTORY LOGGING ]#######################################################
#
# ADD USER LOGGING AND HISTORY COMMAND CONTEXT FOR SOME AUDITING
# DEC 2014, JK BENEDICT
# This email address is being protected from spambots. You need JavaScript enabled to view it. | @xenfomation
#
#########################################################################

# Grab current user's name
export CURRENT_USER_NAME=`id -un`

# Grab current user's level of access: pts/tty/or SSH
export CURRENT_USER_TTY="local `tty`"
checkSSH=`set | grep "^SSH_CONNECTION" | wc -l`

# SET THE PROMPT
if [ "$checkSSH" == "1" ]; then
     export CURRENT_USER_TTY="ssh `set | grep "^SSH_CONNECTION" | awk {' print $1 '} | sed -rn "s/.*?='//p"`"
     export PROMPT_COMMAND='history -a >(tee -a ~/.bash_history | logger -t "HISTORY for $CURRENT_USER_NAME[$$] via $SSH_CONNECTION : ")'
else
     export CURRENT_USER_TTY
     export PROMPT_COMMAND='history -a >(tee -a ~/.bash_history | logger -t "HISTORY for $CURRENT_USER_NAME[$$] via $CURRENT_USER_TTY : ")'
fi

# SET HISTORY SETTINGS
# Lines to retain, ignore dups, time stamp, and user information
# For date variables, check out http://www.computerhope.com/unix/udate.htm
export HISTSIZE=5000
export HISTCONTROL=ignoredups
export HISTTIMEFORMAT=`echo -e "e[1;31m$CURRENT_USER_NAMEe[0m[$$] via e[1;35m$CURRENT_USER_TTYe[0m on e[0;36m%d-%m-%y %H:%M:%S%ne[0m       "`

A link to a file providing this addition downloaded from https://github.com/xenfomation/bash-history-tweak

What Next?

Well, with the changes added and saved to /etc/bashrc, exit the command-line prompt or SSH session: logging back in to test the changes.

exit

hostname
whoami
history
tail -f /var/log/user.log

... And that is that.  So, while there are 1,000,000 more sophisticated ways to achieve this, I thought I'd share what I have used for a long time... have fun and enjoy!

--jkbs | @xenfomation

Continue reading
2733 Hits
0 Comments

Advisory for users of Space Reclamation (TRIM) in XenServer 6.5

We've had an important advisory for users who are trying the new XenServer 6.5 release and would like to use the new Space Reclamation feature (TRIM) on LUNs bigger than 2 TiB  as we have found an issue (inherited from upstream) in the original release which might result in data corruption of virtual disks under certain conditions such as high I/O. 


This issue has now been fixed so please apply Hotfix XS65E005 as soon as possible which has been released to address this issue.  More information is available here Http://support.citrix.com/article/CTX142141


In XenCenter, this functionality is labeled "Reclaim Freed Space" under the host's storage tab.

Note - this doesn't affect other functionality in the product - and is now resolved by applying this hotfix.

Recent Comments
JK Benedict
Thanks, Andy!
Friday, 06 February 2015 19:08
Andrew Halley
Thanks Thilko. Understood. A fix is under test. Will let you know as soon as I know of an ETA for a hotfix.
Monday, 09 February 2015 09:12
Andrew Halley
Sorry if I'm not understanding the question correctly. XenServer 6.5 is very much safe to use. This issue only affects users us... Read More
Wednesday, 11 February 2015 16:16
Continue reading
23960 Hits
14 Comments

XenServer at FOSDEM

Having just released Creedence as XenServer 6.5, 2015 has definitely started off with a bang. In 2014 the focus for XenServer was on a platform refresh, and creating a solid platform for future work. For me, 2015 is about enabling the ecosystem to be successful with XenServer, and that's where FOSDEM comes in. For those unfamiliar with FOSDEM, it's the Free and Open Source Developers European Meeting, and many of the most influential projects will have strong representation. Many of those same projects have strong relationships with other hypervisors, but not necessarily with XenServer. For those projects, XenServer needs to demonstrate its relevance, and I hope through a set of demos within the Xen Project stand to provide exactly that.

Demo #1 - Provisioning Efficiency

XenServer is a hypervisor, and as such is first and foremost a provisioning target. That means it needs to work well with provisioning solutions and their respective template paradigms. Some of you may have seen me present at various events on the topic of hypervisor selection in various cloud provisioning tools. One of the core workflow items for all cloud solutions is the ability to take a template and provision it consistently to the desired hypervisor. In Apache CloudStack with XenServer for example, those templates are VHD files. Unfortunately, XenServer by default exports XVA files, not native VHD; which makes the template process for CloudStack needlessly difficult.

This is where a technology like Packer comes in. Some of the XenServer engineers have been working on a Packer integration to support Vagrant. That's cool, but I'm also looking at this from the perspective of other tools and so will be showing Packer creating a CentOS 7 template which could be used anywhere. That template would then be provisioned and as part of the post-provisioning configuration management become a "something" with the addition of applications.

Demo #2 - Application Containerization

Once I have my template from Packer, and have provisioned it into a XenServer 6.5 host, the next step is application management. For this I'm going to use Ansible to personalize the VM, and to add in some applications which are containerized by Docker. There has been some discussion in the marketplace about containers replacing VMs, and I really see proper use of containers as being efficient use of VMs not as a replacement for a VM. Proper container usage is really proper application management, and understanding when to use which technology. For me this means that a host is a failure point which contains VMs. A VM represents a security and performance wrapper for a given tenant and their applications. Within a VM applications are provisioned, and where containerization of the applications makes sense, it should be used.

System administrators should be able to directly manage each of these three "containers" from the same pane of glass, and as part of my demo, I'll be showing just that using XenCenter. XenCenter has a simple GUI from which host and VM level management can be performed, and which is in the process of being extended to include Dockerized containers.

With this as the demo backdrop, I encourage anyone planning on attending FOSDEM to please stop by and ask about the work we've done with Creedence and also where we're thinking of going. If you're a contributor to a project and would like to talk more about how integrating with XenServer might make sense, either for your project or as something we should be thinking about, please do feel free to reach out to me. Of course if you're not planning on being at FOSDEM, but know folks who are, please do feel free to have them seek me out. We want XenServer to be a serious contender in every data center, but if we don't know about issues facing your favorite projects, we can't readily work to resolve them.

btw, if you'd like to plan anything around FOSDEM, please either comment on this blog, or contact me on Twitter as @XenServerArmy.

-tim     

Recent Comments
Tobias Kreidl
Thank you for sharing this, Tim. Some progress has already been made in being able to export VHD files and even VHD snapshots, so ... Read More
Monday, 26 January 2015 20:39
Felipe Franciosi
For those arriving in Brussels a day earlier, I'll be presenting at the CentOS Dojo and talking about Optimising Xen Deployments f... Read More
Tuesday, 27 January 2015 09:07
Tobias Kreidl
Tim, Would like to see a summary of your experiences and impressions at FOSDEM after it has concluded.
Sunday, 01 February 2015 15:54
Continue reading
12430 Hits
4 Comments

XenServer Support Options

Now that Creedence has been released as XenServer 6.5, I'd like to take this opportunity to highlight where to obtain what level of support for your installation.

Commercial Support

Commercial support is available from Citrix and many of its partners. A commercial support contract is appropriate if you're running XenServer in a production environment, particularly if downtime is a critical component of your SLA. It's important to note that commercial support is only available if the deployment follows the Citrix deployment guidelines, uses third party components from the Citrix Ready Marketplace, and is operated in accordance with the terms of the commercial EULA. Of course, since your deployment might not precisely follow these guidelines, commercial support may not be able to resolve all issues and that's where community support comes in.

Community Support

Community support is available from the Citrix support forums. The people on the forum are both Citrix support engineers and also your fellow system administrators. They are generally quite knowledgeable and enthusiastic to help someone be successful with XenServer. It's important to note that while the product and engineering teams may monitor the support forums from time to time, engineering level support should not be expected on the community forums.

Developer Support

Developer level support is available from the xs-devel list. This is your traditional development mailing list and really isn't appropriate for general support questions. Many of the key engineers are part of this list, and do engage on topics related to performance, feature development and code level issues. It's important to remember that the XenServer software is actually built from many upstream components, so the best source of information might be an upstream developer list and not xs-devel.

Self-support tool

Citrix maintains an self-support tool called Citrix Insight Services, formerly known as Tools-as-a-Service (TaaS). Insight Services takes a XenServer status report, and analyzes it to determine if there are any operational issues present in the deployment. A best practice is to upload a report after installing a XenServer host to determine if any issues are present which can result in latent performance or stability problems. CIS is used extensively by the Citrix support teams, but doesn't require a commercial support contract for end users.

Submitting Defects

If you believe you have encountered a defect or limitation in the XenServer software, simply using one of these support options isn't sufficient for the incident to be added to the defect queue for evaluation. Commercial support users will need to have their case triaged and potentially escalated, with the result potentially being a hotfix. All other users will need to submit an incident report via bugs.xenserver.org. Please be as detailed as possible with any defect reports such that they can be reproduced, and it doesn't hurt to include the URL of any forum discussion or the TaaS ID in your report. Also, please be aware that while the issue may be urgent for you any potential fix may take some time to be created. If your issue is urgent, you are strongly encouraged to follow the commercial support route as Citrix escalation engineers have the ability to prioritize customer issues.

Additionally, its important to point out that submitting a defect or incident report doesn't guarantee it'll be fixed. Some things simply work the way they do for very important reasons, other things may behave the way they do due to the way components interact. XenServer is tuned to provide a highly scalable virtualization platform, and if an incident would require destabilizing that platform, it's unlikely to be changed.

Recent comment in this post
JK Benedict
Tim - thank you very much for this post and as always, we greatly appreciate your work here @ xenserver.org. #XenServer = #Citr... Read More
Friday, 16 January 2015 04:37
Continue reading
15730 Hits
1 Comment

Creedence launches as XenServer 6.5

Today the entire XenServer team is very proud to announce that Creedence has officially been released as XenServer 6.5. It is available for download from xenserver.org, and is recommended for all new XenServer installs. We're so confident in what has been produced that I'm encouraging all XenServer 6.2 users to upgrade at their earliest convenience. So what have we actually accomplished?

The headline features

Every product release I've ever done, and there have been quite a large number over the years, has had some headline features; but Creedence is a bit different. Creedence wasn't about new features, and Creedence wasn't about chasing some perceived competitor. Creedence very much was about getting the details right for XenServer. It was about creating a very solid platform upon which anyone can comfortably, and successfully, build a virtualized data center regardless of workload. Creedence consisted of a lot of mundane improvements whose combination made for one seriously incredible outcome; Creedence restored the credibility of XenServer within the entire virtualization community. We even made up some t-shirts that the cool kids want ;)

So let's look at some of those mundane improvements, and see just how significant they really are.

  • 64 bit dom0 freed us from the limitations of dreaded Linux low memory, but also allows us to use modern drivers and work better with modern servers. From personal experience, when I took alpha.2 and installed it on some of my test Dell servers, it automatically detected my hardware RAID without my having to jump through any driver disk hoops. That was huge for me.
  • The move to a 3.10 kernel from kernel.org meant that we were out of the business of having a completely custom kernel and corresponding patch queue. Upstream is goodness.
  • The move to the Xen Project hypervisor 4.4 meant that we're now consuming the most stable version of the core hypervisor available to us.
  • We've updated to an ovs 2.10 virtual switch giving us improved network stability when the virtual switch is under serious pressure. While we introduced the ovs way back in December of 2010, there remained cases where the legacy Linux bridge worked best. With Creedence, those situations should be very few and far between
  • A thread per vif model was introduced to better ensure network hogs didn't impact adjacent VM performance
  • Network datapath optimizations allow us to drive line rate for 10Gbps NICs, and we're doing pretty well with 40Gbps NICs.
  • Storage was improved through an update to tapdisk3, and the team did a fantastic job of engaging with the community to provide performance details. Overall we've seen very significant improvements in aggregate disk throughput, and when you're virtualizing it's the aggregate which matters more than the single VM case.

What this really means for you is that XenServer 6.5 has a ton more headroom than 6.2 ever did. If you happen to be on even older versions, you'll likely find that while 6.5 looks familiar, it's not quite like any other XenServer you've seen. As has been said multiple times in blog comments, and by multiple people, this is going to be the best release ever. In his blog, Steve Wilson has a few performance graphs to share for those doubters. 

The future

While today we've officially released Creedence, much more work remains. There is a backlog of items we really want to accomplish, and you've already provided a pretty long list of features for us to figure out how to make. The next project will be unveiled very soon, and you can count on having access to it early and being able to provide feedback just as the thousands of pre-release participants did for Creedence. Creedence is very much a success of the community as it is an engineering success.

Thank you to everyone involved. The hard work doesn't go unnoticed.     

Recent Comments
Fabian
Finally! Now hope that it's as stable in production as it was during the testing phase... Fingers crossed here. BTW: There's a ty... Read More
Tuesday, 13 January 2015 19:32
Tim Mackey
@James, thanks for the kinds words @Fabian, I would expect things to be just as stable, and thanks for the catch.... Read More
Tuesday, 13 January 2015 19:39
Tobias Kreidl
Did DVSC ever get bundled in with the free or at least other versions?
Tuesday, 13 January 2015 19:42
Continue reading
49843 Hits
51 Comments

Understanding why certain Creedence builds don't work with certain features

Over the year end break, there were a couple of posts to the list which asked a very important question: "Does the DVSC work with the Release Candidate?" The answer was a resounding "maybe", and this post is intended to help clarify some of the distinction between what you get from xenserver.org, what you get from citrix.com, and how everything is related.

At this point most of us are already familiar with XenServer virtualization being "open source", and that with XenServer 6.2 there was no functional difference between the binary you could download from citrix.com and that from xenserver.org. Logically, when we started the Creedence pre-release program, many assumed that the same download would exist in both locations, and that everything which might be part of a "XenServer" would also always be open source. That would be really cool for many people, and potentially problematic for others.

The astute follower of XenServer technology might also have noticed that several things commonly associated with the XenServer product never had their source released. StorageLink is a perfect example of this. Others will have noticed that the XenServer Tech Preview run on citrix.com included a number of items which weren't present in any of the xenserver.org pre-release builds, and for which the sources aren't listed on xenserver.org. There is of course an easy explanation for this, but it goes to the heart of what we're trying to do with xenserver.org.

xenserver.org is first and foremost about the XenServer platform. Everyone associated with xenserver.org, and by extension the entire team, would love for the data centers of the world to standardize on this platform. The core platform deliverable is called main.iso, and that's the thing from which you install a XenServer host. The source for main.iso is readily available, and other than EULA differences, the XenServer host will look and behave identically regardless of whether main.iso came from xenserver.org or citrix.com. The beauty of this model is that when you grow your XenServer based data center to the point where commercial support makes sense, the software platform you'd want supported is the same.

All of which gets me back to the DVSC (and other similar components). DVSC, StorageLink and certain other "features" include source code which Citrix has access to under license. Citrix provides early access to these feature components to those with a commercial relationship. Because there is no concept of a commercial relationship with xenserver.org, we can't provide early access to anything which isn't part of the core platform. Now of course we do very much want everyone to obtain the same XenServer software from both locations, so when an official release occurs, we mirror it for your convenience.

I hope this longish explanation helps clarify why when questions arise about "features" not present in main.iso that the response isn't as detailed as some might like. It should also help explain why components obtained from prior "Tech Preview" releases might not work with newer platform builds obtained as part of a public pre-release program.

Recent Comments
Tim Mackey
@xiao, It went live this morning, and you can download it from xenserver.org
Tuesday, 13 January 2015 19:40
Tim Mackey
@Nick I think a better way of thinking about this is that the hypervisor is free, and the platform features and functions are fre... Read More
Tuesday, 27 January 2015 23:43
Continue reading
14648 Hits
4 Comments

Trading for Creedence shirts

Update: The t-shirt promotion has closed.

While there are still a couple of stops left on the Creedence world tour, we recognize it's impossible for us to cover all of our enthusiastic followers. That being said, we do want to give everyone an opportunity to cover themselves with one of our lovely Creedence World Tour t-shirts. Quite honestly, I've been rather impressed with how popular these have proven to be, and there is no good reason not to provide them to the community at large.

b2ap3_thumbnail_creedence_support.png

So this is where I offer these lovely shirts to you guys, and hope for something small in return. I'm in the process of revamping xenserver.org and want to add in some quotes about either of XenServer or Creedence. In return for your quote, I'm willing to trade some Creedence shirts. Now of course, I do understand that you might not be in a position to speak on behalf of your employer, so before anything gets posted I'll contact you to verify what can be said; and how you'd be attributed.    

If you're interested, all you need to do is fill out this small survey on Creedence, and select the option to provide a quote. The survey is very short, but the quote you might provide could go a long way to helping others believe in the effort we've put into Creedence, and that XenServer is a compelling platform for them to look at.  Of course, do feel free to forward this to those who might not be following this blog ;)

Recent Comments
JK Benedict
Ha! I see Blaine's photo of Dave and myself didn't make the cut ... Read More
Tuesday, 06 January 2015 19:52
Tim Mackey
He sent me three, but I didn't see you in any!
Tuesday, 06 January 2015 19:55
JK Benedict
Hehehe - it is quite alright, sir! I think this is great and I appreciate everything. I had noticed the Alpharetta team while ti... Read More
Tuesday, 06 January 2015 20:45
Continue reading
13892 Hits
7 Comments

Creedence Release Candidate Available

Just in time for the holiday season - we're pleased to announce a another tech toy for the geeks of the world to play with. Now of course XenServer is serious business, but just like many kids toys, the arrival of Creedence is eagerly awaited by many. As I mentioned earlier this week, you'll have to wait a bit longer for the official release, but today you can download the release candidate and see exactly what the world of Creedence should look like. Andy also mentioned last week that we're closing out the alpha/beta program, and as part of that effort the nightly Creedence snapshot page has been removed. You can still access the final beta (beta.3) on the pre-release page, but all prior builds have been removed. The pre-release page is also where you can download the release candidate.

What's in the Release Candidate

Performance tuning

The release candidate contains a number of bug fixes, but also has had some performance tuning done on it. This performance tuning is a little bit different than what we normally talk about, so if you've been benchmarking Creedence, you'll want to double check with the release candidate. What we've done is take a look at the interaction of a variety of system components and put in some limits on how hard we'll let you push them. Our first objective is a rock solid system, and while this work doesn't result in any configuration limit changes (at least not yet - that comes later in our cycle), it could reduce some of the headroom you might have experienced with a prior build. It's also possible that you could experience better headroom due to an overall improvement in system stability, so doing a performance test or two isn't a bad idea.

Core bug fixes over beta.3

  • mulitpath.conf is now preserved as multipath.conf.bak on upgrade
  • The default cpufreq governor is now set to performance
  • Fixes for XSA-109 through XSA-114 inclusive
  • Increase the number of PIRQs to more than 256 to support large quantities of NICs per host

What we'd like you to do with this build

The core two things we'd like you to do with this build are:

  1. If you've reported any issue at https://bugs.xenserver.org, please validate that we did indeed get the issue addressed.
  2. If you can, run this release candidate through its paces. We think it's nice and solid, and hope you do too.

Lastly, I'd like to take this opportunity to wish everyone in our community a festive end to 2014 and hope that what ever celebrating you might do is enjoyable. 2014 was an exciting year for XenServer, and that's in large part to the contributions of everyone reading this blog and working with Creedence. Thank you.

 

-tim     

Recent Comments
Tassos Papadopoulos
Boot from iSCSI is not supported on XS6.2. We had to do some hacks to do make it possible on Cisco UCS Blades. Are you going to su... Read More
Monday, 22 December 2014 07:21
Itaru OGAWA
Is "xe vm-export" performance improved in Creedence? From my test on RC, it looks similar, around 15MB/sec, even on 10GBE link: ... Read More
Monday, 22 December 2014 17:03
Tim Mackey
@Nathan, Since this is pre-release software intended for testing, no official upgrade path exists to the final release. Practica... Read More
Monday, 05 January 2015 19:53
Continue reading
15466 Hits
15 Comments

Status of Creedence

Over the past few weeks, and particularly as part of the Creedence World Tour, I've been getting questions about precisely when Creedence will be released. To the best of my ability, I've tried to take those questions head on, but the reality is we haven't been transparent about what happens when we release XenServer, and that's part of the problem. I'm going to try and address some of that in this post.

Now before I get into too much detail, it's important to note that XenServer is a packaged product which Citrix sells, and which is also available freely as open source. Citrix is a public company, so there is often a ton more detail I have, but which isn't appropriate for public disclosure. A perfect case in point is the release date. Since conceivably someone could change a PO based on this information, disclosing that can impact revenue and, well, I like my pay-cheque so I hope you understand when I'm quiet on some things.

So back to the question of what happens during finalization of a release, and how that can create a void. The first thing we do is take a look at all the defects coming in from all sources; with bugs.xenserver.org being one of many. We look at the nature of any open issues and determine what the potential for us to have a bad release from them. Next we create internal training to be delivered to the product support teams. These two tasks typically occur with either a final beta, or first release candidate. Concurrent to much of this work is finalization of documentation, and defining the performance envelope of the release. With each release, we have a "configuration limits" document, and the contents of that document represent both what Citrix is willing to deliver support on and what constitutes the limits of a stable XenServer configuration. For practical purposes, many of you have pushed Creedence in some way beyond what we might be comfortable defining as a "long term stable configuration", so its entirely possible the final performance could differ from what you've experienced so far.

Those are the technical bits behind the release, but this is also something which needs to be sold and that means we need to prepare for that as well. In the context of XenServer, selling means both why XenServer is great with XenDesktop, but also why it's great for anyone who is tired of paying more for their core virtualization requirements than really necessary. Given how many areas of data center operations XenServer touches, and the magnitude of the changes in Creedence, getting this right is critical. Then of course there is all the marketing collateral, and you get a sense of how much work is involved in getting XenServer out the door.

Of course, it can be argued that much of this "readiness" stuff could be handled in parallel, and for another project you'd be right. The reality is XenServer has had its share of releases which should've had a bit more bake time. I hope you agree with me that Creedence is better because we haven't rushed it, and that with Creedence we have a solid platform upon which to build. So with that in mind, I'll leave you with it hopefully being obvious that we intend to make a big splash with Creedence. Such a splash can't occur if we release during a typical IT lockdown period, and will need a bit larger stage than the one I'm currently on.

 

So stay tuned, my friends.  Good things are coming ;)

Recent Comments
Tim Mackey
Thanks @M.N..
Tuesday, 16 December 2014 14:42
Christian
tl;dr -- So what you're basically saying is that there is no release date yet.
Tuesday, 16 December 2014 12:03
Tim Mackey
@Christian It would be more precise to say that I know when it is intended to be released, but due to a variety of disclosure and... Read More
Tuesday, 16 December 2014 14:40
Continue reading
9946 Hits
6 Comments

XenServer Pre-Release Programme

A very big thank you for everyone who participated in the Creedence Alpha/Beta programme! 
The programme was very successful and raised a total of 177 issues, of which 138 were resolved during the Alpha/Beta period.  We are reviewing how the pre-release process can be improved and streamlined going forward. 

The Creedence Alpha/Beta programme has now come to an end with the focus of nightly snapshots moving on to the next version of XenServer.   

The Creedence Alpha/Beta source code remains available and can be accessed here: 
http://xenserver.org/component/content/article/24-product/creedence/143-xs-2014-development-snapshots.html

Creedence Alpha/Beta bugs may still be reported on https://bugs.xenserver.org

Work is already progressing on the next version of XenServer and the nightly snapshots are available here:
http://xenserver.org/component/content/article/2-uncategorised/115-development-snapshots.html

As this work is new and still expected to be unstable, please do not raise any Creedence Alpha/Beta bugs against it.

Recent Comments
Andrew Halley
Hi there, we are working towards posting an updated build containing all the bug fixes received to date, and which is fully integr... Read More
Sunday, 14 December 2014 12:33
Tobias Kreidl
Andrew, Thanks go to you and the whole Citrix team for making this a really great overall experience. Each XenServer release seems... Read More
Wednesday, 10 December 2014 19:45
Andrew Halley
Appreciate it Tobias - and our thanks to all the excellent contributions received from our community.
Sunday, 14 December 2014 12:33
Continue reading
15456 Hits
25 Comments

Average Queue Size and Storage IO Metrics

Introduction

There seems to be a bit of confusion around the metric "average queue size". This is a metric reported by iostat as "avgqu-sz". The confusion seems to arise when iostat reports a different avgqu-sz in dom0 and in domU for a single Virtual Block Device (VBD), while other metrics such as Input/Output Operations Per Second (IOPS) and Throughput (often expressed in MB/s) are the same. This page will describe what all of this actually mean and how this should be interpreted.

Background

On any modern Operating System (OS), it is possible to concurrently submit several requests to a single storage device. This practice normally helps several layers of the data path to perform better, allowing systems to achieve higher numbers in metrics such as IOPS and throughput. However, measuring the average of outstanding (or "inflight") requests for a given block device over a period of time can be a bit tricky. This is because the number of outstanding requests is an "instant metric". That is, when you look, there might be zero requests pending for that device. When you look again, there might be 28. Without a lot of accounting and some intrusiveness, it is not really possible to tell what happened in-between.

Most users, however, are not interested in everything that happened in-between. People are much more interested in the average of outstanding requests. This average gives a good understanding of the workload that is taking place (i.e. how applications are using storage) and helps with tuning the environment for better performance.

Calculating the Average Queue Size

To understand how the average queue size is calculated, consider the following diagram which presents a Linux system running 'fio' as a benchmarking user application issuing requests to a SCSI disk.

a1sx2_Medium2_avgqusz-fig1.png

Figure 1. Benchmark issuing requests to a disk

The application issues requests to the kernel through libraries such as libc or libaio. On the simple case where the benchmark is configured with an IO Depth of 1, 'fio' will attempt to keep one request "flying" at all times. As soon as one request completes, 'fio' will send another. This can be achieved with the following configuration file (which runs for 10 seconds and considers /dev/xvdb as the benchmarking disk):

[global]
bs=4k
rw=read
iodepth=1
direct=1
ioengine=libaio
runtime=10
time_based

[job-xvdb]
filename=/dev/xvdb

Table 1. fio configuration file for a test workload

NOTE: In this experiment, /dev/xvdb was configured as a RAW VDI. Ensure to fully populate VHD VDIs before running experiments (especially if they are read-based).

One of the metrics made available by the block layer for a device is the number of read and write "ticks" (see stat.txt on the Linux Kernel documentation). This exposes the amount of time per request that the device has been occupied. The block layer starts this accounting immediately before shipping the request to the driver and stops it immediately after the request completed. The figure below represents this time in the RED and BLUE horizontal bars.

 b2ap3_thumbnail_avgqusz-fig2.png

Figure 2. Diagram representing request accounting

It is important to understand that this metric can grow quicker than time. This will happen if more than one request has been submitted concurrently. On the example below, a new (green) request has been submitted before the first (red) request has been completed. It completed after the red request finished and after the blue request was issued. During the moments where requests overlapped, the ticks metric increased at a rate greater than time.

 b2ap3_thumbnail_avgqusz-fig3.png

Figure 3. Diagram representing concurrent request accounting

Looking at this last figure, it is clear that there were moments were no request was present in the device driver. There were also moments where one or two requests were present in the driver. To calculate the average of inflight requests (or average queue size) between two moments in time, tools like iostat will sample "ticks" at moment one, sample "ticks" again at moment two, and divide the difference between these ticks by the time interval between these moments.

 b2ap3_thumbnail_avgqusz-fig4.png

Figure 4. Formula to calculate the average queue size

The Average Queue Size in a Virtualised Environment

In a virtualised environment, the datapath between the benchmarking application (fio) running inside a virtual machine and the actual storage is different. Considering XenServer 6.5 as an example, the figure below shows a simplification of this datapath. As in the examples of the previous section, requests start in a virtual machine's user space application. When moving through the kernel, however, they are directed to paravirtualised (PV) storage drivers (e.g. blkfront) instead of an actual SCSI driver. These requests are picked up by the storage backend (tapdisk3) in dom0's user space. They are submitted to dom0's kernel via libaio, pass the block layer and reach the disk drivers for the corresponding storage infrastructure (in this example, a SCSI disk).

 b2ap3_thumbnail_avgqusz-fig5.png

Figure 5. Benchmark issuing requests on a virtualised environment

The technique described above to calculate the average queue size will produce different values depending on where it is applied. Considering the diagram above, it could be used in the virtual machine's block layer, in tapdisk3 or in the dom0's block layer. Each of these would show a different queue size and actually mean something different. The diagram below extends the examples used in this article to include these layers.

 b2ap3_thumbnail_avgqusz-fig6.png

Figure 6. Diagram representing request accounting in a virtualised environment

The figure above contains (almost) vertical arrows between the layers representing requests departing from and arriving to different system components. These arrows are slightly angled, suggesting that time passes as a request moves from one layer to another. There is also some elapsed time between an arrow arriving at a layer and a new arrow leaving from that layer.

Another detail of the figure is the horizontal (red and blue) bars. They indicate where requests are accounted at a particular layer. Note that this accounting starts some time after a request arrives at a layer (and some time before the request passes to another layer). These offsets, however, are merely illustrative. A thorough look at the output of specific performance tools is necessary to understand what the "Average Queue Size" is for certain workloads.

Investigating a Real Deployment

In order to place real numbers in this article, the following environment was configured:

  • Hardware: Dell PowerEdge R310
    • Intel Xeon X3450 2.67GHz (1 Socket, 4 Cores/socket, HT Enabled)
    • BIOS Power Management set to OS DBPM
    • Xen P-State Governor set to "Performance", Max Idle State set to "1"
    • 8 GB RAM
    • 2 x Western Digital WD2502ABYS
      • /dev/sda: XenServer Installation + guest's root disk
      • /dev/sdb: LVM SR with one 10 GiB RAW VDI attached to the guest
  • dom0: XenServer Creedence (Build Number 88873)
    • 4 vCPUs
    • 752 MB RAM
  • domU: Debian Wheezy x86_64
    • 2 vCPUs
    • 512 MB RAM

When issuing the fio workload as indicated in Table 1 (sequentially reading 4 KiB requests using libaio and with io_depth set to 1 during 10 seconds), an iostat within the guest reports the following:

root@wheezy64:~# iostat -xm | grep Device ; iostat -xm 1 | grep xvdb
Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
xvdb              0.00     0.00  251.05    0.00     0.98     0.00     8.00     0.04    0.18    0.18    0.00   0.18   4.47
xvdb              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
xvdb              0.00     0.00 4095.00    0.00    16.00     0.00     8.00     0.72    0.18    0.18    0.00   0.18  72.00
xvdb              0.00     0.00 5461.00    0.00    21.33     0.00     8.00     0.94    0.17    0.17    0.00   0.17  94.40
xvdb              0.00     0.00 5479.00    0.00    21.40     0.00     8.00     0.96    0.18    0.18    0.00   0.18  96.40
xvdb              0.00     0.00 5472.00    0.00    21.38     0.00     8.00     0.95    0.17    0.17    0.00   0.17  95.20
xvdb              0.00     0.00 5472.00    0.00    21.38     0.00     8.00     0.97    0.18    0.18    0.00   0.18  97.20
xvdb              0.00     0.00 5443.00    0.00    21.27     0.00     8.00     0.96    0.18    0.18    0.00   0.18  95.60
xvdb              0.00     0.00 5465.00    0.00    21.34     0.00     8.00     0.96    0.17    0.17    0.00   0.17  95.60
xvdb              0.00     0.00 5467.00    0.00    21.36     0.00     8.00     0.96    0.18    0.18    0.00   0.18  96.00
xvdb              0.00     0.00 5475.00    0.00    21.39     0.00     8.00     0.96    0.18    0.18    0.00   0.18  96.40
xvdb              0.00     0.00 5479.00    0.00    21.40     0.00     8.00     0.97    0.18    0.18    0.00   0.18  96.80
xvdb              0.00     0.00 1155.00    0.00     4.51     0.00     8.00     0.20    0.17    0.17    0.00   0.17  20.00
xvdb              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
xvdb              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
xvdb              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

The value of interest is reported in the column "avgqu-sz". It is about 0.96 on average while the benchmark was running. This means that the guest's block layer (referring to Figure 6) is handling requests almost the entire time.

The next layer of the storage subsystem that accounts for utilisation is tapdisk3. This value can be obtained running /opt/xensource/debug/xsiostat in dom0. For the same experiment, it reports the following:

[root@dom0 ~]# /opt/xensource/debug/xsiostat | head -2 ; /opt/xensource/debug/xsiostat | grep 51728
--------------------------------------------------------------------
  DOM   VBD         r/s        w/s    rMB/s    wMB/s rAvgQs wAvgQs
    1,51728:       0.00       0.00     0.00     0.00   0.00   0.00
    1,51728:    1213.04       0.00     4.97     0.00   0.22   0.00
    1,51728:    5189.03       0.00    21.25     0.00   0.71   0.00
    1,51728:    5196.95       0.00    21.29     0.00   0.71   0.00
    1,51728:    5208.94       0.00    21.34     0.00   0.71   0.00
    1,51728:    5208.10       0.00    21.33     0.00   0.71   0.00
    1,51728:    5194.92       0.00    21.28     0.00   0.71   0.00
    1,51728:    5203.08       0.00    21.31     0.00   0.71   0.00
    1,51728:    5245.00       0.00    21.48     0.00   0.72   0.00
    1,51728:    5482.02       0.00    22.45     0.00   0.74   0.00
    1,51728:    5474.02       0.00    22.42     0.00   0.74   0.00
    1,51728:    3936.92       0.00    16.13     0.00   0.53   0.00
    1,51728:       0.00       0.00     0.00     0.00   0.00   0.00
    1,51728:       0.00       0.00     0.00     0.00   0.00   0.00

Analogously to what was observed within the guest, xsiostat reports on the amount of time that it had outstanding requests. At this layer, this figure is reported at about 0.71 while the benchmark was running. This gives an idea of the time that passed between a request being accounted in the guest's block layer and at the dom0's backend system. Going further, it is possible to run iostat in dom0 and find out what is the perceived utilisation at the last layer before the request is issued to the device driver.

[root@dom0 ~]# iostat -xm | grep Device ; iostat -xm 1 | grep dm-3
Device:         rrqm/s   wrqm/s   r/s   w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
dm-3              0.00     0.00 102.10  0.00     0.40     0.00     8.00     0.01    0.11   0.11   1.16
dm-3              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-3              0.00     0.00 281.00  0.00     1.10     0.00     8.00     0.06    0.20   0.20   5.60
dm-3              0.00     0.00 5399.00  0.00    21.09     0.00     8.00     0.58    0.11   0.11  58.40
dm-3              0.00     0.00 5479.00  0.00    21.40     0.00     8.00     0.58    0.11   0.11  57.60
dm-3              0.00     0.00 5261.00  0.00    20.55     0.00     8.00     0.61    0.12   0.12  61.20
dm-3              0.00     0.00 5258.00  0.00    20.54     0.00     8.00     0.61    0.12   0.12  61.20
dm-3              0.00     0.00 5206.00  0.00    20.34     0.00     8.00     0.57    0.11   0.11  56.80
dm-3              0.00     0.00 5293.00  0.00    20.68     0.00     8.00     0.60    0.11   0.11  60.00
dm-3              0.00     0.00 5476.00  0.00    21.39     0.00     8.00     0.64    0.12   0.12  64.40
dm-3              0.00     0.00 5480.00  0.00    21.41     0.00     8.00     0.61    0.11   0.11  60.80
dm-3              0.00     0.00 5479.00  0.00    21.40     0.00     8.00     0.66    0.12   0.12  66.40
dm-3              0.00     0.00 5047.00  0.00    19.71     0.00     8.00     0.56    0.11   0.11  56.40
dm-3              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-3              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

At this layer, the block layer reports about 0.61 for the average queue size.

Varying the IO Depth

The sections above clarified why users might see a lower queue utilisation in dom0 when comparing the output of performance tools in different layers of the storage subsystem. The examples shown so far, however, covered mostly the case where IO Depth is set to "1". This means that the benchmark tool ran within the guest (e.g. fio) will attempt to keep one request inflight at all times. This tool's perception, however, might be incorrect given that it takes time for the request to actually reach the storage infrastructure.

Using the same environment described on the previous section and gradually increasing the IO Depth at the benchmark configuration, the following data can be gathered:

 b2ap3_thumbnail_avgqusz-fig7.png

Figure 7. Average queue size vs. io depth as configured in fio

Conclusion

This article explained what the average queue size is and how it is calculated. As examples, it included real data from specific server and disk types. This should clarify why certain workloads cause different queue utilisations to be perceived from the guest and from dom0.

Recent Comments
Tobias Kreidl
Thanks much as always, Felipe, for the very insightful article! With a case where the %util parameter starts approaching 100%, th... Read More
Wednesday, 03 December 2014 19:17
Felipe Franciosi
Hi Tobias, Thanks for the question and apologies for the delay in responding. The %util column approaching 100% normally means th... Read More
Tuesday, 16 December 2014 19:20
Continue reading
24603 Hits
2 Comments

Basic Network Testing with IPERF

Purpose

I am often asked how one can perform simple network testing within, outside, and into XenServer.  This is a great question as – by itself – it is simple enough to answer.  However, depending on what one desires out of “network testing” the answer can quickly become more complex.

As such, this I have decided to answer this question using a long standing, free utility called IPERF (well, IPERF2).  It is a rather simple, straight-forward, but powerful utility I have used over many, many years.  Links to IPERF will be provided - along with documentation on its use - as it will serve in this guide as a way to:


- Test bandwidth between two or more points

- Determine bottlenecks

- Assists with black box testing or “what happens if” scenarios

- Use a tool that runs on both Linux and Windows

- And more…

IPERF: A Visual Breakdown

IPERF has to be installed on/at at least two separate end points.  One point acts a server/receiver and the other point acts as a client/transmitter.  This so network testing can be done on a simple subnet to a complex, routed network: end-to-end using TCP or UDP generated traffic:

The visual shows an IPERF client transmitting data over IPv4 to an IPERF receiver.  Packets traverse the network - from wireless routers and through firewalls - from the client side to the server side to over port 5001.

IPERF and XenServer

The key to network testing is in remembering that any device which is connected to a network infrastructure – Virtual or Physical – is a node, host, target, end point, or just simply … a networked device.

With regards to virtual machines, XenServer obviously supports Windows and Linux operating systems.  IPERF can be used to test virtual-to-virtual networking as well as virtual-to-physical networking.  If we stack virtual machines in a box to our left and stack physical machines in a box to our right – despite a common subnet or routed network – we can quickly see the permutations of how "Virtual and Physical Network Testing" can be achieved with IPERF transmitting data from one point to another:

And if one wanted, they could just as easily test networking for this:

Requirements

To illustrate a basic server/client model with IPERF, the following will be required:

- A Windows 7 VM that will act as an IPERF client

- A CentOS 5.x VM that will act as a receiver.

- IPERF2 (the latest version of IPERF, or "IPERF3" can be found at https://github.com/esnet/iperf or, more specifically, http://downloads.es.net/pub/iperf/)

The reason for using IPERF2 is quite simple: portability and compatibility on two of the most popular operating systems that I know are virtualized.  In addition, the same steps to installing IPERF2 on these hosts can be carried out on physical systems running similar operating systems, as well. 

The remainder of this article - regarding IPERF2 - will require use of the MS-DOS command-line as well as the Linux shell (of choice).  I will carefully explain all commands as so if you are “strictly a GUI” person, you should fit right in.

Disclaimer

When utilizing IPERF2, keep in mind that this is a traffic generator.  While one can control the quantity and duration of traffic, it is still network traffic

So, consider testing during non-peak hours or after hours as to not interfere with production-based network activity.

Windows and IPERF

The Windows port of IPERF 2.0.5 requires Windows XP (or greater) and can be downloaded from:

http://sourceforge.net/p/iperf/patches/_discuss/thread/20d4a4b0/5c44/attachment/Iperf.zip

Within the .zip file you will find two directories.  One is labeled DEBUG and the other is labeled RELEASE.  Export the Iperf.exe program to a directory you will remember, such as C:\iperf\

Now, accessing the command line (cmd.exe), navigate to C:\iperf\ and execute:

iperf

The following output should appear:

Linux and IPERF

If you have additional repos already configured for CentOS, you can simply execute (as root):

yum install iperf

If that fails, one will need to download the Fedora/RedHat EPEL-Release RPM file for the version of CentOS being used.  To do this (as root), execute:

wget  http://dl.fedoraproject.org/pub/epel/5/i386/epel-release-5-4.noarch.rpm
rpm -Uvh epel-release-5-4.noarch.rpm

 

*** Note that the above EPEL-Release RPM file is just an example (a working one) ***

 

Once epel-release-5-4.noarch.rpm is installed, execute:

yum install iperf

And once complete, as root execute iperf and one should see the following output:

http://cdn.ws.citrix.com/wp-content/uploads/2014/06/CMD2.png?__utma=222274247.1078613845.1409810797.1412210514.1412210784.2&__utmb=222274247.5.8.1412227628611&__utmc=222274247&__utmx=-&__utmz=222274247.1412210514.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)&__utmv=222274247.|1=my%20account%20holder=y=1^14=industry=(Non-company%20Visitor)=1^15=sub_industry=(Non-company%20Visitor)=1^16=employee_count=(Non-company%20Visitor)=1^17=company_name=(Non-company%20Visitor)=1^18=primary_sic=(Non-company%20Visitor)=1^19=registry_dma_code=(Non-company%20Visitor)=1&__utmk=208580497

Notice that it is the same output as what is being displayed from Windows.  IPERF2 is expecting a "-s" (server) or "-c" (client) command-line option with additional arguments.

IPERF Command-Line Arguments

On either Windows or Linux, a complete list of options for IPERF2 can be listed by executing:

iperf –help

A few good resources of examples to use IPERF2 options for the server or client can be referenced at:

http://www.slashroot.in/iperf-how-test-network-speedperformancebandwidth

http://samkear.com/networking/iperf-commands-network-troubleshooting

http://www.techrepublic.com/blog/data-center/handy-iperf-commands-for-quick-network-testing/

For now, we will focus on the options needed for our server and client:

-f, –format    [kmKM]   format to report: Kbits, Mbits, KBytes, MBytes
-m, –print_mss          print TCP maximum segment size (MTU – TCP/IP header)
-i, –interval  #        seconds between periodic bandwidth reports
-s, –server             run in server mode
-c, –client    <host>   run in client mode, connecting to <host>
-t, –time      #        time in seconds to transmit for (default 10 secs)

Lastly, there is a TCP/IP Window setting.  This goes beyond the scope of this document as it relates to the TCP frame/windowing of data.  I highly recommend reading either of the two following links – especially for Linux – as there has always been some debate as what is “best to be used”:

https://kb.doit.wisc.edu/wiscnet/page.php?id=11779

http://kb.pert.geant.net/PERTKB/IperfTool

Running An IPERF Test

So, we have IPERF2 installed on Windows 7 and on CentOS 5.10.  Before one performs any testing, ensure any AV does not block iperf.exe from running as well as port 5001 being opened across the network network.

Again, another port can be specified, but the default port IPERF2 uses for both client and server is 5001.

Server/Receiver Side

The Server/Receiver side will be on the CentOS VM.

Following the commands above, we want to execute the following to run IPERF2 as a server/receiver from our Windows 7 client machine:

iperf -s -f M -m -i 10

The output should show:

————————————————————
Server listening on TCP port 5001
TCP window size: 0.08 MByte (default)
————————————————————

The TCP window size has been previously commented on and the server is now ready to accept connections (press Control+C or Control+Z to exit).

Client/Transmission Side

Let us now focus on the client side to start sending data from the Windows 7 VM to the CentOS VM.

From Windows 7, the command line to start transmitting data for 30 seconds to our CentOS host (x.x.x.48) is:

iperf -c x.x.x.48 -t 30 -f M

Pressing enter, the traffic flow begins and the output from the client side looks like this:

From the server side, the output looks something like this:

And there we have it – a first successful test from a Windows 7 VM (located on one XenServer) to a CentOS 5.10 VM (located on another XenServer).

Understanding the Results

From either the client side or server side, results are shown by time and average.  The key item to look for from either side is:

0.0-30.0 sec  55828 MBytes  1861 MBytes/sec

Why?  This shows the average over the course of 0.0 to 30.0 seconds in terms of total megabytes transmitted as well as average megabytes of data sent per second.  In addition, since the "-f M" argument was passed as a command-line option, the output is calculated in megabytes accordingly.

In this particular case, we simply illustrated that from one VM to another VM, we transferred data at 1861 megabytes per second.

*** Note that this test was performed in a local lab with lower-end hardware than what you probably have! ***

--jkbs | @xenfomation

 

Recent Comments
chaitanya
Hi, Nice article.. I have a simple question.. you did this test for windows and linux os. Any specific requirement on that? I d... Read More
Monday, 10 November 2014 16:59
JK Benedict
Exactly: to show that IPERF can be used in any configuration, any school of thought, etc! Windows Windows Linux Linux Linux Wi... Read More
Wednesday, 12 November 2014 03:08
Massimo De Nadal
Hi, your throughput is 1861 MB/sec which means more than 14Gb !!!! Can I ask you what kind of server/setup are you using ??? I'... Read More
Tuesday, 11 November 2014 12:24
Continue reading
46483 Hits
15 Comments

About XenServer

XenServer is the leading open source virtualization platform, powered by the Xen Project hypervisor and the XAPI toolstack. It is used in the world's largest clouds and enterprises.
 
Technical support for XenServer is available from Citrix.