Virtualization Blog

Discussions and observations on virtualization.

XenServer High-Availability Alternative HA-Lizard

XenServer High-Availability Alternative HA-Lizard

WHY HA AND WHAT IT DOES

XenServer (XS) contains a native high-availability (HA) option which allows quite a bit of flexibility in determining the state of a pool of hosts and under what circumstances Virtual Machines (VMs) are to be restarted on alternative hosts in the event of the loss of the ability of a host to be able to serve VMs. HA is a very useful feature that protects VMs from staying failed in the event of a server crash or other incident that makes VMs inaccessible. Allowing a XS pool to help itself maintain the functionality of VMs is an important feature and one that plays a large role in sustaining as much uptime as possible. Permitting the servers to automatically deal with fail-overs makes system administration easier and allows for more rapid reaction times to incidents, leading to increased up-time for servers and the applications they run.

XS allows for the designation of three different treatments of Virtual Machines: (1) always restart, (2) restart if possible, and (3) do not restart. The VMs designated with the highest restart priority will be the first to be attempted to restart and all will be handled, provided adequate resources (primarily, host memory) are available.  A specific start order, allowing for some VMs to be checked to be running before others, can also be established. VMs will be automatically distributed among whatever remaining XS hosts are considered active. Where necessary, note that hosts that contain expandable memory will be shrunk down to accommodate additional hosts and those hosts designated to be restarted will also be run with reduced memory, if necessary. If additional capacity exists to run more VMs, those designated as “start if possible” will be brought online. Whichever VMs that are not considered essential typically will be marked as “do not restart” and hence will be left “off” had they been running before, requiring any of those desired to be restarted to be done manually, resources permitting.

XS also allows for specifying the minimum number of active hosts to remain to accommodate failures; larger pools that are not overly populated with VMs can readily accommodate even two or more host failures.

The election of what hosts are “live” and should be considered active members of the pool follows a rather involved process of a combination of network accessibility plus access to an independent designated pooled Storage Repository (SR) that serves as an additional metric. The pooled SR can also be a fiber channel device, being independent of Ethernet connections. A quorum-based algorithm is applied to establish which servers are up and active as members of the pool and which -- in the event of a pool master failure -- should be elected the new pool master.

 

WHEN HA WORKS, IT WORKS GREAT

Without going into more detail, suffice it to say that this methodology works very well, however requiring a few prerequisite conditions that need to be taken into consideration. First of all, the mandate that a pooled storage device be available clearly means that a pool consisting of hosts that only make use of local storage will be precluded. Second, there is also a constraint that for a quorum to be possible, it is required to have a minimum of three hosts in the pool or HA results will be unpredictable as the election of a pool master can become ambiguous. This comes about because of the so-called “split brain” issue (http://linux-ha.org/wiki/Split_Brain) which is endemic in many different operating system environments that employ a quorum as means of making such a decision. Furthermore, while fencing (the process of isolating the host; see for example http://linux-ha.org/wiki/Fencing) is the typical recourse, the lack of intercommunication can result in a wrong decision being made and hence loss of access to VMs. Having experimented with two-host pools and the native XenServer HA, I would say that an estimate of it working about half the time is about right and from a statistical viewpoint, pretty much what you would expect.

This limitation is, however, still of immediate concern to those with either no pooled storage and/or only two hosts in a pool. With a little bit of extra network connectivity, a relatively simple and inexpensive solution to the external SR can be provided by making a very small NFS-based SR available. The second condition, however, is not readily rectified without the expense of at least one additional host and all the connectivity associated with it. In some cases, this may simply not be an affordable option.

 

ENTER HA-LIZARD

For a number of years now, an alternative method of providing HA has been available through the program package provided by HA-Lizard (http://www.halizard.com/) , a community project that provides a free alternative that is neither dependent on external SRs nor requires a minimum of three hosts within a pool. In this blog, the focus will be on the standard HA-Lizard version and because of the particularly harder-to-handle situation of a two-node pool, it will also be the subject of discussion.

I had been experimenting for some time with HA-Lizard and found in particular that I was able to create failure scenarios that needed some improvement. HA-Lizard’s Salvatore Costantino was more than willing to lend an ear to the cases I had found and this led further to a very productive collaboration on investigating and implementing means to deal with a number of specific cases involving two-host pools. The result of these several months of efforts is a new HA-Lizard release that manages to address a number of additional scenarios above and beyond its earlier capabilities.

It is worthwhile mentioning that there are two ways of deploying HA-Lizard:

1) Most use cases combine HA-Lizard and iSCSI-HA which creates a two-node pool using local storage while maintaining full VM agility with VMs being able to run on either host. In this case, DRBD (http://www.drbd.org/) is implemented in this type of deployment and it works very well making use of the real-time storage replication.

2) HA-Lizard, only, is used with an external Storage Repository (as in this particular case).

Before going into details of the investigation, a few words should go towards a brief explanation of how this works. Note that there is only Internet connectivity (the use of a heuristic network node) and no external SR, so how is a split brain situation then avoidable?

This is how I'd describe the course of action in this two-node situation:

If a node sees the gateway, assume it's alive. If it cannot, assume it's a good candidate for fencing. If the node that cannot see the gateway is the master, it should internally kill any running VMs and surrender its ability to be the master and fence itself. The slave node should promote itself to master and attempt to restart any missing VMs. Any that are on the previous master will probably fail though, because there is no communication to the old master. If the old VMs cannot be restarted, eventually the new master will be able to restart them regardless after a toolstack restart. If the slave node fails by not being able to communicate with the network, as long as the master still sees the network and not the slave’s network, it can assume the slave needs to fence itself, kill off its VMs and assume that they will be restarted on the current master. The slave needs to realize it cannot communicate out, and therefore should kill off any of its VMs and fence itself.

Naturally, the trickier part comes with the timing of the various actions, since each node has to blindly assume the other is going to conduct a sequence of events. The key here is that these are all agreed on ahead of time and as long as each follows its own specific instructions, it should not matter that each of the two nodes cannot see the other node. In essence, the lack of communication in this case allows for creating a very specific course of action! If both nodes fail, obviously the case is hopeless, but that would be true of any HA configuration in which no node is left standing.

Various test plans were worked out for various cases and the table below elucidates the different test scenarios, what was expected and what was actually observed. It is very encouraging that the vast majority of these cases can now be properly handled.

 

Particularly tricky here was the case of rebooting the master server from the shell, without first disabling HA-Lizard (something one could readily forget to do). Since the fail-over process takes a while, a large number of VMs cannot be handled before the communication breakdown takes place, hence one is left with a bit of a mess to clean up in the end. Nevertheless, it’s still good to know what happens if something takes place that rightfully shouldn’t!

The other cases, whether intentional or not, are handled predictably and reliably, which is of course the intent. Typically, a two-node pool isn’t going to have a lot of complex VM dependencies, so the lack of a start order of VMs should not be perceived as a big shortcoming. Support for this feature may even be added in a future release.

 

CONCLUSIONS

HA-Lizard is a viable alternative to the native Citrix HA configuration. It’s straightforward to set up and can handle standard failover cases with a selective “restart/do not restart” setting for each VM or can be globally configured. There are a quite a number of configuration parameters which the reader is encouraged to research in the extensive HA-Lizard documentation. There is also an on-line forum which serves as a source for information and prompt assistance with issues. This most recent release 2.1.3 is supported on both XenServer 6.5 and 7.0.

Above all, HA-Lizard shines when it comes to handling a non-pooled storage environment and in particular, all configurations of the dreaded two-node pool configuration. From my direct experience, HA-Lizard now handles the vast majority of issues involved in a two-node pool and can do so more reliably than the non-supported two-node pool using Citrix’ own HA application. It has been possible to conduct a lot of tests with various cases and importantly, and to do so multiple times to ensure the actions are predictable and repeatable.

I would encourage taking a look at HA-Lizard and giving it a good test run. The software is free (contributions are accepted) and it is in extensive use and has a proven track record.  For a two-host pool, I can frankly not think of a better alternative, especially with these latest improvements and enhancements.

I would also like to thank Salvatore Costantino for the opportunity to participate in this investigation and am very pleased to see the fruits of this collaboration. It has been one way of contributing to the Citrix XenServer user community that many can immediately benefit from.

 

 

 

 

 

 

Recent comment in this post
JK Benedict
I hath no idea why more have not read this intense article! As always: bravo, sir! BRAVO!
Wednesday, 04 January 2017 12:43
Continue reading
12342 Hits
1 Comment

Making a difference to the next release

With the 1st alpha build of the new, upcoming XenServer release (codenamed " Ely") as announced by Andy Melmed's blog on the 11th October – I thought it'd be useful to provide a little retrospective on how you, the xenserver.org community, have helped getting it off to a great start by providing great feedback on previous alphas, betas and releases - and how this has been used to strengthen the codebase for Ely as a whole based on your experiences.
 
As I am sure you are well aware - community users of xenserver.org can make use of the incident tracking database at bugs.xenserver.org to raise issues on recent alphas, betas and XenServer releases to raise issues or problems they've found on their hardware or configuration.  These incidents are raised in the form of XSO tickets which can then be commented upon by other members of the community and folks who work on the product.  
 
We listened
Looking back on all of the XSO tickets raised on the latest 7.0 release - these total more than 200 individual incident reports.  I want to take the time to thank everyone who contributed to these, often detailed, specific, constructive reports, and for working iteratively to understand more of the underlying issues.  Some of these investigations are ongoing, and need further feedback, but many of them are sufficiently clear to move forward to the next step.  
 
We understood
The incident reports were triaged and, by working with the user community, more than 80% of them have been processed.  Frequently this involved questions and answers to get a better handle on what was the underlying problem.  Then trying a change to the configuration or even a private fix to see and confirm if it related to the problem or resolved it.  The enthusiasm and skill of the reporters has been amazing, and continually useful.  At this point - we've separated the incidents into those which can be fixed as bugs, and those which are requests for features.  The latter have been provided to Citrix product management for consideration.  
 
We did
Out of these which can be fixed as bugs,  we raised or updated 45 acknowledged defects in XenServer.  More than 70% of these are already fixed - with another 20% being actively worked on.  The small remainder are blocked for some reason and awaiting a change elsewhere in the product, upstream or in our ability to test.  The 70% of fixes have now successfully either become part of some of the hotfixes which have been released for 7.0, or are in the codebase already and are being progressively released as part of the Ely alpha programme for the community to try.  
 
So what's next?  With work continuing apace on Ely - we have now opened the "Ely alpha" as a affects-version in the incident database to raise issues with this latest build.  At the same time - in the spirit of continuing to progressively improve the actively developing codebase - we have removed the 6.5 SP1 affects-version – so folks can focus on the new release.
 
Finally - on behalf of all xenserver.org users - my personal thanks to everyone who has helped improve both Dundee and Ely - both by reporting incidents, triaging and fixing them and by continuing to give your feedback on the latest new version.  This really makes a difference to all members of the community.
Continue reading
4593 Hits
0 Comments

Resetting Lost Root Password in XenServer 7.0

XenServer 7.0, Grub2, and a Lost Root Password

In a previous article I detailed how one could reset a lost root password to XenServer 6.2.  While the article is not limited to 6.2 (it works just as well for 6.5, 6.1, and 6.0.2), this article is dedicated to XenServer 7.0 as grub2 has been brought in to replace extlinux.

As such, if the local root user's (LRU) password for a XenServer 7.0 is forgotten physical (or "lights out") access to the host and a reboot will be required.  The contrast comes with grub2, the methods to boot the XenServer 7.0 host into single user mode, and how to reset the root password to a known token.

The Grub Boot Screen

Once obtaining physical or "lights out" to the XenServer 7.0 host in question, on reboot the following screen will appear:

It is important to note that once this screen appears, you only have four seconds to take action before the host proceeds to boot the kernel.

As should be default, the XenServer kernel is highlighted.  One will want to immediately press the key (for edit).

This will then refresh the grub interface - stopping any count-down-to-boot timers - which will reveal the boot entry.  It is within this window (using up, down, left, and right) one will want to navigate to around line 4 or five and isolate "ro nolvm":

 

Next, one will want to remove (or backspace/delete) the "ro" characters and type in "rw init=/sysroot/bin/sh", or as illustrated:

 

Don't worry if the directive is not on one line!

 

With this change made, press both Control and X at the same time as this will boot the XenServer kernel into single user style mode, or better known as Emergency Mode:

How to Change Root's Password

From the Emergency Mode prompt, execute the following command:

chroot /sysroot

Now, once can execute the "passwd" command to change root's credentials:

Finally....

Now that root's credentials have been changed, utilize Control+Alt+Delete to reboot the XenServer 7.0 host and one will find via SSH, XenCenter, or directly that the root password has been changed: the host is ready to be managed again.

 

Recent Comments
Tobias Kreidl
Many thanks for this update, Jesse! It should be turned into a KB article, as well, if not already.
Friday, 24 June 2016 10:52
JK Benedict
Jordan -- Thanks for the compliments! However, it seems more apropos to say "Sorry to hear about your situation!" So, the steps... Read More
Monday, 27 June 2016 10:11
Continue reading
19858 Hits
6 Comments

XenServer Administrators Handbook Published

Last year, I announced that we were working on a XenServer Administrators Handbook, and I'm very pleased to announce that it's been published. Not only have we been published, but based on the Amazon reviews to date we've done a pretty decent job. In part, I suspect that has a ton to do with the book being focused on what information you, XenServer administrators, need to be successful when running a XenServer environment regardless of scale or workload.

XenServer Administrators HandbookThe handbook is formatted following a simple premise; first you need to plan your deployment and second you need to run it. With that in mind, we start with exactly what a XenServer is, define how it works and what expectations it has on infrastructure. After all, it's critical to understand how a product like XenServer interfaces with the real world, and how its virtual objects relate to each other. We even cover some of the misunderstandings those new to XenServer might have.

While it might be tempting to go deep on some of this stuff, Jesse and I both recognized that virtualization SREs have a job to do and that's to run virtual infrastructure. As interesting as it might be to dig into how the product is implemented, that's not the role of an administrators handbook. That's why the second half of the book provides some real world scenarios, and how to go about solving them.

We had an almost limitless list of scenarios to choose from, and what you see in the book represents real world situations which most SREs will face at some point. The goal of this format being to have a handbook which can be actively used, not something which is read once and placed on some shelf (virtual or physical). During the technical review phase, we sent copies out to actual XenServer admins, all of whom stated that we'd presented some piece of information they hadn't previously known. I for one consider that to be a fantastic compliment.

Lastly, I want to finish off by saying that like all good works, this is very much a "we" effort. Jesse did a top notch job as co-author and brings the experience of someone who's job it is to help solve customer problems. Our technical reviewers added tremendously to the polish you'll find in the book. The O'Reilly Media team was a pleasure to work with, pushing when we needed to be pushed but understanding that day jobs and family take precedence.

So whether you're looking at XenServer out of personal interest, have been tasked with designing a XenServer installation to support Citrix workloads, clouds, or for general purpose virtualization, or have a XenServer environment to call your own, there is something in here for you. On behalf of Jesse, we hope that everyone who gets a copy finds it valuable. The XenServer Administrator's handbook is available from book sellers everywhere including:

Amazon: http://www.amazon.com/XenServer-Administration-Handbook-Successful-Deployments/dp/149193543X/

Barnes and Noble: http://www.barnesandnoble.com/w/xenserver-administration-handbook-tim-mackey/1123640451

O'Reilly Media: http://shop.oreilly.com/product/0636920043737.do

If you need a copy of XenServer to work with, you can obtain that for free from: http://xenserver.org/download

Recent Comments
Tobias Kreidl
A timely publication, given all the major recent enhancements to XenServer. It's packed with a lot of hands-on, practical advice a... Read More
Tuesday, 03 May 2016 03:37
Eric Hosmer
Been looking forward to getting this book, just purchased it on Amazon. Now I just need to find that mythical free time to read ... Read More
Friday, 06 May 2016 22:41
Continue reading
11774 Hits
2 Comments

NAU VMbackup 3.0 for XenServer

NAU VMbackup 3.0 for XenServer

By Tobias Kreidl and Duane Booher

Northern Arizona University, Information Technology Services

Over eight years ago, back in the days of XenServer 5, not a lot of backup and restore options were available, either as commercial products or as freeware, and we quickly came to the realization that data recovery was a vital component to a production environment and hence we needed an affordable and flexible solution. The conclusion at the time was that we might as well build our own, and though the availability of options has grown significantly over the last number of years, we’ve stuck with our own home-grown solution which leverages Citrix XenServer SDK and XenAPI (http://xenserver.org/partners/developing-products-for-xenserver.html). Early versions were created from the contributions of Douglas Pace, Tobias Kreidl and David McArthur. During the last several years, the lion’s share of development has been performed by Duane Booher. This article discusses the latest 3.0 release.

A Bit of History

With the many alternatives now available, one might ask why we have stuck with this rather un-flashy script and CLI-based mechanism. There are clearly numerous reasons. For one, in-house products allow total control over all aspects of their development and support. The financial outlay is all people’s time and since there are no contracts or support fees, it’s very controllable and predictable. We also found from time-to-time that various features were not readily available in other sources we looked at. We also felt early on as an educational institution that we could give back to the community by freely providing the product along with its source code; the most recent version is available via GitHub at https://github.com/NAUbackup/VmBackup for free under the terms of the GNU General Public License. There was a write-up at https://www.citrix.com/blogs/2014/06/03/another-successful-community-xenserver-sdk-project-free-backup-tools-and-scripts-naubackup-restore-v2-0-released/ when the first GitHub version was published. Earlier versions were made available via the Citrix community site (Citrix Developer Network), sometimes referred to as the Citrix Code Share, where community contributions were published for a number of products. When that site was discontinued in 2013, we relocated the distribution to GitHub.

Because we “eat our own dog food,” VMbackup gets extensive and constant testing because we rely on it ourselves as the means to create backups and provide for restores for cases of accidental deletion, unexpected data corruption, or in the event that disaster recovery might be needed. The mechanisms are carefully tested before going into production and we perform frequent tests to ensure the integrity of the backups and that restores really do work. A number of times, we have relied on resorting to recovering from our backups and it has been very reassuring that these have been successful.

What VMbackup Does

Very simply, VMbackup provides a framework for backing up virtual machines (VMs) hosted on XenServer to an external storage device, as well as the means to recover such VMs for whatever reason that might have resulted in loss, be it disaster recovery, restoring an accidentally deleted VM, recovering from data corruption, etc.

The VMbackup distribution consists of a script written in Python and a configuration file. Other than a README document file, that’s it other than the XenServer SDK components which one needs to download separately; see http://xenserver.org/partners/developing-products-for-xenserver.html for details. There is no fancy GUI to become familiar with, and instead, just a few simple things that need to be configured, plus a destination for the backups needs to be made accessible (this is generally an NFS share, though SMB/CIFS will work, as well). Using cron job entries, a single host or an entire pool can be set up to perform periodic backups. Configurations on individual hosts in a pool are needed in that the pool master performs the majority of the work and it can readily change to a different XenServer, while individual host-based instances are also needed when local storage is also made use of, since access to any local SRs can only be performed from each individual XenServer. A cron entry and numerous configuration examples are given in the documentation.

To avoid interruptions of any running VMs, the process of backing up a VM follows these basic steps:

  1. A snapshot of the VM and its storage is made
  2. Using the xe utility vm-export, that snapshot is exported to the target external storage
  3. The snapshot is deleted, freeing up that space

In addition, some VM metadata are collected and saved, which can be very useful in the event a VM needs to be restored. The metadata include:

  • vm.cfg - includes name_label, name_description, memory_dynamic_max, VCPUs_max, VCPUs_at_startup, os_version, orig_uuid
  • DISK-xvda (for each attached disk)
    • vbd.cfg - includes userdevice, bootable, mode, type, unplugable, empty, orig_uuid
    • vdi.cfg - includes name_label, name_description, virtual_size, type, sharable, read_only, orig_uuid, orig_sr_uuid
  • VIFs (for each attached VIF)
    • vif-0.cfg - includes device, network_name_label, MTU, MAC, other_config, orig_uuid

An additional option is to create a backup of the entire XenServer pool metadata, which is essential in dealing with the aftermath of a major disaster that affects the entire pool. This is accomplished via the “xe pool-dump-database” command.

In the event of errors, there are automatic clean-up procedures in place that will remove any remnants plus make sure that earlier successful backups are not purged beyond the specified number of “good” copies to retain.

There are numerous configuration options that allow to specify which VMs get backed up, how many backup versions are to be retained, whether the backups should be compressed (1) as part of the process, as well as optional report generation and notification setups.

New Features in VMbackup 3.0

A number of additional features have been added to this latest 3.0 release, adding flexibility and functionality. Some of these came about because of the sheer number of VMs that needed to be dealt with, SR space issues as well as with changes coming to the next XenServer release. These additions include:

  • VM “preview” option: To be able to look for syntax errors and ensure parameters are being defined properly, a VM can have a syntax check performed on it and if necessary, adjustments can then be made based on the diagnosis to achieve the desired configuration.
  • Support for VMs containing spaces: By surrounding VM names in the configuration file with double quotes, VM names containing spaces can now be processed. 
  • Wildcard suffixes: This very versatile option permits groups of VMs to be configured to be handled similarly, eliminating the need to create individual settings for every desired VM. Examples include “PRD-*”, “SQL*” and in fact, if all VMs in the pool should be backed up, even “*”. There are however, a number of restrictions on wildcard usage (2).
  • Exclude VMs: Along with the wildcard option to select which VMs to back up, clearly a need arises to provide the means to exclude certain VMs (in addition to the other alternative, which is simply to rename them such that they do not match a certain backup set). Currently, each excluded VM must be named separately and any such VMs should de defined at the end of the configuration file. 
  • Export the OS disk VDI, only: In some cases, a VM may contain multiple storage devices (VDIs) that are so large that it is impractical or impossible to take a snapshot of the entire VM and its storage. Hence, we have introduced the means to backup and restore only the operating system device (assumed to be Disk 0). In addition to space limitations, some storage, such as DB data, may not be able to be reliably backed up using a full VM snapshot. Furthermore, the next XenServer release (Dundee) will likely support up to as many as perhaps 255 storage devices per VM, making a vm-export even more involved under such circumstances. Another big advantage here is that currently, this process is much more efficient and faster than a VM export by a factor of three or more!
  • Root password obfuscation: So that clear-text passwords associated with the XenServer pool are not embedded in the scripts themselves, the password can be basically encoded into a file.

The mechanism for a running VM from which only the primary disk is to be backed up is similar to the full VM backup. The process of backing up such a VM follows these basic steps:

  1. A snapshot of just the VM's Disk 0 storage is made
  2. Using the xe utility vdi-export, that snapshot is exported to the target external storage
  3. The snapshot is deleted, freeing up that space

As with the full VM export, some metadata for the VM are also collected and saved for this VDI export option.

These added features are of course subject to change in future releases, though typically later editions generally encompass the support of previous versions to preserve backwards compatibility.

Examples

Let’s look at the configuration file weekend.cfg:

# Weekend VMs
max_backups=4
backup_dir=/snapshots/BACKUPS
#
vdi-export=PROD-CentOS7-large-user-disks
vm-export=PROD*
vm-export=DEV-RH*:3
exclude=PROD-ubuntu12-benchmark
exclude=PRODtestVM

Comment lines start with a hash mark and may be contained anywhere with the file. The hash mark must appear as the first character in the line. Note that the default number of retained backups is set here to four. The destination directory is set next, indicating where the backups will be written to. We then see a case where only the OS disk is being backed up for the specific VM "PROD-CentOS7-large-user-disks" and below that, all VMs beginning with “PROD” are backed up using the default settings. Just below that, a definition is created for all VMs starting with "DEV-RH" and the default number of backups is reduced for all of these from the global default of four down to three. Finally, we see two excludes for specific VMs that fall into the “PROD*” group that should not be backed up at all.

To launch the script manually, you would issue from the command line:

./VmBackup.py password weekend.cfg

To launch the script via a cron job, you would create a single-line entry like this:

10 0 * * 6 /usr/bin/python /snapshots/NAUbackup/VmBackup.py password
/snapshots/NAUbackup/weekend.cfg >> /snapshots/NAUbackup/logs/VmBackup.log 2>&1

This would run the task at ten minutes past midnight on Saturday and create a log entry called VmBackup.log. This cron entry would need to be installed on each host of a XenServer pool.

Additional Notes

It can be helpful to break up when backups are run so that they don’t all have to be done at once, which may be impractical, take so long as to possibly impact performance during the day, or need to be coordinated with when is best for specific VMs (such as before or after patches are applied). These situations are best dealt with by creating separate cron jobs for each subset.

There is a fair load on the server, comparable to any vm-export, and hence the queue is processed linearly with only one active snapshot and export sequence for a VM being run at a time. This is also why we suggest you perform the backups and then asynchronously perform any compression on the files on the external storage host itself to alleviate the CPU load on the XenServer host end.

For even more redundancy, you can readily duplicate or mirror the backup area to another storage location, perhaps in another building or even somewhere off-site. This can readily be accomplished using various copy or mirroring utilities, such as rcp, sftp, wget, nsync, rsync, etc.

This latest release has been tested on XenServer 6.5 (SP1) and various beta and technical preview versions of the Dundee release. In particular, note that the vdi-export utility, while dating back a while, is not well documented and we strongly recommend not trying to use it on any XenServer release before XS 6.5. Doing so is clearly at your own risk.

The NAU VMbackup distribution can be found at: https://github.com/NAUbackup/VmBackup

In Conclusion

This is a misleading heading, as there is not really a conclusion in the sense that this project continues to be active and as long as there is a perceived need for it, we plan to continue working on keeping it running on future XenServer releases and adding functionality as needs and resources dictate. Our hope is naturally that the community can make at least as good use of it as we have ourselves.

Footnotes:

  1. Alternatively, to save time and resources, the compression can potentially be handled asynchronously by the host onto which the backups are written, hence reducing overhead and resource utilization on the XenServer hosts, themselves.
  2. Certain limitations exist currently with how wildcards can be utilized. Leading wildcards are not allowed, nor are multiple wildcards within a string. This may be enhanced at a later date to provide even more flexibility.

This article was written by Tobias Kreidl and Duane Booher, both of Northern Arizona University, Information Technology Services. Tobias' biography is available at this site, and Duane's LinkedIn profile is at https://www.linkedin.com/in/duane-booher-a068a03 while both can also be found on http://discussions.citrix.com primarily in the XenServer forum.     

Recent Comments
Lorscheider Santiago
Tobias Kreidl and Duane Booher, Greart Article! you have thought of a plugin for XenCenter?
Saturday, 09 April 2016 13:28
Tobias Kreidl
Thank you, Lorscheider, for your comment. Our thoughts have long been that others could take this to another level by developing a... Read More
Thursday, 14 April 2016 01:34
Niklas Ahden
Hi, First of all I want to thank you for this great article and NAUBackup. I am wondering about the export-performance while usin... Read More
Sunday, 17 April 2016 19:14
Continue reading
19269 Hits
11 Comments

Preview of XenServer Administrators Handbook

Administering any technology can be both fun and challenging at times. For many, the fun part is designing a new deployment while for others the hardware selection process, system configuration and tuning and actual deployment can be a rewarding part of being an SRE. Then the challenging stuff hits where the design and deployment become a real part of the everyday inner workings of your company and with it come upgrades, failures, and fixes. For example, you might need to figure out how to scale beyond the original design, deal with failed hardware or find ways to update an entire data center without user downtime. No matter how long you've been working with a technology, the original paradigms often do change, and there is always an opportunity to learn how to do something more efficiently.

That's where a project JK Benedict and I have been working on with the good people of O'Reilly Media comes in. The idea is a simple one. We wanted a reference guide which would contain valuable information for anyone using XenServer - period. If you are just starting out, there would be information to help you make that first deployment a successful one. If you are looking at redesigning an existing deployment, there are valuable time-saving nuggets of info, too. If you are a longtime administrator, you would find some helpful recipes to solve real problems that you may not have tried yet. We didn't focus on long theoretical discussions, and we've made sure all content is relevant in a XenServer 6.2 or 6.5 environment. Oh, and we kept it concise because your time matters.

I am pleased to announce that attendees of OSCON will be able to get their hands on a preview edition of the upcoming XenServer Administrators Handbook. Not only will you be able to thumb through a copy of the preview book, but I'll have a signing at the O'Reilly booth on Wednesday July 22nd at 3:10 PM. I'm also told the first 25 people will get free copies, so be sure to camp out ;)

Now of course everyone always wants to know what animal which gets featured for the book cover. As you can see below, we have a bird. Not just any bird mind you, but a xenops. Now I didn't do anything to steer O'Reilly towards this, but find it very cool that we have an animal which also represents a very core component in XenServer; the xenopsd. For me, that's a clear indication we've created the appropriate content, and I hope you'll agree.

 

             

Recent Comments
prashant sreedharan
cool ! cant wait to get my hands on the book :-)
Tuesday, 07 July 2015 19:32
Tobias Kreidl
Congratulations, Tim and Jesse, as an update in this area is long overdue and in very good hands with you two. The XenServer commu... Read More
Tuesday, 07 July 2015 19:42
JK Benedict
Ah, Herr Tobias -- Danke freund. Danke fur ihre unterstutzung! Guten abent!
Thursday, 23 July 2015 09:26
Continue reading
13330 Hits
6 Comments

Security bulletin covering VENOM

Last week a vulnerability in QEUM was reported with the marketing name of "VENOM", but which is more correctly known as CVE-2015-3456.  Citrix have released a security bulletin covering CVE-2015-3456 which has been updated to include hotfixes for XenServer 6.5, 6.5 SP1 and XenServer 6.2 SP1.

Learning about new XenServer hotfixes

When a hotfix is released for XenServer, it will be posted to the Citrix support web site. You can receive alerts from the support site by registering at http://support.citrix.com/profile/watches and following the instructions there. You will need to create an account if you don't have one, but the account is completely free. Whenever a security hotfix is released, there will be an accompanying security advisory in the form of a CTX knowledge base article for it, and those same KB articles will be linked on xenserver.org in the download page.

Patching XenServer hosts

XenServer admins are encouraged to schedule patching of their XenServer installations at their earliest opportunity. Please note that this bulletin does impact XenServer 6.2 hosts, and to apply the patch, all XenServer 6.2 hosts will first need to be patched to service pack 1 which can be found on the XenServer download page

Continue reading
28626 Hits
1 Comment

XenServer 6.5 and Asymmetric Logical Unit Access (ALUA) for iSCSI Devices

INTRODUCTION

There are a number of ways to connect storage devices to XenServer hosts and pools, including local storage, HBA SAS and fiber channel, NFS and iSCSI. With iSCSI, there are a number of implementation variations including support for multipathing with both active/active and active/passive configurations, plus the ability to support so-called “jumbo frames” where the MTU is increased from 1500 to typically 9000 to optimize frame transmissions. One of the lesser-known and somewhat esoteric iSCSI options available on many modern iSCSI-based storage devices is Asymmetric Logical Unit Access (ALUA), a protocol that has been around for a decade and is furthermore mysterious and intriguing because of its ability to be used not only with iSCSI, but also with fiber channel storage. The purpose of this article is an attempt to both clarify and outline how ALUA can be used more flexibly now with iSCSI on XenServer 6.5.

HISTORY

ALUA support on XenServer goes way back to XenServer 5.6 and initially only with fiber channel devices. The support of iSCSI ALUA connectivity started on XenServer 6.0 and was initially limited to specific ALUA-capable devices, which included the EMC Clariion, NetApp FAS as well as the EMC VMAX and VNX series. Each device required specific multipath.conf file configurations to properly integrate with the server used to access them, XenServer being no exception. The upstream XenServer code also required customizations. The "How to Configure ALUA Multipathing on XenServer 6.x for Enterprise Arrays" article CTX132976 (March 2014, revised March 2015) currently only discusses ALUA support through XenServer 6.2 and only for specific devices, stating: “Most significant is the usability enhancement for ALUA; for EMC™ VNX™ and NetApp™ FAS™, XenServer will automatically configure for ALUA if an ALUA-capable LUN is attached”.

It was announced in the XenServer 6.5 Release Notes that XenServer will automatically connect to one of these aforementioned documented devices and it is now running the updated device mapper multipath (DMMP) version 0.4.9-72. This rekindled my interest in ALUA connectivity and after some research and discussions with Citrix and Dell about support, it appeared this might now be possible specifically for the Dell MD3600i units we have used on XenServer pools for some time now. What is not stated in the release notes is that XenServer 6.5 now has the ability to connect generically to a large number of ALUA-capable storage arrays. This will be gone into detail later. It is also of note that MPP-RDAC support is no longer available in XenServer 6.5 and DMMP is the exclusive multipath mechanism supported. This was in part because of support and vendor-specific issues (see, for example, the XenServer 6.5 Release Notes or this document from Dell, Inc.).

But first, how are ALUA connections even established? And perhaps of greater interest, what are the benefits of ALUA in the first place?

ALUA DEFINITIONS AND SETTINGS

As the name suggests, ALUA is intended to optimize storage traffic by making use of optimized paths. With multipathing and multiple controllers, there are a number of paths a packet can take to reach its destination. With two controllers on a storage array and two NICs dedicated to iSCSI traffic on a host, there are four possible paths to a storage Logical Unit Number (LUN). On the XenServer side, LUNs then are associated with storage repositories (SRs). ALUA recognizes that once an initial path is established to a LUN that any multipathing activity destined for that same LUN is better served if routed through the same storage array controller. It attempts to do so as much as possible, unless of course a failure forces the connection to have to take an alternative path. ALUA connections fall into five self-explanatory categories (listed along with their associated hex codes):

  • Active/Optimized : x0
  • Active/Non-Optimized : x1
  • Standby : x2
  • Unavailable : x3
  • Transitioning : xf

For ALUA to work, it is understood that an active/active storage path is required and furthermore that an asymmetrical active/active mechanism is involved. The advantage of ALUA comes from less fragmentation of packet traffic by routing if at all possible both paths of the multipath connection via the same storage array controller as the extra path through a different controller is less efficient. It is very difficult to locate specific metrics on the overall gains, but hints of up to 20% can be found in on-line articles (e.g., this openBench Labs report on Nexsan), hence this is not an insignificant amount and potentially more significant that gains reached by implementing jumbo frames. It should be noted that the debate continues to this day regarding the benefits of jumbo frames and to what degree, if any, they are beneficial. Among numerous articles to be found are: The Great Jumbo Frames Debate from Michael Webster, Jumbo Frames or Not - Purdue University Research, Jumbo Frames Comparison Testing, and MTU Issues from ESNet. Each installation environment will have its idiosyncrasies and it is best to conduct tests within one's unique configuration to evaluate such options.

The SCSI Architecture Model version defines these SCSI Primary Commands (SPC-3) used to determine paths. The mechanism by which this is accomplished is target port group support (TPGS). The characteristics of a path can be read via an RTPG command or set with an STPG command. With ALUA, non-preferred controller paths are used only for fail-over purposes. This is illustrated in Figure 1, where an optimized network connection is shown in red, taking advantage of routing all the storage network traffic via Node A (e.g., storage controller module 0) to LUN A (e.g., 2).

 

b2ap3_thumbnail_ALUAfig1.jpg

Figure 1.  ALUA connections, with the active/optimized paths to Node A shown as red lines and the active/non-optimized paths shown as dotted black lines.

 

Various SPC commands are provided as utilities within the sg3_utils (SCSI generic) Linux package.

There are other ways to make such queries, for example, VMware has a “esxcli nmp device list” command and NetApp appliances support “igroup” commands that will provide direct information about ALUA-related connections.

Let us first examine a generic Linux server containing ALUA support connected to an ALUA-capable device. In general, note that this will entail a specific configuration to the /etc/multipath.conf file and typical entries, especially for some older arrays or XenServer versions, will use one or more explicit configuration parameters such as:

  • hardware_handler ”1 alua”
  • prio “alua”
  • path_checker “alua”

Consulting the Citrix knowledge base article CTX132976, we see for example the EMC Corporation DGC Clariion device makes use of an entry configured as:

        device{
                vendor "DGC"
                product "*"
                path_grouping_policy group_by_prio
                getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
                prio_callout "/sbin/mpath_prio_emc /dev/%n"
                hardware_handler "1 alua"
                no_path_retry 300
                path_checker emc_clariion
                failback immediate
        }

To investigate the multipath configuration in more detail, we can make use of the TPGS setting. The TPGS setting can be read using the sg_rtpg command. By using multiple “v” flags to increase verbosity and “d” to specify the decoding of the status code descriptor returned for the asymmetric access state, we might see something like the following for one of the paths:

# sg_rtpg -vvd /dev/sde
open /dev/sdg with flags=0x802
    report target port groups cdb: a3 0a 00 00 00 00 00 00 04 00 00 00
    report target port group: requested 1024 bytes but got 116 bytes
Report list length = 116
Report target port groups:
  target port group id : 0x1 , Pref=0
    target port group asymmetric access state : 0x01 (active/non optimized)
    T_SUP : 0, O_SUP : 0, U_SUP : 1, S_SUP : 0, AN_SUP : 1, AO_SUP : 1
    status code : 0x01 (target port asym. state changed by SET TARGET PORT GROUPS command)
    vendor unique status : 0x00
    target port count : 02
    Relative target port ids:
      0x01
      0x02
(--snip--)

Noting the boldfaced characters above, we see here specifically that target port ID 1 is an active/non-optimized ALUA path, both from the “target port group id” line as well as from the “status code”. We also see there are two paths identified, with target port IDs 1,1 and 1,2.

There are a slew of additional “sg” commands, such as the sg_inq command, often used with the flag “-p 0x83” to get the VPD (vital product data) page of interest, sg_rdac, etc. The sg_inq command will in general return, in fact, TPGS > 0 for devices that support ALUA. More on that will be discussed later on in this article. One additional command of particular interest, because not all storage arrays in fact support target port group queries (more also on this important point later!), is sg_vpd (sg vital product data fetcher), as it does not require TPG access. The base syntax of interest here is:

sg_vpd –p 0xc9 –hex /dev/…

where “/dev/…” should be the full path to the device in question. Looking at an example of the output of a real such device, we get:

# sg_vpd -p 0xc9 --hex /dev/mapper/mpathb1
Volume access control (RDAC) VPD Page:
00     00 c9 00 2c 76 61 63 31  f1 01 00 01 01 01 00 00    ...,vac1........
10     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00    ................
20     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00    ................

If one reads the source code for various device handlers (see the multipath tools hardware table for an extensive list of hardware profiles as well as the Linux SCSI device handler regarding how the data are interpreted through the device handler), one can determine that the value of interest here is that of avte_cvp (part of the RDAC c9_inquiry structure), which is the sixth hex value, and will indicate if the connected device is using ALUA (if shifted right five bits together with a logical AND with 0x1, in the RDAC world, known as IOSHIP mode), AVT, or Automatic Volume Transfer mode (if shifted right seven bits together with a logical AND with 0x1), or otherwise defaults in general to basic RDAC (legacy) mode. In the case above we see “61” returned (indicated in boldface), so (0x61 >> 5 & 0x1) is equal to 1, and hence the above connection is indeed an ALUA RDAC-based connection.

I will revisit sg commands once again later on. Do note that the sg3_utils package is not installed on stock XenServer distributions and as with any external package, the installation of external packages may void any official Citrix support.

MULTIPATH CONFIGURATIONS AND REPORTS

In addition to all the information that various sg commands provide, there is also an abundance of information available from the standard multipath command. We saw a sample multipath.conf file earlier, and at least with many standard Linux OS versions and ALUA-capable arrays, information on the multipath status can be more readily obtained using stock multipath commands.

For example, on an ALUA-enabled connection we might see output similar to the following from a “multipath –ll” command (there will be a number of variations in output, depending on the version, verbosity and implementation of the multipath utility):

mpath2 (3600601602df02d00abe0159e5c21e111) dm-4 DGC,VRAID
[size=100G][features=1 queue_if_no_path][hwhandler=1 alua][rw]
_ round-robin 0 [prio=50][active]
 _ 1:0:3:20  sds   70:724   [active][ready]
 _ 0:0:1:20  sdk   67:262   [active][ready]
_ round-robin 0 [prio=10][enabled]
 _ 0:0:2:20  sde   8:592    [active][ready]
 _ 1:0:2:20  sdx   128:592  [active][ready]

Recalling the device sde from the section above, note that it falls under a path with a lower priority of 10,  indicating it is part of an active, non-optimized network connection vs. 50, which indicates being in an active, optimized group; a priority of “1” would indicate the device is in the standby group. Depending on what mechanism is used to generate the priority values, be aware that these priority values will vary considerably; the most important point is that whatever path has a higher “prio” value will be the optimized path. In some newer versions of the multipath utility, the string “hwhandler=1 alua” shows clearly that the controller is configured to allow the hardware handler to help establish the multipathing policy as well as that ALUA is established for this device. I have read that the path priority will be elevated to typically a value of between 50 and 80 for optimized ALUA-based connections (cf. mpath_prio_alua in this Suse article), but have not seen this consistently.

The multipath.conf file itself has traditionally needed tailoring to each specific device. It is particularly convenient, however, that using a generic configuration is now possible for a device that makes use of the internal hardware handler and is rdac-based and can auto-negotiate an ALUA connection. The italicized entries below represent the specific device itself, but others should now work using this generic sort of connection:

device {
                vendor                  "DELL"
                product                 "MD36xx(i|f)"
                features                "2 pg_init_retries 50"
                hardware_handler        "1 rdac"
                path_selector           "round-robin 0"
                path_grouping_policy    group_by_prio
                failback                immediate
                rr_min_io               100
                path_checker            rdac
                prio                    rdac
                no_path_retry           30
                detect_prio             yes
                retain_attached_hw_handler yes
        }

Note how this differs (the additional entries above are in boldface type) from the “stock” version (in XenServer 6.5) of the MD36xx multipath configuration:

device {
                vendor                  "DELL"
                product                 "MD36xx(i|f)"
                features                "2 pg_init_retries 50"
                hardware_handler        "1 rdac"
                path_selector           "round-robin 0"
                path_grouping_policy    group_by_prio
                failback                immediate
                rr_min_io               100
                path_checker            rdac
                prio                    rdac
                no_path_retry           30
        }

THE CURIOUS CASE OF DELL MD32XX/36XX ARRAY CONTROLLERS

The LSI controllers incorporated into Dell’s MD32xx and MD36xx series of iSCSI storage arrays represent an unusual and interesting case. As promised earlier, we will get back to looking at the sg_inq command, which queries a storage device for several pieces of information, including TPGS. Typically, an array that supports ALUA will return a value of TPGS > 0, for example:

# sg_inq /dev/sda
standard INQUIRY:
PQual=0 Device_type=0 RMB=0 version=0x04 [SPC-2]
[AERC=0] [TrmTsk=0] NormACA=1 HiSUP=1 Resp_data_format=2
SCCS=0 ACC=0 TPGS=1 3PC=1 Protect=0 BQue=0
EncServ=0 MultiP=1 (VS=0) [MChngr=0] [ACKREQQ=0] Addr16=0
[RelAdr=0] WBus16=0 Sync=0 Linked=0 [TranDis=0] CmdQue=1
[SPI: Clocking=0x0 QAS=0 IUS=0]
length=117 (0x75) Peripheral device type: disk
Vendor identification: NETAPP
Product identification: LUN
Product revision level: 811a

Highlighted in boldface, we see in this case above that TPGS is reported to have a value of 1. The MD36xx has supported ALUA since RAID controller firmware 07.84.00.64 and  NVSRAM  N26X0-784890-904, however, even with that (or newer) revision level, an sg_inq returns the following for this particular storage array:

# sg_inq /dev/mapper/36782bcb0002c039d00005f7851dd65de
standard INQUIRY:
  PQual=0  Device_type=0  RMB=0  version=0x05  [SPC-3]
  [AERC=0]  [TrmTsk=0]  NormACA=1  HiSUP=1  Resp_data_format=2
  SCCS=0  ACC=0  TPGS=0  3PC=1  Protect=0  BQue=0
  EncServ=1  MultiP=1 (VS=0)  [MChngr=0]  [ACKREQQ=0]  Addr16=0
  [RelAdr=0]  WBus16=1  Sync=1  Linked=0  [TranDis=0]  CmdQue=1
  [SPI: Clocking=0x0  QAS=0  IUS=0]
    length=74 (0x4a)   Peripheral device type: disk
 Vendor identification: DELL
 Product identification: MD36xxi
 Product revision level: 0784
 Unit serial number: 142002I

Various attempts to modify the multipath.conf file to try to force TPGS to appear with any value greater than zero all failed. Above all, it seemed that without access to the TPGS command, there was no way to query the device for ALUA-related information.  Furthermore, the command mpath_prio_alua and similar commands appear to have been deprecated in newer versions of the device-mapper-multipath package, and so offer no help.

This proved to be a major roadblock in making any progress. Ultimately it turned out that the key to looking for ALUA connectivity in this particular case comes oddly from ignoring what TPGS reports, and rather focusing on what the MD36xx controller is doing. What is going on here is that the hardware handler is taking over control and the clue comes from the sg_vpd output shown above. To see how a LUN is mapped for these particular devices, one needs to hunt back through the /var/log/messages file for entries that appear when the LUN was first attached. To investigate this for the MD36xx array, we know it uses the internal “rdac” connection mechanism for the hardware handler, so a Linux grep command for “rdac” in the /var/log/messages file around the time the connection was established to a LUN should reveal how it was established.

Sure enough, if one looks at a case where the connection is known to not be making use of ALUA, you might see entries such as these:

[   98.790309] rdac: device handler registered
[   98.796762] sd 4:0:0:0: rdac: AVT mode detected
[   98.796981] sd 4:0:0:0: rdac: LUN 0 (owned (AVT mode))
[   98.797672] sd 5:0:0:0: rdac: AVT mode detected
[   98.797883] sd 5:0:0:0: rdac: LUN 0 (owned (AVT mode))
[   98.798590] sd 6:0:0:0: rdac: AVT mode detected
[   98.798811] sd 6:0:0:0: rdac: LUN 0 (owned (AVT mode))
[   98.799475] sd 7:0:0:0: rdac: AVT mode detected
[   98.799691] sd 7:0:0:0: rdac: LUN 0 (owned (AVT mode))

In contrast, an ALUA-based connection to LUNs shown below on an MD3600i that has new enough firmware to support ALUA and using an appropriate client that also supports ALUA and has a properly configured entry in the /etc/multipath.conf file will instead show the IOSHIP connection mechanism (see p. 124 of this IBM System Storage manual for more on I/O Shipping):

Mar 11 09:45:45 xs65test kernel: [   70.823257] scsi 8:0:0:1: rdac: LUN 1 (IOSHIP) (owned)
Mar 11 09:45:46 xs65test kernel: [   71.385835] scsi 9:0:0:0: rdac: LUN 0 (IOSHIP) (unowned)
Mar 11 09:45:46 xs65test kernel: [   71.389345] scsi 9:0:0:1: rdac: LUN 1 (IOSHIP) (owned)
Mar 11 09:45:46 xs65test kernel: [   71.957649] scsi 10:0:0:0: rdac: LUN 0 (IOSHIP) (owned)
Mar 11 09:45:46 xs65test kernel: [   71.961788] scsi 10:0:0:1: rdac: LUN 1 (IOSHIP) (unowned)
Mar 11 09:45:47 xs65test kernel: [   72.531325] scsi 11:0:0:0: rdac: LUN 0 (IOSHIP) (owned)

Hence, we happily recognize that indeed, ALUA is working.

The even better news is that not only is ALUA now functional in XenServer 6.5 but should, in fact, work now with a large number of ALUA-capable storage arrays, both with custom configuration needs as well as potentially many that may work generically. Another surprising find was that for the MD3600i arrays tested, it turns out that even the “stock” version of the MD36xxi multipath configuration entry provided with XenServer 6.5 creates ALUA connections. The reason for this is that the hardware handler is being used consistently, provided no specific profile overrides are intercepted, and so primarily the storage device is doing the negotiation itself instead of being driven by the file-based configuration. This is what made the determination of ALUA connectivity more difficult, namely that the TPGS setting was never changed from zero and could consequently not be used to query for the group settings.

CONCLUSIONS

First off, it is really nice to know now that many modern storage devices support ALUA and that XenServer 6.5 now provides an easier means to leverage this protocol. It is also a lesson that documentation can be either hard to find and in some cases, is in need of being updated to reflect the current state. Individual vendors will generally provide specific instructions regarding iSCSI connectivity, and should of course be followed. Experimentation is best carried out on non-production servers where a major faux pas will not have catastrophic consequences.

To me, this was also a lesson in persistence as well as an opportunity to share the curiosity and knowledge among a number of individuals who were helpful throughout this process. Above all, among many who deserve thanks, I would like to thank in particular Justin Bovee from Dell and Robert Breker of Citrix for numerous valuable conversations and information exchanges.

Recent Comments
JK Benedict
POETRY! Thank you so much for this effort, Tobias!!!
Monday, 20 April 2015 14:01
Loren Saxby
This is excellent! It's amazing how certain features are overlooked or never brought up to begin with. Articles like this one shed... Read More
Wednesday, 06 May 2015 06:01
Tobias Kreidl
Thank you for all your collective comments. Shedding some light on the obscure can be very rewarding, even if only a small audienc... Read More
Sunday, 10 May 2015 23:38
Continue reading
33117 Hits
9 Comments

History and Syslog Tweaks

Introduction

As XenServer Administrators already know (or will know), there is one user "to rule them all"... and that user is root.  Be it an SSH connection or command-line interaction with DOM0 via XenCenter, while you may be typing commands in RING3 (user space), you are doing it as the root user.

This is quite appropriate for XenServer's architecture as once the bare-metal is powered on, one is not booting into the the latest "re-spin" of some well-known (or completely obscure) Linux-spin.  Quite the opposite.  One is actually booting into the virtualization layer: dom0 or the Control Domain.  This is where separation of Guest VMs (domUs) and user space programmes (ping, fsck, and even XE) begins... even at the command line for root.

In summary, it is not uncommon for many Administrators to require root access to a XenServer... at one time.  Thus, this article will show my own means of adding granularity to the HISTORY command as well as logging (via Syslog) of each and every root user session.

Assumptions

As BASH is the default shell, this article assumes that one has knowledge of BASH, things "BASH", Linux-based utilities, and so forth.  If one isn't familiar with BASH, how BASH leverages global and local scripts to setup a user environment, etc I have provided the following resources:

  • BASH login scripts : http://www.linuxfromscratch.org/blfs/view/6.3/postlfs/profile.html
  • Terminal Colors : http://www.tldp.org/HOWTO/Bash-Prompt-HOWTO/x329.html
  • HISTORY command : http://www.tecmint.com/history-command-examples/

Purpose

The purpose I wanted to achieve was not just a more 'clean way' to look at the history command, but to also log the root user's session information: recording their access means, what command they ran, and WHEN.


In short, we go from this:

To this (plus record of each command in /var/log/user.log | /var/log/messages):

What To Do?

First, we want to backup /etc/bashrc to /etc/backup.bashrc in the event one would like to revert to the original HISTORY method, etc.  This can be done via the command-line of the XenServer:

cp /etc/bashrc /etc/backup.bashrc

Secondly, the following addition will should be added to the end of /etc/bashrc:

##[ HISTORY LOGGING ]#######################################################
#
# ADD USER LOGGING AND HISTORY COMMAND CONTEXT FOR SOME AUDITING
# DEC 2014, JK BENEDICT
# This email address is being protected from spambots. You need JavaScript enabled to view it. | @xenfomation
#
#########################################################################

# Grab current user's name
export CURRENT_USER_NAME=`id -un`

# Grab current user's level of access: pts/tty/or SSH
export CURRENT_USER_TTY="local `tty`"
checkSSH=`set | grep "^SSH_CONNECTION" | wc -l`

# SET THE PROMPT
if [ "$checkSSH" == "1" ]; then
     export CURRENT_USER_TTY="ssh `set | grep "^SSH_CONNECTION" | awk {' print $1 '} | sed -rn "s/.*?='//p"`"
     export PROMPT_COMMAND='history -a >(tee -a ~/.bash_history | logger -t "HISTORY for $CURRENT_USER_NAME[$$] via $SSH_CONNECTION : ")'
else
     export CURRENT_USER_TTY
     export PROMPT_COMMAND='history -a >(tee -a ~/.bash_history | logger -t "HISTORY for $CURRENT_USER_NAME[$$] via $CURRENT_USER_TTY : ")'
fi

# SET HISTORY SETTINGS
# Lines to retain, ignore dups, time stamp, and user information
# For date variables, check out http://www.computerhope.com/unix/udate.htm
export HISTSIZE=5000
export HISTCONTROL=ignoredups
export HISTTIMEFORMAT=`echo -e "e[1;31m$CURRENT_USER_NAMEe[0m[$$] via e[1;35m$CURRENT_USER_TTYe[0m on e[0;36m%d-%m-%y %H:%M:%S%ne[0m       "`

A link to a file providing this addition downloaded from https://github.com/xenfomation/bash-history-tweak

What Next?

Well, with the changes added and saved to /etc/bashrc, exit the command-line prompt or SSH session: logging back in to test the changes.

exit

hostname
whoami
history
tail -f /var/log/user.log

... And that is that.  So, while there are 1,000,000 more sophisticated ways to achieve this, I thought I'd share what I have used for a long time... have fun and enjoy!

--jkbs | @xenfomation

Continue reading
2805 Hits
0 Comments

XenServer Support Options

Now that Creedence has been released as XenServer 6.5, I'd like to take this opportunity to highlight where to obtain what level of support for your installation.

Commercial Support

Commercial support is available from Citrix and many of its partners. A commercial support contract is appropriate if you're running XenServer in a production environment, particularly if downtime is a critical component of your SLA. It's important to note that commercial support is only available if the deployment follows the Citrix deployment guidelines, uses third party components from the Citrix Ready Marketplace, and is operated in accordance with the terms of the commercial EULA. Of course, since your deployment might not precisely follow these guidelines, commercial support may not be able to resolve all issues and that's where community support comes in.

Community Support

Community support is available from the Citrix support forums. The people on the forum are both Citrix support engineers and also your fellow system administrators. They are generally quite knowledgeable and enthusiastic to help someone be successful with XenServer. It's important to note that while the product and engineering teams may monitor the support forums from time to time, engineering level support should not be expected on the community forums.

Developer Support

Developer level support is available from the xs-devel list. This is your traditional development mailing list and really isn't appropriate for general support questions. Many of the key engineers are part of this list, and do engage on topics related to performance, feature development and code level issues. It's important to remember that the XenServer software is actually built from many upstream components, so the best source of information might be an upstream developer list and not xs-devel.

Self-support tool

Citrix maintains an self-support tool called Citrix Insight Services, formerly known as Tools-as-a-Service (TaaS). Insight Services takes a XenServer status report, and analyzes it to determine if there are any operational issues present in the deployment. A best practice is to upload a report after installing a XenServer host to determine if any issues are present which can result in latent performance or stability problems. CIS is used extensively by the Citrix support teams, but doesn't require a commercial support contract for end users.

Submitting Defects

If you believe you have encountered a defect or limitation in the XenServer software, simply using one of these support options isn't sufficient for the incident to be added to the defect queue for evaluation. Commercial support users will need to have their case triaged and potentially escalated, with the result potentially being a hotfix. All other users will need to submit an incident report via bugs.xenserver.org. Please be as detailed as possible with any defect reports such that they can be reproduced, and it doesn't hurt to include the URL of any forum discussion or the TaaS ID in your report. Also, please be aware that while the issue may be urgent for you any potential fix may take some time to be created. If your issue is urgent, you are strongly encouraged to follow the commercial support route as Citrix escalation engineers have the ability to prioritize customer issues.

Additionally, its important to point out that submitting a defect or incident report doesn't guarantee it'll be fixed. Some things simply work the way they do for very important reasons, other things may behave the way they do due to the way components interact. XenServer is tuned to provide a highly scalable virtualization platform, and if an incident would require destabilizing that platform, it's unlikely to be changed.

Recent comment in this post
JK Benedict
Tim - thank you very much for this post and as always, we greatly appreciate your work here @ xenserver.org. #XenServer = #Citr... Read More
Friday, 16 January 2015 04:37
Continue reading
15738 Hits
1 Comment

Basic Network Testing with IPERF

Purpose

I am often asked how one can perform simple network testing within, outside, and into XenServer.  This is a great question as – by itself – it is simple enough to answer.  However, depending on what one desires out of “network testing” the answer can quickly become more complex.

As such, this I have decided to answer this question using a long standing, free utility called IPERF (well, IPERF2).  It is a rather simple, straight-forward, but powerful utility I have used over many, many years.  Links to IPERF will be provided - along with documentation on its use - as it will serve in this guide as a way to:


- Test bandwidth between two or more points

- Determine bottlenecks

- Assists with black box testing or “what happens if” scenarios

- Use a tool that runs on both Linux and Windows

- And more…

IPERF: A Visual Breakdown

IPERF has to be installed on/at at least two separate end points.  One point acts a server/receiver and the other point acts as a client/transmitter.  This so network testing can be done on a simple subnet to a complex, routed network: end-to-end using TCP or UDP generated traffic:

The visual shows an IPERF client transmitting data over IPv4 to an IPERF receiver.  Packets traverse the network - from wireless routers and through firewalls - from the client side to the server side to over port 5001.

IPERF and XenServer

The key to network testing is in remembering that any device which is connected to a network infrastructure – Virtual or Physical – is a node, host, target, end point, or just simply … a networked device.

With regards to virtual machines, XenServer obviously supports Windows and Linux operating systems.  IPERF can be used to test virtual-to-virtual networking as well as virtual-to-physical networking.  If we stack virtual machines in a box to our left and stack physical machines in a box to our right – despite a common subnet or routed network – we can quickly see the permutations of how "Virtual and Physical Network Testing" can be achieved with IPERF transmitting data from one point to another:

And if one wanted, they could just as easily test networking for this:

Requirements

To illustrate a basic server/client model with IPERF, the following will be required:

- A Windows 7 VM that will act as an IPERF client

- A CentOS 5.x VM that will act as a receiver.

- IPERF2 (the latest version of IPERF, or "IPERF3" can be found at https://github.com/esnet/iperf or, more specifically, http://downloads.es.net/pub/iperf/)

The reason for using IPERF2 is quite simple: portability and compatibility on two of the most popular operating systems that I know are virtualized.  In addition, the same steps to installing IPERF2 on these hosts can be carried out on physical systems running similar operating systems, as well. 

The remainder of this article - regarding IPERF2 - will require use of the MS-DOS command-line as well as the Linux shell (of choice).  I will carefully explain all commands as so if you are “strictly a GUI” person, you should fit right in.

Disclaimer

When utilizing IPERF2, keep in mind that this is a traffic generator.  While one can control the quantity and duration of traffic, it is still network traffic

So, consider testing during non-peak hours or after hours as to not interfere with production-based network activity.

Windows and IPERF

The Windows port of IPERF 2.0.5 requires Windows XP (or greater) and can be downloaded from:

http://sourceforge.net/p/iperf/patches/_discuss/thread/20d4a4b0/5c44/attachment/Iperf.zip

Within the .zip file you will find two directories.  One is labeled DEBUG and the other is labeled RELEASE.  Export the Iperf.exe program to a directory you will remember, such as C:\iperf\

Now, accessing the command line (cmd.exe), navigate to C:\iperf\ and execute:

iperf

The following output should appear:

Linux and IPERF

If you have additional repos already configured for CentOS, you can simply execute (as root):

yum install iperf

If that fails, one will need to download the Fedora/RedHat EPEL-Release RPM file for the version of CentOS being used.  To do this (as root), execute:

wget  http://dl.fedoraproject.org/pub/epel/5/i386/epel-release-5-4.noarch.rpm
rpm -Uvh epel-release-5-4.noarch.rpm

 

*** Note that the above EPEL-Release RPM file is just an example (a working one) ***

 

Once epel-release-5-4.noarch.rpm is installed, execute:

yum install iperf

And once complete, as root execute iperf and one should see the following output:

http://cdn.ws.citrix.com/wp-content/uploads/2014/06/CMD2.png?__utma=222274247.1078613845.1409810797.1412210514.1412210784.2&__utmb=222274247.5.8.1412227628611&__utmc=222274247&__utmx=-&__utmz=222274247.1412210514.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)&__utmv=222274247.|1=my%20account%20holder=y=1^14=industry=(Non-company%20Visitor)=1^15=sub_industry=(Non-company%20Visitor)=1^16=employee_count=(Non-company%20Visitor)=1^17=company_name=(Non-company%20Visitor)=1^18=primary_sic=(Non-company%20Visitor)=1^19=registry_dma_code=(Non-company%20Visitor)=1&__utmk=208580497

Notice that it is the same output as what is being displayed from Windows.  IPERF2 is expecting a "-s" (server) or "-c" (client) command-line option with additional arguments.

IPERF Command-Line Arguments

On either Windows or Linux, a complete list of options for IPERF2 can be listed by executing:

iperf –help

A few good resources of examples to use IPERF2 options for the server or client can be referenced at:

http://www.slashroot.in/iperf-how-test-network-speedperformancebandwidth

http://samkear.com/networking/iperf-commands-network-troubleshooting

http://www.techrepublic.com/blog/data-center/handy-iperf-commands-for-quick-network-testing/

For now, we will focus on the options needed for our server and client:

-f, –format    [kmKM]   format to report: Kbits, Mbits, KBytes, MBytes
-m, –print_mss          print TCP maximum segment size (MTU – TCP/IP header)
-i, –interval  #        seconds between periodic bandwidth reports
-s, –server             run in server mode
-c, –client    <host>   run in client mode, connecting to <host>
-t, –time      #        time in seconds to transmit for (default 10 secs)

Lastly, there is a TCP/IP Window setting.  This goes beyond the scope of this document as it relates to the TCP frame/windowing of data.  I highly recommend reading either of the two following links – especially for Linux – as there has always been some debate as what is “best to be used”:

https://kb.doit.wisc.edu/wiscnet/page.php?id=11779

http://kb.pert.geant.net/PERTKB/IperfTool

Running An IPERF Test

So, we have IPERF2 installed on Windows 7 and on CentOS 5.10.  Before one performs any testing, ensure any AV does not block iperf.exe from running as well as port 5001 being opened across the network network.

Again, another port can be specified, but the default port IPERF2 uses for both client and server is 5001.

Server/Receiver Side

The Server/Receiver side will be on the CentOS VM.

Following the commands above, we want to execute the following to run IPERF2 as a server/receiver from our Windows 7 client machine:

iperf -s -f M -m -i 10

The output should show:

————————————————————
Server listening on TCP port 5001
TCP window size: 0.08 MByte (default)
————————————————————

The TCP window size has been previously commented on and the server is now ready to accept connections (press Control+C or Control+Z to exit).

Client/Transmission Side

Let us now focus on the client side to start sending data from the Windows 7 VM to the CentOS VM.

From Windows 7, the command line to start transmitting data for 30 seconds to our CentOS host (x.x.x.48) is:

iperf -c x.x.x.48 -t 30 -f M

Pressing enter, the traffic flow begins and the output from the client side looks like this:

From the server side, the output looks something like this:

And there we have it – a first successful test from a Windows 7 VM (located on one XenServer) to a CentOS 5.10 VM (located on another XenServer).

Understanding the Results

From either the client side or server side, results are shown by time and average.  The key item to look for from either side is:

0.0-30.0 sec  55828 MBytes  1861 MBytes/sec

Why?  This shows the average over the course of 0.0 to 30.0 seconds in terms of total megabytes transmitted as well as average megabytes of data sent per second.  In addition, since the "-f M" argument was passed as a command-line option, the output is calculated in megabytes accordingly.

In this particular case, we simply illustrated that from one VM to another VM, we transferred data at 1861 megabytes per second.

*** Note that this test was performed in a local lab with lower-end hardware than what you probably have! ***

--jkbs | @xenfomation

 

Recent Comments
chaitanya
Hi, Nice article.. I have a simple question.. you did this test for windows and linux os. Any specific requirement on that? I d... Read More
Monday, 10 November 2014 16:59
JK Benedict
Exactly: to show that IPERF can be used in any configuration, any school of thought, etc! Windows Windows Linux Linux Linux Wi... Read More
Wednesday, 12 November 2014 03:08
Massimo De Nadal
Hi, your throughput is 1861 MB/sec which means more than 14Gb !!!! Can I ask you what kind of server/setup are you using ??? I'... Read More
Tuesday, 11 November 2014 12:24
Continue reading
46513 Hits
15 Comments

Before Electing a New Pool Master

Overview

The following is a reminder of specific steps to take before electing a new pool master - especially in High Availability-enabled deployments.  Albeit, there are circumstances where this will happen automatically due to High Availability (by design) or in an emergency situation, but never-the-less, the following steps should be taken when electing a new pool master where High Availability is enabled.

Disable High Availability

Before electing a new master one must disable High Availability.  The reason is quite simple:

If a new host is designated as master with HA enabled, the subsequent processes and transition time can lead to HA see that a pool member is down.  It is doing what it is supposed to do from the "mathematical" sense, but from "reality" it is actually confused.

The end result is that HA could either recover with some time or fence as it attempts to apply fault tolerance in contradiction to the desire to "simply elect a new master".

It is also worth noting that upon recovery - if any Guests which had a mounted ISO are rebooted on another host - that "VDI not found" errors can appear although this is not the case.  The ISO image that is mounted is seen as a VDI and if that resource is not available on another host, the Guest VM will fail to resume: presenting the generic VDI error.

Steps to Take

HA must be disabled and for safe practice, I always recommend ejecting all mounted ISO images.  The latter can be accomplished by executing the following from the pool master:

xe vm-cd-eject --multiple

As for HA it can be disabled in two ways: via the command-line or from XenCenter.

From the command line of the current pool master, execute:

xe pool-ha-disable
xe pool-sync

If desired - just for safe guarding one's work - those commands can be executed on every other pool member.

As for XenCenter one can select the Pool/Pool Master icon in question and from the "HA" tab, select the option to disable HA for the pool.

Workload Balancing

For versions of XenServer utilizing Workload Balancing it is not necessary to halt this.

Now that HA is disabled, switch Pool Masters and when all servers are in an active state: re-enable HA from XenCenter or from the command line:

xe pool-recover-slaves
xe pool-ha-enable

I hope this is helpful and as always: questions and comments are welcomed!

 

--jkbs | @xenfomation

Continue reading
17286 Hits
0 Comments

PowerShell SDK examples

Santiago Cardenas from the Citrix Solutions Lab has written a blog post that caught my eye. It's entitled Scripting: Automating VM operations on XenServer using PowerShell, and in it he describes how the Solutions Lab has been using the XenServer PowerShell SDK to automate XenServer deployments at scale. The thing I found most interesting was that he's included several example scripts for common operations, which could be very useful to other people.

If anyone else has example scripts in any of our five SDK languages (C, C#, Java, Python and PowerShell), and would like to share them with the community, please put a note in the comments below. We would love to link to more examples, and maybe even include them in the distribution.

PS If you're interested in the PowerShell SDK, also check out the blog post that Konstantina Chremmou wrote here in May describing improvements in the SDK since the 6.2 release.

Continue reading
13840 Hits
0 Comments

Log Rotation and Syslog Forwarding

A Continuation of Root Disk Management

First, this article is applicable to any sized XenServer deployment and secondly, it is a continuation off of my previous article regarding XenServer root disk maintenance.  The difference is that - for all XenServer deployments - the topic revolves specifically with that of Syslog: from tuning log rotation, specifying the amount of logs to retain, leveraging compression, and of course... Syslog forwarding.

All of this is an effort to share tips to new (or seasoned) XenServer Administrators in the options available to ensure necessary Syslog data does not fill a XenServer root disk while ensuring - for certain industry specific requirements - that log-specific data is retained without sacrafice.

Syslog: A Quick Introduction

So, what is this Syslog?  In short it can be compared to the Unix/Linux equivalent of Windows Event Log (along with other logging mechanisms popular to specific applications/Operating Systems). 

The slightly longer explanation is that Syslog is not only a daemon, but also a protocol: established long ago for Unix systems to record system and application to local disk as well as offering the ability to forward the same log information to its peers for redundancy, concentration, and to conserve disk space on highly active systems.  For more detailed information on the finer details of the Syslog protocol and daemon one can review the IETF's specification at http://tools.ietf.org/html/rfc5424.

On a stand-alone XenServer, the Syslog daemon is started on boot and its configuration file for handling source, severity, types of logs, and where to store them are defined in /etc/syslog.conf.  It is highly recommended that one does not alter this file unless necessary and if one knows what they are doing.  From boot to reboot, information is stored in various files: found under the root disk's /var/log directory.

Taken from a fresh installation of XenServer, the following shows various log files that store information specific to a purpose.  Note that the items in "brown" are sub-directories:

For those seasoned in administering XenServer it is visible that from the kernel-level and user-space level there are not many log files.  However, XenServer is verbose about logging for a very simple reason: collection, analysis, and troubleshooting if an issue should arise.

So for a lone XenServer (by default) logs are essentially received by the Syslog daemon and based on /etc/syslog.conf - as well as the source and type of message - stored on the local root file system as discussed:

Within a pooled XenServer environment things are pretty much the same: for the most part.  As a pool has a master server, log data for the Storage Manager (as a quick example) is trickled up to the master server.  This is to ensure that while each pool member is recording log data specific to itself, the master server has the aggregate log data needed to promote troubleshooting of the entire pool from one point.

Log Rotation

Log rotation, or "logrotate", is what ensures that Syslog files in /var/log do not grow out of hand.  Much like Syslog, logrotate utilizes a configuration file to dictate how often, at what size, and if compression should be used when archiving a particular Syslog file.  The term "archive" is truly meant for rotating out a current log in place of a fresh, current log to take its place.

Post XenServer installation and before usage, one can measure the amount of free root disk space by executing the following command:

df -h

The output will be similar to the following and the line one should be most concerned with is in bold font:

Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             4.0G  1.9G  2.0G  49% /
none                  381M   16K  381M   1% /dev/shm
/opt/xensource/packages/iso/XenCenter.iso
                       52M   52M     0 100% /var/xen/xc-install

Once can see by the example that only 49% of the root disk on this XenServer host has been used.  Repeating this process as implementation ramps up, an administrator should be able to measure how best to tune logrotate's configuration file for after install, /etc/logrotate should resemble the following:

# see "man logrotate" for details
# rotate log files weekly
weekly
# keep 4 weeks worth of backlogs
rotate 4
# create new (empty) log files after rotating old ones
create
# uncomment this if you want your log files compressed
#compress
# RPM packages drop log rotation information into this directory
include /etc/logrotate.d
# no packages own wtmp -- we'll rotate them here
/var/log/wtmp {
    monthly
    minsize 1M
    create 0664 root utmp
    rotate 1
}
/var/log/btmp {
    missingok
    monthly
    minsize 1M
    create 0600 root utmp
    rotate 1
}
# system-specific logs may be also be configured here.

In previous versions, /etc/logrotate.conf was setup to retain 999 archived/rotated logs, but as of 6.2 the configuration above is standard. 

Before covering the basic premise and purpose of this configuration file, one can see this exact configuration file explained in more detail at http://www.techrepublic.com/article/manage-linux-log-files-with-logrotate/

The options declared in the default configuration are conditions that, when met, rotate logs accordingly:

  1. The first option specifies when to invoke log rotation.  By default this is set to weekly and may need to be adjusted for "daily".  This will only swap log files out for new ones and will not delete log files.
  2. The second option specifies how long to keep archived/rotate log files on the disk.  The default is to remove archived/rotated log files after a week.  This will delete log files that meet this age.
  3. The third options specifies what to do after rotating a log file out.  The default - which should not be changed is to create a new/fresh log after rotating out its older counterpart.
  4. The fourth option - which is commented out - specifies another what to do, but this time for the archived log files.  It is highly recommended to remove the comment mark so that archived log files are compressed: saving on disk space.
  5. A fifth option which is not present in the default conf is the "size" option.  This specifies how to handle logs that reach a certain size, such as "size 15M".  This option should be employed: especially if an administrator has SNMP logs that grow exponentially or notices that the particular XenServer's Syslog files are growing faster than logrotate can rotate and dispose of archived files.
  6. The "include" option specifies a sub-directory wherein unique, logrotate configurations can be specified for individual log files.
  7. The remaining portion should be left as is


In summary for logrotate, one is advised to measure use of the root disk using "df -h" and to tune logrotate.conf as needed to ensure Syslog does not inadvertently consume available disk space.

And Now: Syslog Forwarding

Again, this is a long standing feature and one I have been looking forward to explaining, highlighting, and providing examples for.  However, I have had a kind of writers block for many reasons: mainly that it ties into Syslog, Logrotate, and XenCenter, but also that there is a tradeoff.

I mentioned before that Syslog can forward messages to other hosts.  Furthermore, it can forward Syslog messages to other hosts without writing a copy of the log to local disk.  What this means is that a single XenServer or a pool of XenServers can send their log data to a "Syslog Aggregator".

The trade off is that one cannot generate a server status report via XenCenter, but instead gather the logs from the Syslog aggregate server and manually submit them for review.  That being said, one can ensure that low root disk space is not nearly as high of a concern on the "Admin Todo List" and can retain vast amounts of log data for a deployment of any size: based on dictated industry practices or for, sarcastically, nostalgic purposes.

The principles with Syslog and logrotate.conf will apply to the Syslog Aggregator as what good is a Syslog server if not configured properly as to ensure it does not fill itself up?  The requirements to instantiate a Syslog aggregation server, configure the forwarding of Syslog messages, and so forth are quite simple:

  1. Port 514 must be opened on the network
  2. The Syslog aggregation server must be reachable - either by being on the same network segment or not - by each XenServer host
  3. The Syslog aggregation server can be a virtual or physical machine; Windows or Linux-based with either a native Syslog daemon configured to receive external host messages or using a Windows-based Syslog solution offering the same "listening" capabilities.
  4. The Syslog aggregation server must have a static IP assigned to it
  5. The Syslog aggregation server should be monitored and tuned just as if it were Syslog/logrotate on a XenServer
  6. For support purposes, logs should be easily copied/compressed from the Syslog aggregation server - such as using WinSCP, scp, or other tools to copy log data for support's analysis

The quickest means to establish a simple virtual or physical Syslog aggregation server - in my opinion - is to reference the following two links.  These describe the installation of a base Debian-based system with specific intent to leverage Rsyslog for the recording of remote Syslog messages sent to it over UDP port 514 from one's XenServers:

http://www.aboutdebian.com/syslog.htm

http://www.howtoforge.com/centralized-rsyslog-server-monitoring

Alternatively, the following is an all-in-one guide (using Debian) with Syslog-NG:

http://www.binbert.com/blog/2010/04/syslog-server-installation-configuration-debian/

Once the server is instantiated and ready to record remote Syslog messages, it is time to open XenCenter.  Right click on a pool master or stand-alone XenServer and select "Properties":


In the window that appear - in the lower left-hand corner - is an option for "Log Destination":

To the right, one should notice the default option selected is "Local".  From there, select the "Remote" option and enter the IP address (or FQDN) of the remote Syslog aggregate server as follows:

Finally, select "OK" and the stand-alone XenServer (or pool) will update its Syslog configuration, or more specifically, /var/lib/syslog.conf.  The reason for this is so Elastic Syslog can take over the normal duties of Syslog: forwarding messages to the Syslog aggregator accordingly.

For example, once configured, the local /var/log/kern.log file will state:

Sep 18 03:20:27 bucketbox kernel: Kernel logging (proc) stopped.
Sep 18 03:20:27 bucketbox kernel: Kernel log daemon terminating.
Sep 18 03:20:28 bucketbox exiting on signal 15

Certain logs will still continue to record Syslog on the host, so it may be desirable to edit /var/lib/syslog.conf and add comments to lines where a "-/var/log/some_filename" is specified as lines with "@x.x.x.x" dictate to forward to the Syslog aggregator.  As an example, I have marked the lines in bold to show where comments should be added to prevent further logging to the local disk:

# Save boot messages also to boot.log
local7.*             @10.0.0.1
# local7.*         /var/log/boot.log

# Xapi rbac audit log echoes to syslog local6
local6.*             @10.0.0.1
# local6.*         -/var/log/audit.log

# Xapi, xenopsd echo to syslog local5
local5.*             @10.0.0.1
# local5.*         -/var/log/xensource.log

After one - The Administrator - has decided what logs to keep and what logs to forward, Elastic Syslog can be restarted as so the changes take affect by executing:

/etc/init.d/syslog restart

Since Elastic Syslog - a part of XenServer - is being utilized, the init script will ensure that Elastic Syslog is bounced and that it is responsible for handling Syslog forwarding, etc.

 

So, with this - I hope you find it useful and as always: feedback and comments are welcomed!

 

--jkbs | @xenfomation

 

 

 

Recent Comments
Tobias Kreidl
Super nice post, Jesse! One great reason to have logs on more than one server is that if there is ever a security issue, you stan... Read More
Thursday, 18 September 2014 17:12
JK Benedict
I could NOT agree more, Tobias and why I have been testing, experimenting, and really just trying to push the bounds as far as I c... Read More
Saturday, 27 September 2014 08:53
JK Benedict
Thank you, Tobias! Indeed, RSyslog and a base Debian install is my preferred choice for Syslog aggregation due to exactly what yo... Read More
Friday, 19 September 2014 03:08
Continue reading
51685 Hits
16 Comments

XenServer Root Disk Maintenance

The Basis for a Problem

UPDATE 21-MAR-2015: Thanks to feedback from our community, I have added key notes and additional information to this article.

For all that it does, XenServer has a tiny installation footprint: 1.2 GB (roughly).  That is the modern day equivalent of a 1.44" disk, really.  While the installation footprint is tiny, well, so is the "root/boot" partition that the XenServer installer creates: 4GB in size - no more, no less, and don't alter it! 

The same is also true - during the install process - for the secondary partition that XenServer uses for upgrades and backups:

The point is that this amount of space does not facilitate much room for log retention, patch files, and other content.  As such, it is highly important to tune, monitor, and perform clean-up operations on a periodic basis.  Without attention over time all hotfix files, syslog files, temporary log files, and other forms of data can accumulate until the point with which the root disk will become full.

UPDATE: If you are wondering where the swap partition is, wonder no more.  For XenServer, swap is file-based and is instantiated during the boot process of XenServer.  As for the 4GB partitions, never alter the size of these partitions upgrades, etc will re-align the partitions to match upstream XenServer release specifications.

One does not want a XenServer (or any server for that matter) to have a full root disk as this will lead to a full stop of processes as well as virtualization for the full disk will go "read only".  Common symptoms are:

  • VMs appear to be running, but one cannot manage a XenServer host with XenCenter
  • One can ping the XenServer host, but cannot SSH into it
  • If one can SSH into the box, one cannot write or create files: "read only file system" is reported
  • xsconsole can be used, but it returns errors when "actions" are selected

So, while there is a basis for a problem, the following article offers the basis for a solution (with emphasis on regular administration).

Monitoring the Root Disk

Shifting into the first person, I am often asked how I monitor my XenServer root disks.  In short, I utilize tools that are built into XenServer along with my own "Administrative Scripts".  The most basic way to see how much space is available on a XenServer's root disk is to execute the following:

df -h

This command will show you "disk file systems" and the "-h" means "human readable", ie Gigs, Megs, etc.  The output should resemble the following and I have made the line we care about in bold font:

Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             4.0G  1.9G  1.9G  51% /
none                  299M   28K  299M   1% /dev/shm
/opt/xensource/packages/iso/XenCenter.iso
                       56M   56M     0 100% /var/xen/xc-install

A more "get to the point" way is to run:

df -h | grep "/$" | head -n 1

Which produces the line we are concerned with:

/dev/sda1             4.0G  1.9G  1.9G  51% /

The end result is that we know 51% of the root partition is used.  Not bad, really.  Still, I am a huge fan of automation and will now discuss a simple way that this task can be ran - automatically - for each of your XenServers.

What I am providing is essentially a simple BASH script that checks a XenServer's local disk.  If the local disk use exceeds a threshold (which you can change), it will send an alert to XenCenter so the the tactics described further in this document can be employed for the assurance of as much free space as possible.

Using nano or VI, create a file in the /root/ (root's home) directory called "diskmonitor" and paste in the following content:

#!/bin/bash
# Quick And Dirty Disk Monitoring Utility
# Get this host's UUID
thisUUID=`xe host-list name-label=$HOSTNAME params=uuid --minimal`
# Threshold of disk usage to report on
threshold=75    # an example of how much disk can be used before alerting
# Get disk usage
diskUsage=`df -h | grep "/$" | head -n 1 | awk {' print $5 '} | sed -n -e "s/%//p"`
# Check
if [ $diskUsage -gt $threshold ]; then
     xe message-create host-uuid=$thisUUID name="ROOT DISK USAGE" body="Disk space use has exceeded $diskUsage on `echo $HOSTNAME`!" priority="1"
fi

After saving this file be sure to make it executable:

chmod +x /root/diskmonitor

The "#!/bin/bash" at the start of this script now becomes imperative as it tells the user space (when called upon) to use the BASH interpreter.

UPDATE: To execute this script manually, one can execute the following command if in the same directory as this script:

./diskmonitor

This convention is used so that scripts can be execute just as if they were a binary/compiled piece of code.  If the "./" prefix is an annoyance, move /root/diskmonitor to /sbin/ -- this will ensure that one can execute diskmonitor without the "dot forward-slash" prefix while in other directories:

mv /root/diskmonitor /sbin/
# Now you should be able to execute diskmonitor from anywhere
diskmonitor

If you move the diskmonitor script make note of where you placed it as this directory will be needed for the cron entry.

For automation of the diskmonitor script one can now leverage cron: adding an entry to root's "crontab" and specify a recurring time diskmonitor should be executed (behind the scenes). 

The following is a basic outline as how to leverage cron so that diskmonitor will be executed four times per day.  Now, if you are looking for more information regarding cron, what it does, and how to configure it for other automation-based task then visit http://www.thegeekstuff.com/2009/06/15-practical-crontab-examples/ for more detailed examples and explanations.

1.  From the XenServer host command-line execute the following to add an entry to crontab for root:

crontab -e

2.  This will open root's crontab in VI or nano (text editors) where one will want to add one of the following lines based on where diskmonitor has been moved to or if it is still located in the /root/ directory:

# If diskmonitor is still located in /root/
00 00,06,12,18 * * * ./root/diskmonitor
# OR if it has been moved to the /sbin/ directory
00 00,06,12,18 * * * diskmonitor

3.  After saving this, we now have a cron entry that runs diskmonitor at midnight, six in the morning, noon, and 6 in the evening (military time) for every day of every week of every month.  If the script detects that the root drive on a XenServer is > 75% "used" (you can adjust this), it will send an alert to XenCenter where one can leverage - further - built in tools for email notifications, etc. 

The following is an example of the output of diskmonitor, but it is apropos to note that the following test was done using a threshold of 50% -- yes, in Creedence there is a bit more free space!  Kudos to Dev!

One can expand upon the script (and XenCenter), but lets focus on a few areas where root disk usage can be slowly consumed.

Removing Old Hotfixes

After applying one or more hotfixes to XenServer, copies of each decompressed hotfix are stored in /var/patch.  The main reason for this - in short - is that in pooled environments, hotfixes are distributed from a host master to each host slave to eliminate the need to repetitively download one hotfix multiplied by the number of hosts in a pool. 

The more complex reason is for consistency, for if a host becomes the master of the pool, it must reflect the same content and configuration as its predecessor did and this includes hotfixes.

The following is an example of what the /var/patch/ directory can look like after the application of one or more hotfixes:

Notice the /applied sub-directory?  We never want to remove that. 

UPDATE 21-MAR-2015:  Thanks to Tim, the Community Comments, and my Senior Lead for validating I was not "crazy" in my findings before composing this article: "xe patch-destroy" did not do its job as many commented.  It has been resolved post 6.2, so I thank everyone - especially Dev - for addressing this.

APPROPRIATE REMOVAL:

To appropriately remove these patch files, one can should utilize the "xe patch-destroy" command.  While I do not have a "clever" command-line example to take care of all files at once, the following should be ran against each file that has a UUID-based naming convention:

cd /var/patch/

xe patch-destroy uuid=<FILENAME, SUCH AS 4d2caa35-4771-ea0e-0876-080772a3c4a7>
(repeat "xe patch-destroy uuid=" command for each file with the UUID convention)

While this is not optimum, especially to run per-host in a pool, it is the prescribed method and as I have a more automated/controlled solution, I will naturally document it.

EMERGENCY SITUATIONS:

In the event that removal of other contents discussed in this article does not resolve a full root disk issue, the following can be used to remove these patch files.  However, it must be emphasized that a situation could arise wherein the lack of these files will require a re-download and install of said patches:

find /var/patch -maxdepth 1 | grep "[0-9]" | xargs rm -f

Finally, if you are in the middle of applying hotfixes do not perform the removal procedure (above) until all hosts are rebooted, fully patched, and verified as in working order.  This applies for pools - especially - where a missing patch file could throw off XenCenter's perspective of what hotfixes have yet to be installed and for which host.

The /tmp Directory

Plain and simple, the /tmp directory is truly meant for just that: holding temporary data.  Pre-Creedence, one can access a XenServer's command-line and execute the following to see a quantity of ".log" files:

cd /tmp
ls

As visualized (and overtime) one can see that an accumulation of many, many log files.  Albeit, these are small at the individual file perspective, but collectively... they take up space.

UPDATE 21-MAR-2015:  Again, thanks to everyone as these logs were always intended to be "removed" automatically once a Guest VM was started.  So, as of 6.5 and beyond -- this section is irrelevant!

cd /tmp/
rm -rf *.log

This will remove only ".log" files so any driver ISO images stored in /tmp (or elsewhere) should be manually addressed.

Compressed Syslog Files

The last item is to remove all compressed Syslog files stored under /var/log.  These usually consume the most disk space and as such, I will be authoring an article shortly to explain how one can tune logrotate and even forward these messages to a Syslog aggregator.

UPDATE:  As a word of of advice, we are only looking to clear "*.gz" (compressed/archived) log files.  Once these are deleted, they are gone.  Naturally this means an server status report gathered for collection will lack historical information so one may consider copying these off to another host (using scp or WinSCP) before following the next steps to remove them under a full root disk scenario.

In the meantime, just as before one can execute the following command to keep current syslog files in-tact, but remove old, compressed log files:

cd /var/log/
rm -rf *gz

So For Now...

It is at this point one has a tool to know when a disk has hit capacity and methods with which to clean-up specific items.  This can be taken by the admin to be ran in an automated fashion or manual fashion.  It is truly up to the admin's style of work.

Please be on the lookout for my next article involving Syslog forwarding, logrotation, and so forth as this will help any size deployment of XenServer: especially where regulations for log retention is a strict requirement.

Feel free to post any questions, suggestions, or methods you may even use to ensure XenServer's root disk does not fill up.

 

--jkbs | @xenfomation

 

 

Recent Comments
JK Benedict
Just as an update, Heinrich - my Beta 3 system is at 48% post-install and with a PV/HVM Debian 7 Guest running (I will be posting ... Read More
Saturday, 27 September 2014 08:55
JK Benedict
Heinrich, Quite welcome, sir!! Different versions of XenServer naturally leave different footprints, but 60-65% is where my syste... Read More
Tuesday, 16 September 2014 13:05
JK Benedict
Yup: my error. I grew this simple script to be modular: for pools and other data. The correct syntax for the "xe message" line ne... Read More
Tuesday, 16 September 2014 13:16
Continue reading
148573 Hits
51 Comments

About XenServer

XenServer is the leading open source virtualization platform, powered by the Xen Project hypervisor and the XAPI toolstack. It is used in the world's largest clouds and enterprises.
 
Technical support for XenServer is available from Citrix.