By Tobias Kreidl and Duane Booher
Northern Arizona University, Information Technology Services
Over eight years ago, back in the days of XenServer 5, not a lot of backup and restore options were available, either as commercial products or as freeware, and we quickly came to the realization that data recovery was a vital component to a production environment and hence we needed an affordable and flexible solution. The conclusion at the time was that we might as well build our own, and though the availability of options has grown significantly over the last number of years, we’ve stuck with our own home-grown solution which leverages Citrix XenServer SDK and XenAPI (http://xenserver.org/partners/developing-products-for-xenserver.html). Early versions were created from the contributions of Douglas Pace, Tobias Kreidl and David McArthur. During the last several years, the lion’s share of development has been performed by Duane Booher. This article discusses the latest 3.0 release.
A Bit of History
With the many alternatives now available, one might ask why we have stuck with this rather un-flashy script and CLI-based mechanism. There are clearly numerous reasons. For one, in-house products allow total control over all aspects of their development and support. The financial outlay is all people’s time and since there are no contracts or support fees, it’s very controllable and predictable. We also found from time-to-time that various features were not readily available in other sources we looked at. We also felt early on as an educational institution that we could give back to the community by freely providing the product along with its source code; the most recent version is available via GitHub at https://github.com/NAUbackup/VmBackup for free under the terms of the GNU General Public License. There was a write-up at https://www.citrix.com/blogs/2014/06/03/another-successful-community-xenserver-sdk-project-free-backup-tools-and-scripts-naubackup-restore-v2-0-released/ when the first GitHub version was published. Earlier versions were made available via the Citrix community site (Citrix Developer Network), sometimes referred to as the Citrix Code Share, where community contributions were published for a number of products. When that site was discontinued in 2013, we relocated the distribution to GitHub.
Because we “eat our own dog food,” VMbackup gets extensive and constant testing because we rely on it ourselves as the means to create backups and provide for restores for cases of accidental deletion, unexpected data corruption, or in the event that disaster recovery might be needed. The mechanisms are carefully tested before going into production and we perform frequent tests to ensure the integrity of the backups and that restores really do work. A number of times, we have relied on resorting to recovering from our backups and it has been very reassuring that these have been successful.
What VMbackup Does
Very simply, VMbackup provides a framework for backing up virtual machines (VMs) hosted on XenServer to an external storage device, as well as the means to recover such VMs for whatever reason that might have resulted in loss, be it disaster recovery, restoring an accidentally deleted VM, recovering from data corruption, etc.
The VMbackup distribution consists of a script written in Python and a configuration file. Other than a README document file, that’s it other than the XenServer SDK components which one needs to download separately; see http://xenserver.org/partners/developing-products-for-xenserver.html for details. There is no fancy GUI to become familiar with, and instead, just a few simple things that need to be configured, plus a destination for the backups needs to be made accessible (this is generally an NFS share, though SMB/CIFS will work, as well). Using cron job entries, a single host or an entire pool can be set up to perform periodic backups. Configurations on individual hosts in a pool are needed in that the pool master performs the majority of the work and it can readily change to a different XenServer, while individual host-based instances are also needed when local storage is also made use of, since access to any local SRs can only be performed from each individual XenServer. A cron entry and numerous configuration examples are given in the documentation.
To avoid interruptions of any running VMs, the process of backing up a VM follows these basic steps:
- A snapshot of the VM and its storage is made
- Using the xe utility vm-export, that snapshot is exported to the target external storage
- The snapshot is deleted, freeing up that space
In addition, some VM metadata are collected and saved, which can be very useful in the event a VM needs to be restored. The metadata include:
- vm.cfg - includes name_label, name_description, memory_dynamic_max, VCPUs_max, VCPUs_at_startup, os_version, orig_uuid
- DISK-xvda (for each attached disk)
- vbd.cfg - includes userdevice, bootable, mode, type, unplugable, empty, orig_uuid
- vdi.cfg - includes name_label, name_description, virtual_size, type, sharable, read_only, orig_uuid, orig_sr_uuid
- VIFs (for each attached VIF)
- vif-0.cfg - includes device, network_name_label, MTU, MAC, other_config, orig_uuid
An additional option is to create a backup of the entire XenServer pool metadata, which is essential in dealing with the aftermath of a major disaster that affects the entire pool. This is accomplished via the “xe pool-dump-database” command.
In the event of errors, there are automatic clean-up procedures in place that will remove any remnants plus make sure that earlier successful backups are not purged beyond the specified number of “good” copies to retain.
There are numerous configuration options that allow to specify which VMs get backed up, how many backup versions are to be retained, whether the backups should be compressed (1) as part of the process, as well as optional report generation and notification setups.
New Features in VMbackup 3.0
A number of additional features have been added to this latest 3.0 release, adding flexibility and functionality. Some of these came about because of the sheer number of VMs that needed to be dealt with, SR space issues as well as with changes coming to the next XenServer release. These additions include:
- VM “preview” option: To be able to look for syntax errors and ensure parameters are being defined properly, a VM can have a syntax check performed on it and if necessary, adjustments can then be made based on the diagnosis to achieve the desired configuration.
- Support for VMs containing spaces: By surrounding VM names in the configuration file with double quotes, VM names containing spaces can now be processed.
- Wildcard suffixes: This very versatile option permits groups of VMs to be configured to be handled similarly, eliminating the need to create individual settings for every desired VM. Examples include “PRD-*”, “SQL*” and in fact, if all VMs in the pool should be backed up, even “*”. There are however, a number of restrictions on wildcard usage (2).
- Exclude VMs: Along with the wildcard option to select which VMs to back up, clearly a need arises to provide the means to exclude certain VMs (in addition to the other alternative, which is simply to rename them such that they do not match a certain backup set). Currently, each excluded VM must be named separately and any such VMs should de defined at the end of the configuration file.
- Export the OS disk VDI, only: In some cases, a VM may contain multiple storage devices (VDIs) that are so large that it is impractical or impossible to take a snapshot of the entire VM and its storage. Hence, we have introduced the means to backup and restore only the operating system device (assumed to be Disk 0). In addition to space limitations, some storage, such as DB data, may not be able to be reliably backed up using a full VM snapshot. Furthermore, the next XenServer release (Dundee) will likely support up to as many as perhaps 255 storage devices per VM, making a vm-export even more involved under such circumstances. Another big advantage here is that currently, this process is much more efficient and faster than a VM export by a factor of three or more!
- Root password obfuscation: So that clear-text passwords associated with the XenServer pool are not embedded in the scripts themselves, the password can be basically encoded into a file.
The mechanism for a running VM from which only the primary disk is to be backed up is similar to the full VM backup. The process of backing up such a VM follows these basic steps:
- A snapshot of just the VM's Disk 0 storage is made
- Using the xe utility vdi-export, that snapshot is exported to the target external storage
- The snapshot is deleted, freeing up that space
As with the full VM export, some metadata for the VM are also collected and saved for this VDI export option.
These added features are of course subject to change in future releases, though typically later editions generally encompass the support of previous versions to preserve backwards compatibility.
Let’s look at the configuration file weekend.cfg:
# Weekend VMs
Comment lines start with a hash mark and may be contained anywhere with the file. The hash mark must appear as the first character in the line. Note that the default number of retained backups is set here to four. The destination directory is set next, indicating where the backups will be written to. We then see a case where only the OS disk is being backed up for the specific VM "PROD-CentOS7-large-user-disks" and below that, all VMs beginning with “PROD” are backed up using the default settings. Just below that, a definition is created for all VMs starting with "DEV-RH" and the default number of backups is reduced for all of these from the global default of four down to three. Finally, we see two excludes for specific VMs that fall into the “PROD*” group that should not be backed up at all.
To launch the script manually, you would issue from the command line:
./VmBackup.py password weekend.cfg
To launch the script via a cron job, you would create a single-line entry like this:
10 0 * * 6 /usr/bin/python /snapshots/NAUbackup/VmBackup.py password
/snapshots/NAUbackup/weekend.cfg >> /snapshots/NAUbackup/logs/VmBackup.log 2>&1
This would run the task at ten minutes past midnight on Saturday and create a log entry called VmBackup.log. This cron entry would need to be installed on each host of a XenServer pool.
It can be helpful to break up when backups are run so that they don’t all have to be done at once, which may be impractical, take so long as to possibly impact performance during the day, or need to be coordinated with when is best for specific VMs (such as before or after patches are applied). These situations are best dealt with by creating separate cron jobs for each subset.
There is a fair load on the server, comparable to any vm-export, and hence the queue is processed linearly with only one active snapshot and export sequence for a VM being run at a time. This is also why we suggest you perform the backups and then asynchronously perform any compression on the files on the external storage host itself to alleviate the CPU load on the XenServer host end.
For even more redundancy, you can readily duplicate or mirror the backup area to another storage location, perhaps in another building or even somewhere off-site. This can readily be accomplished using various copy or mirroring utilities, such as rcp, sftp, wget, nsync, rsync, etc.
This latest release has been tested on XenServer 6.5 (SP1) and various beta and technical preview versions of the Dundee release. In particular, note that the vdi-export utility, while dating back a while, is not well documented and we strongly recommend not trying to use it on any XenServer release before XS 6.5. Doing so is clearly at your own risk.
The NAU VMbackup distribution can be found at: https://github.com/NAUbackup/VmBackup
This is a misleading heading, as there is not really a conclusion in the sense that this project continues to be active and as long as there is a perceived need for it, we plan to continue working on keeping it running on future XenServer releases and adding functionality as needs and resources dictate. Our hope is naturally that the community can make at least as good use of it as we have ourselves.
- Alternatively, to save time and resources, the compression can potentially be handled asynchronously by the host onto which the backups are written, hence reducing overhead and resource utilization on the XenServer hosts, themselves.
- Certain limitations exist currently with how wildcards can be utilized. Leading wildcards are not allowed, nor are multiple wildcards within a string. This may be enhanced at a later date to provide even more flexibility.
This article was written by Tobias Kreidl and Duane Booher, both of Northern Arizona University, Information Technology Services. Tobias' biography is available at this site, and Duane's LinkedIn profile is at https://www.linkedin.com/in/duane-booher-a068a03 while both can also be found on http://discussions.citrix.com primarily in the XenServer forum.