All Things Xen

General ramblings regarding Citrix XenServer & its open source counter part.

XenServer Root Disk Maintenance

The Basis for a Problem

UPDATE 21-MAR-2015: Thanks to feedback from our community, I have added key notes and additional information to this article.

For all that it does, XenServer has a tiny installation footprint: 1.2 GB (roughly).  That is the modern day equivalent of a 1.44" disk, really.  While the installation footprint is tiny, well, so is the "root/boot" partition that the XenServer installer creates: 4GB in size - no more, no less, and don't alter it! 

The same is also true - during the install process - for the secondary partition that XenServer uses for upgrades and backups:

The point is that this amount of space does not facilitate much room for log retention, patch files, and other content.  As such, it is highly important to tune, monitor, and perform clean-up operations on a periodic basis.  Without attention over time all hotfix files, syslog files, temporary log files, and other forms of data can accumulate until the point with which the root disk will become full.

UPDATE: If you are wondering where the swap partition is, wonder no more.  For XenServer, swap is file-based and is instantiated during the boot process of XenServer.  As for the 4GB partitions, never alter the size of these partitions upgrades, etc will re-align the partitions to match upstream XenServer release specifications.

One does not want a XenServer (or any server for that matter) to have a full root disk as this will lead to a full stop of processes as well as virtualization for the full disk will go "read only".  Common symptoms are:

  • VMs appear to be running, but one cannot manage a XenServer host with XenCenter
  • One can ping the XenServer host, but cannot SSH into it
  • If one can SSH into the box, one cannot write or create files: "read only file system" is reported
  • xsconsole can be used, but it returns errors when "actions" are selected

So, while there is a basis for a problem, the following article offers the basis for a solution (with emphasis on regular administration).

Monitoring the Root Disk

Shifting into the first person, I am often asked how I monitor my XenServer root disks.  In short, I utilize tools that are built into XenServer along with my own "Administrative Scripts".  The most basic way to see how much space is available on a XenServer's root disk is to execute the following:

df -h

This command will show you "disk file systems" and the "-h" means "human readable", ie Gigs, Megs, etc.  The output should resemble the following and I have made the line we care about in bold font:

Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             4.0G  1.9G  1.9G  51% /
none                  299M   28K  299M   1% /dev/shm
/opt/xensource/packages/iso/XenCenter.iso
                       56M   56M     0 100% /var/xen/xc-install

A more "get to the point" way is to run:

df -h | grep "/$" | head -n 1

Which produces the line we are concerned with:

/dev/sda1             4.0G  1.9G  1.9G  51% /

The end result is that we know 51% of the root partition is used.  Not bad, really.  Still, I am a huge fan of automation and will now discuss a simple way that this task can be ran - automatically - for each of your XenServers.

What I am providing is essentially a simple BASH script that checks a XenServer's local disk.  If the local disk use exceeds a threshold (which you can change), it will send an alert to XenCenter so the the tactics described further in this document can be employed for the assurance of as much free space as possible.

Using nano or VI, create a file in the /root/ (root's home) directory called "diskmonitor" and paste in the following content:

#!/bin/bash
# Quick And Dirty Disk Monitoring Utility
# Get this host's UUID
thisUUID=`xe host-list name-label=$HOSTNAME params=uuid --minimal`
# Threshold of disk usage to report on
threshold=75    # an example of how much disk can be used before alerting
# Get disk usage
diskUsage=`df -h | grep "/$" | head -n 1 | awk {' print $5 '} | sed -n -e "s/%//p"`
# Check
if [ $diskUsage -gt $threshold ]; then
     xe message-create host-uuid=$thisUUID name="ROOT DISK USAGE" body="Disk space use has exceeded $diskUsage on `echo $HOSTNAME`!" priority="1"
fi

After saving this file be sure to make it executable:

chmod +x /root/diskmonitor

The "#!/bin/bash" at the start of this script now becomes imperative as it tells the user space (when called upon) to use the BASH interpreter.

UPDATE: To execute this script manually, one can execute the following command if in the same directory as this script:

./diskmonitor

This convention is used so that scripts can be execute just as if they were a binary/compiled piece of code.  If the "./" prefix is an annoyance, move /root/diskmonitor to /sbin/ -- this will ensure that one can execute diskmonitor without the "dot forward-slash" prefix while in other directories:

mv /root/diskmonitor /sbin/
# Now you should be able to execute diskmonitor from anywhere
diskmonitor

If you move the diskmonitor script make note of where you placed it as this directory will be needed for the cron entry.

For automation of the diskmonitor script one can now leverage cron: adding an entry to root's "crontab" and specify a recurring time diskmonitor should be executed (behind the scenes). 

The following is a basic outline as how to leverage cron so that diskmonitor will be executed four times per day.  Now, if you are looking for more information regarding cron, what it does, and how to configure it for other automation-based task then visit http://www.thegeekstuff.com/2009/06/15-practical-crontab-examples/ for more detailed examples and explanations.

1.  From the XenServer host command-line execute the following to add an entry to crontab for root:

crontab -e

2.  This will open root's crontab in VI or nano (text editors) where one will want to add one of the following lines based on where diskmonitor has been moved to or if it is still located in the /root/ directory:

# If diskmonitor is still located in /root/
00 00,06,12,18 * * * ./root/diskmonitor
# OR if it has been moved to the /sbin/ directory
00 00,06,12,18 * * * diskmonitor

3.  After saving this, we now have a cron entry that runs diskmonitor at midnight, six in the morning, noon, and 6 in the evening (military time) for every day of every week of every month.  If the script detects that the root drive on a XenServer is > 75% "used" (you can adjust this), it will send an alert to XenCenter where one can leverage - further - built in tools for email notifications, etc. 

The following is an example of the output of diskmonitor, but it is apropos to note that the following test was done using a threshold of 50% -- yes, in Creedence there is a bit more free space!  Kudos to Dev!

One can expand upon the script (and XenCenter), but lets focus on a few areas where root disk usage can be slowly consumed.

Removing Old Hotfixes

After applying one or more hotfixes to XenServer, copies of each decompressed hotfix are stored in /var/patch.  The main reason for this - in short - is that in pooled environments, hotfixes are distributed from a host master to each host slave to eliminate the need to repetitively download one hotfix multiplied by the number of hosts in a pool. 

The more complex reason is for consistency, for if a host becomes the master of the pool, it must reflect the same content and configuration as its predecessor did and this includes hotfixes.

The following is an example of what the /var/patch/ directory can look like after the application of one or more hotfixes:

Notice the /applied sub-directory?  We never want to remove that. 

UPDATE 21-MAR-2015:  Thanks to Tim, the Community Comments, and my Senior Lead for validating I was not "crazy" in my findings before composing this article: "xe patch-destroy" did not do its job as many commented.  It has been resolved post 6.2, so I thank everyone - especially Dev - for addressing this.

APPROPRIATE REMOVAL:

To appropriately remove these patch files, one can should utilize the "xe patch-destroy" command.  While I do not have a "clever" command-line example to take care of all files at once, the following should be ran against each file that has a UUID-based naming convention:

cd /var/patch/

xe patch-destroy uuid=<FILENAME, SUCH AS 4d2caa35-4771-ea0e-0876-080772a3c4a7>
(repeat "xe patch-destroy uuid=" command for each file with the UUID convention)

While this is not optimum, especially to run per-host in a pool, it is the prescribed method and as I have a more automated/controlled solution, I will naturally document it.

EMERGENCY SITUATIONS:

In the event that removal of other contents discussed in this article does not resolve a full root disk issue, the following can be used to remove these patch files.  However, it must be emphasized that a situation could arise wherein the lack of these files will require a re-download and install of said patches:

find /var/patch -maxdepth 1 | grep "[0-9]" | xargs rm -f

Finally, if you are in the middle of applying hotfixes do not perform the removal procedure (above) until all hosts are rebooted, fully patched, and verified as in working order.  This applies for pools - especially - where a missing patch file could throw off XenCenter's perspective of what hotfixes have yet to be installed and for which host.

The /tmp Directory

Plain and simple, the /tmp directory is truly meant for just that: holding temporary data.  Pre-Creedence, one can access a XenServer's command-line and execute the following to see a quantity of ".log" files:

cd /tmp
ls

As visualized (and overtime) one can see that an accumulation of many, many log files.  Albeit, these are small at the individual file perspective, but collectively... they take up space.

UPDATE 21-MAR-2015:  Again, thanks to everyone as these logs were always intended to be "removed" automatically once a Guest VM was started.  So, as of 6.5 and beyond -- this section is irrelevant!

cd /tmp/
rm -rf *.log

This will remove only ".log" files so any driver ISO images stored in /tmp (or elsewhere) should be manually addressed.

Compressed Syslog Files

The last item is to remove all compressed Syslog files stored under /var/log.  These usually consume the most disk space and as such, I will be authoring an article shortly to explain how one can tune logrotate and even forward these messages to a Syslog aggregator.

UPDATE:  As a word of of advice, we are only looking to clear "*.gz" (compressed/archived) log files.  Once these are deleted, they are gone.  Naturally this means an server status report gathered for collection will lack historical information so one may consider copying these off to another host (using scp or WinSCP) before following the next steps to remove them under a full root disk scenario.

In the meantime, just as before one can execute the following command to keep current syslog files in-tact, but remove old, compressed log files:

cd /var/log/
rm -rf *gz

So For Now...

It is at this point one has a tool to know when a disk has hit capacity and methods with which to clean-up specific items.  This can be taken by the admin to be ran in an automated fashion or manual fashion.  It is truly up to the admin's style of work.

Please be on the lookout for my next article involving Syslog forwarding, logrotation, and so forth as this will help any size deployment of XenServer: especially where regulations for log retention is a strict requirement.

Feel free to post any questions, suggestions, or methods you may even use to ensure XenServer's root disk does not fill up.

 

--jkbs | @xenfomation

 

 

Location (Map)

Log Rotation and Syslog Forwarding
XenServer Creedence World Tour kicks off

Related Posts

 

Comments 51

Guest - Heinrich Huber on Tuesday, 16 September 2014 09:14

Thanks, that really helped me.
I almost went out of space, not knowing what can be deleted savely (except compressed update files). Now I'm at least at 65% disk usage - not so bad.

Best regards
HHU

1
Thanks, that really helped me. I almost went out of space, not knowing what can be deleted savely (except compressed update files). Now I'm at least at 65% disk usage - not so bad. Best regards HHU
JK Benedict on Saturday, 27 September 2014 08:55

Just as an update, Heinrich - my Beta 3 system is at 48% post-install and with a PV/HVM Debian 7 Guest running (I will be posting how to achieve that today)!

--jkbs | @xenfomation

0
Just as an update, Heinrich - my Beta 3 system is at 48% post-install and with a PV/HVM Debian 7 Guest running (I will be posting how to achieve that today)! --jkbs | @xenfomation
JK Benedict on Tuesday, 16 September 2014 13:05

Heinrich,

Quite welcome, sir!! Different versions of XenServer naturally leave different footprints, but 60-65% is where my systems hover at. This provides me ample room for monitoring scripts to alert me if, say a specific log begins growing fast.

I will be working with Mr. Mackey to throw my Syslog Forwarding article up as soon as I populate the screenshots and explain the dependency of a remote Syslog server/aggregator!

Thanks for the comments and if you have any questions, just ask!

Cheers!
-jkbs @xenfomation

0
Heinrich, Quite welcome, sir!! Different versions of XenServer naturally leave different footprints, but 60-65% is where my systems hover at. This provides me ample room for monitoring scripts to alert me if, say a specific log begins growing fast. I will be working with Mr. Mackey to throw my Syslog Forwarding article up as soon as I populate the screenshots and explain the dependency of a remote Syslog server/aggregator! Thanks for the comments and if you have any questions, just ask! Cheers! -jkbs @xenfomation
Guest - David Reade on Tuesday, 16 September 2014 10:20

When I run that script, it claims the UUID is invalid?

1
When I run that script, it claims the UUID is invalid?
JK Benedict on Tuesday, 16 September 2014 13:16

Yup: my error. I grew this simple script to be modular: for pools and other data.

The correct syntax for the "xe message" line needs to be:

     xe message-create host-uuid=$thisUUID name="ROOT DISK USAGE" body="Disk space use has exceeded $diskUsage on `echo $HOSTNAME`!" priority="1"

Good catch, David! I salute you!

-jkbs @xenfomation

0
Yup: my error. I grew this simple script to be modular: for pools and other data. The correct syntax for the "xe message" line needs to be:      xe message-create host-uuid=$thisUUID name="ROOT DISK USAGE" body="Disk space use has exceeded $diskUsage on `echo $HOSTNAME`!" priority="1" Good catch, David! I salute you! -jkbs @xenfomation
Guest - David Reade on Tuesday, 16 September 2014 15:41

You're welcome, I'll try again. Thanks for the other tips; whenever I had a 96% warning in XenCenter, I only deleted the .gz files in /var/log. However deleting the files under /var/patch and /tmp has earned me another 0.5GB! Much appreciated! :)

1
You're welcome, I'll try again. Thanks for the other tips; whenever I had a 96% warning in XenCenter, I only deleted the .gz files in /var/log. However deleting the files under /var/patch and /tmp has earned me another 0.5GB! Much appreciated! :)
JK Benedict on Saturday, 27 September 2014 08:56

You are very welcome and to you, David, as well as everyone else - please pardon the edits I had to make initially. Life is fast, 6.5 is only becoming more epic every day and as such - it is comments that each of you leave that make my day!

--jkbs | @xenfomation

0
You are very welcome and to you, David, as well as everyone else - please pardon the edits I had to make initially. Life is fast, 6.5 is only becoming more epic every day and as such - it is comments that each of you leave that make my day! --jkbs | @xenfomation
JK Benedict on Tuesday, 16 September 2014 13:01

David,

As soon as hit the office I will triple check it as it stems from my own monitoring tools I have. I will be certain to update the post and solution here! Thanks for the heads up!!!

0
David, As soon as hit the office I will triple check it as it stems from my own monitoring tools I have. I will be certain to update the post and solution here! Thanks for the heads up!!!
Guest - Christof Giesers on Tuesday, 16 September 2014 16:36

Nice one - with a little mistake:
00 00,06,12,18 * * * ./root/diskcleanup
That should be:
00 00,06,12,18 * * * ./root/diskmonitor
;-)

Regards
- Christof

1
Nice one - with a little mistake: 00 00,06,12,18 * * * ./root/diskcleanup That should be: 00 00,06,12,18 * * * ./root/diskmonitor ;-) Regards - Christof
JK Benedict on Wednesday, 17 September 2014 14:49

And this, Christof is a fine example of time compression and having toooooo many drafts based. Well, that and flat out typos in converting what I test locally into minimalistic public consumption at their own doom! :D

Thanks for catching this as I am updating this document now and again, I greatly appreciate your feedback and the ability to reach out to the community via XenServer.org (thanks, Mr. Mackey!).

--jkbs | @xenfomation

0
And this, Christof is a fine example of time compression and having toooooo many drafts based. Well, that and flat out typos in converting what I test locally into minimalistic public consumption at their own doom! :D Thanks for catching this as I am updating this document now and again, I greatly appreciate your feedback and the ability to reach out to the community via XenServer.org (thanks, Mr. Mackey!). --jkbs | @xenfomation
Guest - Christof Giesers on Tuesday, 16 September 2014 16:36

Nice one - with a little mistake:
00 00,06,12,18 * * * ./root/diskcleanup
That should be:
00 00,06,12,18 * * * ./root/diskmonitor
;-)

Regards
- Christof

0
Nice one - with a little mistake: 00 00,06,12,18 * * * ./root/diskcleanup That should be: 00 00,06,12,18 * * * ./root/diskmonitor ;-) Regards - Christof
Tobias Kreidl on Tuesday, 16 September 2014 18:15

Another good article, Jesse, thank you!

One option to reduce space is to also change the pertinent rotatelog script to either retain fewer versions or to force compression. That way, you don't have to manually do the gzips.
-=Tobias

1
Another good article, Jesse, thank you! One option to reduce space is to also change the pertinent rotatelog script to either retain fewer versions or to force compression. That way, you don't have to manually do the gzips. -=Tobias
JK Benedict on Wednesday, 17 September 2014 14:47

Thanks Tobias and again, SPOT ON! I have been working on the follow up to this -- with screenshots, etc -- to tune logrotate as well as leverage Syslog Forwarding from XenServer/XenCenter. I have to add screen shots as well as a few updates to this article alluding to this as so, only under emergency situations, the community should have to nuke *.gz files from /var/log/

The trade off with Syslog forwarding is, well, gathering logs for analysis. Still, I find that trivial under the premise of regulatory compliance and the need to retain logs for auditing, etc.

As always, you are the scholar and I greatly appreciate the feedback from everyone as again, I am trying to convert previous blogs, Alpha/Beta content, and methods (unsupported, but never-the-less useful for the communities own testing) to run HVMPVs in 6.2 and much, much more!

--jkbs | @xenfomation

0
Thanks Tobias and again, SPOT ON! I have been working on the follow up to this -- with screenshots, etc -- to tune logrotate as well as leverage Syslog Forwarding from XenServer/XenCenter. I have to add screen shots as well as a few updates to this article alluding to this as so, only under emergency situations, the community should have to nuke *.gz files from /var/log/ The trade off with Syslog forwarding is, well, gathering logs for analysis. Still, I find that trivial under the premise of regulatory compliance and the need to retain logs for auditing, etc. As always, you are the scholar and I greatly appreciate the feedback from everyone as again, I am trying to convert previous blogs, Alpha/Beta content, and methods (unsupported, but never-the-less useful for the communities own testing) to run HVMPVs in 6.2 and much, much more! --jkbs | @xenfomation
Tobias Kreidl on Wednesday, 17 September 2014 21:00

That's very true abut syslog forwarding. You, can, of course get the best both of both worlds by both forwarding and retaining a local copy of logs and just compressing the heck out of them on your XenServer. If you ever need to do a serious analysis, you have the means to show that both sets of logs are identical, as well as the potential to do the analysis on an external machine so you don't have to chew up CPU cycles on the XenServer itself. It's a win-win situation for all.
-=Tobias

1
That's very true abut syslog forwarding. You, can, of course get the best both of both worlds by both forwarding [i]and[/i] retaining a local copy of logs and just compressing the heck out of them on your XenServer. If you ever need to do a serious analysis, you have the means to show that both sets of logs are identical, as well as the potential to do the analysis on an external machine so you don't have to chew up CPU cycles on the XenServer itself. It's a win-win situation for all. -=Tobias
JK Benedict on Wednesday, 22 October 2014 00:08

I don't know how I missed this as there are alternative methods for compression -- if I am not mistaken -- and 7-Zip is the first one that comes to mind, good sir!

0
I don't know how I missed this as there are alternative methods for compression -- if I am not mistaken -- and 7-Zip is the first one that comes to mind, good sir!
JK Benedict on Wednesday, 17 September 2014 01:44

Thanks everyone for the feedback -- tons of information in my head trying hash out. I will have an update to this article ASAP for corrections, etc.

--jkbs | @xenfomation

0
Thanks everyone for the feedback -- tons of information in my head trying hash out. I will have an update to this article ASAP for corrections, etc. --jkbs | @xenfomation
Guest - lp on Wednesday, 17 September 2014 15:39

My tmp directory also has metadata.new and metadata.old files. What are these used for and are the ones with .old designation safe to delete?

Thanks,
LP

1
My tmp directory also has metadata.new and metadata.old files. What are these used for and are the ones with .old designation safe to delete? Thanks, LP
JK Benedict on Monday, 22 September 2014 05:34

LP -

I have not determined the source of these -- what version of XenServer are you working with currently? Also, are you on Twitter as I'd like to arrange a means to view these files.

In short - don't delete them :)

--jkbs | @xenfomation

0
LP - I have not determined the source of these -- what version of XenServer are you working with currently? Also, are you on Twitter as I'd like to arrange a means to view these files. In short - don't delete them :) --jkbs | @xenfomation
JK Benedict on Thursday, 18 September 2014 01:13

LP,

Much of the items in the /tmp directory (specifically logs) are not meant to stick around, but as you can see -- sometimes they do. I've been talking to senior colleagues about this - not that it is an error or issue, but why they clear themselves at arbitrary times.

I have not seen this in Creedence, but I have seen it for a long time. My curiosity finally has gotten the best of me, so I have been pinging colleagues as to why the files appear, linger, and go away at times. These files are simply related to domU/Guest VM being established (along with stunnel logs).

As for the metadata files, please do not delete these until I find out what version you have as well as testing on my system first! I can't believe I am saying this, but I had never noticed those before!

Keep me posted on the version of XenServer you are running so I can check my /tmp directory.

--jkbs | @xenfomation

0
LP, Much of the items in the /tmp directory (specifically logs) are not meant to stick around, but as you can see -- sometimes they do. I've been talking to senior colleagues about this - not that it is an error or issue, but why they clear themselves at arbitrary times. [b]I have not[/b] seen this in Creedence, but I have seen it for a long time. My curiosity finally has gotten the best of me, so I have been pinging colleagues as to why the files appear, linger, and go away at times. These files are simply related to domU/Guest VM being established (along with stunnel logs). As for the metadata files, please do not delete these until I find out what version you have as well as testing on my system first! I can't believe I am saying this, but I had never noticed those before! Keep me posted on the version of XenServer you are running so I can check my /tmp directory. --jkbs | @xenfomation
Guest - j jeffries on Friday, 19 September 2014 11:07

Attempts to run 'xe patch-destroy uuid=' give me this error:

The uuid you supplied was invalid.
type: pool_patch

1
Attempts to run 'xe patch-destroy uuid=' give me this error: The uuid you supplied was invalid. type: pool_patch

About XenServer

XenServer is the leading open source virtualization platform, powered by the Xen Project hypervisor and the XAPI toolstack. It is used in the world's largest clouds and enterprises.
 
Commercial support for XenServer is available from Citrix.