How do I store ganglia's gmetad's RRDs in a tmpfs?

With a large farm being monitored by ganglia the disk usage of the central gmetad collector can be very unusual.

For instance at RAL we broke three harddisks in 6 months. The reason for this is that gmetad constantly modifys fixed sized rrd files on the harddisk and so the same few disk blocks are constantly thrashed.

To alleviate the problem we moved the rrd databases into a tmpfs filesystem.

To do this:

  • Set up gmetad for ganglia in the usual fashion with the rrds being written to /var/lib/ganglia
  • Make a note of how big the RRDs are becoming. At RAL for 650 monitored nodes the RRDs are 300Megs.

  • To your /etc/fstab add a line:
    tmpfs /var/lib/ganglia tmpfs defaults 0 0
    and mount the partition.
    mount /var/lib/ganglia.

    The partition will by default grow to half the physical memory you have as it is needed. A size option can also be specified.

  • Now the data is all in memory we must protect ourselves from system crashes. To do this we created a diretory on disk /var/lib/ganglia-p and two scripts:
    1. ganglia-memory2permanent
      rsync -a --delete /var/lib/ganglia/rrds /var/lib/ganglia-p
    2. ganglia-permanent2memory
      rsync -a --delete /var/lib/ganglia-p/rrds /var/lib/ganglia/
  • The first script copys from memory to disk. The second script copys from disk to memory. These were patched into the gmetad init.d scripts to run before start up and after shutdown of gmeatd. Our SysV gmetad script.
  • Since the machine may crash at any time we also add a cron to run every 10 minutes and rsync the memory to disk again. The cron must check that gmond is running or else you risk over writing the data with blank memory. Our a cron entry.
With this in place at RAL the disk has been fine on the ganglia server, we have not lost any data and box is now only CPU bound when displaying pages.

LCG FAQS


Last modified Fri 17 September 2004 . View page history
Switch to HTTPS . Website Help . Print View . Built with GridSite 1.4.3
For more about GridPP please contact Neasan O'Neill