GridPP dCache GIP plugin

From GridPP Wiki
Jump to: navigation, search

Steve Traylen & Derek Ross at RAL Tier-1 created a dCache GIP plugin as an alternative to the one that comes with a standard dCache installation. It was written before there was a plugin supplied with dCache. It provides a more accurate representation of the used storage for VOs due to current restrictions in the dCache system which mean that if VOs share pools/pool groups within a dCache, then the used storage will be double (or triple...) counted. By performing a du on the /pnfs filesystem, it is possible to calculate the per-VO usage. By comparing this against a file containing the allocated VO space (i.e. upon recommendations from the GridPP UB), the remaining available space can be obtained. The scripts below generate the storage information and then publish this to via a GIP plugin. The published information can then be accounted for properly, per VO.

Note that performing a du on /pnfs has been shown to take up to 1 hour on the Tier-1 node running PNFS, simultaneously generating high load on the machine. Tier-2 installations are generally smaller, but you have been warned!

Note also that these are the files straight from the Tier1, they contain a number of assumptions about the setup which may not be relevant to other sites. A partial list follows:

  • lcg-dynamic-dcache uses dcache.gridpp.rl.ac.uk as the SRM
  • /pnfs/gridpp.rl.ac.uk is used in several places as the storage path
  • lcg-dynamic-dcache assumes that the path for the biomed vo is bio rather than biomed
  • cron job is scheduled to run once a day at 3:50am


  • lcg-dynamic-dcache goes in the gip plugin directory /opt/lcg/var/gip/plugin. You will need to remove the link to the dCache plugin that exists here (you could also remove the link in the provider directory). Remember to chmod a+x on the new plugin.

#!/usr/bin/perl -w

use strict ;
use File::Basename ;

my $used  = '/var/lib/edginfo/used-space.dat' ;
my $total = '/var/lib/edginfo/available-space.dat' ;

my %space ;

open(USED,$used) or die "Could not open $used: $!\n" ;
while(<USED>) {
  if (/^(\d+)\s+(\S+)\s*/) {
      my $kb    = $1  ;
      my $path  = &basename($2)  ;
      $space{$path}{'used'} =  $kb ;
  }
}
close(USED) ;

open(TOTAL,$total) or die "Could not open $total: $!\n" ;
while(<TOTAL>) {
  if (/^(\d+)\s+(\S+)\s*/) {
      my $kb    = $1  ;
      my $path  = &basename($2)  ;
      $space{$path}{'total'} =  $kb ;
  }
}

foreach ( qw/cms minos dteam atlas lhcb biomed pheno zeus hone ilc esr t2k magic babar cedar geant4 fusion ops/ ){
 my $vo = $_ ;
 my $path = $vo ;

 if ( $path eq 'biomed' ) {
     $path = 'bio' ;
 }
 print "dn: GlueSALocalID=$vo,GlueSEUniqueID=dcache.gridpp.rl.ac.uk,Mds-Vo-name=local,o=grid\n" ;
#  print "GlueSAStateAvailableSpace: ".$space{$path}{'total'}."\n" ;
 my $temp = $space{$path}{'total'}-$space{$path}{'used'};
#    if ($temp < 0) { $temp=0;};
 print "GlueSAStateAvailableSpace: ".$temp."\n" ;

 print "GlueSAStateUsedSpace: ".$space{$path}{'used'}."\n\n" ;
}

  • used-space goes in /etc/cron.d/ .
50 3 * * *  edginfo /usr/bin/du -s  /pnfs/gridpp.rl.ac.uk/data/* > /var/lib/edginfo/used-space.dat.tmp; mv -f \
/var/lib/edginfo/used-space.dat.tmp /var/lib/edginfo/used-space.dat
  • available-space.dat, which contains the maximum space available to a VO (in KB), goes in /var/lib/edginfo/
10654049858     /pnfs/gridpp.rl.ac.uk/data/atlas
18733858816     /pnfs/gridpp.rl.ac.uk/data/cms
2099240960      /pnfs/gridpp.rl.ac.uk/data/dteam
14115930112     /pnfs/gridpp.rl.ac.uk/data/lhcb
104859648       /pnfs/gridpp.rl.ac.uk/data/pheno
104859648       /pnfs/gridpp.rl.ac.uk/data/zeus
104859648       /pnfs/gridpp.rl.ac.uk/data/bio
1070204928      /pnfs/gridpp.rl.ac.uk/data/hone
104859648       /pnfs/gridpp.rl.ac.uk/data/ilc
104859648       /pnfs/gridpp.rl.ac.uk/data/esr
104859648       /pnfs/gridpp.rl.ac.uk/data/t2k
104859648       /pnfs/gridpp.rl.ac.uk/data/magic
104859648       /pnfs/gridpp.rl.ac.uk/data/babar
104859648       /pnfs/gridpp.rl.ac.uk/data/minos
104859648       /pnfs/gridpp.rl.ac.uk/data/cedar
104859648       /pnfs/gridpp.rl.ac.uk/data/geant4
104859648       /pnfs/gridpp.rl.ac.uk/data/fusion
2099240960      /pnfs/gridpp.rl.ac.uk/data/ops

The used-space cron will create a file named used-space.dat in the same place, so /var/lib/edginfo/ has to rw for edginfo.

You will need to edit lcg-dynamic-dcache and available-space.dat for the VOs you support (and when you add new ones in the future).

Quarterly reports

If dCache sites do not operate with this GIP plugin, then they may be required to run the command in the used-space script in order to get per-VO used storage for the GridPP quarterly reports.

Known issues

Since the GIP performs a du on the PNFS namespace, it does not take account of file replicas that have been created by the dCache (or by the admin) for load balancing purposes. It is not clear if such replicated files 'belong' to the VO or not.

Since PNFS uses NFSv2 files over 2GB are not correctly accounted for (AFAIK they appear to be 1 byte, quite a large discrepancy).

used-space.pl

I (Chris brew) wrote this to get round the 2GB file issue after noticing the my reported usage was less than ten percent of the actual usage. It uses the fact that the first four characters of the PNFSid (or the filename in the pool data directory) tell you which database the file is in and so which VO it belongs to (you are using a database for each VO aren't you). It requires that you mount the pools on the accounting node (I use automount for that).

#!/usr/bin/perl -w
# Get the VO disk usage info out of dCache
# 

my $info;
my $store = "/pnfs/pp.rl.ac.uk/data/";

# First query postgres to get the VO to database mappings
# This will be used to assign the files to the correct VOs
#

open(LIST, "/opt/pnfs/tools/mdb show |" ) || die;

while (<LIST>) {
    if ( /enabled/ ){
        my ( $id, $vo ) = (split)[0,1];
        $info->{"vos"}->{$id}->{"disk"} = ${store}.${vo};
        $info->{"vos"}->{$id}->{"vo"} = $vo;
        $info->{"vos"}->{$id}->{"size"} = 0;
    }
}
close ( LIST );

exit 1 unless keys %{$info->{"vos"}};

# Now read the auto mounted config to get the list of pools

open(POOLS, "</etc/auto.pools" ) || die;
while (<POOLS> ) {
    if ( /pool/ ) {
        push @{$info->{"pools"}}, (split)[0];
    }
}
close ( POOLS );

exit 1 unless $info->{"pools"}->[0];
    
# 
# Now do the real work
# Loop over the files in the data directory of the pool
# use the first four characters (hex!) to work out which VO 
# owns the file then add the size to that VOs total

for my $pool ( @{$info->{"pools"}} ) {
    chdir "/pools/$pool/data/";
    for my $file ( glob "*" ) {
        my $id = hex ( substr ( $file, 0, 4 ) );
        $info->{"vos"}->{$id}->{"size"} += -s "$file";
    }
}

# Now loop over the VO partitionss and find the free and used space info 

open (USED,">/var/lib/edginfo/used-space.dat") || die;
map { 
    my $dfout = $info->{"vos"}->{$_}->{"size"};
    print USED int(($dfout/1024)+0.5)."\t$store$info->{'vos'}->{$_}->{'vo'}\n";
#    print int(($dfout/1024)+0.5)."\t$store$info->{'vos'}->{$_}->{'vo'}\n";
 } keys %{$info->{"vos"}};
close (USED);

exit 0;

The Automounter needs to be configured with:

/pools       /etc/auto.pools

in /etc/auto.master and a line like:

poolnode_1 -ro,soft poolnode:/raid/pool1

in /etc/auto.pools for each pool.