DCache Administration Scripts

From GridPP Wiki
Jump to: navigation, search

This page will be used a place to store scripts which should help dCache system administrators with everyday tasks.

Deleting items from trash

It is sometimes necessary to force dCache to free up space after issuing a delete command. If this is the case, the following script can be useful. In /opt/pnfsdb/pnfs/trash/2 on your admin node, the name of every file in there is the pnfsid of a file that has been deleted from pnfs. rep rm -force in the pool's cell in the admin node should delete the file.

The script can be found in the HEPiX system administration subversion repository - http://www.sysadmin.hep.ac.uk/svn/fabric-management/dcache/pnfs/emptytrash.sh

Thanks to Kostas Georgiou for this script.

Setting list of PNFSids to precious

The situation may arise where you want to convert all atlas files (for example) in a pool to precious. getCached below looks for the cached files in a pool and then passes this list of files to makeComms which sets the pnfsids to precious.

The script can be found in the HEPiX system administration subversion repository - http://www.sysadmin.hep.ac.uk/svn/fabric-management/dcache/pnfs/pnfsids2precious.sh

Thanks to Chris Brew for the script.

Finding list of PNFSids with no corresponding file

Sometimes PNFSids pop up that have no corresponding file in the PNFS namespace. Running the pathfinder on these IDs returns "File not found". Below, getPnfsIds performs a query of the companion database, looking for all IDs for a particular VO database (in this case, 000E and 000F are dteam). The script then runs pathfinder on all of the relevant PNFSids and returns those that do not have an entry in the PNFS namespace. The script also allows for all traces of these files to be removed from the dCache completely by creating a temp file with all of the rep rm -force <pnfsid> commands that are required. This should only be used if really necessary.

The script can be found in the HEPiX system administration subversion repository - http://www.sysadmin.hep.ac.uk/svn/fabric-management/dcache/pnfs/find-orphan-pnfsids.sh

Obviously the script could do some more advanced input error checking, but it gives an idea of what is possible.

Removing orphaned files from dCache

Occasionally pools will contain files (and their corresponding control/<pnfsid>, control/SI-<pnfsid> files) but these will not correspond to any entry in the PNFS namespace. It is not immediately clear what causes this situation to arise, but it has been seen on numerous occassions. The following script collects all of the pnfsids on a particular pool and then checks to see if they are in the namespace (by looking for the file metadata .(puse)(pnfsid)(0) ). If not, then the script creates a file with all of the relevant rep rm -force <pnfsid> commands that are subsequently passed to the admin shell in order to remove the orphaned files.

This script is related to the script above (Finding list of PNFSids with no corresponding file) but uses the admin shell and then the metadata of the namespace rather than using the PNFS companion database to find the list of orphaned files. From tests that I have done, it appears that both methods pick out the same pnfsids as being orphaned, but there may be some circumstances where this is not the case.

The script can be found in the HEPiX system administration subversion repository - http://www.sysadmin.hep.ac.uk/svn/fabric-management/dcache/pnfs/remove-orphan-files.sh

Thanks to Patrick Fuhrmann for the script

This script will also delete files that are being transfered at the time it runs, in addition if pnfs the "ls .(puse)(pnfsid)(0)" keeps failing (if pnfs is down for example) it might delete everything in your pools.