Imperial Dirac Maintenance

From GridPP Wiki
Jump to: navigation, search


The Imperial dirac server consists of 4 machines:
dirac01 (configuration server, main dirac server)
dirac02 (second dirac server, for load balancing etc)
diracdb (hosts the databases)
diracweb (hosts the webserver)

Restarting dirac

To restart dirac:
as 'dirac' in /opt/dirac/:

killall -SIGHUP runsvdir
source bashrc
runsvdir /opt/dirac/startup &

Updating the dirac install

Note: dirac01 and dirac02 should always run the same version of the dirac software. Whatever you do it helps having a clean shell without the cursed bashrc having been invoked.

cd /opt/dirac/DIRAC
git status
git pull
find . -iname '*.pyo' -delete
[restart dirac]

and the GridPP module

cd /opt/dirac/GridPPDIRAC
git status
git pull
find . -iname '*.pyo' -delete
[restart dirac]

Note that for new dirac commands ('dirac-populate-component-db'), only the relevant python files are downloaded. To make the actual commands run:


Check the logs

Looking at a specific SiteDirector:
tail -f startup/WorkloadManagement_SiteDirectorGridPP/log/current
Local dirac code:
tail -f /opt/dirac/startup/Configuration_AutoBdii2CSAgent/log/current

Best of Github

The ic-hep repo

Merge requests: Being logged into github really helps :-)

to check what changed if 'git status' complains about modified files:
git diff [file it's upset about]

Adding a new VO

(example taken from the LZ VO, from memory, so probably incomplete)
on dirac01:
add VO to /etc/vomses and /etc/grid-security/vomsdir (as root)

on the webinterface (as dirac_admin):
Registry -> Groups: create lz_pilot and lz_user with all options
Registry -> VO: add lz with all options and subfolder VOMSServers
Registry -> VOMS -> Mapping: add lz_user and lz_pilot
Registry -> VOMS -> URLs: add lz folder with all options
Operations: add lz folder with all options
Systems -> WorkLoadManagement -> Production -> Agents: add folder SiteDirectorLz with all options

Make a SiteDirector (from a dirac ui):

source bashrc
dirac-proxy-init -g dirac_admin
dirac-admin-sysadmin-cli --host
[]> install agent WorkloadManagement SiteDirectorLsst -m SiteDirector
agent WorkloadManagement_SiteDirectorLsst is installed, runit status: Run 

Restart the UsersAndGroups Agent to populate the new groups.

Upload a pilot proxy:
dirac-proxy-init -C /etc/grid-security/pilotcert.pem -K /etc/grid-security/pilotkey.pem -g lz_pilot -M -P

Make a top level directory on the file catalogue:
source bashrc
dirac-proxy-init -g dirac_admin
mkdir [voname]
chgrp voname_user [VO]

Registering a file in the dirac file catalogue

For those situations when a file is on an SE but not in the dirac file catalogue....

  • File copied to SE without dirac involvement:

lcg-cp -vvv -b -D srmv2 somefile srm://

  1. streams: 1
      941720 bytes   3628.98 KB/sec avg   3628.98 KB/sec inst 
  • Register the file by hand

[on dirac ui]
FC:/> register file / srm:// 941720 UKI-LT2-IC-HEP-disk
File successfully added to the catalog

  • To check if it worked, on the dirac ui:

dirac-dms-get-file /

Switching glexec on/off

Official documentation.

  • Resources -> Computing -> glexec: RescheduleOnError = True (this should also enable glexec in logging only mode)
  • Systems -> WorkLoadManagement -> Production ->Agent -> JobAgent -> CEtype = glexec or InProcess (to turn glexec off)

Enable/Ban a site

On a dirac ui:
source bashrc
dirac-proxy-init -g dirac_admin
dirac-admin-allow-site "Test"
dirac-admin-ban-site "Test"

Ban a CE, rather than a site

  • Systems ->Configuration -> Agents -> AutoBdii2CSAgent -> BannedCEs
  • Restart the AutoBdii2CSAgent

Install a new database/service etc

On a dirac UI of your choice...

source bashrc
dirac-proxy-init -g dirac_admin
dirac-admin-sysadmin-cli --host
[]> install db InstalledComponentsDB
MySQL root password: [go find password in config file]
Adding to CS Framework/InstalledComponentsDB
Database InstalledComponentsDB from DIRAC/FrameworkSystem installed successfully 

Edit the automatic entry under 'Host' in the config file (Systems -> Framework -> Production -> Databases) to contain the actual database machine.
Now install the service:

[]> install service Framework ComponentMonitoring