Imperial Dirac Maintenance
Contents
Hardware
The Imperial dirac server consists of 4 machines:
dirac01 (configuration server, main dirac server)
dirac02 (second dirac server, for load balancing etc)
diracdb (hosts the databases)
diracweb (hosts the webserver)
Restarting dirac
To restart dirac:
as 'dirac' in /opt/dirac/:
killall -SIGHUP runsvdir source bashrc runsvdir /opt/dirac/startup &
Updating the dirac install
Note: dirac01 and dirac02 should always run the same version of the dirac software. Whatever you do it helps having a clean shell without the cursed bashrc having been invoked.
cd /opt/dirac/DIRAC git status git pull find . -iname '*.pyo' -delete [restart dirac]
and the GridPP module
cd /opt/dirac/GridPPDIRAC git status git pull find . -iname '*.pyo' -delete [restart dirac]
Check the logs
Looking at a specific SiteDirector:
tail -f startup/WorkloadManagement_SiteDirectorGridPP/log/current
Local dirac code:
tail -f /opt/dirac/startup/Configuration_AutoBdii2CSAgent/log/current
Best of Github
The ic-hep repo
Merge requests: Being logged into github really helps :-)
to check what changed if 'git status' complains about modified files:
git diff [file it's upset about]
Adding a new VO
(example taken from the LZ VO, from memory, so probably incomplete)
on dirac01:
add VO to /etc/vomses and /etc/grid-security/vomsdir (as root)
on the webinterface (as dirac_admin):
Registry -> Groups: create lz_pilot and lz_user with all options
Registry -> VO: add lz with all options and subfolder VOMSServers
Registry -> VOMS -> Mapping: add lz_user and lz_pilot
Registry -> VOMS -> URLs: add lz folder with all options
Operations: add lz folder with all options
Systems -> WorkLoadManagement -> Production -> Agents: add folder SiteDirectorLz with all options
Make a SiteDirector (from a dirac ui):
source bashrc dirac-proxy-init -g dirac_admin [dirac01.grid.hep.ph.ic.ac.uk]> install agent WorkloadManagement SiteDirectorLsst -m SiteDirector agent WorkloadManagement_SiteDirectorLsst is installed, runit status: Run
Restart the UsersAndGroups Agent to populate the new groups.
Upload a pilot proxy:
dirac-proxy-init -C /etc/grid-security/pilotcert.pem -K /etc/grid-security/pilotkey.pem -g lz_pilot -M -P
Registering a file in the dirac file catalogue
For those situations when a file is on an SE but not in the dirac file catalogue....
- File copied to SE without dirac involvement:
lcg-cp -vvv -b -D srmv2 somefile srm://gfe02.grid.hep.ph.ic.ac.uk:8443/srm/managerv2?SFN=/pnfs/hep.ph.ic.ac.uk/data/comet/comet.j-parc.jp/user/daniela.bauer/some.test.file
[...]
- streams: 1
941720 bytes 3628.98 KB/sec avg 3628.98 KB/sec inst
- Register the file by hand
[on dirac ui]
dirac-dms-filecatalog-cli
FC:/> register file /comet.j-parc.jp/user/daniela.bauer/some.test.file srm://gfe02.grid.hep.ph.ic.ac.uk:8443/srm/managerv2?SFN=/pnfs/hep.ph.ic.ac.uk/data/comet.j-parc.jp/user/daniela.bauer/some.test.file 941720 UKI-LT2-IC-HEP-disk
File successfully added to the catalog
exit
- To check if it worked, on the dirac ui:
dirac-dms-get-file /comet.j-parc.jp/user/daniela.bauer/some.test.file
Switching glexec on/off
Official documentation.
- Resources -> Computing -> glexec: RescheduleOnError = True (this should also enable glexec in logging only mode)
- Systems -> WorkLoadManagement -> Production ->Agent -> JobAgent -> CEtype = glexec or InProcess (to turn glexec off)
Ban a CE, rather than a site
- Systems ->Configuration -> Agents -> AutoBdii2CSAgent -> BannedCEs
- Restart the AutoBdii2CSAgent
Install a new database/service etc
On a dirac UI of your choice...
source bashrc dirac-proxy-init -g dirac_admin dirac-admin-sysadmin-cli --host dirac01.grid.hep.ph.ic.ac.uk [dirac01.grid.hep.ph.ic.ac.uk]> install db InstalledComponentsDB MySQL root password: [go find password in config file] Adding to CS Framework/InstalledComponentsDB Database InstalledComponentsDB from DIRAC/FrameworkSystem installed successfully