Difference between revisions of "Imperial Dirac Maintenance"

From GridPP Wiki
Jump to: navigation, search
(Update the pilot and diracos version)
(Update the pilot and diracos version)
Line 50: Line 50:
 
4. and reinstall diracos: bash DIRACOS-Linux-$(uname -m).sh
 
4. and reinstall diracos: bash DIRACOS-Linux-$(uname -m).sh
 
5. rm DIRACOS-Linux-$(uname -m).sh
 
5. rm DIRACOS-Linux-$(uname -m).sh
6. source diracos/diracosrc (should have been source bashrc ? we set DIRAC_ROOT_PATH by hand in bashrc....)
+
6. source bashrc (in a UI user: source diracos/diracosrc)
 
7. check that you are on the branch that you expect: cd DIRAC; git status; cd ..
 
7. check that you are on the branch that you expect: cd DIRAC; git status; cd ..
 
8. install from your local branch: pip install -e DIRAC
 
8. install from your local branch: pip install -e DIRAC
 
9. install the rest of it: pip install -e GridPPDIRAC;  pip install -e WebAppDIRAC
 
9. install the rest of it: pip install -e GridPPDIRAC;  pip install -e WebAppDIRAC
10. restart DIRAC: If you haven't sourced bashrc until now, do so now
 
 
</pre>
 
</pre>
 
python2: diracos: quietly copy off cvmfs from the lx machines across to the dirac node: /cvmfs/dirac.egi.eu/installSource/ <br>
 
python2: diracos: quietly copy off cvmfs from the lx machines across to the dirac node: /cvmfs/dirac.egi.eu/installSource/ <br>
Which externals version goes with which dirac version can be found [https://raw.githubusercontent.com/DIRACGrid/DIRAC/integration/releases.cfg here]. <br>
+
python2: Which externals versions goes with which dirac version can be found [https://raw.githubusercontent.com/DIRACGrid/DIRAC/integration/releases.cfg here]. <br>
  
 
== Check the logs ==
 
== Check the logs ==

Revision as of 11:39, 21 October 2022

Hardware

The Imperial dirac server consists of 4 machines:
dirac01 (configuration server, main dirac server)
dirac02 (second dirac server, for load balancing etc)
diracdb (hosts the databases)
diracweb (hosts the webserver)

Restarting dirac

To restart dirac:
as 'dirac' in /opt/dirac/:

killall -SIGHUP runsvdir
source bashrc
runsvdir /opt/dirac/startup &

Updating the dirac install

Note: dirac01 and dirac02 should always run the same version of the dirac software. Whatever you do it helps having a clean shell without the cursed bashrc having been invoked.

cd /opt/dirac/DIRAC
git status
git pull
find . -iname '*.pyo' -delete
[restart dirac]

and the GridPP module

cd /opt/dirac/GridPPDIRAC
git status
git pull
find . -iname '*.pyo' -delete
[restart dirac]

Note that for new dirac commands ('dirac-populate-component-db'), only the relevant python files are downloaded. To make the actual commands run:

dirac-deploy-scripts

New versions of the Externals and lcgBundles can be found here: [1], [2]

Update the pilot and diracos version

Operations -> GridPP -> Pilot
Python 3:

1. stop DIRAC: killall -SIGHUP runsvdir
2. remove diracos dir: rm -rf diracos
3. get the .sh file back (note the "latest", can be replace by version):
   curl -LO https://github.com/DIRACGrid/DIRACOS2/releases/latest/download/DIRACOS-Linux-$(uname -m).sh
4. and reinstall diracos: bash DIRACOS-Linux-$(uname -m).sh
5. rm DIRACOS-Linux-$(uname -m).sh
6. source bashrc (in a UI user: source diracos/diracosrc)
7. check that you are on the branch that you expect: cd DIRAC; git status; cd ..
8. install from your local branch: pip install -e DIRAC
9. install the rest of it: pip install -e GridPPDIRAC;  pip install -e WebAppDIRAC

python2: diracos: quietly copy off cvmfs from the lx machines across to the dirac node: /cvmfs/dirac.egi.eu/installSource/
python2: Which externals versions goes with which dirac version can be found here.

Check the logs

Looking at a specific SiteDirector:
tail -f startup/WorkloadManagement_SiteDirectorGridPP/log/current
Local dirac code:
tail -f /opt/dirac/startup/Configuration_AutoBdii2CSAgent/log/current


Best of Github

The ic-hep repo

Merge requests: Being logged into github really helps :-)

To check if code complies to DIRAC python style: git diff -U0 upstream/integration | pycodestyle --diff

to check what changed if 'git status' complains about modified files:
git diff [file it's upset about]

Adding a new VO

(example taken from the LZ VO, from memory, so probably incomplete)
on dirac01:
add VO to /etc/vomses and /etc/grid-security/vomsdir (as root)

on the webinterface (as dirac_admin):
Registry -> Groups: create lz_pilot and lz_user with all options. Add dirac pilot user to the pilot group.
Registry -> VO: add lz with all options and subfolder VOMSServers
Registry -> VOMS -> Mapping: add lz_user (note: we fill the pilot group by hand as we don't want the userandgroups script to fill it automatically)
Registry -> VOMS -> URLs: add lz folder with all options
Operations: add lz folder with all options
Systems -> WorkLoadManagement -> Production -> Agents: add folder SiteDirectorLz with all options

Make a SiteDirector (from a dirac ui):

source bashrc
dirac-proxy-init -g dirac_admin
dirac-admin-sysadmin-cli --host dirac01.grid.hep.ph.ic.ac.uk
[dirac01.grid.hep.ph.ic.ac.uk]> install agent WorkloadManagement SiteDirectorLsst -m SiteDirector
agent WorkloadManagement_SiteDirectorLsst is installed, runit status: Run 

Restart the UsersAndGroups Agent to populate the new groups.
Once the VO configuration is complete restart the BDII2CS agent.
In fact it's probably best to restart dirac and to be done with it... </br>

(Obsolete in v7r3) Upload a pilot proxy (on dirac01 as 'dirac'):

dirac-proxy-init -C /etc/grid-security/pilotcert.pem -K /etc/grid-security/pilotkey.pem -g lz_pilot -M -U -r

Make a top level directory on the file catalogue (from a dirac UI):

source bashrc 
dirac-proxy-init -g dirac_admin 
dirac-dms-filecatalog-cli 
mkdir [voname] 
chgrp voname_user [VO]
ls -l

If a new VO adds new sites, make sure to enable them (from a dirac ui): dirac-admin-allow-site LCG.MA-01-CNRST.ma "magrid1"

Registering a file in the dirac file catalogue

For those situations when a file is on an SE but not in the dirac file catalogue....

  • File copied to SE without dirac involvement:

gfal-copy-vvv -b -D srmv2 somefile srm://gfe02.grid.hep.ph.ic.ac.uk:8443/srm/managerv2?SFN=/pnfs/hep.ph.ic.ac.uk/data/comet/comet.j-parc.jp/user/daniela.bauer/some.test.file
[...]

  1. streams: 1
      941720 bytes   3628.98 KB/sec avg   3628.98 KB/sec inst 
  • Register the file by hand

[on dirac ui]
dirac-dms-filecatalog-cli
FC:/> register file /comet.j-parc.jp/user/daniela.bauer/some.test.file srm://gfe02.grid.hep.ph.ic.ac.uk:8443/srm/managerv2?SFN=/pnfs/hep.ph.ic.ac.uk/data/comet.j-parc.jp/user/daniela.bauer/some.test.file 941720 UKI-LT2-IC-HEP-disk
File successfully added to the catalog
exit

  • To check if it worked, on the dirac ui:

dirac-dms-get-file /comet.j-parc.jp/user/daniela.bauer/some.test.file

Switching glexec on/off (obsolete)

Official documentation.

  • Resources -> Computing -> glexec: RescheduleOnError = True (this should also enable glexec in logging only mode)
  • Systems -> WorkLoadManagement -> Production ->Agent -> JobAgent -> CEtype = glexec or InProcess (to turn glexec off)

Enable/Ban a site

On a dirac ui:
source bashrc
dirac-proxy-init -g dirac_admin
dirac-admin-allow-site LCG.RAL-LCG2.uk "Test"
dirac-admin-ban-site LCG.RAL-LCG2.uk "Test"

Ban a CE, rather than a site

  • Systems ->Configuration -> Agents -> AutoBdii2CSAgent -> BannedCEs
  • Restart the AutoBdii2CSAgent


Install a new database/service etc

On a dirac UI of your choice...

source bashrc
dirac-proxy-init -g dirac_admin
dirac-admin-sysadmin-cli --host dirac01.grid.hep.ph.ic.ac.uk
[dirac01.grid.hep.ph.ic.ac.uk]> install db InstalledComponentsDB
MySQL root password: [go find password in config file]
Adding to CS Framework/InstalledComponentsDB
Database InstalledComponentsDB from DIRAC/FrameworkSystem installed successfully 

Edit the automatic entry under 'Host' in the config file (Systems -> Framework -> Production -> Databases) to contain the actual database machine.
Now install the service:

[dirac01.grid.hep.ph.ic.ac.uk]> install service Framework ComponentMonitoring

In order to install something from the GridPP package, it needs to be present in the UI: git clone https://github.com/ic-hep/GridPPDIRAC.git and then edit etc/dirac.cfg and add ExtraModules = GridPP

[diracdev.grid.hep.ph.ic.ac.uk]> install agent Configuration UsersAndGroupsAgent

Ports

9152 Framework/ProxyManager
9162 Framework/SystemAdministrator


Enabling Unit Tests

Needs: travis.yml, pytest.ini, requirements.txt and files ending in _test e.g. HelloWorldAgent_test.py

GridPPDIRAC on travis (use GitHub to log in).



The Transformation System (automated file processing)

Transformation System

VMDIRAC (Cloud)

VMDIRAC

WebApp

WebApp

Dirac: Back to the dirac overview page.