Difference between revisions of "Imperial Dirac server"

From GridPP Wiki
Jump to: navigation, search
(Added a section on how to create pull requests against the DIRAC repo.)
Line 1: Line 1:
== Prerequisites ==
== Prerequisites ==
* A host machine with SL6.
* A host machine with SL7.
* > 10GB free in /opt.
* > 20GB free in /opt.
* A host certificate.
* A host certificate.
* Ports 8080, 8443 & 9130-9200 TCP open on any firewalls.
* Ports 8080, 8443 & 9130-9200 TCP open on any firewalls.
Line 9: Line 9:
* Sign up with the [https://groups.google.com/forum/#!forum/diracgrid-forum diracgrid-forum].
* Sign up with the [https://groups.google.com/forum/#!forum/diracgrid-forum diracgrid-forum].
(*) If you see errors like this " Errors in the job submission:  Cannot append voms extension" in /opt/dirac/startup/WorkloadManagement_SiteDirectorDteam/log/current you forgot to downgrade voms.... <br>
Now you need to work out which version to install; this can be found in the tags list on [https://github.com/DIRACGrid/DIRAC GitHub].
Now you need to work out which version to install; this can be found in the tags list on [https://github.com/DIRACGrid/DIRAC GitHub].

Latest revision as of 11:27, 2 July 2020


  • A host machine with SL7.
  • > 20GB free in /opt.
  • A host certificate.
  • Ports 8080, 8443 & 9130-9200 TCP open on any firewalls.
  • [CHECK] No mysql or mysql-libs package on the machine (/etc/my.cnf conflicts with dirac settings).
  • A link to the documentation.
  • Sign up with the diracgrid-forum.

Now you need to work out which version to install; this can be found in the tags list on GitHub.

Installing the server on a cloud node

Following the steps here to some degree.
To find the current production version of Dirac check the dirac Wiki. Sort out the firewall (talk to Simon ;-)

[root ~]# wget http://repository.egi.eu/sw/production/cas/1/current/repo-files/EGI-trustanchors.repo -O /etc/yum.repos.d/EGI-trustanchors.repo
[root ~]# yum install ca-policy-egi-core fetch-crl
[root ~]# chkconfig fetch-crl-cron on
[root ~]# service fetch-crl-cron start
[root ~]# useradd -s /bin/bash -d /opt/dirac dirac
[root ~]# mkdir -p /opt/dirac/etc/grid-security/
[root ~]# cp /etc/grid-security/host*.pem /opt/dirac/etc/grid-security
[root ~]# chown -R dirac:dirac /opt/dirac
[root ~]# su - dirac
[dirac ~]$ ln -s /etc/grid-security/certificates  /opt/dirac/etc/grid-security/certificates
[dirac ~]$ ln -s /etc/grid-security/vomsdir  /opt/dirac/etc/grid-security/
[dirac ~]$ git clone https://github.com/DIRACGrid/DIRAC.git
[dirac ~]$ cd DIRAC
[dirac ~]$ git checkout v6r14p23 # Pick a tag or a branch
[dirac ~]$ cd ..
[dirac ~]$ git clone https://github.com/DIRACGrid/WebAppDIRAC.git
[dirac ~]$ cd WebAppDIRAC
[dirac ~]$ git checkout v1r6p26 # Pick a tag or a branch (right now we pick the master branch, i.e. don't do anything here)
[dirac ~]$ cd ..
[dirac ~]$ git clone https://github.com/ic-hep/GridPPDIRAC.git
[dirac ~]$ DIRAC/Core/scripts/dirac-deploy-scripts.py

[root ~]# yum -y install mysql-server
[root ~]# chkconfig mysqld on
[root ~]# service mysqld start
[root ~]# mysql_secure_installation

[root ~]# yum install java-1.8.0-openjdk
[root ~]# yum install xmlsec1-openssl glibmm24 db4-cxx c-ares log4cpp boost-date-time

# Edit install.cfg for what you want
[dirac ~]$ scripts/dirac-install -X install.cfg (stolen from elsewhere)
[dirac ~]$ source bashrc
[dirac ~]$ dirac-proxy-init -x (needs my own usercert in /opt/dirac/.globus)
[dirac ~]$ scripts/dirac-configure -F install.cfg
dirac-proxy-init -g dirac_admin
[dirac ~]$ scripts/dirac-setup-site

dirac@diracdev WebAppDIRAC]$ ./dirac-postInstall.py

To deal with stuff that didn't install the first time around:
dirac-admin-sysadmin-cli --host diracdev.grid.hep.ph.ic.ac.uk
[diracdev.grid.hep.ph.ic.ac.uk]> install agent Configuration UsersAndGroupsAgent

This is the initial preparation done, the next steps actually install the server components. To get you started, you can find a sanitized version of full.cfg here. Please note that this is not the final version, we are still working on it, so bits of it might just be plain wrong.

# This step just grabs the config file
[dirac DIRAC]$ wget www.hep.ph.ic.ac.uk/~dbauer/dirac/full_sanitized.cfg
[dirac DIRAC]$ mv full_sanitized.cfg full.cfg
[dirac DIRAC]$ chmod +x install_site.sh

STOP! check your full.cfg... If there are mistakes you will be best removing the target /opt/dirac folder contents and starting again. (remember to leave /opt/dirac/etc/gird-secutiry alone)

If this fails the best remedy is to remove all evidence of the previous install attempt of dirac in /opt/dirac and restart the machine to make sure all processes from the (now deleted) old installation have been correctly killed)

# This step takes quite a while (~10 minutes)
[dirac DIRAC]$ ./install_site.sh full.cfg
# Eventually it fails with a python error ending with: Requirement.parse('WebOb>=1.2'))
# Edit /opt/dirac/versions/v6r11p8_*/Linux_x86_64_glibc-2.12/lib/python2.6/site-packages/WebTest-2.0.14-py2.6.egg/EGG-INFO/requires.txt to erase the WebOb line.
# Then start it again...
[dirac DIRAC]$ ./install_site.sh full.cfg
# This should eventually finish and print a list of component statuses.
# You now have to edit the above requires.txt _again_ or it won't start properly in the future. (*)

Now open the web interface and check that it appears to work.

These instructions are for the *old* webinterface only. Not relevant to cloud install (*) If you do lots of installs in a row, i.e. dirac-install is unlikely to change while you are doing this, you can edit dirac-install by inserting the following lines in row 948:

 # Tidy up here...
  target_file = "%s/Linux_x86_64_glibc-2.12/lib/python2.6/site-packages/WebTest-2.0.14-py2.6.egg/EGG-INFO/requires.txt" % cliParams.targetPath
  if os.path.exists(target_file):
    sedCmd = "sed -i -e 's/^WebOb/#WebOb/' %s" % target_file
    os.system( sedCmd )

and then comment out the line in install_site.sh where it downloads dirac-install. (End of old instructions...)

Adding a new (admin) user

As the admin user specified in the config file (note: needs user*.pem in ~/.globus) do:

[dirac DIRAC]$ source /opt/dirac/bashrc
[dirac DIRAC]$ dirac-proxy-init -g dirac_admin
[dirac DIRAC]$ dirac-admin-add-user -N newusername -D "/C=UK/O=eScience/OU=Imperial/L=Physics/CN=new user DN" -M "user@maildomain.ac.uk" -G dirac_admin

Adding a VO

We keep the VO config separately so that we can just merge in a new VO as needed. Unfortunately if you do this, the plugin which retrieves the usernames will only work with one of them. Note: You can either issue these commands directly on the dirac node, or from the comfort of your dirac UI installed elsewhere.
Here is an example for a dteam.cfg and resources.cfg. Minor modifications will be required. The configuration is split into two files to make it easier to edit, if you prefer you can have on config file to rule them all.

[dirac DIRAC]$ source /opt/dirac/bashrc
[dirac DIRAC]$ dirac-proxy-init -g dirac_admin

[dirac DIRAC]$ dirac-configuration-cli
(dips://...)-Connected> mergeFromFile dteam.cfg
(dips://...)-Connected> mergeFromFile resources.cfg
(dips://...)-Connected> writeToServer
Data sent to server.
(dips://...)-Connected> quit

# Now create a SiteDirector instance for the VO (it seems that you have to run this command twice(**) 
- or maybe I was too impatient after updating the config file - so if you get an error, wait two min and try again)
[dirac DIRAC]$ dirac-admin-sysadmin-cli --host dwms00.grid.hep.ph.ic.ac.uk
[dirachost]> install agent WorkloadManagement SiteDirectorDteam -m SiteDirector
[dirachost]> quit

[dirac DIRAC]$ dirac-admin-sysadmin-cli --host dwms00.grid.hep.ph.ic.ac.uk
[dirachost]> restart *

# Enable the site
dirac-admin-allow-site LCG.UKI-LT2-IC-HEP.uk "Go"

# Have a look in /opt/dirac/startup/WorkloadManagement_SiteDirectorDteam/log/current for errors(*).

(*) To increase the debug level, on the web interface go to "System" -> "Configuration" -> "ManageRemoteConfiguration" -> "Systems" -> "WorkLoadManagement" -> "Production" -> "Agents". Right click on "SiteDirectorDteam" and select "Create an Option". Name is "LogLevel", value is "DEBUG". Commit the change. Restart the SiteDirectorDteam from "Systems" -> "SystemAdministration" -> Hostname tab -> Select Component and Restart.

(**) As this snippet shows:

[diractest2.cern.ch]> install agent WorkloadManagement SiteDirectorDteam -m SiteDirector
[ERROR] Software for agent WorkloadManagement/SiteDirectorDteam is not installed
[diractest2.cern.ch]> install agent WorkloadManagement SiteDirectorDteam -m SiteDirector
agent WorkloadManagement_SiteDirectorDteam is installed, runit status: Run

Now we have to upload a pilot proxy for the VO we want to use (here: comet.j-parc.jp)

[dirac DIRAC]$  dirac-proxy-init -C /etc/grid-security/pilotcert.pem -K /etc/grid-security/pilotkey.pem -g comet.j-parc.jp_pilot -P

File management
If you are trying to add a file (from a dirac UI) to your newly configured disk, e.g.:

dirac-dms-add-file -ddd /dteam/user/d/dbauer/testfile_imp1.txt testfile.txt IMPERIAL-disk

you might see the error: ERROR: FileCatalog._getCatalogConfigDetails: Failed to get catalog options. FileCatalog
This means you need to a add a FileCatalogue in the resources section (dirac-admin-sysadmin-cli) if you haven't done so already.

Installing VMDirac

We are roughly following these steps: VMDIRAC Wiki

The VMDIRAC code was already included in our initial dirac-install config, so the base code is already present.

Enable the module:

$ dirac-configuration-shell
[StealthConfig:/ ]% cd DIRAC
[StealthConfig:/ ]% set Extensions VM
[StealthConfig:/ ]% commit
# Now restart DIRAC.

Next install the extra python modules which are needed (any missing modules will trigger a traceback in the Framework_SystemAdministrator log and a mysterious "Software not installed" message.)

$ pip install apache-libcloud
$ pip install paramiko

Now the Dirac modules can actually be installed:

$ dirac-admin-sysadmin-cli --host dwms00.grid.hep.ph.ic.ac.uk
[dirachost]> install db VirtualMachineDB
[dirachost]> install service WorkloadManagement VirtualMachineManager
[dirachost]> install agent WorkloadManagement VirtualMachineScheduler
[dirachost]> install agent WorkloadManagement VirtualMachineContextualization

Enable your new site:

dirac-admin-allow-site CLOUD.gridpp.ac.uk "Go"

Restarting Dirac

  • Use the webinterface
  • Try our handy stop and start scripts, run as the dirac user (at your own risk ! no guarantees!)

Testing your dirac server

Is it working ? Hints how to find out can be found here.

Random notes

The Dirac version is set in /opt/dirac/DIRAC/__init__.py Available versions (esp for the dirac UI): [1]

Reinstalling diracdev

  • in /opt/dirac: rm -rf bashrc control cshrc data DIRAC mysql runit sbin scripts startup storage storageElement
  • Not sure about Linux_x86_64_glibc-2.12, I deleted and reinstalled it afterwards
  • mysql: remove /var/lib/mysql/* and /etc/my.cnf
  • do not remove /etc/grid-security
  • dirac_admin: in DevelConfig.cfg: in dirac_admin add user by hand (initial install assigns admin DN to user 'dirac' and then 'dirac' to dirac_admin, but gets confused if admin DN then also is registered as user)

How to make a pull request against the DIRAC repo

I made a fork from the main DIRAC repor a long time ago.

git clone git@github.com:marianne013/DIRAC.git

To be able to update my fork, I need to setup a remote pointing to the main DIRAC repo. (Only do this once.)

git remote add upstream git@github.com:DIRACGrid/DIRAC.git

Update from the main repo:

git fetch upstream

See what branches there are:

git branch -r

Now I want to make a branch off the v6r21. It's going to be called 'magicRunes'. Note the 'upstream'.

git checkout -b magicRunes upstream/rel-v6r21

Do some changes.

git status
git add whatever_i_changed
git commit -m "some matching message"
git push -u origin

Now go to the git webpage and create a pull request. Double check that you are making the pull request against the correct base release as it defaults to integration and you probably don't want this.

Return to Dirac overview page.