Imperial Dirac server

From GridPP Wiki
Revision as of 15:42, 19 November 2014 by Rcurrie 83080d9ccc (Talk | contribs)

Jump to: navigation, search

Prerequisites

  • A host machine with SL6.
  • >3GB free in /opt.
  • A host certificate.
  • Possibly: An EMI2 UI (or EMI3, but with voms-clients-2(*)) - we haven't quite worked out which bits we need.
  • Ports 8080, 8443 & 9130-9200 TCP open on any firewalls.
  • No mysql or mysql-libs package on the machine (/etc/my.cnf conflicts with dirac settings).
  • A link to the documentation.
  • Sign up with the diracgrid-forum.

(*) If you see errors like this " Errors in the job submission: Cannot append voms extension" in /opt/dirac/startup/WorkloadManagement_SiteDirectorDteam/log/current you forgot to downgrade voms....
Now you need to work out which version to install; this can be found in the tags list on GitHub.

Installing the server

Following the steps here to some degree.

[root ~]# wget http://repository.egi.eu/sw/production/cas/1/current/repo-files/EGI-trustanchors.repo -O /etc/yum.repos.d/EGI-trustanchors.repo
[root ~]# yum install ca-policy-egi-core
[root ~]# useradd -s /bin/bash -d /home/dirac dirac
[root ~]# mkdir -p /opt/dirac/etc/grid-security/
[root ~]# cp /etc/grid-security/host*.pem /opt/dirac/etc/grid-security
[root ~]# chown -R dirac:dirac /opt/dirac
[root ~]# su - dirac
[dirac ~]$ ln -s /etc/grid-security/certificates  /opt/dirac/etc/grid-security/certificates
[dirac ~]$ ln -s /etc/grid-security/vomsdir  /opt/dirac/etc/grid-security/
[dirac ~]$ mkdir ~/DIRAC
[dirac ~]$ cd ~/DIRAC
[dirac DIRAC]$ wget https://github.com/DIRACGrid/DIRAC/raw/integration/Core/scripts/install_site.sh

This is the initial preparation done, the next steps actually install the server components. To get you started, you can find a sanitized version of full.cfg here. Please note that this is not the final version, we are still working on it, so bits of it might just be plain wrong.

# This step just grabs the config file
[dirac DIRAC]$ wget www.hep.ph.ic.ac.uk/~dbauer/dirac/full_sanitized.cfg
[dirac DIRAC]$ mv full_sanitized.cfg full.cfg
[dirac DIRAC]$ chmod +x install_site.sh

STOP! check your full.cfg... If there are mistakes you will be best removing the target /opt/dirac folder contents and starting again. (remember to leave /opt/dirac/etc alone)

If this fails the best remedy is to remove all evidence of the previos install attempt of dirac in /opt/dirac and restart the machine to make sure all processes from the (now deleted) old installation have been correctly killed)

# This step takes quite a while (~10 minutes)
[dirac DIRAC]$ ./install_site.sh full.cfg
# Eventually it fails with a python error ending with: Requirement.parse('WebOb>=1.2'))
# Edit /opt/dirac/versions/v6r11p8_*/Linux_x86_64_glibc-2.12/lib/python2.6/site-packages/WebTest-2.0.14-py2.6.egg/EGG-INFO/requires.txt to erase the WebOb line.
# Then start it again...
[dirac DIRAC]$ ./install_site.sh full.cfg
# This should eventually finish and print a list of component statuses.
# You now have to edit the above requires.txt _again_ or it won't start properly in the future. (*)

Now open the web interface and check that it appears to work, here is ours.

(*) If you do lots of installs in a row, i.e. dirac-install is unlikely to change while you are doing this, you can edit dirac-install by inserting the following lines in row 948:

 # Tidy up here...
  target_file = "%s/Linux_x86_64_glibc-2.12/lib/python2.6/site-packages/WebTest-2.0.14-py2.6.egg/EGG-INFO/requires.txt" % cliParams.targetPath
  if os.path.exists(target_file):
    sedCmd = "sed -i -e 's/^WebOb/#WebOb/' %s" % target_file
    os.system( sedCmd )

and then comment out the line in install_site.sh where it downloads dirac-install.

Adding a new (admin) user

As the admin user specified in the config file (note: needs user*.pem in ~/.globus) do:

[dirac DIRAC]$ source /opt/dirac/bashrc
[dirac DIRAC]$ dirac-proxy-init -g dirac_admin
[dirac DIRAC]$ dirac-admin-add-user -N newusername -D "/C=UK/O=eScience/OU=Imperial/L=Physics/CN=new user DN" -M "user@maildomain.ac.uk" -G dirac_admin


Adding a VO

We keep the VO config separately so that we can just merge in a new VO as needed. Unfortunately if you do this, the plugin which retrieves the usernames will only work with one of them. Note: You can either issue these commands directly on the dirac node, or from the comfort of your dirac UI installed elsewhere.
Here is an example for a dteam.cfg and resources.cfg. Minor modifications will be required. The configuration is split into two files to make it easier to edit, if you prefer you can have on config file to rule them all.

[dirac DIRAC]$ source /opt/dirac/bashrc
[dirac DIRAC]$ dirac-proxy-init -g dirac_admin

[dirac DIRAC]$ dirac-configuration-cli
(dips://...)-Connected> mergeFromFile dteam.cfg
(dips://...)-Connected> mergeFromFile resources.cfg
(dips://...)-Connected> writeToServer
...
Data sent to server.
(dips://...)-Connected> quit

# Now create a SiteDirector instance for the VO (it seems that you have to run this command twice(**) 
- or maybe I was too impatient after updating the config file - so if you get an error, wait two min and try again)
[dirac DIRAC]$ dirac-admin-sysadmin-cli --host dwms00.grid.hep.ph.ic.ac.uk
[dirachost]> install agent WorkloadManagement SiteDirectorDteam -m SiteDirector
[dirachost]> quit

[dirac DIRAC]$ dirac-admin-sysadmin-cli --host dwms00.grid.hep.ph.ic.ac.uk
[dirachost]> restart *

# Enable the site
dirac-admin-allow-site LCG.UKI-LT2-IC-HEP.uk "Go"

# Have a look in /opt/dirac/startup/WorkloadManagement_SiteDirectorDteam/log/current for errors(*).

(*) To increase the debug level, on the web interface go to "System" -> "Configuration" -> "ManageRemoteConfiguration" -> "Systems" -> "WorkLoadManagement" -> "Production" -> "Agents". Right click on "SiteDirectorDteam" and select "Create an Option". Name is "LogLevel", value is "DEBUG". Commit the change. Restart the SiteDirectorDteam from "Systems" -> "SystemAdministration" -> Hostname tab -> Select Component and Restart.

(**) As this snippet shows:

[diractest2.cern.ch]> install agent WorkloadManagement SiteDirectorDteam -m SiteDirector
[ERROR] Software for agent WorkloadManagement/SiteDirectorDteam is not installed
[diractest2.cern.ch]> install agent WorkloadManagement SiteDirectorDteam -m SiteDirector
agent WorkloadManagement_SiteDirectorDteam is installed, runit status: Run

Now we have to upload a pilot proxy for the VO we want to use:

[dirac DIRAC]$ dirac-proxy-init -P

File management
If you are trying to add a file (from a dirac UI) to your newly configured disk, e.g.:

dirac-dms-add-file -ddd /dteam/user/d/dbauer/testfile_imp1.txt testfile.txt IMPERIAL-disk

you might see the error: ERROR: FileCatalog._getCatalogConfigDetails: Failed to get catalog options. FileCatalog
This means you need to a add a FileCatalogue in the resources section (dirac-admin-sysadmin-cli) if you haven't done so already.

Installing VMDirac

We are roughly following these steps: VMDIRAC Wiki

The VMDIRAC code was already included in our initial dirac-install config, so the base code is already present.

Enable the module:

$ dirac-configuration-shell
[StealthConfig:/ ]% cd DIRAC
[StealthConfig:/ ]% set Extensions VM
[StealthConfig:/ ]% commit
# Now restart DIRAC.

Next install the extra python modules which are needed (any missing modules will trigger a traceback in the Framework_SystemAdministrator log and a mysterious "Software not installed" message.)

$ pip install apache-libcloud
$ pip install paramiko

Now the Dirac modules can actually be installed:

$ dirac-admin-sysadmin-cli --host dwms00.grid.hep.ph.ic.ac.uk
[dirachost]> install db VirtualMachineDB
[dirachost]> install service WorkloadManagement VirtualMachineManager
[dirachost]> install agent WorkloadManagement VirtualMachineScheduler
[dirachost]> install agent WorkloadManagement VirtualMachineContextualization

Enable your new site:

dirac-admin-allow-site CLOUD.gridpp.ac.uk "Go"

Restarting Dirac

  • Use the webinterface
  • Try our handy stop and start scripts, run as the dirac user (at your own risk ! no guarantees!)

Testing your dirac server

Is it working ? Hints how to find out can be found here.

Things to test (for us, not for you ;-)

  • Tarball UI
  • Different Site naming
  • ARC-CE pilots
  • Automatic CE configruation



Return to Dirac overview page.