Imperial glideinwms
Contents
Setting up a glideinwms (preliminaries)
We are going for the all-in-one solution here, only "lightly tested" according to the developers.
Documentation:
The glideinwms webpages.
Andrew Lahiff's glideinwms setup page.
Setup
For cloud work, we need v3_2 or higher.
Preparations
The node needs a hostcert. Plus an additional (host)cert for the frontend.
There are three distinct pieces of software:
a) condor (condor-8.0.2-x86_64_RedHat6-unstripped which I got from htcondor, leave tarball in /opt/tarballs
b) javascript (javascriptrrd-0.6.4 from javascriptrrd, unpack the tarball in /opt)
c) glideinwms (from glideinWMS, unpack the tarball in install dir (here: /opt/raincloud) ):
tar -zxvf /opt/tarballs/glideinWMS_v3_2.tgz; chown -R root:root glideinwms
Install some missing packages:
wget http://repository.egi.eu/sw/production/cas/1/current/repo-files/EGI-trustanchors.repo -O /etc/yum.repos.d/EGI-trusta
nchors.repo
wget http://www.mirrorservice.org/sites/dl.fedoraproject.org/pub/epel/6/x86_64/fetch-crl-3.0.11-1.el6.noarch.rpm
rpm -iv fetch-crl-3.0.11-1.el6.noarch.rpm
chkconfig fetch-crl-cron on
yum install ca-policy-egi-core
yum install m2crypto
yum install rrdtool-python
yum install httpd
groupadd raincloud
useradd -m -g raincloud raincloud
Make a copy of the hostcert and key belonging to this user.
Validate the ini file for all components
./manage-glideins --validate [component, e.g. wmscollector, in all lower case letters] --ini [inifile]
Configuration
The ini file
Note: You cannot have too many condors. Don't make your life difficult by skimping on condor instances !
The working configuration file: glideinWMS.ini-raincloud
(Second try) glideinWMS.ini-raincloud
(First try) glideinWMS.ini-raincloud
Note: I don't really need privilege separation, but support for the 'no' option might be dropped soon.
Create some subdirectories.
Some bits of the glideinwms are very touchy about directory ownership. Right now I have:
[raincloud@gwms00 raincloud]$ pwd
/opt/raincloud
[raincloud@gwms00 raincloud]$ ls -l
drwxr-xr-x. 4 root root 4096 Oct 23 10:52 factory_client_files
drwxr-xr-x. 12 root root 4096 Oct 22 13:33 glideinwms
drwxr-xr-x. 4 raincloud root 4096 Oct 23 11:14 gwms
glideinwms contains the unpacked glideinwms tarball and nothing else.
Configuration
The raw log of the first run through can be found here.
- WMSCollector
This has to be run as root to enable privilege separation. Answer 'y' to any questions the setup script throws at you.
If this is a reconfiguration you need to remove the following directories first:
/opt/raincloud/factory_client_files/clientlog/user_raincloud
and
/opt/raincloud/factory_client_files/clientproxies/user_raincloud
[root@gwms00 ~]# /opt/raincloud/glideinwms/install/manage-glideins --install wmscollector --ini /opt/config/glideinWMS.ini-raincloud
At the end you should see:
You will need to have the WMSCollector service running if you intend
to install the other glideinWMS components.
... would you like to start it now? (y/n): y
... running: /opt/raincloud/glideinwms/install/manage-glideins --start wmscollector --ini /opt/config/glideinWMS.ini-raincloud
... requested action completed
Note that the WMSCollector also writes a file to /etc/condor (whose idea was that ?).
You may want more than 20 VMs running at once...
echo "GRIDMANAGER_MAX_SUBMITTED_JOBS_PER_RESOURCE_EC2 = 1000" >> /opt/raincloud/gwms/condor-wms/config.d/03_gwms_local.config
source /opt/raincloud/gwms/condor-wms/condor.sh
condor_reconfig
- Factory
This needs to be done as the factory unix account (raincloud).
Also the following directory needs to exist and be owned by raincloud:
[root@gwms00 opt]# mkdir /var/www/html/cfactory
[root@gwms00 opt]# chown raincloud:raincloud /var/www/html/cfactory
[raincloud@gwms00 ~]# /opt/raincloud/glideinwms/install/manage-glideins --install factory --ini /opt/config/glideinWMS.ini-raincloud
[...]
Collecting configuration file data. It will be question/answer time.
Using /opt/raincloud/condor-wms/etc/condor_config
Do you want to fetch entries from RESS? (y/n): n
Do you want to add manual entries? (y/n): y
Please list all additional glidein entry points,
Entry name (leave empty when finished): Imperial_GridPP_1
Gatekeeper for 'Imperial_GridPP_1': http://gridppcl02.grid.hep.ph.ic.ac.uk:8773/services/Cloud
RSL for 'Imperial_GridPP_1':
Work dir for 'Imperial_GridPP_1': [.]
Site name for 'Imperial_GridPP_1': [Imperial_GridPP_1]
...
======== Factory install complete ==========
Do you want to create the glideins now? (y/n) [n]: n
At this point edit /opt/raincloud/gwms/factory/glidein_c7.cfg/glideinWMS.xml to insert the cloud configuration (remove the auto-generated part belonging to Imperial_GridPP_1.)
And change the CCB attricutes to this:
<attr name="USE_CCB" value="True" const="True" type="string" glidein_publish="True" publish="True" job_publish="False" parameter="True"/>
Also add/replace the following tags if you want to enable gLExec:
<attr name="GLEXEC_JOB" const="True" glidein_publish="False" job_publish="False" parameter="True" publish="True" type="string" value="True"/> <attr name="GLEXEC_BIN" const="True" glidein_publish="False" job_publish="False" parameter="True" publish="False" type="string" value="/usr/sbin/glexec"/>
If you want to use 1024-bit proxies rather than the HTCondor default of 512, now is also a good time to change it. You have to include a script that will change the setting on the WN as the glidein starts up. To do this, create a new file at /opt/raincloud/gwms/factory/glidein_c7.cfg/proxy_length.sh. Once you've done this you can add the following line into the outermost <file> section in the XML config file:
<file absfname="/opt/raincloud/gwms/factory/glidein_c7.cfg/proxy_length.sh" executable="True" comment="fix proxy length"/>
Before doing starting the factory I need to make two directories and change their owner to raincloud:root. As far as I can tell this is the minimum invasive procedure... (Note that the validation/install will work without these directories, but you won't be able to create the glideins.)
(as root do)
mkdir /opt/raincloud/factory_client_files/clientlog/user_raincloud
chown raincloud:root /opt/raincloud/factory_client_files/clientlog/user_raincloud
mkdir /opt/raincloud/factory_client_files/clientproxies/user_raincloud
chown raincloud:root /opt/raincloud/factory_client_files/clientproxies/user_raincloud
Then create the glideins (as raincloud) and start the factory:
. /opt/raincloud/gwms/factory/factory.sh
/opt/raincloud/glideinwms/creation/create_glidein /opt/raincloud/gwms/factory/glidein_c7.cfg/glideinWMS.xml
/opt/raincloud/glideinwms/install/manage-glideins --start factory --ini /opt/config/glideinWMS.ini-raincloud
- Usercollector
(as raincloud - on installing as root see Usercollector_as_root)
[raincloud@gwms00 ~]# /opt/raincloud/glideinwms/install/manage-glideins --install usercollector --ini /opt/config/glideinWMS.ini-raincloud
Answer 'y' to any questions.
- Submit
Note: When running in combination with a CE, the Submit module is installed on the ARCCE, not the glidein WMS and therefore this bit can be ignored when installing the glideinWMS.
[root@gwms00 ~]# /opt/raincloud/glideinwms/install/manage-glideins --install submit --ini /opt/config/glideinWMS.ini-raincloud
Note: previously we tried to share a condor with the user collector. This required to a) edit the condor_mapfile by hand and b) update 11_gwms_secondary_collectors.config from backup as the user collector config gets overwritten by the submit install. Not recommended.
You may also want to stop this node sending out e-mail on every successful job, this is simple:
echo "MAIL = /bin/true" > /opt/condor-submit/config.d/99_gwms_nomail.conf condor_reconfig
If you want to copy an X509 proxy to the WN (which you probably do), you should add the following to /opt/condor-submit/config.d/03_gwms_local.config (or any other file in this directory) and re-run condor_reconfig (thanks to Andrew L. for the simple recipe to do this!):
use_x509userproxy = True SUBMIT_EXPRS = $(SUBMIT_EXPRS) use_x509userproxy
- VOFrontend
The frontend has its own hostcert and key with a different DN to the glideinWMS:
[raincloud@gwms00 ~]$ pwd <br> /home/raincloud <br> [raincloud@gwms00 ~]$ ls -l <br> -rw-r--r--. 1 raincloud raincloud 1814 Jun 11 17:18 frontend-cert.pem <br> -rw-------. 1 raincloud raincloud 1679 Jun 11 17:18 frontend-key.pem <br> -rw-------. 1 raincloud raincloud 3873 Jun 11 17:21 raincloud.proxy<br>
(as root)
mkdir /var/www/html/cfrontend
chown raincloud:raincloud /var/www/html/cfrontend
(as raincloud)
/opt/raincloud/glideinwms/install/manage-glideins --install vofrontend --ini /opt/config/glideinWMS.ini-raincloud
Do you want to create the frontend now? (y/n) [n]: n
Then you should edit frontend.xml with the following: Frontend_xml_imperial_cloud
If you want gLexec enabled, replace the GLIDEIN_Glexec_Use attr with the following:<attr name="GLIDEIN_Glexec_Use" glidein_publish="True" job_publish="True" parameter="False" type="string" value="OPTIONAL"/>
And after that create the frontend:
. /opt/raincloud/gwms/frontend/frontend.sh
/opt/raincloud/glideinwms/creation/create_frontend /opt/raincloud/gwms/frontend/instance_c7.cfg/frontend.xml
selinux
semanage port -a -t http_port_t -p tcp 8319
edit /etc/httpd/conf/httpd.conf to change "Listen 80" to "Listen 8319"
chkconfig httpd on; service httpd start
Open ports 8139 & 9618 for iptables
Proxies
We run an hourly cron job that renews the frontend proxy.
Checks, Starting and Stopping
Have a look:
glidein_c7
Submitting a job
Use an innocent user (e.g. 'cloud').
su - cloud
source /opt/raincloud/gwms/condor-submit/condor.sh
condor_submit test2.jdl
condor_q
For extra debugging config see here.
List of relevant log files:
- /opt/raincloud/gwms/frontend/log/frontend_frontend_service-c7/group_main
main.info.log: ERROR: Runtime Error. Failed to talk to schedd: -> check if submit module on cetest02 is running - /opt/raincloud/gwms/condor-user/condor_local/log/SchedLog
Reconfiguring Things
Now you've got it all working, no doubt you want to change things like the image AMI number without going through the trauma of re-installing everything. You can do this with the sections below.
Reconfiguring the factory
As raincloud:
- /opt/raincloud/glideinwms/install/manage-glideins --stop factory --ini /opt/config/glideinWMS.ini-raincloud
- cd /opt/raincloud/gwms/factory
- source factory.sh
- /opt/raincloud/glideinwms/creation/reconfig_glidein -xml glidein_c7.cfg/glideinWMS.xml
- (Ignore the error about the monitor directory not existing)
- /opt/raincloud/glideinwms/install/manage-glideins --start factory --ini /opt/config/glideinWMS.ini-raincloud
Reconfiguring the frontend
As raincloud:
- /opt/raincloud/glideinwms/install/manage-glideins --stop vofrontend --ini /opt/config/glideinWMS.ini-raincloud
- cd /opt/raincloud/gwms/frontend
- source frontend.sh
- /opt/raincloud/glideinwms/creation/reconfig_frontend instance_c7.cfg/frontend.xml
- /opt/raincloud/glideinwms/install/manage-glideins --start vofrontend --ini /opt/config/glideinWMS.ini-raincloud
Return to overview page.