Difference between revisions of "ARC HTCondor Basic Install"
(→HTCondor configuration) |
(→Testing) |
||
Line 121: | Line 121: | ||
WARNING: This job was very recently submitted and might not yet have reached the information system | WARNING: This job was very recently submitted and might not yet have reached the information system | ||
No jobs | No jobs | ||
+ | |||
+ | When the job has finished running | ||
+ | -bash-4.1$ arcstat gsiftp://lcgvm21.gridpp.rl.ac.uk:2811/jobs/Um0NDmEkj2jnvMODjqAWcw5nABFKDmABFKDmOpOKDmABFKDmTCef4m | ||
+ | Job: gsiftp://lcgvm21.gridpp.rl.ac.uk:2811/jobs/Um0NDmEkj2jnvMODjqAWcw5nABFKDmABFKDmOpOKDmABFKDmTCef4m | ||
+ | Name: arctest1 | ||
+ | State: Finished (FINISHED) | ||
+ | Exit Code: 0 |
Revision as of 20:16, 8 May 2014
This page explains how to setup a minimal ARC CE and HTCondor pool. In order to be as simple as possible the CE, HTCondor central manager and worker node are setup on a single machine.
Contents
Prerequisites
Prepare an SL6 VM with a host certificate.
ARC CE installation
YUM repository configuration for EPEL and NorduGrid:
rpm -Uvh https://anorien.csc.warwick.ac.uk/mirrors/epel/6/x86_64/epel-release-6-8.noarch.rpm rpm -Uvh http://download.nordugrid.org/packages/nordugrid-release/releases/13.11/centos/el6/x86_64/nordugrid-release-13.11-1.el6.noarch.rpm
Install the ARC CE meta-package:
yum install nordugrid-arc-compute-element
HTCondor installation
Setup the YUM repository:
cd /etc/yum.repos.d/ wget http://research.cs.wisc.edu/htcondor/yum/repo.d/htcondor-stable-rhel6.repo
Install the most recent stable version of HTCondor:
yum install condor
HTCondor configuration
Configure HTCondor to use partitionable slots. Create a file /etc/condor/config.d/00slots.config containing the following:
NUM_SLOTS = 1 SLOT_TYPE_1 = cpus=100%,mem=100%,auto NUM_SLOTS_TYPE_1 = 1 SLOT_TYPE_1_PARTITIONABLE = TRUE
Start HTCondor by running
service condor start
Check the HTCondor is working correctly:
[root@lcgvm21 ~]# condor_status -any MyType TargetType Name Collector None Personal Condor at lcgvm21.gridpp.rl.ac.u Scheduler None lcgvm21.gridpp.rl.ac.uk DaemonMaster None lcgvm21.gridpp.rl.ac.uk Negotiator None lcgvm21.gridpp.rl.ac.uk Machine Job slot1@lcgvm21.gridpp.rl.ac.uk
ARC CE configuration
Create the required control and session directories
mkdir -p /var/spool/arc/jobstatus mkdir -p /var/spool/arc/grid
Create a simple grid-mapfile for testing, for example /etc/grid-security/grid-mapfile containing
"/C=UK/O=eScience/OU=CLRC/L=RAL/CN=andrew lahiff" pcms001
Create a minimal /etc/arc.conf, for example
[common] x509_user_key="/etc/grid-security/hostkey.pem" x509_user_cert="/etc/grid-security/hostcert.pem" x509_cert_dir="/etc/grid-security/certificates" gridmap="/etc/grid-security/grid-mapfile" lrms="condor"
[grid-manager] user="root" controldir="/var/spool/arc/jobstatus" sessiondir="/var/spool/arc/grid" runtimedir="/etc/arc/runtime" logfile="/var/log/arc/grid-manager.log" pidfile="/var/run/grid-manager.pid" joblog="/var/log/arc/gm-jobs.log" shared_filesystem="no"
[gridftpd] user="root" logfile="/var/log/arc/gridftpd.log" pidfile="/var/run/gridftpd.pid" port="2811" allowunknown="no"
[gridftpd/jobs] path="/jobs" plugin="jobplugin.so" allownew="yes"
[infosys] user="root" overwrite_config="yes" port="2135" registrationlog="/var/log/arc/inforegistration.log" providerlog="/var/log/arc/infoprovider.log"
[cluster] cluster_alias="MINIMAL Computing Element" comment="This is a minimal out-of-box CE setup" homogeneity="True" architecture="adotf" nodeaccess="outbound" authorizedvo="cms"
[queue/grid] name="grid" homogeneity="True" comment="Default queue" nodecpu="adotf" architecture="adotf" defaultmemory="1000"
Start the GridFTP server, A-REX service and LDAP information system:
service gridftpd start service a-rex start service nordugrid-arc-ldap-infosys start
Testing
From a standard UI, check the status of the newly-installed ARC CE:
-bash-4.1$ arcinfo -c lcgvm21.gridpp.rl.ac.uk Computing service: MINIMAL Computing Element (production) Information endpoint: ldap://lcgvm21.gridpp.rl.ac.uk:2135/Mds-Vo-Name=local,o=grid Information endpoint: ldap://lcgvm21.gridpp.rl.ac.uk:2135/o=glue Submission endpoint: gsiftp://lcgvm21.gridpp.rl.ac.uk:2811/jobs (status: ok, interface: org.nordugrid.gridftpjob)
Try submitting a test job
-bash-4.1$ arctest -c lcgvm21.gridpp.rl.ac.uk -J 1 Test submitted with jobid: gsiftp://lcgvm21.gridpp.rl.ac.uk:2811/jobs/Um0NDmEkj2jnvMODjqAWcw5nABFKDmABFKDmOpOKDmABFKDmTCef4m
Check the status of the job. If you do this before the information system has been updated, you will see a response like this
-bash-4.1$ arcstat gsiftp://lcgvm21.gridpp.rl.ac.uk:2811/jobs/Um0NDmEkj2jnvMODjqAWcw5nABFKDmABFKDmOpOKDmABFKDmTCef4m WARNING: Job information not found in the information system: gsiftp://lcgvm21.gridpp.rl.ac.uk:2811/jobs/Um0NDmEkj2jnvMODjqAWcw5nABFKDmABFKDmOpOKDmABFKDmTCef4m WARNING: This job was very recently submitted and might not yet have reached the information system No jobs
When the job has finished running
-bash-4.1$ arcstat gsiftp://lcgvm21.gridpp.rl.ac.uk:2811/jobs/Um0NDmEkj2jnvMODjqAWcw5nABFKDmABFKDmOpOKDmABFKDmTCef4m Job: gsiftp://lcgvm21.gridpp.rl.ac.uk:2811/jobs/Um0NDmEkj2jnvMODjqAWcw5nABFKDmABFKDmOpOKDmABFKDmTCef4m Name: arctest1 State: Finished (FINISHED) Exit Code: 0