Difference between revisions of "Batch system status"

From GridPP Wiki
Jump to: navigation, search
(Sites batch system status)
Line 204: Line 204:
|<span style="color:black">-</span>
|<span style="color:black">-</span>
|<span style="color:black">No</span>
|<span style="color:black">No</span>
|<span style="color:black">CentOS7 was waiting for move to DC, with June 1st deadline now re-evaluating to complete before move.</span>
|<span style="color:black">21/11/2017</span>
|<span style="color:black">22/1/2019</span>

Revision as of 11:17, 22 January 2019

Other links

Sites batch system status

This page has been setup to collect information from GridPP sites regarding their batch systems in February 2014. The information will help with wider considerations and strategy. The table seeks the following:

  1. Current product (local/shared) - what is the current batch system at the site. Is it locally managed or shared with other groups?
  2. Concerns - has your site experienced any problems with the batch system in operation?
  3. Interest/Investigating/Testing - Does your site already have plans to change and if so to what. If not are you actively investigating or testing any alternatives?
  4. CE type(s) - What CE type (gLite, ARC...) do you currently run and do you plan to change this, perhaps in conjunction with a batch system move?
  5. glExec/pilot support for all VOs - do you have glExec and pilot pool accounts for all VOs, as opposed to just the LHC VOs? Used for the move to a Dirac WMS.
  6. Multicore status for ATLAS and CMS
    1. ATLAS multicore jobs history for UK sites
  7. Machine/Job Features (MJF) enabled: - = not started; Fail = failing SAM tests; Warn = warnings from SAM tests; Pass = passing SAM tests
  8. Notes - Any other information you wish to share on this topic.

See Cloud & VM status for status of Vac/Cloud deployment by site.

Site Current product (local/shared) Concerns and observations Interest/Investigating/Testing CE type(s) & plans at site Pilots for all cgroups Multicore Atlas/CMS MJF CentOS7 WN Notes Date last reviewed or updated
RAL Tier-1 HTCondor (local) None No reason ARC-CE Yes Yes Yes Pass Yes 08-Jan-2019
UKI-LT2-Brunel HTCondor ArcCE info system ARC-CE Yes Yes Yes - Yes CEs and WNs on C7 since Jan 2018. Storage being moved to C7. All other services on C7 2019-01-22

UKI-LT2-IC-HEP Gridengine (local) ARC-CE CREAM, ARC-CE Yes No Yes - Yes an style="color:green">Yes</span>

UKI-LT2-QMUL SLURM SLURM does support MaxCPUTime for queues but it's complicated SPark and hadoop integration with slurm and lustre CREAM Yes Yes Yes No In local testing GPU and preempt queues also supported on the grid 13-April-18
UKI-LT2-RHUL Torque/Maui (local) Torque/Maui support non-existent Will follow the consensus CREAM Yes No Yes - Testing Setting up CC7 ArcCondor cluster 21-Nov-17
UKI-NORTHGRID-LANCS-HEP Son of Gridengine (HEC) CREAM, looking at HTCondorCE over ARC now Yes No Yes - Yes Almost all resources CentOS7, small amount of SL6 for smaller VO use. Singularity deployed (local build) 16/10/18
UKI-NORTHGRID-LIV-HEP HTCondor/VAC (local) We run a HTCondor-CE prototype ARC-CE (C7 and SL6), HTCondor-CE - C7 (prod) Yes Yes Yes Yes Yes None 22Jan 2019

UKI-NORTHGRID-MAN-HEP Torque/Maui (local)/ HTCondor (local) singularity Started migration to ARC-CE/HTCondor Yes Yes Yes Pass Yes
UKI-NORTHGRID-SHEF-HEP Torque/Maui (local) Torque/Maui support non-existent HTCondor is in testing mode CREAM CE, ACR CE is in test Yes No Yes -
UKI-SCOTGRID-DURHAM SLURM (local) No reason ARC-CE Yes Yes Yes -
UKI-SCOTGRID-ECDF Gridengine ARC-CE No Yes - Yes
UKI-SCOTGRID-GLASGOW HTcondor (local) Containers (Singularity, Docker) ARC-CE (investigating HTCondor-CE) Yes Yes Yes - No CentOS7 was waiting for move to DC, with June 1st deadline now re-evaluating to complete before move. 22/1/2019
UKI-SOUTHGRID-BHAM-HEP Torque/Maui Maui sometimes fails to see new jobs and so nothing is scheduled HTCondor CREAM No No -
UKI-SOUTHGRID-BRIS HTCondor (shared) Cannot run modern workflows (e.g. Apache Spark) kubernetes, Mesos ARC-CE, plan to add HTCondor CE once accouting is sorted. On roadmap Yes Yes - In local testing 11 Dec 2018
UKI-SOUTHGRID-CAM-HEP VAC, small legacy Torque/Maui (local) SAM tests onto VAC painfully slow VAC CREAM CE, almost completely moved to VAC Yes N/A Yes Pass VAC all CS7, CREAM-CE never will be Completely migrated to VAC 16/10/2018
UKI-SOUTHGRID-OX-HEP HTCondor (local) ARC-CE Yes Yes Yes Yes Moved some WN to Centos7 16/10/2018

UKI-SOUTHGRID-SUSX (Shared) Gridengine - (Univa Grid Engine) CREAM Yes Yes