Difference between revisions of "Batch system status"
From GridPP Wiki
m (Bristol updates) |
(→Sites batch system status) |
||
Line 125: | Line 125: | ||
|<span style="color:green">HTCondor/VAC (local)</span> | |<span style="color:green">HTCondor/VAC (local)</span> | ||
|<span style="color:green"> </span> | |<span style="color:green"> </span> | ||
− | |<span style="color:green"> | + | |<span style="color:green">We run a HTCondor-CE prototype</span> |
|<span style="color:green">ARC-CE</span> | |<span style="color:green">ARC-CE</span> | ||
|<span style="color:green">Yes</span> | |<span style="color:green">Yes</span> | ||
Line 133: | Line 133: | ||
|<span style="color:green">Yes</span> | |<span style="color:green">Yes</span> | ||
|<span style="color:green">None</span> | |<span style="color:green">None</span> | ||
− | | | + | |<span style="color:green">11 Dec 2018</span> |
Revision as of 14:17, 11 December 2018
Other links
Sites batch system status
This page has been setup to collect information from GridPP sites regarding their batch systems in February 2014. The information will help with wider considerations and strategy. The table seeks the following:
- Current product (local/shared) - what is the current batch system at the site. Is it locally managed or shared with other groups?
- Concerns - has your site experienced any problems with the batch system in operation?
- Interest/Investigating/Testing - Does your site already have plans to change and if so to what. If not are you actively investigating or testing any alternatives?
- CE type(s) - What CE type (gLite, ARC...) do you currently run and do you plan to change this, perhaps in conjunction with a batch system move?
- glExec/pilot support for all VOs - do you have glExec and pilot pool accounts for all VOs, as opposed to just the LHC VOs? Used for the move to a Dirac WMS.
- Multicore status for ATLAS and CMS
- Machine/Job Features (MJF) enabled: - = not started; Fail = failing SAM tests; Warn = warnings from SAM tests; Pass = passing SAM tests
- Notes - Any other information you wish to share on this topic.
See Cloud & VM status for status of Vac/Cloud deployment by site.
Site | Current product (local/shared) | Concerns and observations | Interest/Investigating/Testing | CE type(s) & plans at site | Pilots for all | cgroups | Multicore Atlas/CMS | MJF | CentOS7 WN | Notes | Date last reviewed or updated |
RAL Tier-1 | HTCondor (local) | None | No reason | ARC-CE | Yes | Yes | Yes | Warn | Yes | ||
UKI-LT2-Brunel | HTCondor | ArcCE info system | Spark cluster in test | ARC-CE | Yes | Yes | Yes | - | |||
UKI-LT2-IC-HEP | Gridengine (local) | ARC-CE | CREAM, ARC-CE | Yes | No | Yes | - | Yes |
| ||
UKI-LT2-QMUL | SLURM | SLURM does support MaxCPUTime for queues but it's complicated | SPark and hadoop integration with slurm and lustre | CREAM | Yes | Yes | Yes | No | In local testing | GPU and preempt queues also supported on the grid | 13-April-18 |
UKI-LT2-RHUL | Torque/Maui (local) | Torque/Maui support non-existent | Will follow the consensus | CREAM | Yes | No | Yes | - | Testing | Setting up CC7 ArcCondor cluster | 21-Nov-17 |
UKI-NORTHGRID-LANCS-HEP | Son of Gridengine (HEC) | CREAM, looking at HTCondorCE over ARC now | Yes | No | Yes | - | Yes | Almost all resources CentOS7, small amount of SL6 for smaller VO use. Singularity deployed (local build) | 16/10/18 | ||
UKI-NORTHGRID-LIV-HEP | HTCondor/VAC (local) | We run a HTCondor-CE prototype | ARC-CE | Yes | Yes | Yes | Yes | Yes | None | 11 Dec 2018
| |
UKI-NORTHGRID-MAN-HEP | Torque/Maui (local)/ HTCondor (local) | singularity | Started migration to ARC-CE/HTCondor | Yes | Yes | Yes | Pass | Yes | |||
UKI-NORTHGRID-SHEF-HEP | Torque/Maui (local) | Torque/Maui support non-existent | HTCondor is in testing mode | CREAM CE, ACR CE is in test | Yes | No | Yes | - | |||
UKI-SCOTGRID-DURHAM | SLURM (local) | No reason | ARC-CE | Yes | Yes | Yes | - | ||||
UKI-SCOTGRID-ECDF | Gridengine | ARC-CE | No | Yes | - | Yes | |||||
UKI-SCOTGRID-GLASGOW | HTcondor (local) | Containers (Singularity, Docker) | ARC-CE (investigating HTCondor-CE) | Yes | Yes | Yes | - | No | 21/11/2017 | ||
UKI-SOUTHGRID-BHAM-HEP | Torque/Maui | Maui sometimes fails to see new jobs and so nothing is scheduled | HTCondor | CREAM | No | No | - | ||||
UKI-SOUTHGRID-BRIS | HTCondor (shared) | Cannot run modern workflows (e.g. Apache Spark) | kubernetes, Mesos | ARC-CE, plan to add HTCondor CE once accouting is sorted. | On roadmap | Yes | Yes | - | In local testing | 11 Dec 2018 | |
UKI-SOUTHGRID-CAM-HEP | VAC, small legacy Torque/Maui (local) | SAM tests onto VAC painfully slow | VAC | CREAM CE, almost completely moved to VAC | Yes | N/A | Yes | Pass | VAC all CS7, CREAM-CE never will be | Completely migrated to VAC | 16/10/2018 |
UKI-SOUTHGRID-OX-HEP | HTCondor (local) | ARC-CE | Yes | Yes | Yes | Yes | Moved some WN to Centos7 | 16/10/2018
| |||
UKI-SOUTHGRID-RALPP | HTCondor | ARC-CE | Yes | Yes | Yes | Warn | |||||
UKI-SOUTHGRID-SUSX | (Shared) Gridengine - (Univa Grid Engine) | CREAM | Yes | Yes |