Batch system status
This page has been setup to collect information from GridPP sites regarding their batch systems in February 2014. The information will help with wider considerations and strategy. The table seeks the following:
1) Current product (local/shared) - what is the current batch system at the site. Is it locally managed or shared with other groups?
2) Concerns - has your site experienced any problems with the batch system in operation?
3) Interest/Investigating/Testing - Does your site already have plans to change and if so to what. If not are you actively investigating or testing any alternatives?
4) CE type(s) - What CE type (gLite, ARC...) do you currently run and do you plan to change this, perhaps in conjunction with a batch system move?
5) Cloud interface(s)? - Does your site offer access to resources in ways other than via a CE?
6) Notes - Any other information you wish to share on this topic.
|Site||Current product (local/shared)||Concerns and observations||Interest/Investigating/Testing||CE type(s) & plans at site||Cloud interface available/plans||Notes|
|RAL Tier-1||HTCondor (local)||None||No reason to change||ARC & CREAM CEs, but would like to decommission CREAM CEs eventually|
|UKI-LT2-Brunel||Torque/Maui||No support for Torque/Maui||Slurm and HTCondor in test||Arc in test||OpenVZ in production, Docker in test|
|UKI-LT2-IC-HEP||Gridengine||None||None||CREAM, ARC||GridPP Cloud Tests||
|UKI-LT2-QMUL||Gridengine (local)||None||son of gridengine||cream||scalable solution to get our storage usable in the cloud|
|UKI-LT2-RHUL||Torque/Maui (local)||Torque/Maui support non-existent||Will follow the consensus||Cream||
|UKI-NORTHGRID-LANCS-HEP||Son of Gridengine (HEC), torque/maui (local)||Disillusioned with torque/maui.||Slurm or HTCondor.||Cream, interested in ARC||VMWare testing.|
|UKI-NORTHGRID-LIV-HEP||Torque Maui||Poor Support, Maui intrinsically broken||Slurm (Condor?)||Cream||None|
|UKI-NORTHGRID-MAN-HEP||Torque/Maui (local)||Maui is unsupported. It had memory leaks. Robert wrote a patch and there was nowhere to feed it back into.||slurm||Currently CreamCE, investigating ARC-CE||Vac in production on testbed||
|UKI-NORTHGRID-SHEF-HEP||Torque/Maui (local)||Torque/Maui support non-existent||Will follow the consensus||CREAM CE||
|UKI-SCOTGRID-DURHAM||Torque/Maui - Local||Becomes unresponsive and unstable. Doesn't behave particularly well if it looses nodes.||SLURM||Currently CreamCE, would like to use ARC as a replacement||N/A||
|UKI-SCOTGRID-ECDF||Gridengine||None||No plans to change||Cream CE for standard production, ARC CE for exploratory HPC work||
|UKI-SCOTGRID-GLASGOW||Torque/Maui - Local||Becomes unresponsive at times of high load or nodes being un-contactable.||Investigating HTCondor/SoGE/SLURM as a replacement.||Currently CreamCE, investigating ARC CE as replacement.||N/A|
|UKI-SOUTHGRID-BHAM-HEP||Torque/Maui||Maui sometimes fails to see new jobs and so nothing is scheduled||Will follow the consensus||CREAM||
|UKI-SOUTHGRID-BRIS||HTCondor (shared), torque + maui (local)||None||No reason to change||ARC & CREAM CEs||
|UKI-SOUTHGRID-CAM-HEP||Torque/Maui (local)||Torque/Maui support non-existent||Will follow the consensus||CREAM CE|
|UKI-SOUTHGRID-OX-HEP||Torque/Maui||Becomes unresponsive and unstable.||Investigating HTCondor||CREAMCE, Investigating ARC CE||OpenStack in production. Testing VAC||
|UKI-SOUTHGRID-RALPP||HTCondor (Legacy Torque/Maui will be switched off soon)||None||None, just migrated from torque/maui||ArcCE (Legacy CreamCEs will be switched off soon||
|UKI-SOUTHGRID-SUSX||(Shared) Gridengine - (Univa Grid Engine)||None||No reason to change||CREAMCE||