Difference between revisions of "Batch system status"

From GridPP Wiki
Jump to: navigation, search
Line 49: Line 49:
 
|-
 
|-
 
|UKI-LT2-Brunel
 
|UKI-LT2-Brunel
|<span style="color:green">Torque/Maui, Arc/Condor</span>
+
|<span style="color:green">Arc/Condor</span>
|<span style="color:green">No support for Torque/Maui</span>
+
|<span style="color:green">ArcCE info system </span>
|<span style="color:green">Slurm and HTCondor in test</span>
+
|<span style="color:green">Spark cluster in test</span>
|<span style="color:green">Arc in test</span>
+
|<span style="color:green"></span>
 
|<span style="color:green"></span>
 
|<span style="color:green"></span>
 
|<span style="color:green">Yes</span>
 
|<span style="color:green">Yes</span>
 
|<span style="color:green">Yes</span>
 
|<span style="color:green">Yes</span>
|<span style="color:green">OpenVZ in production, Docker in test</span>
+
|<span style="color:green">OpenVZ being retired, LXD and Docker in test</span>
 
|
 
|
 
   
 
   

Revision as of 12:14, 19 April 2016

Other links

Sites batch system status

This page has been setup to collect information from GridPP sites regarding their batch systems in February 2014. The information will help with wider considerations and strategy. The table seeks the following:

  1. Current product (local/shared) - what is the current batch system at the site. Is it locally managed or shared with other groups?
  2. Concerns - has your site experienced any problems with the batch system in operation?
  3. Interest/Investigating/Testing - Does your site already have plans to change and if so to what. If not are you actively investigating or testing any alternatives?
  4. CE type(s) - What CE type (gLite, ARC...) do you currently run and do you plan to change this, perhaps in conjunction with a batch system move?
  5. glExec/pilot support for all VOs - do you have glExec and pilot pool accounts for all VOs, as opposed to just the LHC VOs? Used for the move to a Dirac WMS.
  6. Cloud interface(s)? - Does your site offer access to resources in ways other than via a CE? (See Cloud & VM status for more up-to-date / detailed information)
  7. Multicore status for ATLAS and CMS
    1. ATLAS multicore jobs history for UK sites
  8. Notes - Any other information you wish to share on this topic.



Site Current product (local/shared) Concerns and observations Interest/Investigating/Testing CE type(s) & plans at site Pilots for all cgroups Multicore Atlas/CMS Cloud interface available/plans Notes
RAL Tier-1 HTCondor (local) None No reason ARC Yes Yes Yes OpenNebula
UKI-LT2-Brunel Arc/Condor ArcCE info system Spark cluster in test Yes Yes OpenVZ being retired, LXD and Docker in test
UKI-LT2-IC-HEP Gridengine (local) None No reason CREAM, ARC Yes No Yes GridPP Cloud Tests


UKI-LT2-QMUL Gridengine (local) None SLURM CREAM Yes No Yes local VM management system (proxmox/ovirt)
UKI-LT2-RHUL Torque/Maui (local) Torque/Maui support non-existent Will follow the consensus CREAM Yes No Yes


UKI-LT2-UCL-HEP Torque/Maui (local) Torque/Maui support non-existent HTCondor CREAM CE No X


UKI-NORTHGRID-LANCS-HEP Son of Gridengine (HEC) Torque/Maui clusterDecommissioned, for for grid and local (tier 3) Sticking with grid engine CREAM, moving to ARC eventually Yes No Yes VMWare testing; Vac in production
UKI-NORTHGRID-LIV-HEP (Single core cluster) Torque Maui (local) Poor Support, Maui intrinsically broken Cream Yes No No None
UKI-NORTHGRID-LIV-HEP (Multi core cluster) HTCondor (local) None ARC Yes Loooking into it Yes None


UKI-NORTHGRID-MAN-HEP Torque/Maui (local) Maui is unsupported. It had memory leaks. Robert wrote a patch and there was nowhere to feed it back into. HTCondor Currently CREAM, investigating ARC-CE Yes Looking into it Yes Vac in production


UKI-NORTHGRID-SHEF-HEP Torque/Maui (local) Torque/Maui support non-existent HTCondor is in testing mode CREAM CE, ACR CE is in test Yes No Yes None


UKI-SCOTGRID-DURHAM SLURM (local) No reason ARC CE Yes Yes N/A


UKI-SCOTGRID-ECDF Gridengine None No reason Cream CE for standard production, ARC CE for exploratory HPC work No Yes


UKI-SCOTGRID-GLASGOW HTcondor (local), Torque/Maui (local) Becomes unresponsive at times of high load or nodes being un-contactable. Investigating HTCondor/SoGE/SLURM as a replacement. ARC, Cream Yes Yes N/A
UKI-SOUTHGRID-BHAM-HEP Torque/Maui Maui sometimes fails to see new jobs and so nothing is scheduled HTCondor CREAM No No Testing Vac setup


UKI-SOUTHGRID-BRIS HTCondor (shared), torque + maui (local) None No reason ARC & CREAM CEs, plan to move to HTCondor CE On roadmap No No Docker in pre-testing


UKI-SOUTHGRID-CAM-HEP Torque/Maui (local) Torque/Maui support non-existent Will follow the consensus CREAM CE Yes No Yes None at present
UKI-SOUTHGRID-OX-HEP HTCondor (local) None No reason ARC CE in production Yes Yes Yes OpenStack in production. Testing VAC


UKI-SOUTHGRID-RALPP HTCondor None No reason ARC CE Yes Yes Yes


UKI-SOUTHGRID-SUSX (Shared) Gridengine - (Univa Grid Engine) None No reason CREAMCE Looking into it Yes