Site status and plans

From GridPPwiki

Table of contents

Page for tracking site status and plans

This page has been created to provide a central GridPP reference page that allows the project to understand the status and plans of each site for pending and future middleware upgrades. It can be updated by the site administrators concerned or the Tier-2 (deputy) coordinators.

Background to batch system memory request for details:

SL5 worker nodes

Put here the percentage of your cluster on SL5 and/or an indication when nodes will be moved to SL5.

SRM upgrades

Record in this area the current version of the SRM and dates of any expected upgrades.


London Tier2

UKI-LT2-Brunel

Christmas 2009 From Raul: Brunel site will be kept online during the Christmas period. I'll be watching it from Netherlands and Paul from London. Paul will be online if things need to be powered on/off. I'll do the admin bit.

SL5 WNs

Current status (10 Nov 2009): SL4 in all worker nodes

Planned upgrade: Building a new cluster with SL5 to be deployed in December. Other clusters to be upgraded as soon as new cluster is stable.

Comments:

SRM

Current status (10 Nov 2009): Upgraded to 1.7.2 on Nov 6

Planned upgrade:

Comments:

SCAS/glexec

Current status (10 Nov 2009):

Planned deployment:

Comments: Evaluating the glexec documentation. Might deploy it with the new cluster mentioned above.

CREAM CE

Current status (date):

Planned deployment:

Comments:

UKI-LT2-IC-HEP

Christmas 2009 Best effort. Daniela and Duncan will keep an eye on the site, but will not have physical access to the machines (there is a good chance the site will survive without it).

SL5 WNs

Current status (date):

Planned upgrade:

Comments:

SRM

Current status (date):

Planned upgrade:

Comments:

SCAS/glexec

Current status (date):

Planned deployment:

Comments:

CREAM CE

Current status (date):

Planned deployment:

Comments:

UKI-LT2-IC-LESC

Site is decommissioned.

UKI-LT2-QMUL

SL5 WNs

Current status (date): (5 Jan 2010) 40 SL5 WN test machines. ce02 has queues for these machines. Atlas software installed, CMS and LHCb to install later.

Planned upgrade: Wait for the OK from the ATLAS, then roll out across the cluster.

Comments: A few minor issues remain with the install.

SRM

Current status (date): (10 Nov 2009) storm v1.4.0 (with info system patch).

Planned upgrade: storm v1.5 when released.

Comments: legacy SE( se02) shortly to be switched off. se01 was switched off in December 2009.

SCAS/glexec

Current status (date):

Planned deployment: under discussion

Comments:

CREAM CE

Current status (date):

Planned deployment:

Comments:

UKI-LT2-RHUL

Christmas 2009 Best effort (remote monitoring).

SL5 WNs

Current status (3 Nov 09): One test node provided by vendor. Still needs to have middleware installed.

Planned upgrade: Tentatively scheduled for January.

Comments: Upgrade is coupled to cluster move which depends on network provision which is somewhat uncertain and delayed.

SRM

Current status (3 Nov 09):Upgraded to DPM 1.7.2-4 on 13/11/09

Planned upgrade:

Comments:

SCAS/glexec

Current status (date):

Planned deployment:

Comments:

CREAM CE

Current status (date):

Planned deployment:

Comments:

UKI-LT2-UCL-CENTRAL

Christmas 2009 No Support.

SL5 WNs

Current status (date): 06/11/2009 no SL5 WN's

Planned upgrade: January 2010

Comments: Our SL5 nodes will be available through UCL-HEP

SRM

Current status (date): 12/11/2009 DPM 1.7 (latest)

Planned upgrade:

Comments:

SCAS/glexec

Current status (date): 06/11/2009 not deployed

Planned deployment: under discussion

Comments: Is glexec able to use sudo for identity switching and program execution?

CREAM CE

Current status (date): 06/11/2009 not deployed

Planned deployment: will be deployed by UCL-HEP

Comments:

UKI-LT2-UCL-HEP

Christmas 2009 Best effort, monitored remotely. No physical access to machines.

SL5 WNs

Current status (date): Test system installation in progress

Planned upgrade: aim for December 2009

Comments: UCL-CENTRAL cluster will become available through UCL-HEP as SL5 nodes

SRM

Current status (date): DPM 1.7.2-4 (latest on x86_64) 12/11/2009

Planned upgrade:

Comments:

SCAS/glexec

Current status (date): not deployed 06/11/2009

Planned deployment: under discussion

Comments: no instructions yet for tarball installations

CREAM CE

Current status (date): not deployed 06/11/2009

Planned deployment: early 2010

Comments:

NorthGrid

UKI-NORTHGRID-LANCS-HEP


SL5 WNs

Current status (date): 19/11/09 Installed new nodes on SL5 with new software area. Some trouble understanding the publishing for a seperate subcluster and different software area for that subcluster. We wish to to extensively test the new nodes before putting them fully into production.

Planned upgrade: Will upgrade older nodes to SL5 after new nodes are tested. Rather then a full downtime we'll do a rolling upgrade.

Comments:

SRM

Current status (date): 27/10/09 DPM 1.7.4-4 on our SL4 pools and head nodes, DPM 1.7.4-5 on our new SL5 pools.

Planned upgrade: No plan to migrate existing SL4 pools to SL5 at this time, newer hardware has been installed on SL5 and we will continue in this fashion. We plan within the next 3 months to upgrade our DPM headnode hardware and we will install the replacement head on SL5. This work will not take place till January however.

Comments:

SCAS/glexec

Current status (date): 19/11/09 Testing SCAS/glexec on our cluster.

Planned deployment:

Comments:

CREAM CE

Current status (date): 19/11/09 No Cream CE yet.

Planned deployment: The cream CE is our next project, after commissioning the SL5 WNs. We hope to have a functioning test box by Christmas.

Comments:

UKI-NORTHGRID-LIV-HEP


SL5 WNs

Current status (04/01/2010): Node installation system done, and new CE is online


Comments: All 64bit nodes are now upgraded.

SRM

Current status (date): DPM 1.7.2-4 + patched srmv2.2 (13/10/2009)

Planned upgrade: No plans at present (21/10/2009)

Comments: Head node SL4, pool nodes SL4+SL5

SCAS/glexec

Current status (date): Not deployed (10/11/2009)

Planned deployment: Under discussion

Comments: We have two clusters, a local HEP Torque cluster, and a central computing SGE cluster. There may be problems deploying on the central SGE cluster, depending on the requirements.

CREAM CE

Current status (date): Not deployed (10/11/2009)

Planned deployment: Early 2010

Comments:

UKI-NORTHGRID-MAN-HEP


SL5 WNs

Current status (date): SL5 (21/10/09)

Planned upgrade: Upgrade to SL5 on all the nodes completed on 15/10/09

Comments:

SRM

Current status (date): DPM 1.7.2 (21/10/09)

Planned upgrade: Upgrade to DPM 1.7.2 on both SEs completed on the 16/10/09

Comments: currently proceeding to unify the two DPM instances as requested by atlas. Head node and pools all SL4.

SCAS/glexec

Current status (date):

Planned deployment:

Comments:

CREAM CE

Current status (date): Deployed 2009-11-17, is in production, however no software experiment area as of yet.

Planned deployment:

Comments:

UKI-NORTHGRID-SHEF-HEP


SL5 WNs

Current status (date): SL5 (16/11/09)

Comments:

SRM

Current status (date): DPM 1.7.2 (DPM1.7.06 was installed in July)(21/10/09)

Planned upgrade:

Comments:Head node SL4, pool nodes SL4+SL5 (21/10/09)

SCAS/glexec

Current status (date):

Planned deployment:

Comments:

CREAM CE

Current status (date):

Planned deployment:

Comments:

ScotGrid

UKI-SCOTGRID-DURHAM


SL5 WNs

Current status (date): 10/11/2009

Planned upgrade: Upgrade all WNs and UI to SL5.4 completed 6/11/2009.

Comments:

SRM

Current status (date): 10/11/2009 - Running DPM 1.7.2.4 since 7/9/2009

Planned upgrade:

Comments:

SCAS/glexec

Current status (date): 10/11/2009 - Not deployed

Planned deployment: Could be deployed in future on request.

Comments:

CREAM CE

Current status (date): 10/11/2009 - no Cream CE

Planned deployment: No current plans. Could install cream CE on request.

Comments:

UKI-SCOTGRID-ECDF


SL5 WNs

Current status (date): Upgraded on 29th Oct.

Planned upgrade:

Comments:Problem with LHCb SAM test (script looks in /etc/redhat-release). Seemingly not affecting actual jobs (confirming) ATLAS pilot jobs issue (work in progress) (SAM tests and SL test passing).

SRM

Current status (date): Running DPM 1.7.2-4 for a long time.

Planned upgrade:

Comments:

SCAS/glexec

Current status (date): Not deployed

Planned deployment: None planned. Systems team do not object to deployment.

Comments:

CREAM CE

Current status (date): Not deployed

Planned deployment: None planned. Low priority unless real demand from VOs.

Comments:

UKI-SCOTGRID-GLASGOW


SL5 WNs

Current status (date): Initial Migration Complete. 1912 cores total, 1848 SL5 on WN3.2.4-0, 48 SL4 on WN3.1.40-0

Planned upgrade: December move of remaining 48 SL4 cores to SL5.

Comments: Migration complete. Some SL4 capacity kept for local ATLAS users to run non ported versions of Athena.

SRM

Current status (date): 2 DPMS migrated to SL5 DPM3.2.1-0

Planned upgrade: Possible upgrade from DPM-srm-server-mysql.x86_64 1.7.2-5 when available

Comments:

SCAS/glexec

Current status (date): 10/11/2009 SCAS & GLEXEC with CREAM and GLEXEC on WN deployed in UAT .

Planned deployment: SCAS, GLEXEC with CREAM, GLEXEC with WN in Production on request.

Comments: Documenting install and info on wiki.

CREAM CE

Current status (date): 10/11/2009 Deployed in Production currently running 3.1.22

Planned deployment: Completed. Migrated to svr014.gla.scotgrid.ac.uk.

Comments: In Production and open to all VO's

SouthGrid

UKI-SOUTHGRID-BHAM-HEP


SL5 WNs

Current status (10/02/10): All WNs now running SL5.3

Planned upgrade: Complete.

Comments: Ongoing problems running ATLAS pilot jobs on shared cluster, even after upgrade.

SRM

Current status (27/10/09): DPM 1.7.2-4 on SL 4.6

Planned upgrade: Complete.

Comments:

SCAS/glexec

Current status (date):

Planned deployment: Deployment after the test CreamCE.

Comments:

CREAM CE

Current status (16/02/10): Deployed two virtual machines (epgr05 and epgr06, hosted on epgce4) for the purposes of installing and testing a Cream CE (epgr05) submitting jobs to a WN (epgr06).

Planned deployment:

Comments:

UKI-SOUTHGRID-BRIS-HEP


SL5 WNs

Current status (date): (Dec 2009) VM CE in production with SL5 WN passing all OPS SAM tests. More WN soon.

SRM

Current status (date): 1.6.11-3sec Planned upgrade: No plans to upgrade, plan to retire DPM in Dec 2009.

StoRM SE must be upgra^H^H^H^H^H rebuilt (there's no upgrade path!) to 1.4 & enable other VO support on it.

SCAS/glexec

Current status (date): Waiting to hear how it goes elsewhere first.

Planned deployment: Waiting to hear how it goes elsewhere first.

Comments: If it takes any time+effort we'd decline, lack of staff

CREAM CE

Current status (date): No plans to deploy soon, lack of staff

Planned deployment: No plans to deploy soon, lack of staff

Comments:

UKI-SOUTHGRID-CAM-HEP


SL5 WNs

Current status (19/11/2009): SL4 on all WNs

Planned upgrade: The preparation of a very small test cluster with SL5 WNs in progress.

Comments: Site mostly busy fixing various bugs in the middleware related to Condor batch system (not generally seen at any torque/pbs site), which got impacts on site stability and usability, slowing down the site heavily for any new deployment.

SRM

Current status (19/11/2009): Presently at 1.6.11 ob gLite 3.1 (glite-SE_dpm_mysql-3.1.10-0.x86_64)

Planned upgrade: Already tried several times but error returned reporting:

Error: Missing Dependency: libapr-0.so.0()(64bit) is needed by package apr-util
Error: Missing Dependency: libapr-0.so.0()(64bit) is needed by package httpd

Comments: There is already a opened ticket for that: #52552 (http://https://gus.fzk.de/ws/ticket_info.php?ticket=52552)

SCAS/glexec

Current status (19/11/2009): Reviewing the compatibility issue with Condor at site.

Planned deployment:

Comments:

CREAM CE

Current status (19/11/2009): No plans to deploy soon, lack of support for Condor

Planned deployment:

Comments:

EFDA-JET


SL5 WNs

Current status (date): We have just upgraded to SL5/glite 3.2 (191109)

Planned upgrade:

Comments:

SRM

Current status (date):

Planned upgrade:

Comments:

SCAS/glexec

Current status (date):

Planned deployment:

Comments:

CREAM CE

Current status (date):

Planned deployment:

Comments:

UKI-SOUTHGRID-OX-HEP


SL5 WNs

Current status (date): All WN's at SL5 (19.10.09)

Planned upgrade:

Comments:

SRM

Current status (date): Running DPM 1.7.2-4 since August 2009

Planned upgrade:

Comments:

SCAS/glexec

Current status (date): t2ce02 is a CREAM ce using SCAS and a glexec enabled WN . (Updated March 2010).

Planned deployment:

Comments:

CREAM CE

Current status (date): t2ce06 is a CREAM ce driving the all the WNs in the production cluster. No SCAS or glexec. (Updated March 2010) Planned deployment:

Comments:

UKI-SOUTHGRID-RALPP


SL5 WNs

Current status (date): All WNs nodes running SL5 (Was the first site to move across)

Planned upgrade:

Comments:

SRM

Current status (date): dcache 1.9.1-7

Planned upgrade:

Comments:

SCAS/glexec

Current status (date):

Planned deployment:

Comments:

CREAM CE

Current status (date):

Planned deployment:

Comments:

Tier1

RAL-LCG2-Tier-1


SL5 WNs

Current status (date): 19/11/2009 All LHC accessible WNs are SL5, some remaining SL4 capacity for non-LHC vos

Planned upgrade: None

Comments: Expect to deploy a new CE for SL5 for non-LHC vos soon, but no date yet

SRM

Current status (date):

Planned upgrade:

Comments:

SCAS/glexec

Current status (date): 19/11/2009 SCAS deployed but not tested

Planned deployment:

Comments:

CREAM CE

Current status (date): 19/11/2009 CREAM CE deployed for Alice, CMS and LHCb

Planned deployment: Atlas have requested access

Comments:

Grid Ireland

csTCDie


SL5 WNs

Current status (date):

Planned upgrade:

Comments:

SRM

Current status (date):

Planned upgrade:

Comments:

SCAS/glexec

Current status (date):

Planned deployment:

Comments:

CREAM CE

Current status (date):

Planned deployment:

Comments: