GridPP PMB Meeting 678

GridPP PMB Meeting 678 (10.09.18)
=================================
Present: Dave Britton (Chair), Tony Cass, Pete Clarke, David Colling, Alastair Dewhurst, Tony Doyle, Pete Gronbech, Roger Jones, Dave Kelsey, Steve Lloyd, Gareth Roy, Andrew Sansum, Louisa Campbell (Minutes).

Apologies: Jeremy Coles, Andrew McNab.

GR was welcomed onto the PMB.

1. Background Papers Update
===========================
a) Storage Document (JC)
Link supplied via email in advance – first draft should be completed end September. DB invited comments on the document and agreed the timescale seems achievable to have all the documents in draft by end September (except Security document, see below).

b) Security Document (DK)
DK and David Crooks have discussed and agreed the slides shown at GridPP41 are a good starting point from which to draft the document. David Crooks will draw up an executive summary which will contain all aspects going into the document by 8 October then DK will work with him on fuller document asap thereafter.

c) Support Document (RJ)
RJ work on the early draft presented at GridPP41 and circulate to the PMB for comment asap. RJ enquired about the aims and intended circulation of the documents – DB confirmed their purpose is to allow everyone to have input in advance of the GridPP6 proposal and also to provide a strategy and text to write the GridPP6 proposal, which will have to be drafted before Christmas, allowing only 5-6 weeks of writing. The documents will be put into PMB Documents on the website and content should be appropriately written with semi-public access in mind.

d) Tier-1 Document (AD)
AD aims to have a draft completed by 22nd-29th September before his annual leave. DB confirmed the PMB will want to contribute before finalised so asked for a draft on Googledoc beforehand.

e) Tier-2 Document (SL)
SL circulated a draft to PMB as a Google Document containing comments combined with DB previous Tier-2 Evolution document to ensure all relevant information is together and accessible. Talks from the GridPP41 Tier-2 session were also incorporated, where relevant. This is now in reasonable shape as a first draft and DC will have input regarding CMS requirements, PC should comment on IRIS and new communities from a strategic perspective. This has also been sent to the technical experts who previously contributed to amend/comment, where appropriate. DB will review and comment on the draft later this week.
DB confirmed there is a 3-week window to converge these documents – he has written to Tony Medland to request a meeting to discuss and is awaiting a response.
ACTION 678.1: RJ, to finalise the Experiment Support background document by end September.
ACTION 678.2: DK to finalise the Security, Trust and Identity background document by mid October.
ACTION 678.3: AD to finaluise the Tier1 background document, including tape strategy by end September.
ACTION 678.4: SL to finalise the Tier2 background document by end September.

ACTION 678.5: JC to finalise the Storage background document by end September.

2. RSE (Research Software Engineers) College Case (document from PC)
======================================
Cases were written against a Call from STFC for projects to be developed to match future funding opportunities as discussed previously by PC. These have been circulated to the PMB – there is a small chance (possibly 5-10%) of success but worth participating. DB thanked PC for his work on this and advised we submitted cases covering High Luminocity and LHC software so this is very useful for reference in other contexts in future. As Chair of CAP it is also extremely useful for DC to have this information.

3. eInfrastructure Case (document from PC)
==========================================
PC, DB, DK, AM, AS and others are part of this group who converged on 5 different projects submitted last week. This has a similarly slim chance for success but is important for reaffirmation of the importance of computing.

4. Collaboration Board Email
============================
DB drafted an email for the Collaboration Board and circulated to PMB for comment, there were none. It was agreed DB should send this to CB.
ACTION 678.6: DB will send an email to the Collaboration Board.

5. Pledges
==========
Pledges are due by the end of September. This is normally spearheaded by PG, but he will discuss details with GR who will now take this forward. Initially, Rebus request levels should be checked and confirmed if we have sufficient capacity in Tier-1 and Tier-2. DB will discuss with GR and cover Rebus, he asked PG to send the spreadsheet with information. This relates to our Pledge to WLCG rather than what sites pledge to us and allows us to check that sites are comfortable with proposed pledges. Different methods are used now to define resource so this needs to be reviewed.
ACTION 678.7: DB, PG and GR will discuss how GR can take forward Pledges.

6. AOCB
=======
a) Quarterly reports – Tier-1, Atlas, CMS, Security Ops, and London grid were outstanding on 4 September – Atlas has since been received, with thanks, from RJ and DK confirmed this will be sent today, the others remain outstanding. AD has asked Gareth Smith to prepare the Tier-1 on this occasion due to his other pressing commitments so this is in hand. DC confirmed the CMS will be completed shortly and he will chase the London one this week as Duncan has been on leave. DB reminded the PMB that subsequent QRs will be sent to Matt and requested they are prepared timeously.
b) Tier-1 Review Agenda – AD shared the agenda outline (he has arranged taxis for DB and GR as well as and TC). Draft agenda:
10.00-11.00 arrival and coffee;
DB 10 minute slot;
AD will give strategic vision (work with IRIS, Tier-2 in future and procurement, manpower and how arranged in future);
Networking (plan to join LHC1 and network issues experienced this year, other development – upgrade to 100GB external network and ability to set CMS entries without tickets, etc);
Alison (tape service/risks/costs/ Oracle database/ ATR for new person and forward planning/ savings by not running Oracle and benefits of sharing database);
Lunch;
Darren on operations (major incidents, improvements, metrics comparison with other Tier-1s, call-outs and review response to security incidents);
Fabric operations and projects, Nick Hill (issues in the past – funds and h/w delivery to groups, improvements made to project management);
AD on CPU and Disk and effort spent to run Fabric team;
Rob on Castor and Echo (ongoing effort and required for future, major incidents);
Grid Service: 1) other services (plans for decommissioning and building resilient services/proxies); and 2) batch file usage (James will cover) – CPU planned and delivered;
Coffee and end.
DB suggested all speakers should leave 25-30% of their slot for discussion and a final 15-minute general discussion slot should be built in for summing up.

7. Standing Items
===================

SI-0 Bi-Weekly Report from Technical Group (DC)
———————————————–
Nothing significant to report, though DC noted the need to reinvigorate this and he is going to circulate an email this week.

SI-1 ATLAS Weekly Review and Plans (RJ)
—————————————
RAL data disk has been drained completely from the Rucio perspective and will soon be fully decommissioned. More storage matters – EOS migration, Birmingham moving output to Manchester and local storage being drained. They will no longer be a storage site for Atlas though they have h/w which was partly funded for Atlas, but a very small share. There was a longstanding issue with Edinburgh DRF which was not running cloud, now resolved. Rutherford RALPP now operating properly.

SI-2 CMS Weekly Review and Plans (DC)
————————————-
There have been various discussions about migrations, but nothing of significance.

SI-3 LHCb Weekly Review and Plans (PC)
————————————–
Nothing significant to report.

SI-4 Production Manager’s report (JC)
————————————-
JC not in attendance and no report submitted.

SI-5 Tier-1 Manager’s Report (AD)
———————————
– The memory upgrades for production machines in Echo have been completed. We are still waiting on the ClusterVision delivery which we expect in the next few weeks. During this time we will be finishing the weighting up of the XMA nodes, so this memory upgrade shouldn’t delay any further the cluster vision deployment.

– There was a short (~2 hours) scheduled downtime on Thursday for Castor for urgent security patching for the database.

– New CMS Tier-1 Liaison (Katy Ellis) started today, she will be at the Tier-1 review.

– A plan to migrate to the Dirac File Catalogue will be emailed out to small VOs that are still using the LFC towards the end of September.

CMS have now established a working group for Rucio.
It was noted that Dave Newbold will soon take over from Stephen Heywood.

SI-6 LCG Management Board Report of Issues (DB)
———————————————–
There has been no management board meeting and nothing to report.

SI-7 External Contexts (PC)
———————————
PC and DB are attending the Dirac days in Swansea on 11 September.
PC noted a discussion with Lydia regarding Durham who believed they can now apply for IRIS, but PC clarified only GridPP can make such an application.
CERN SCF is next week and DB is on an NA62 shift at CERN.

REVIEW OF ACTIONS
=================
644.4: AD will progress capture of funds for Dirac with Mark Wilkinson. (Update: funding from DIRAC. AS has emailed Mark. They are now using it more heavily. Could use the money for tape, but have to be careful not to buy tape we won’t use. May be better charging later rather than during this FY? AD will now progress). Ongoing.
667.2 PG will do h/w planning before next OC to provide OC with details of shortfall in funds. Ongoing.
671.2: PG and AM to check if there is a common requirement across the Grid that can be negotiated with Dell for a framework agreement (e.g. Storage, Compute, Configurations). UPDATE: Dell have not been in contact so action is closed. Done.
672.3: RJ, DC and AM to draft the Experiment Support background document. Done.
672.4: DK to draft the Security, Trust and Identity background document. Done.
672.5: AD to draft the Tier1 background document. Done.
672.6: JC, SL AM and PG to draft the Tier2 background document. Done.
672.7: PG will consider the agenda for GridPP41 incorporating the GridPP6 Background Documents. Done.
673.2: AD will provide the PMB with an overview of strategy for tapes and drives for the remainder of GridPP5 and GridPP6. Ongoing.
675.1: DC to sign off report on Tier-1 LHC usage. Ongoing.
675.2: RJ to sign off report on Tier-1 LHC usage. Ongoing.

ACTIONS AS OF 10.09.18
======================
644.4: AD will progress capture of funds for Dirac with Mark Wilkinson. (Update: funding from DIRAC. AS has emailed Mark. They are now using it more heavily. Could use the money for tape, but have to be careful not to buy tape we won’t use. May be better charging later rather than during this FY? AD will now progress). Ongoing.
667.2 PG will do h/w planning before next OC to provide OC with details of shortfall in funds. Ongoing.
673.2: AD will provide the PMB with an overview of strategy for tapes and drives for the remainder of GridPP5 and GridPP6. Ongoing.
675.1: DC to sign off report on Tier-1 LHC usage. Ongoing.
675.2: RJ to sign off report on Tier-1 LHC usage. Ongoing.
678.1: RJ, to finalise the Experiment Support background document by end September.
678.2: DK to finalise the Security, Trust and Identity background document by mid October.
678.3: AD to finalise the Tier1 background document, including tape strategy by end September.
678.4: SL to finalise the Tier2 background document by end September.

678.5: JC to finalise the Storage background document by end September.

678.6: DB will send an email to the Collaboration Board.
678.7: DB, PG and GR will discuss how GR can take forward Pledges.