GridPP PMB Meeting 666

GridPP PMB Meeting 666 (23.04.18)
=================================
Present: Dave Britton (Chair), Tony Cass, Pete Clarke, Jeremy Coles, David Colling, Alastair Dewhurst, Tony Doyle, Roger Jones, Steve Lloyd, Andrew McNab, Andrew Sansum, Pete Gronbech (Minutes).

Apologies: Dave Kelsey

1. DUNE
=======
DUNE computing is now on a firmer footing and a Section has been added in the GridPP incubator. Need to trail this in the ops meeting.
In June ProtoDUNE will produce data: Headline figures of many PB. To start with ~2PB. It is early days but it could establish UK as part of the DUNE computing. Raja is also working partly on DUNE.
PC had mooted that the UK would like to host some of the Proto DUNE data. Oxford are mainly DAQ for DUNE. Steve Jones at Liverpool are on board.

2. GDPR
=======
DK absent so cannot cover this fully this week. However,DK gave a presentation at the recent MB on GDPR. DB had also heard from somebody involved with GDPR formation that everybody needs to take GDPR seriously as examples will be made! It has a clear agenda, one of which is self-funding. So early fines are likely. Making examples of different communities may well be one of their agenda items.

3. CMS Resource Use
================
The discussion had come about after hearing how some other countries are dealing with lumpy usage patterns.
Some embarrassment from CMS in that they did not use their Tier-1 resources worldwide in the first half of last year.

4. F2F – June
================
The F2F in June? One thing to discuss is this sort of resource allocation but the main thrust will be planning for GridPP6 etc.
We need to develop a strategy on clouds even if it is not in response to the CMS lumpy resource thing. With a physics hat… there is a validation issue which is preventing people running MC etc when it may then need re-running.

So do we need a F2F meeting before the summer? Issues around how we structure GridPP6 (for which we don’t know the scope) but but almost certainly useful to have preliminary discussions.

Various issues in the recent thread about commercial clouds, HNScicloud etc.

Action 666.1: DB to set up Doodle poll about F2F in June.

5. UKT0
=======
We don’t want PP to appear to dominate. Nor do we want to be under represented as we are the largest provider. PC and DB will discuss offline.There is no Astro Grid or Astro Compute body that represents them. Holding place for Astro compute for one of each of a few projects (~4) but should be replace at some point by a representative for them all.
For PP do we just have GridPP or do we list the major projects….
(Atlas, CMS maybe LHCb and then what about DUNE……)
DB First suggestion would be LHC, Non-LHC, Tech rep, Managerial rep.
(NA62, t2k etc need things like DIRAC can be represented by Non-LHC)

PC Technical people could come in as required.
PC will discuss offline with DB. Needs to be pushed forward this week. Representation in a balanced and pragmatic way.

5. AOCB
=======
Procurement fall-out?
Any news on the money. Expecting delivery to start this week.
AS will deal with the money questions with TM and the 1.5 FTE carry forward.
WLCG MB:
https://indico.cern.ch/event/685868/

DB: Announcement about middleware officer.
Presentation by DK on GDPR.

Slides on resources for run 3, by Ian Bird. Not a big step up fits in with flat cash ramp up.
WLCG Strategy document. Their response to the HSF white paper.
Identifies those themes which are priorities for WLCG.
DB and others will need to digest this and see what relevance to GridPP6 there may be.
TC This is the document that Ian explains what he understands by Data lakes.
OSG operations centre talk.

In July 2017 the issue of the WLCG MoU and the UK moving to UKRI came up. STFC signed the MoU and SPG are dealing with it… Where is the MoU? AS found it. Last week got back a list of agreements. MoU came back to AS. AS thinks it should not be SCD that now owns the MoU. DB The MoU value to us is the commitment from STFC. The higher up the better. If Program directorate can own it, AS will follow up.

6. Standing Items
===================

SI-0 Bi-Weekly Report from Technical Group (DC)
———————————————–
Atlas Nothing substantive, other than wrt the Atlas liaison post. Have reservations about a split post.
PPD proposal would be filled for 1 year and then post be split. Could host at an institution but this would take time to fill and may be also be difficult.
The post will be filled at the time we write the GridPP6 proposal. AD appreciate the post being filled as it will allow him to leave the Atlas responsibilities.

SI-1 ATLAS Weekly Review and Plans (RJ)
—————————————
No report submitted.

SI-2 CMS Weekly Review and Plans (DC)
————————————-
No report submitted.

SI-3 LHCb Weekly Review and Plans (PC)
————————————–
No report submitted.

SI-4 Production Manager’s report (JC)
————————————-
No report submitted.

SI-5 Tier-1 Manager’s Report (AD)
———————————
Castor:
– gdss782 was out of production for 3 days (15th – 18th April). This contained ATLAS data and caused sufficient impact that we got an email from Rod Walker. (Joking aside, Rod is a busy person and if he needs to ask why production work is not progressing it is a pretty good indication that it is impacting ATLAS)

Echo:
– One of the ATLAS PGs was unavailable for parts of last week (Tuesday – today). A corrupt file was found to be causing the problems and has now been fixed. A thread has been started with the Ceph developers as a bad file should not cause instabilities in the underlying storage.
– Following the recent correction of the EC backfill bug we expect to be deploying the remaining 2016 generation of disk servers into Echo starting this week.
– There are ongoing problem with CMS SAM tests. The CMS AAA fall back test is frequently failing (our WN getting data from other sites), which we believe is firewall related and will review after the firewall replacement. There also appears to be a problem with normal XRootD access which we believe is related to the WN gateways, but have not identified the cause yet.

Networking:
– The upgrade (replacement) of the RAL firewall is scheduled to take place on the morning of the 25th April. Hopefully this will fix problems we have seen with data flows to/from our worker nodes.

Other:
– Rucio at RAL workshop is scheduled for the end of this week (timetable is taking shape here: https://indico.cern.ch/event/724632/timetable/#20180426).
– We are setting up Euclid at RAL (we set them up in 2016, but it has since gone stale) so that Mark Holliman can begin his testing for the scale run in August.

Data services team interviewing on Wednesday.
CMS liaison waiting on a visa.
Atlas post can effectively start asap.
David Crooks will become the security officer.

SI-6 LCG Management Board Report of Issues (DB)
———————————————–
Nothing to report.

SI-7 External Contexts (PC)
———————————
Nothing further, just UKT0 has to get serious.
STFC needs to do something sensible with the money.
AS believe the default thing will be sensible.
AS has spoken with Neil Geddes.

REVIEW OF ACTIONS
=================
656.1: DK will report before the end of February on any actions GridPP should take to comply with GDPR. (UPDATE: DK circulated slides). Done.
663.2: PG will canvas sites to ascertain when they want to spend money and determine how disk will be phased out. Ongoing.
663.3: RJ and DC will advise how the experiments want disk divided for the start of Run 3 (Alice and LHCb are resolved). Ongoing.
663.4: PC will publish our input to Balance of Programmes Review on GridPP website. Ongoing.
663.5: GS will respond on availability for proposed date of 13 September for Tier1 review. Done.
663.8: JC will examine GridPP staff roles/service/areas of expertise. Ongoing.
663.9: AM will share baseline of interfaces he will draw up for UKT0 participating sites before a F2F in June. (Update: AM advised the two documents in the last two actions now exist but are still being discussed with UKT0). Ongoing
663.10: AM will share list of interfaces which experiments need to be able to participate in the UKT0 service. (Update: AM advised the two documents in the last two actions now exist but are still being discussed with UKT0). Ongoing.
665.1: AD will raise issues relating to (VENDOR) delivery of h/w with Lindsay and Martin
665.2: AD will produce Procurement schedule for the coming FY to build in an additional month to buffer any delays in the future.
665.3: DB will follow up with RJ on the Atlas post.

ACTIONS AS OF 23.04.18
======================
663.2: PG will canvas sites to ascertain when they want to spend money and determine how disk will be phased out. Ongoing.
663.3: RJ and DC will advise how the experiments want disk divided for the start of Run 3 (Alice and LHCb are resolved). Ongoing.
663.4: PC will publish our input to Balance of Programmes Review on GridPP website. Ongoing.
663.8: JC will examine GridPP staff roles/service/areas of expertise. Ongoing.
663.9: AM will share baseline of interfaces he will draw up for UKT0 participating sites before a F2F in June. (Update: AM advised the two documents in the last two actions now exist but are still being discussed with UKT0). Ongoing
663.10: AM will share list of interfaces which experiments need to be able to participate in the UKT0 service. (Update: AM advised the two documents in the last two actions now exist but are still being discussed with UKT0). Ongoing.
665.1: AD will raise issues relating to (VENDOR) delivery of h/w with Lindsay and Martin
665.2: AD will produce Procurement schedule for the coming FY to build in an additional month to buffer any delays in the future.
665.3: DB will follow up with RJ on the Atlas post.
666.1: DB to set up Doodle poll about F2F in June.