GridPP PMB Meeting 699

GridPP PMB Meeting 699 (25.02.19)
=================================
Present: Dave Britton (Chair), Tony Cass, Pete Clarke, Alastair Dewhurst, Tony Doyle, Pete Gronbech, Roger Jones, Dave Kelsey, Steve Lloyd, Andrew McNab, Gareth Roy, Andrew Sansum, Louisa Campbell (Minutes).

Apologies: Jeremy Coles, David Colling, Jon Hays.

1) STFC HW Spend
===================
Following on from DB’s email GR had requested:

  • Manchester to confirm spend. AM confirmed this, but Dell have been emailed regarding whether delivery timescales can be met. Manchester spent earlier and can use the same procurement – it may take one week for official notification from STFC but procurement can start before then.
  • IC to Confirm Grant No. and spend. GR will follow up with DC.

2) GridPP42 confirm dates/registration
======================================

LC confirmed date for PMB F2F is 23 April (1-4pm) and collaboration 24-25 April (starting 11am on 24th and ending 4pm on 25th). AD has reserved B&B accommodation and arranged catering at Coseners House and is firming up Collaboration Dinner. Registrations are now open and set to close 1 April at midnight.

Sponsor is required (GridPP41 – XMA & Boston; GridPP40 – Dell; GridPP39 – BIOS IT; GridPP38 – Dell; GridPP37 – BIOS IT). RAL will organise, AD will discuss with LC. Themes and sessions will be considered after GridPP6 proposal has been submitted, perhaps a focus on Tier-1 service, or evolution and integration with other communities.

3) GridPP6 task update
======================

1) DC: Could you formally answer my question about location LZ support post?

DONE – LZ POST TO BE AT IC UNLESS CG OUTCOME CHANGES THINGS.

2) JC: the task matrix is now extremely urgent (section 3.4) because updates to text in the subsequent section need to be consistent with matrix.

THERE IS A NEW VERSION OF TASK MATRIX – NEEDS CRITICAL REVIEW – VOLUNTEER (ED GROUP)

3) AD: section 3.5.1. needs review and possible expansion.

4) JC: need text/updates for 3.5.2 and 3.5.3 – note my margin comments DONE BUT ED-GROUP ASKED TO BALANCE SEC-3 VS SEC-5 (PG AND DB EMAILS)

5) RJ: need to review 3.6 that was extracted from your previous input (WP2)

6) DC/AM: need to review 3.6 from your side. DONE from AM

7) JC: need text for 3.7 based on matrix.

DONE BUT ED-GROUP ASKED TO BALANCE AS PART OF POINT-4 ABOVE

8) PC: review 3.8- I think it’s straight from your input – the rest of which is in section 5.

DONE AND SIGNED OFF.

9) GR: Please see comment at end of 3.9 and consult with JC if necessary.

10) PC: Review comments in 4.1.

DONE

11) DC: You offered to draft 4.3 by today (Friday)….

STILL OUTSTANDING

12) PC: Review 4.4.

DONE -SIGNED OFF

13) AD: To work on 4.5 I need Tape-costing; and I need non-capital operational spend including networking as previously requested by email.

TAPE PLAN DONE BUT NEED TO DISCUSS STRATEGIC OPTIONS FOR PROPOSAL NON-CAPITAL SPEND ITEMS ARE OUTSTANDING.

13A) AD to provide non-capital spend profile.

13B) AD to review tape paragraph in 4.5 once we have agreed strategy. AD WILL REVIEW DB’S REVISED TEXT.

13C) AD to revise non-capital operational paragraph in 4.5.

13D) DB to address comment on summary of 4.5 about quantifying the various effects.

14) AD: You had red text that is now in 5.1.1 and 5.1.3- you need to resolve.

15) AD: I think we shorten title of (remove red bits) of 5.1.4

16) AD: I need mapping of the Tier-1 14.5 posts to WP so that I can complete the full matrix which underlies all the tables that need updating in the proposal (see request in previous email).

INPUT RECEIVED. DB NEEDS TO DIGEST. ISSUES OF TIER-1 MANAGEMENT IN WP1.

17) AS/AD 5.1.5 is outstanding (in one sense of the word, only!).

STILL OUTSTANDING

18) DB probably needs to move 5.2.3 into 5.2.1 and address grey text at end of 5.2.2 DB NEEDS TO GET BACK TO THIS.

19) RJ: need to review 5.3 which is the rest of you input after I had a go at it.

20) DC/AM need to review 5.3.

21) PC needs to finish 5.4 and discuss descoping options with IC.

COMMENT ON P29 OF PROPOSAL. NEED TO CLARIFY TEXT TO SAY “0.5FTE NEEDS TO BE ASSIGNED TO ENSURE SUPPORT FOR APEL AND GOCDB OVER THE DURATION OF THE PROJECT”. DESCOPING DISCUSSION NEEDS TO BE REVISITED/STARTED MORE GENERALLY.

22) AD/PC in section 5.5 we need to motivate 6 FTE (2 FTE at Tier-1

roughly) for WP4. In the baseline FC version it is 2 FTE (notionally 1 FTE at Tier-1). At the moment the content does not go that far. See next action.

NEW DRAFT FROM PC – IN HANDS OF EDITOR-GROUP. ACTION NOW FOR PMB TO PROVIDE ANY OTHER INPUT

23) GR to draft possible case for WP4 post for storage.

DONE – CIRCULATED – AGREED TO TRY AND GENERALISE DATALAKES PARAGRAPH TO COVER.

DIFFICULTY WITH WP4 IS THAT THEY ALSO WILL PROVIDE ON-GOING SUPPORT WHICH IS NOT WP4… DO WE MENTION THIS OR NOT?

24) DB has to work on 5.6

PARTLY DONE – SUGGESTION THAT WE SHOULD UP MANAGEMENT TO 2 FTE AT LEAST AS NEGOTIATING POSITION.

25) PG/GR to provide milestones and metrics for 5.6.4 PG HAS DONE SOMETHING ON THIS BUT NOT CIRCULATED?

26) GR to provide risk register (long) and risk summary (shorter for proposal).

GR HAS CIRCULATED – NEEDS SOMEBODY (SMALL GROUP) TO REVIEW?

27) All the financial tables and all the scenario planning needs to be done DB HAS TO GET ONTO THIS.

28) All tables need to be finalised based on final spreadsheets, and checked.

DB and GR.

29) Proposal needs to be edited and reviewed ED-group.

30) JeS forms need to be initiated – sign off table-6 on P25 – plus

WP4/WP5 posts.

GR will send out instructions for JeS forms tomorrow.

31) JH to provide 5.63 and pathways to impact document.

4) AOCB
=======
PC summarized DUNE has been unable to write to RAL for a while due to 5 GB limit and invited comment from PMB. AD confirmed they can write but it takes time – he will be at Rucio workshop next week and will discuss.

5) Standing Items
===================

SI-0 Bi-Weekly Report from Technical Group (DC)
———————————————–
No report submitted.

SI-1 ATLAS Weekly Review and Plans (RJ)
—————————————
No report submitted.

SI-2 CMS Weekly Review and Plans (DC)
————————————-
No report submitted.

SI-3 LHCb Weekly Review and Plans (PC)
————————————–
No report submitted.

SI-4 Production Manager’s report (JC)
————————————-
No report submitted.

SI-5 Tier-1 Manager’s Report (AD)
———————————
– Castor disk server were physically moved to make room for new procurements. This was done in a rolling manner during Tuesday 19th February. The data on a disk server was unavailable while it was being moved, but relatively little impact on LHCb was observed.

– We have had two Castor disk server crashes since the move gdss776 and gdss783 both lhcbDst disk servers.

– All but one of the ARC CEs has been upgraded.   We are observing significantly less load on the machines, which we believe was the cause of most of the other issues observed.

– CPU efficiency is improving. There is an ongoing discussion with CMS regarding some of their jobs. Their most problematic jobs involve large input files (> 10GB) which have metadata spread across the file (this is being worked on by CMS).

– Procurement

XMA engineers are on site and CPU is being delivered as we speak!

Dell Storage, delivery has been booked in for 4th March.

XMA storage will be delivered this Friday, 1st March. They will not have time to finish cabling it on Friday and will finish that after Dell have delivered.

There has been a delay getting the 3 network switches from Dell for all the storage. This is due to a global shortage of a certain type of chip. They are expected to be delivered on the 27th March. The total cost is ~£10k if they were later. The main problem is that it could potentially cause a delay to the start of acceptance testing of the storage. Martin Bly is trying to get the cables delivered on time and then finding an old switch for the testing.

There is a problem with the combined resource/extra capital spend. This is for a total of £400k (inc VAT). It appears that SBS used the wrong contract which Dell have signed. Lindsay Glover is trying to amend the contract. It is still possible for delivery to happen in time.

SI-6 LCG Management Board Report of Issues (DB)
———————————————–
Nothing to report.

SI-7 External Contexts (PC)
———————————
Nothing to report.

REVIEW OF ACTIONS
=================

644.4: AD will progress capture of funds for Dirac with Mark Wilkinson. (Update: funding from DIRAC. AS has emailed Mark. They are now using it more heavily. Could use the money for tape, but have to be careful not to buy tape we won’t use. May be better charging later rather than during this FY? AD will now progress. 08/10/18 – Leicester are producing a PO for tapes and will send to AD to produce an invoice). Ongoing.

696.2: RJ to provide ATLAS’ guidance for 2 FTE location at Tier-2 sites. Ongoing.

696.3: RJ to draft 4c(iii) in the Plan2 document: A description of WP2. Ongoing.

696.5: JC to draft 4c(iv) in the Plan2 document and work with DB on 4c(i). Ongoing.

696.7: JH to draft pathways-to-impact document and extract 1 page for proposal. (Update: JH is seeking clarification between provision in GridPP5 and requirements for GridPP6). Ongoing.

696.9: PC to coordinate development of 4c(v) WP4 description. Ongoing.

696.10: AD to provide draft of Tier-1 section 6b. Ongoing.

696.11: AD to contribute via PC to 4v(v). Ongoing.

696.13: DC to provide assistance to RJ with 4c(iii). Ongoing.

696.15: DB to draft 4c(ii) with help from JC. Ongoing.

696.16: DB to coordinate 4c(vi). Ongoing.

696.17: DB to continue to develop effort matrix once Experiment site preference are known. Ongoing.

696.18: GR to continue to gather resource requirements. (Update – emails have gone out and responses awaited). Ongoing.

696.19: GR to liaise with PG on 4c(vi)2&3. Ongoing.

ACTIONS AS OF 25.02.19

======================

644.4: AD will progress capture of funds for Dirac with Mark Wilkinson. (Update: funding from DIRAC. AS has emailed Mark. They are now using it more heavily. Could use the money for tape, but have to be careful not to buy tape we won’t use. May be better charging later rather than during this FY? AD will now progress. 08/10/18 – Leicester are producing a PO for tapes and will send to AD to produce an invoice). Ongoing.

696.2: RJ to provide ATLAS’ guidance for 2 FTE location at Tier-2 sites. Ongoing.

696.3: RJ to draft 4c(iii) in the Plan2 document: A description of WP2. Ongoing.

696.5: JC to draft 4c(iv) in the Plan2 document and work with DB on 4c(i). Ongoing.

696.7: JH to draft pathways-to-impact document and extract 1 page for proposal. (Update: JH is seeking clarification between provision in GridPP5 and requirements for GridPP6). Ongoing.

696.9: PC to coordinate development of 4c(v) WP4 description. Ongoing.

696.10: AD to provide draft of Tier-1 section 6b. Ongoing.

696.11: AD to contribute via PC to 4v(v). Ongoing.

696.13: DC to provide assistance to RJ with 4c(iii). Ongoing.

696.15: DB to draft 4c(ii) with help from JC. Ongoing.

696.16: DB to coordinate 4c(vi). Ongoing.

696.17: DB to continue to develop effort matrix once Experiment site preference are known. Ongoing.

696.18: GR to continue to gather resource requirements. (Update – emails have gone out and responses awaited). Ongoing.

696.19: GR to liaise with PG on 4c(vi)2&3. Ongoing.