GridPP PMB Meeting 607

GridPP PMB Meeting 607 (19.09.16)
=================================
Present: Dave Britton(Chair), Tony Cass, Pete Clarke, Jeremy Coles, Pete Gronbech, Roger Jones, Steve Lloyd, Andrew McNab, Gareth Smith, Louisa Campbell (Minutes).

Apologies: Andrew Sansum, Tony Doyle, David Colling, Dave Kelsey.

1. WLCG Pledges (Tier-1 and Tier-2 position) – STFC advice
==========================================================
Following on from DB’s email and a meeting with Tony Medland on Friday it has been made clear that there should be no underspend on the original GridPP5 profile DB and AS will look at the implications and options. A full up to date financial figure is not yet clear, but Tier1 monies should be categorised into Capital and Resource – PG will produce a new re-categorised spreadsheet to facilitate decision-making. Recapitalising tape media should be commenced and the archiving of Dirac data may be of assistance in making this case.

PG has updated Tier2 pledges independent of Tier1 pledges and re-circulated the spreadsheet to the PMB for onward sharing with Tier2 sites. Pledges for both Tier1 and Tier2 are required in the next 10 days and these should include updated LHC requirements in the spreadsheet.

Action 607.1: AS will provide figures on Tier1 and Tier2 pledges and the available spend this week.

Action 607.2: PG will produce a spreadsheet containing explicit detail on Capital and Resource for Tier1 and as well as Tier1 and Tier2 pledges to include LHC requirements.

2. Requirements for the next OSC
================================
The usual documents require to be prepared for the OSC. DB looked over the previous set and will forward the feedback received from OSC to PMB. In summary, the OSC complimented bringing on of new users and simplifications and encouraged continuation of that for new mid- and small- scale users – this should be referred to in reports for the OSC. One suggested was we explore ways to bring in staff and refer to the Global Research Fund which we have been doing. The OSC wish to review plans for the next phase of operations and for GridPP5. The documents must be submitted by 8 November with final polishing completed by 1 November. Therefore, all draft documents should be completed by 21 OCTOBER – for finalising during the F2F. Taking account of numerous conferences and other PMB commitments, the F2F is likely to take place on 24 October, probably 1130-1600, perhaps at Queen Mary for convenience – SL has arranged a meeting room.

Financial document – PG is progressing and requires some issues resolved. Other elements to be completed include: introduction (DB); Wider Context (DB & PC); Tier1 Status (GS & AM) – fabric, infrastructure, management, etc; Deployment Status (JC); User Reports from Major Experiments (RJ, SL, AM, DC); Impact & Dissemination (Tom Whytnie, SL to pass on).

DB, SL, PG & PC will attend the OSC and possibly experiment reps – ATLAS and CMS can be covered re plans for GridPP5 and won’t require explicit representation.

ACTION 607.3: PG and SL will circulate details of F2F meeting on 24 October at Queen Mary.

ACTION 607.4: ALL to contribute to the OSC Project Status Report.

ACTION 607.5: DB to contribute Introduction for OSC report.

ACTION 607.6: DB & PC to contribute Wider Context for OSC report.

ACTION 607.7: GS and AS to contribute Tier1 Status Report for OSC report.

ACTION 607.8: JC to contribute Deployment Status for OSC Report.

ACTION 607.9: RJ to contribute ATLAS User Report for OSC Report.

ACTION 607.10: DC to contribute LHCb User Report for OSC Report.

ACTION 607.11: SL (and Tom Whytnie) to contribute Impact and Dissemination Report for OSC Report.

3. Tier-1 review date?
=====================
AS is currently undertaking an internal poll on this – GS will progress this to ensure a meeting can be scheduled in good time and taking account of other PMB commitments.

4. AOCB
=======
a) F2F Doodle Poll – this is currently incomplete, TC cannot attend and. It was agreed that 24 October is the preferred date for most members – JC cannot attend any and RJ is checking whether another meeting can be moved to allow attendance. SL has reserved a meeting room.

5. Standing Items
===================

SI-0 Bi-Weekly Report from Technical Group (DC)
———————————————–
Nothing of significance to report.

SI-1 Dissemination Report (SL)
——————————
Nothing of significance to report.

SI-2 ATLAS Weekly Review and Plans (RJ)
—————————————
Nothing of significance to report.

SI-3 CMS Weekly Review and Plans (DC)
————————————-
Nothing of significance to report.

SI-4 LHCb Weekly Review and Plans (PC)
————————————–
Nothing of significance to report.

SI-5 Production Manager’s report (JC)
————————————-
1. The WLCG T2 availability/reliability figures for August show few problems:

* ALICE: http://wlcg-sam.cern.ch/reports/2016/201608/wlcg/WLCG_All_Sites_ALICE_Aug2016.pdf.
All okay

* ATLAS: http://wlcg-sam.cern.ch/reports/2016/201608/wlcg/WLCG_All_Sites_ATLAS_Aug2016.pdf

Glasgow: 86%:97%
Oxford: 82%:82%

* CMS:http://wlcg-sam.cern.ch/reports/2016/201608/wlcg/WLCG_All_Sites_CMS_Aug2016.pdf
All okay

* LHCb: http://wlcg-sam.cern.ch/reports/2016/201608/wlcg/WLCG_All_Sites_LHCB_Aug2016.pdf
All okay (but note ECDF as N/A).

Glasgow: Glasgow availability was down due to a power cut in their machine room at the beginning of the month. It took a few days to recover from it.
Oxford: Oxford was down for a few days due to an A/C failure on Friday 12th August. The cluster was shutdown and restored on Monday 15th.

2. UK eScience CA – certificate issuance problems. Jens reported that on 15th a partial but significant database corruption occurred on the signing system for the CA. It was unclear if this contributed to a problem encountered with signing in that period. Data was restored from (offline) backups but the rebuild was not correctly configured and this was only resolved today. No data was lost but signing was not possible for a few days. A post-mortem report is being written.

3. We are down to 4 contributors again for the ROD work. Ideally the Tier-1 will replace Gareth’s contribution in due course (he is now working <0.5FTE). 4. A large number of site admins and other GridPP supporters appeared to be suspended from the dteam VO last week. The GRNET maintainers of the VOMS responded “During a planned [1] upgrade operation of VOMS service, a system malfunction occurred. As a result, some users received false notification about membership expiration. We are in contact with the software development team in order to identify the cause.” 5. Matt Doidge will join the GridPP security team. Others have become less active or left the team leading to a membership review. SI-6 Tier-1 Manager's Report (GS) --------------------------------- Staffing: - Looking at how to run the Tier1 Production Team with one member of staff leaving (effectively at the end of this week), another being temporarily off work and my reduced hours. Castor: - Testing of Castor 2.1.15 is proceeding OK. Progress was made on what looked like problems to do with draining and tape access. We anticipate doing the scheduling of the change in the next week or so. Tape System: - The 'preventative maintenance' on the tape libraries took place OK on Tuesday 13th September. Oracle now want to come and make some changes to the tape libraries and we are looking to schedule this early November. Batch System: - HPE Worker nodes: The re-cabling has been done by HPE. The systems are now ready for us. They are being used to test a configuration whereby they will run SL7 with a SL6 WN running in a container. - We had a problem with the cvmfs squids yesterday between around 11am and 7pm. This is a repeat of a (shorter lived) problem we had a couple of weeks ago. Investigations have not concluded but it looks like the tcp parameters on our cvmfs stratum1 server need adjusting. SI-7 LCG Management Board Report of Issues (DB) ----------------------------------------------- Meeting is tomorrow and DB will report next week. SI-8 External Contexts (PG) --------------------------------- UKTO phone calls are awaited. REVIEW OF ACTIONS ================= 600.1: DC to contact Julia Sedgebeer at Imperial to informally discuss and address SuperNemo’s computing needs and request Daniella and Tom to await outcome of these discussions before progressing further. Ongoing. 602.3: AS will request that Jens make a presentation to the PMB supported by a written report on plans for AAAI project with Dirac as well as proposed reporting. (UPDATE: AS will try to set up date for Jens to present remotely at a PMB soon) Ongoing. 603.1: AS will discuss with Jens and confirm agreement for 150 TB tape storage capacity for the Nuclear Physics request on the provision it can be accessed by existing mechanisms in the GridPP suite of tools. Ongoing. 605.1: DK will investigate costs and timescales of upgrading the OPN Link to 30 and report back to PMB. Ongoing. 605.2: PG will go through data required with PMB members concerned this week to agree inclusion in the project map and reports. Ongoing. 605.3: ALL members to review risks on the register to which their names are attached, and provide interim feedback. (UPDATE: PG will provide the most up to date re-ordered version) Ongoing. 606.1: DB will discuss with Tony Medland an OSC strategy for delivery vs pledges. Done 606.2: PG will set up a doodle for F2F in advance of OSC. Done. 606.3: AS will propose a convenient date for Tier1 review and circulate to PMB for consideration. Ongoing. ACTIONS AS OF 19.09.16 ====================== 600.1: DC to contact Julia Sedgebeer at Imperial to informally discuss and address SuperNemo’s computing needs and request Daniella and Tom to await outcome of these discussions before progressing further. Ongoing. 602.3: AS will request that Jens make a presentation to the PMB supported by a written report on plans for AAAI project with Dirac as well as proposed reporting. (UPDATE: AS will try to set up date for Jens to present remotely at a PMB soon) Ongoing. 603.1: AS will discuss with Yens and confirm agreement for 150 TB tape storage capacity for the Nuclear Physics request on the provision it can be accessed by existing mechanisms in the GridPP suite of tools. Ongoing. 605.1: DK will investigate costs and timescales of upgrading the OPN Link to 30 and report back to PMB. Ongoing. 605.2: PG will go through data required with PMB members concerned this week to agree inclusion in the project map and reports. Ongoing. 605.3: ALL members to review risks on the register to which their names are attached, and provide interim feedback. (UPDATE: PG will provide the most up to date re-ordered version) Ongoing. 606.3: AS will propose a convenient date for Tier1 review and circulate to PMB for consideration. Ongoing. 607.1: AS will provide figures on Tier1 and Tier2 pledges and available spend soon. 607.2: PG will produce a spreadsheet containing explicit detail on Capital and Resource for Tier1 and as well as Tier1 and Tier2 pledges to include LHC requirements. 607.3: PG and SL will circulate details of F2F meeting on 24 October at Queen Mary. 607.4: ALL to contribute to the OSC Project Status Report. 607.5: DB to contribute Introduction for OSC report. 607.6: DB & PC to contribute Wider Context for OSC report. 607.7: GS and AS to contribute Tier1 Status Report for OSC report. 607.8: JC to contribute Deployment Status for OSC Report. 607.9: RJ to contribute ATLAS User Report for OSC Report. 607.10: DC to contribute LHCb User Report for OSC Report. 607.11: SL (and Tom Whytnie) to contribute Impact and Dissemination Report for OSC Report.