GridPP PMB Meeting 625

[pmbminuteslink}

GridPP PMB Meeting 625 (27.02.17)
=================================
Present: Pete Gronbech (Chair), Jeremy Coles, David Colling, Roger Jones, Steve Lloyd, Andrew McNab, Gareth Smith, Louisa Campbell (Minutes).

Apologies: Dave Britton, Tony Cass, Tony Doyle, Dave Kelsey, Pete Clarke, Andrew Sansum.

1. WMS Service and other services (Frontier) at the Tier-1
==========================================================
AS was unable to attend the meeting today – item deferred until next week.

2. LZ storage requirements
==========================
DC updated re a centre for LZ – at the moment their requirements are a modest 200TB, but by 2020 they will commence collecting data and their requirements will increase significantly. DC advised they will be slotted into the project following GridPP5 – 670TB is the projected figures by the end of GridPP5. They have a US data centre and a European centre at Imperial – just disk, so not a Tier1 requirement but a small uplift in the equipment grant, not significant but not easily otherwise absorbable. DB will comment on this, but they were not listed explicitly in the GridPP5 project proposal. In the LZ proposal they noted a desire to have support from the Grid and host equipment in a Grid site, which was well received. They were awarded the grant but did not receive any support for computing and were advised to approach the Grid for support. There is a possibility something will be included in a second tranche. DC is checking if they will be involved in a WLCG workshop in April to share information about current programmes/technologies and check for any overlaps.

3. Report from the PDG Meeting at Daresbury
===========================================
DC attended this meeting and has circulated minutes. In summary, there were discussions on infrastructures, costing of resources using cloud instead and Jasmine. AS also attended and did a costing for RAL.

4. ResearchFish
===============
PC has been discussing with Ian Fuller and contacts at STFC and has managed to link a lot of the consolidated grants and Grid Grants, but is not attending today. PG has linked some and is working on this, he has done the ATLAs ones and PC has done LHCb ones. PG will now try to look at some of the other projects.

5. AOCB
=======
a) Quarterly Report Status
PG put out a reminder asking for submission of reports from anyonce concerned.

5. Standing Items
===================

SI-0 Bi-Weekly Report from Technical Group (DC)
———————————————–
DC missed the last meeting so there is nothing to report.

SI-1 Dissemination Report (SL)
——————————
Nothing of significance to report.

SI-2 ATLAS Weekly Review and Plans (RJ)
—————————————
Nothing of significance to report.

SI-3 CMS Weekly Review and Plans (DC)
————————————-
Nothing of significance to report.

SI-4 LHCb Weekly Review and Plans (PC)
————————————–
Nothing of significance to report. JC noted switching over from Castor to Echo – looking at using some redirection features to hide changes in the background.

SI-5 Production Manager’s report (JC)
————————————-
Low availability at Birmingham and RAL. EGI have just moved the Dark Side into production. Apart from that there is nothing of significance to report.

SI-6 Tier-1 Manager’s Report (GS)
———————————
Castor:
– There was a problem with the LHCb SRMs overnight last Wednesday – Thursday. This resolved itself in the morning.
– There has been a problem over the weekend with the SRM SAM tests for Atlas which are failing most of the time. We have confirmed this is not affecting Atlas operationally – it is just the tests that fails. Investigations are ongoing.

Batch:
– We have two batches of worker nodes in a new configuration with SL7 and the jobs themselves running in SL6 containers. These nodes
are currently running jobs from (only) the LHC VOs.

Networking:
– IPV6 was enabled on the Tier1 OPN Router last Wednesday (22nd Feb). This went well although a reload of the router was required.
The next step is enabling IPV6 on the Tier1 router pair (the Extreme x670 systems) that will take place this Wednesday (1st March).

ECHO:
– There was a problem with one ‘placement group’ that resulted in a loss of data (2000 Atlas files). This has been followed up and understood – in conjunction with the CEPH developers and is being presented in a CEPH forum. The understanding gained means that should this recur there would be no data loss.

JC enquired about SL5 related to Castor – GS confirmed this will happen but the only frontline service on SL5 is SRM nodes. Now we have moved to 215 we are working on the update by end March. There are a number of nodes on database machines that are still SL5 which are being supported. Internally, part of this work will be to remove an Oracle layer which is no longer supported.

JC asked about nominations on ROD work, GS is progressing this.

SI-7 LCG Management Board Report of Issues (DB)
———————————————–
Nothing to report.

SI-8 External Contexts (PC)
———————————
PC is absent – no report submitted.

REVIEW OF ACTIONS
=================
616.3: DB and SL will discuss how best to progress replacement of TW’s role. (Update: DB has reviewed and now await admin) Ongoing.
620.1 DB to contact DK re the procedure to deal with a security incident and the media. (Update: DK had devised an interim statement which involved TW as dissemination officer and he is no longer in post – there is no prescriptive full response as this would be dependent on circumstances and probably involve an emergency PMB and communication with relevant PR representatives). DK will send the statement to PMB in case required in future – spokesman SL as head of board or DB as project leader. Ongoing.
623.3: DB and AS will discuss how best to summarise the Tier1 review. (Update: a brief summary will be written up and presented). Ongoing
623.4: GS will upload talks from the Tier1 review to the Agenda. Done.
624.1: AS will rework tape modelling taking account of recent changes. Ongoing.
624.2: PG will firm up the agenda and alternative topics in Session 4 and 5. Ongoing.
624.3: PG and AM will work up a SurveyMonkey for sites to outline what they are currently working on and future plans. Ongoing.

ACTIONS AS OF 27.02.17
======================
616.3: DB and SL will discuss how best to progress replacement of TW’s role. (Update: DB has reviewed and now await admin) Ongoing.
620.1 DB to contact DK re the procedure to deal with a security incident and the media. (Update: DK had devised an interim statement which involved TW as dissemination officer and he is no longer in post – there is no prescriptive full response as this would be dependent on circumstances and probably involve an emergency PMB and communication with relevant PR representatives). DK will send the statement to PMB in case required in future – spokesman SL as head of board or DB as project leader. Ongoing.
623.3: DB and AS will discuss how best to summarise the Tier1 review. (Update: a brief summary will be written up and presented). Ongoing
624.1: AS will rework tape modelling taking account of recent changes. Ongoing.
624.2: PG will firm up the agenda and alternative topics in Session 4 and 5. Ongoing.
624.3: PG and AM will work up a SurveyMonkey for sites to outline what they are currently working on and future plans. Ongoing.