GridPP PMB Meeting 627

GridPP PMB Meeting 627 (13.03.17)
=================================
Present: Pete Gronbech (Chair), Dave Britton Tony Cass, Jeremy Coles, Tony Doyle, Roger Jones, Dave Kelsey, Andrew McNab, Gareth Smith, Louisa Campbell (Minutes).

Apologies: Andrew Sansum, Pete Clarke, David Colling, Steve Lloyd.

1. Researchfish
===============
The deadline is Thursday and PG intends to submit by tomorrow – papers have been linked in. We have some 700 papers listed against projects. DB will look at Researchfish tomorrow morning before PG formally submits. PG will check whether everything is updated when a linked project is updated.

2. GridPP Web Page Editors
==========================
There was a request from Edinburgh to remove an old job advert and put on a new one. This reignited the question who is in control of this now – AM is in full technical control, but TW was previously dealing with content. JC, PG and SL should be added to all areas for editing. SL is now advertising for TW’s replacement so this will be re-visited thereafter.

3. GridPP38 Agenda
=====================
The Agenda is progressing slowly, some talks from QMUL and DC is making arrangements for Imperial. Birmingham will give a talk remotely and there are various technical talks from AM, Sam Skipsey and Andrew Lahiff. Gavin has agreed to give the Tier-0 talk and the Echo talk may be moved forward.
QMUL will talk, but no small sites have been approached as yet. As the host institution, Sussex should speak and possibly Durham due to recent changes to the way the site operates. Cambridge may wish to contribute, perhaps John Hill? – JC will approach John. Jens will also speak. There is a non-HEP section, but this has not as yet been worked up.
Mid-size sites, e.g. Holloway, Oxford and possibly Liverpool, need to be considered.
Discussion at a previous PMB on the Tier-2 evolution at the F2F meeting should be discussed on the afternoon of Day 1 of the collaboration meeting. DB will give this some consideration. The SurveyMonkeySurvey being arranged by PG and AM may lend some insight to this and help in the decision of placement of discussion slots. Sponsor talk will be limited to 10-15 minutes.
ACTION 627.1: DB will give consideration to an appropriate slot for a discussion of Tier-2 Evolution at GridPP38.
ACTION 627.2: PG will ask Jeremy Marris at Sussex to talk.

4. AOCB
=======
a) Quarterly report status
PG is chasing this up more firmly now and has sent out reminder emails.

5. Standing Items
===================

SI-0 Bi-Weekly Report from Technical Group (DC)
———————————————–
DC did not attend. No Report submitted.

SI-1 Dissemination Report (SL)
——————————
SL did not attend. No Report submitted.

SI-2 ATLAS Weekly Review and Plans (RJ)
—————————————
ATLAS week – RJ will have an update next week.

SI-3 CMS Weekly Review and Plans (DC)
————————————-
Nothing significant to report.

SI-4 LHCb Weekly Review and Plans (PC)
————————————–
Nothing significant to report.

SI-5 Production Manager’s report (JC)
————————————-
1. VO Nagios is in use in GridPP for running tests automatically on behalf of several regional VOs. Due to OS support ending at the end of the month for the current implementation, we will shortly have to decommission the service with no replacement.

2. A MySQL sleep injection was detected on a web server associated with one of the VOs supported by GridPP.

3. The agenda for last Wednesday’s GDB can be found here http://indico.cern.ch/event/578984/. Topics covered: ASGC report; Asian Tier forum report; an update from the Traceability & Isolation WG; IPv6 rollout status; Incident response of identity federations; Workload management trends in WLCG and T1 Configuration Evolution and Options.

SI-6 Tier-1 Manager’s Report (GS)
———————————
General:
– Work has started on replacing two of the chillers for the R89 machine room air-conditioning.
– There was a problem with the Microsoft Hyper-V 2012 hypervisor cluster on Thursday afternoon. Two out of the five nodes appear to have updated a particular component – and attempted to move VMs to other nodes. It took a few hours for the system to recover. This affected a number of services including BDIIs, FTS nodes and CEs. However, the resilience built into these services meant that the operational effect was small.

Castor:
– We have an ongoing problem with the SRM SAM tests for Atlas which are failing a lot of the time. We have confirmed this is not affecting Atlas operationally it is just the tests that fails. We still have a GGUS ticket open with Atlas as the test appears to be problematic.
– There was a very large FTS transfer queue for Atlas during the middle of last week. Resolved by Atlas.
– We continue to fail CMS SRM SAM tests sporadically with timeouts.
– Some newer disk servers (’14 generation) are being brought into service. These will replace some older (’12 generation) servers.

ECHO:
Two additional ‘MON’ boxes are being set-up bringing the total to five. The existing three can cope with normal activity but the additional ones would speed up recoveries and starts. Two additional gateway nodes are also being set-up (also bringing the total to five) which will improve access bandwidth.

Batch:
– Ongoing testing of two batches of worker nodes in a new configuration with SL7 and the jobs themselves running in SL6 containers.
As stated before these nodes are currently running jobs from (only) the LHC VOs.

Networking:
On Wednesday IPv6 was enabled in the Tier1 router pair (the Extreme x670 systems). During the day before IPv6 was disabled across our systems. The plan is to re-enable it on a case by case basis as required. The next step is to agree our IPv6 addressing scheme ahead of getting the production Perfsonar nodes IPv6 enabled.

Scotgrid coolers are down today due to an issue over the weekend.

SI-7 LCG Management Board Report of Issues (DB)
———————————————–
This has been rescheduled till 21.03.17, DB cannot attend but PC or AS will coordinate attend and report.

SI-8 External Contexts (PC)
———————————
PC did not attend. No report submitted.

REVIEW OF ACTIONS
=================
616.3: DB and SL will discuss how best to progress replacement of TW’s role. (Update: DB and SL have amended and awaiting DB final comments) Ongoing.
620.1 DB to contact DK re the procedure to deal with a security incident and the media. (Update: DK had devised an interim statement which involved TW as dissemination officer and he is no longer in post – there is no prescriptive full response as this would be dependent on circumstances and probably involve an emergency PMB and communication with relevant PR representatives). DK will send the statement to PMB in case required in future – spokesman SL as head of board or DB as project leader. Ongoing.
623.3: DB and AS will discuss how best to summarise the Tier1 review. (Update: a brief summary will be written up and presented). Ongoing.
624.1: AS will rework tape modelling taking account of recent changes. Ongoing.
624.2: PG will firm up the agenda and alternative topics in Session 4 and 5. Ongoing.
624.3: PG and AM will work up a SurveyMonkey for sites to outline what they are currently working on and future plans. (Update: AM will make up a Wikipage instead of SurveyMonkey). Done.

ACTIONS AS OF 13.03.17
======================
616.3: DB and SL will discuss how best to progress replacement of TW’s role. (Update: DB and SL have amended and awaiting DB final comments) Ongoing.
620.1 DB to contact DK re the procedure to deal with a security incident and the media. (Update: DK had devised an interim statement which involved TW as dissemination officer and he is no longer in post – there is no prescriptive full response as this would be dependent on circumstances and probably involve an emergency PMB and communication with relevant PR representatives). DK will send the statement to PMB in case required in future – spokesman SL as head of board or DB as project leader. Ongoing.
623.3: DB and AS will discuss how best to summarise the Tier1 review. (Update: a brief summary will be written up and presented). Ongoing.
624.1: AS will rework tape modelling taking account of recent changes. Ongoing.
624.2: PG will firm up the GridPP38 agenda and alternative topics in Session 4 and 5. Ongoing.
627.1: DB will give consideration to appropriate slot for a discussion of Tier-2 Evolution at GridPP38.
627.2: PG will ask Jeremy Marris at Sussex to talk.