GridPP PMB Meeting 626

GridPP PMB Meeting 626 (06.03.17)
Present: Pete Gronbech (Chair & Minutes), Tony Cass, Pete Clarke, Jeremy Coles, David Colling, Roger Jones, Steve Lloyd, Andrew Sansum, Gareth Smith.

Apologies: Dave Britton, Tony Doyle, Dave Kelsey, Andrew McNab.

1. Services at the Tier-1
AS had been looking at some of the services they run that are not used (much) by the LHC community. WMS will start to decommission. Later this year?
Although the LHC experiments are not using it, T2k are using it. An alternative solution would be required for them.
Other services include, the LFC and the Atlas Frontier service.

2. ResearchFish
PC pointed out that the GridPP Coordination grant is now linked to the various particle Physics Consolidated Grants and directly to the LHC experiments grants. PIs can check in the linked section.
GriPP PIs should just submit their GridPP grant and they should get the outcomes submitted centrally automatically.
They should see that their Staff grant should be linked to GridPP coordination grant.
Look for the ‘My linked awards.’
Check each grant listed under my linked awards

3. AS reported on ANAES
A recent Kickoff meeting, it’s a h2020 project funded by EU to plan UK regional computing for SKA.
Planning meeting and an opportunity to meet each other.
Vision for Work Packages.
AS will send a summary.
STFC was funded 2 Months/year to work on planning infrastructure. Storage side.
AS was asked by Rosie and Anna if AS would run one of the tasks.
AS is unfunded on the project but the involvement could be useful for GridPP.
Work Package around testing might be an option.
Small capability tests.
Square circle of no funds.
AS may discuss with PC.

DC noted that ATLAS are having a containers workshop on Wednesday 8th March.

5. Standing Items

SI-0 Bi-Weekly Report from Technical Group (DC)
Migration to C7 (CentOS 7) or other variants had been discussed at the last meeting.
What we can do to help?
DC took notes, & they are on the technical groups agenda page.
Normal catch up on VAC and other items.

SI-1 Dissemination Report (SL)
Nothing of significance to report.

SI-2 ATLAS Weekly Review and Plans (RJ)
Nothing of significance to report.

SI-3 CMS Weekly Review and Plans (DC)
Problems with databases for CMS. Not UK specific.

SI-4 LHCb Weekly Review and Plans (PC)
Nothing of significance to report.

SI-5 Production Manager’s report (JC)
Some operations updates/news:

1) We have received the WLCG T2 R/A figures ( for February.

– All okay

– Oxford 84%:84%

– All okay

– QMUL 84%:84%
– Durham 88%:88%

QMUL: No known issues. Following up with LHCb.
Durham: Batch system upgrade led to one outage and a University wide internet connection loss led to another.

2) This year’s WLCG workshop will be held June 19-22 in Manchester:

3) There has been a kick-off meeting of the WLCG Data Management steering group.

4) The CERN based VOMS has experienced some issues with AUP re-signing. A bug was identified.

5) The agenda for this month’s WLCG GDB is at I believe the focus will be on sites/issues in the Asia Pacific region.

6) Registration is open for the EGI Conference 2017 and INDIGO Summit 2017:

7) Steady progress is now been made with IPv6. LHCb can now run on pure IPv6 resources.

8) We have seen increased activity from several of our “incubator” VOs, though several are now more correctly “production” rather than in incubation. DUNE for example is undertaking a large run which is stretching the GridPP DIRAC service. HPC DIRAC has several sites in transfer testing mode. LSST in the US has recently moved towards larger scale testing of Grid resources (via Panda on OSG) and the LSST VO. The skatelescope VO users have been looking at options to import LOFAR data.

SI-6 Tier-1 Manager’s Report (GS)
– Successful UPS/Generator load test last Tuesday.

– We have an ongoing problem with the SRM SAM tests for Atlas which are failing a lot of the time. We have confirmed this is not affecting Atlas operationally – it is just the tests that fails. We have a GGUS ticket open with Atlas as the test appears to be problematic.
– We continue to fail CMS SRM SAM tests sporadically with timeouts.

– Ongoing testing of two batches of worker nodes in a new configuration with SL7 and the jobs themselves running in SL6 containers.
These nodes are currently running jobs from (only) the LHC VOs.

– We saw a higher rate of packet loss (via Perfsonar) between the 14th and 24th February. However, we did not find the cause for this.
– The next step is enabling IPV6 on the Tier1 router pair (the Extreme x670 systems) this was scheduled for last week – but was delayed until this Wednesday (8th March). We will disable IPv6 across our systems and then plan to re-enable it on a case by case basis.

SI-7 LCG Management Board Report of Issues (DB)
Nothing to report.

SI-8 External Contexts (PC)
No further progress.
Lots of awareness but nothing concrete to report.
DC at Cloud workshop a group talked about setting up a HTC taskforce.
David Salmon and DC would be the main contacts. Andrew McNab also….
Could be useful but unless we have time and effort and inclination.
Purpose is to set objectives that can be realized in a reasonable timescale.
Short term. Continuation of some of the work Andrew Lahiff had been doing perhaps.
Without specific targets not useful.

PC May be having an informal chat with Catapult.
Possible interaction with industry, very informal. Who in London would be able to talk any interested industrial partners?
DC would be happy to talk to them.
Bit stretched in this area.

616.3: DB and SL will discuss how best to progress replacement of TW’s role. (Update: DB and SL have amended and awaiting DB final comments) Ongoing.
620.1 DB to contact DK re the procedure to deal with a security incident and the media. (Update: DK had devised an interim statement which involved TW as dissemination officer and he is no longer in post – there is no prescriptive full response as this would be dependent on circumstances and probably involve an emergency PMB and communication with relevant PR representatives). DK will send the statement to PMB in case required in future – spokesman SL as head of board or DB as project leader. Ongoing.
623.3: DB and AS will discuss how best to summarise the Tier1 review. (Update: a brief summary will be written up and presented). Ongoing
623.4: GS will upload talks from the Tier1 review to the Agenda. Done.
624.1: AS will rework tape modelling taking account of recent changes. Ongoing.
624.2: PG will firm up the agenda and alternative topics in Session 4 and 5. Ongoing.
624.3: PG and AM will work up a SurveyMonkey for sites to outline what they are currently working on and future plans. Ongoing.

ACTIONS AS OF 06.03.17
616.3: DB and SL will discuss how best to progress replacement of TW’s role. (Update: DB and SL have amended and awaiting DB final comments) Ongoing.
620.1 DB to contact DK re the procedure to deal with a security incident and the media. (Update: DK had devised an interim statement which involved TW as dissemination officer and he is no longer in post – there is no prescriptive full response as this would be dependent on circumstances and probably involve an emergency PMB and communication with relevant PR representatives). DK will send the statement to PMB in case required in future – spokesman SL as head of board or DB as project leader. Ongoing.
623.3: DB and AS will discuss how best to summarise the Tier1 review. (Update: a brief summary will be written up and presented). Ongoing
623.4: GS will upload talks from the Tier1 review to the Agenda. Done.
624.1: AS will rework tape modelling taking account of recent changes. Ongoing.
624.2: PG will firm up the agenda and alternative topics in Session 4 and 5. Ongoing.
624.3: PG and AM will work up a SurveyMonkey for sites to outline what they are currently working on and future plans. Ongoing.