GridPP PMB Meeting 613

GridPP PMB Meeting 613 (14/11/16)
=================================
Present: Dave Britton(Chair), Tony Cass, Pete Clarke, Tony Doyle, Pete Gronbech, Steve Lloyd, Andrew McNab, Andrew Sansum, Gareth Smith, Louisa Campbell (Minutes).

Apologies: Jeremy Coles, Dave Kelsey, David Colling, Roger Jones

1. CMS issue at Tier-1
=========================
Issue of CMS usage at Tier 1 – full information is not yet available, but was raised to the OSC for full disclosure. PG summarised the UK only provided total resource of only 5.3% of 8% planned. Looking at the global map of CMS, we provided c 8% in 2014 and 2015 and 4% this year so far. This is, therefore, an issue that has been ongoing and figures were presented at several resource meetings with no serious issues raised by CMS. CMS low job efficiency has been regularly mentioned during monthly summaries and the inefficiency is one reason in 2015 delivery in HEPSPEC was less than 8%. Additionally, the CMS liaison person at RAL has, at times, restricted the number of jobs that could run in order to protect the infrastructure which also had an impact. GS confirmed we were looking at CMS efficiencies regarding disk access, changes made recently were twofold – number of multi-core jobs to be run and increase of jobs draining out to enable slots available to meet demand. The issue was noted as resolved by moving to CEPH.All information will not be available until Andrew Lahiffe returns, therefore we must in the meantime work with factual data. Resource usage meeting (PG and AS attend), CMS through the past year have come out low on the scale – it was assumed CMS was using what they required rather than any concerns over availability. There may be a disconnect between CMS internationally if they believed there was an issue and CMS UK which did not communicate any concerns. A post mortem will be undertaken by AS to understand what occurred and why this was not picked up on or raised in order ensure this does not recur and such issues are effectively communicated in future. Monthly liaison meetings and other meetings and reports should show up inefficiencies or low performance.
ACTION 613.1: AS will undertake a post mortem on CMS issues at Tier-1.

2. OC Meeting Talk to prepare
=============================
The agenda has allocated a 45 minute slot which will include questions. DB will prepare a talk to ensure the risks are discussed as a priority and best presented early to pre-empt most questions. PG noted the lack of financial report and this should be produced for information in case of related questions – he will provide slides with tables for information (e.g. capital vs resource). DK has provided some additional information for FY17 re the £390K – this has not been included in the OSC documents as Tony Medland’s guidance would be useful in advance of a final decision. The OSC meeting commences at 11.00 and presentation slot is 11.45 – DB, PG and AS will meet beforehand.
ACTION 613.2: DC will prepare one talk to present to the OC Meeting.
ACTION 613.3: PG will create slides containing tables and notes for information with resource, capital and financials.

3. Tier-2 HW grants – status
============================
PG summarised every university has completed a JES and 10 have been received by STFC, including Imperial, and are already with Council. Some remain in JES and are not yet with Council – Brunel, Queen Mary, Royal Holloway, Glasgow, Sheffield and Bristol. PG is determining which sites can spend this FY year. Imperial and Glasgow may be able to spend, DB has initiated a procurement at Glasgow. STFC wish £500-£600K across different sites, each site must spend at least £10K and most should be at least £50 – Glasgow will spend £190K. PG will follow up and remind sites they must spend £10K initially on capital, e.g. network switches or disk and network switches, this will very much depend on procurement rules of individual institutions. PG will thank sites for submitting JES forms and remind them to commence procurement and provide guidance.
ACTION 613.4: PG will write to sites to provide instructions and request they commence procurement of minimum £10K for capital.

4. Q3 reports status
========================
Three reports have been received and PG has sent out reminders. Much of the information can be copied & pasted from previous reports with elements updated as necessary, e.g. narratives.
ACTION 613.5: ALL submit Q3 reports to PG.

5. AOCB
=======
a) CDT bids have been submitted to STFC. Only 2 bids requested a letter of support from GridPP – DB sent round a draft letter of support to PMB and did not receive any objections or suggestions for change. The Dirac letter of support was signed by Jeremy Yates as director.

6. Standing Items
===================

SI-0 Bi-Weekly Report from Technical Group (DC)
———————————————–
DC not present – no report submitted.

SI-1 Dissemination Report (SL)
——————————
##GridPP Engagement Officer Notes for PMB

### Piloting secure CVMFS repositories at RAL

According to Catalin, it is now possible to create secure (i.e. non-world readable) CVMFS repositories and he would be happy to host some on the RAL Stratum-0. This would be useful for commercial users or those with software licensing use cases. However, some development work would be required, so if anyone knows any friendly users who might have such a need please let him know. (Both CERN@school and MoEDAL are all completely Open Source.)

DB noted this could be very useful for non-Physics users but questioned how it is planned to be implemented and the method of accessibility should not be highly bespoke. SL confirmed it is likely to be associated with a proxy. SL will advise Catalin the PMB supports this and this should perhaps be considered for the next Horizon 2020 or EGI bid in due course.

ACTION 613.6: SL will advise Catalin that the PMB supports development work on creating secure CVMFS repositories hosted on the RAL Stratum-0 and may consider wider bids once the scope is better understood.

SI-2 ATLAS Weekly Review and Plans (RJ)
—————————————
Site move in ATLAS has been actioned. Some aspects were not fully tested and this caused some issues at RAL. Other issues are also being addressed (e.g. QMUL breakage because it uses cluster – this was a result of a global issue, not UK-based).

SI-3 CMS Weekly Review and Plans (DC)
————————————-
DC not present – no report submitted.

SI-4 LHCb Weekly Review and Plans (PC)
————————————–
Nothing of significance to report.

SI-5 Production Manager’s report (JC)
————————————-
JC not present – no report submitted.

SI-6 Tier-1 Manager’s Report (GS)
———————————
General:
Still some mopping up from the “dirty cow” vulnerability to be done. We are now looking at what we need to do regarding the recently announced CVE-2016-7117 vulnerability.

Castor:
– Disk servers being prepared to go into both LHCb (12 servers, each 120TB) and Alice (5 servers, each 100TB). These are servers
from the 2014 purchases. This will both enable the retirement of some old (’11) servers and provide an increase in disk space.
– As reported before the testing of Castor 2.1.15 is largely complete. Owing to staff availability this update will be carried out in the New Year, with the intention of completing it by the end of January.
– We are looking to merge smaller disk pools into larger ones for both LHCb and Atlas.

Tape:
– There will be a short interruption to tape mounts tomorrow morning while the library control server (that runs the “ACSLS software”) is swapped.
– Migration of LHCb data from ‘C’ to ‘D’ tapes ongoing.

Services:
– We have had a problem with the Atlas Frontier service that was fixed this morning.

SI-7 LCG Management Board Report of Issues (DB)
———————————————–
MB is tomorrow so no report is available today.

SI-8 External Contexts (PC)
———————————
There has been an iteration with the BEIS document for UKT0 bids, AS and DC made changes and these have been received from Charlotte today. Charlotte was very positive about this and the questions that came back so far were not conceptual. AS noted the re-profiling request regarding the planned requirements over the next 4 years required some work – but the original table was contained in the final iteration. Some clarity may be required as to the finance and re-profiling for FY16 and there may ultimately be some other high level questions. The autumn statement is expected in 9 days which may bring clarity.

REVIEW OF ACTIONS
=================
605.1: DK will investigate costs and timescales of upgrading the OPN Link to 30 and report back to PMB. Ongoing.
606.3: AS will propose a convenient date for Tier1 review and circulate to PMB for consideration. Done.
607.2: PG will produce a spreadsheet containing explicit detail on Capital and Resource for Tier1 and as well as Tier1 and Tier2 pledges to include LHC requirements. Done.
607.4: ALL to contribute to the OSC Project Status Report. (Almost complete) Done.
607.8: JC to contribute Deployment Status for OSC Report. Done.
610.1: AS/GS Produce suggestions for one or more metrics that will summarise the Tier1 network availability/performance. Ongoing.

610.2: ALL – review Pete’s comments in the metrics spreadsheet and act accordingly. Done.
610.3: AS Attempt to get tape media re-classified from resource to capital. Ongoing.
610.4: AS/DB Contact Tony Medland to get new budget allocation (regarding extra capital) in writing so we can start procurement. Done.
610.5: AS Provide numbers/details for H2020 bids. DB will contextualize them. Done.
610.6: GS Produce report on how Tier1 missed that a very low number of CMS jobs were running and therefore fell significantly
behind running the CMS re-reco jobs. ACTION (613.1) ON AS TO PERFORM POSTMORTEM

612.1: DB will draft text for Section 3 of the OSC project management plan by tomorrow morning. Done.
612.2: PG will finalise and submit OSC documents by tomorrow. Done.
612.3: PG will determine which small sites can undertake procurement this FY. Ongoing.
612.4: DC will circulate minutes from latest Technical Group meeting. Done.

ACTIONS AS OF 14.11.16
======================
605.1: DK will investigate costs and timescales of upgrading the OPN Link to 30 and report back to PMB. Ongoing.
610.1: AS/GS Produce suggestions for one or more metrics that will summarise the Tier1 network availability/performance. Ongoing.
610.3: AS Attempt to get tape media re-classified from resource to capital. Ongoing.
612.3: PG will determine which small sites can undertake procurement this FY. Ongoing.
613.1: AS will undertake a post mortem on CMS issues at Tier-1.
613.2: DC will prepare one talk to present to the OC Meeting.
613.3: PG will create slides containing tables and notes for information with resource, capital and financials.
613.4: PG will write to sites to provide instructions and request they commence procurement of minimum £10K for capital.
613.5: ALL submit Q3 reports to PG.
613.6: SL will advise Catalin that the PMB supports development work on creating secure CVMFS repositories hosted on the RAL Stratum-0 and may consider wider bids once the scope is better understood.