GridPP PMB Meeting 690

GridPP PMB Meeting 690 (10.12.18)
=================================
Present: Dave Britton (Chair), Pete Clarke, Jeremy Coles, David Colling, Alastair Dewhurst, Tony Doyle, Pete Gronbech, Jon Hays, Roger Jones, Steve Lloyd, Andrew McNab, Gareth Roy (Minutes), Andrew Sansum.

Apologies: Tony Cass, Dave Kelsey,

1. Authorship Fractions
=======================
DB had previously calculated the authorship fractions based on the Cern Greybook. DC had suggested previously that M&O numbers would be a better approach. On investigation this gave numbers which were similar but different from those calculated from authorship, and a concern was raised that this may be skewed due to the inclusion of PhD students. DB hoped for a consistent set of numbers to allocated and request resources for each experiment. DC will check M&O figures from CMS and ensure the difference is understood.
ACTION 690.1: DC to get confirm M&O numbers and identify any mismatch due to inclusion of PhD students.

2. EOSC-Hub request
===================
AM forwarded a proposal request from the BBC to EGI Federated Cloud:
“This is to inform you that we have received a proposal from the BBC R&D Video Coding team [1] in London. The team is interested to improve the current video coding standards using machine learning algorithms. To do so, they would like to use a Linux-based VM with at least 2 GPU (possibly: Quadro, Tesla or RTX cards that can work in FP16, half precision floating points) and a couple of Terabyte of storage.”
There was some discussion on this and it was felt that both QMUL & IC have resources that would fit the BBC’s requirements and that GridPP should respond to the proposal in a positive fashion.
ACTION 690.2: JH to follow up with BBC contacts on further potential collaboration in regards the proposal
ACTION 690.3: AD to formally respond to EOSC-HUB on behalf of GridPP and express a willingness to provide resources

3. AOCB
=======
a) GridPP42
Betws-y-Coed was ruled out as a possible location for GridPP42 as the conference suite was not large enough to accommodate the number of attendees. Cosners House was suggested as an alternate location with dates from the 23-25th April being confirmed available by AD. After some discussion it was felt costing and location would be appropriate better a better set of dates would be 24-26th April. AD was to check availability and book if appropriate.
ACTION 690.4: AD to check availability of Cosners House for the dates of 24,25,26 April and to book if appropriate.

b) GridPP6
A holding email was received from STFC in regards to the final guidelines for the GridPP6 proposal. Some concern was raised about the timescales for completing the proposal, it was understood that STFC was aware of the concern but wanted to ensure a proper scope being given.

c) Data Infrastructure Roadmap
AS gave a report on this meeting, which had the aim to create a document providing the capacity requirements for HTC & Data for eInfrastructure to UKRI. Through discussion it appears there was some confusion about how the numbers are to be pulled together, but that it was important that GridPP and IRIS were accurately reported. PC, AS and DC were all involved and would work to ensure this happened correctly.

5. Standing Items
===================

SI-0 Bi-Weekly Report from Technical Group (DC)
———————————————–
Technical meeting took place to examine GridPP’s approach to working with other VOs. PC & AD stated that much good discussion took place and the all attendees to the meeting were onboard with the proposals and attitudes outlined in DB’s “To grid or not to grid” email.

SI-1 ATLAS Weekly Review and Plans (RJ)
—————————————
Migration to Harvester complete, some problems with queues in UK that may be relates
– Peter Love is working to improve monitoring of Harvester resources.

SI-2 CMS Weekly Review and Plans (DC)
————————————-
Nothing significant to report.

SI-3 LHCb Weekly Review and Plans (PC)
————————————–
Nothing significant to report.

SI-4 Production Manager’s report (JC)
————————————-
1. The main collaboration wide discussions this last week or so have been around our approach to other VOs. There was the clarification email sent to TB-SUPPORT by DB on 5th December (To grid or not to grid) and also a Technical Meeting last Friday looking at the technical considerations and options.

2. There is a GDB this week – https://indico.cern.ch/event/651360/. Topics will include a data privacy update, reports from the recent LHCONE/LHCOPN workshop and the most recent Asia Tier Center Forum, and an update on containerised benchmarking tools. The pre-GDB on Tuesday is a meeting of the AuthZ Working Group.

3. We heard last week that CREAM-CE support will end in December 2020. We only have 4 sites now with CREAM who have not indicated that they are evaluating/launching alternatives so this is not a large risk for us.

4. The WLCG ops coordination meeting last week focused on IPv6 progress. Good progress is being made but the plan for all T2s to have dual-stacked perfSONAR and storage by LS2 is some way off target. Approaches to further encourage sites are being considered, but it was acknowledged that lack of progress is often outside of the WLCG site admin control.

5. A WLCG/HSF/OSG workshop will be held at JLab March 18th-22nd 2019 (https://indico.cern.ch/e/how2019).

6. The EGI RP/RC A/R Report for November 2018 (http://argo.egi.eu/ar-ngi?month=2018-11) has the UK looking fine though, there was a noted drop in availability in mid-November due to issues at ECDF and Birmingham (https://tinyurl.com/yc9lpvqv).

7. As of last week UKI had 39 open GGUS tickets, which is only slightly higher than our background figure due to (internal) GridPP IPv6 tickets. The nature of the tickets indicates business as normal.

SI-5 Tier-1 Manager’s Report (AD)
———————————
– Argo tests for CMS Castor were failing on Monday and Tuesday last week. This was as a result of a BDII problem (it stopped publishing the information).

– There was a successful load test of the generator on Wednesday,

– ~5% of SAM tests via GridFTP against Echo have been failing due to the ÒAddress already in useÓ problem. We are investigating the problem and disabled NIS on the gateways as it is not needed and was using up ~13000 ports. We are monitoring to see if the situation improves.

– CMS successfully migrated to the new consolidated Castor tape instance on Thursday.

– The physical machine hosting MySQL databases for the Tier-1 (RT ticket system + LFC) died on Thursday. The service was restored from backup on Friday on a VM.

– Procurement is ongoing. CPU procurement is waiting on XMA (who had to delay things as they were finishing their Disk tender for us).

– We received a request last week from LHCb, that they would like to reprocess all of their Run 1 + 2 data. This would involve a 4.7PB recall from Tape! We are in discussions with Raja, but my current feeling is we canÕt accommodate this on their old Castor instance. They will need to migrate to Echo + new Tape instance and even then, this will need to be broken into blocks of say 500TB at a time.

AM was concerned that LHCb was focussed on the big picture and did not want to make a special case of RAL, DB felt that LHCb needed to be aware that using the old system may not have the performance that LHCb required.

ACTION 690.5: AM to discuss with LHCb and identify requirements for Tape recall, as well as start dates and durations of the campaign.

SI-6 LCG Management Board Report of Issues (DB)
———————————————–
PC attended the WLCG MB meeting and noted:
– A joint HSF meeting was announced.
– Iain Bird presented slides on Open Access and Data Preservation, and meeting will be organised accordingly.
– PC & DB had added wording to the EPPS Document, Iain Bird was very welcoming of the changes. DB suggested circulating the document within the UK.

SI-7 External Contexts (PC)
———————————
Nothing to report.

REVIEW OF ACTIONS
=================
644.4: AD will progress capture of funds for Dirac with Mark Wilkinson. (Update: funding from DIRAC. AS has emailed Mark. They are now using it more heavily. Could use the money for tape, but have to be careful not to buy tape we won’t use. May be better charging later rather than during this FY? AD will now progress. 08/10/18 – Leicester are producing a PO for tapes and will send to AD to produce an invoice). Ongoing.
678.3: AD to finalise the Tier1 background document, including tape strategy by end September. (Update: Almost complete and will circulate current iteration for comment). Ongoing.
678.5: JC to finalise the Storage background document by end September.
(UPDATE: 17 October meeting with Tony Medland & DB and PC will attend. This is almost complete and awaiting a few minor elements to be worked in ñ GR will upload into Googledocs for info). Ongoing.

ACTIONS AS OF 10.12.18
======================
644.4: AD will progress capture of funds for Dirac with Mark Wilkinson. (Update: funding from DIRAC. AS has emailed Mark. They are now using it more heavily. Could use the money for tape, but have to be careful not to buy tape we wonÕt use. May be better charging later rather than during this FY? AD will now progress. 08/10/18 – Leicester are producing a PO for tapes and will send to AD to produce an invoice). Ongoing.

678.3: AD to finalise the Tier1 background document, including tape strategy by end September. (Update: Almost complete and will circulate current iteration for comment). Ongoing.

678.5: JC to finalise the Storage background document by end September.
(UPDATE: 17 October meeting with Tony Medland & DB and PC will attend. This is almost complete and awaiting a few minor elements to be worked in – GR will upload into Googledocs for info). Ongoing.

690.1: DC to get confirm M&O numbers and identify any mismatch due to inclusion of PhD students.

690.2: JH to follow up with BBC contacts on further potential collaboration in regards the proposal

690.3: AD to formally respond to EOSC-HUB on behalf of GridPP and express a willingness to provide resources

690.4: AD to check availability of Cosners House for the dates of 24,25,26 April and to book if appropriate.

690.5: AM to discuss with LHCb and identify requirements for Tape recall, as well as start dates and durations of the campaign.