GridPP PMB Meeting 589

GridPP PMB Meeting 589 (15.02.16)
=================================
Present: Pete Gronbech (Chair), Dave Kelsey, Andrew Sansum, Jeremy Coles, Gareth Smith, Roger Jones, Steve Lloyd, Pete Clarke, Louisa Campbell (Minutes).

Apologies: Dave Britton, Claire Devereux, David Colling, Tony Doyle, Tony Cass, Andrew McNab.

1. GridPP36 Agenda
==================
Registration is now open. LC will make an announcement on the UKHEPGRID mailing list and provide a url link to the conference webpage and information on travel from Edinburgh.

STFC will make direct payment for delegate bed and breakfast accommodation and all conference catering. The Conference Dinner on Tuesday 12th April is sponsored by Dell who will make the necessary arrangements with the venue for payment. LC has reserved overnight accommodation for PMB members on Sunday 10th.

Format:

Monday 11th April – PMB will take place from 12.00-16.00 on.
– Delegates arrive late afternoon/evening

Tuesday 12th April – 09.00 Day 1 of meeting commences
– 18.00 Day 1 concludes
– 19.30 Conference Dinner (hotel restaurant)

Wednesday 13th April – 09.00 Day 2 of meeting commences
– 12.30 Meeting concludes – lunch provided

Previously agreed session themes:

1) Evolution of Tier-2 and other sites
2) New methods of working in GridPP5

Suggestions received:

1) ‘New Technologies for GridPP5’ – Agreed.
2) ‘New User Case Studies’
3) ‘What’s New from the Experiments for 2016’
4) ‘UK-T0’
5) ‘Site-orientated new technologies’(Vac, cloud, cgroups/ namespaces, HTCondorCE (not just HTCondor), Ceph, …)
6) Reporting, e.g. security
7) Hardware and expiry – Priming sites for network

Contribution (not a full session) – Other non-HEP VO support to summarise current position and future expectation.

ACTION 589.1: PC and PG will discuss and agree the GridPP36 agenda.

ACTION 589.2: LC will announce GridPP36 registration is open to UKHEPGRID mailing list and provide url link as well as information on travel to the venue from Edinburgh (https://indico.cern.ch/event/477023/).

2. Non-HEP PPAN VO Policy
=========================

As previously reported, there are many reasons to engage other sectors within PPAN (PP, Astro and Nuclear) to make use of the GridPP infrastructure.

The recent efforts to get LSST going have been very instructive. Significant effort has been put in by Manchester staff (the PMB thanks Manchester), but it has taught us that we cannot necessarily expect these communities to use the canonical HEP way of doing things at the outset – the overhead may be too great, causing delays run the risk of putting such groups off. It is therefore important that we can present a “quick-start” process to establish worth, and then work with them later in a more canonical way. In fact things are moving in a much more positive direction now for LSST.

PC gave a talk on GridPP opportunities to astronomers at the Dark Energy Strategy meeting, and many appear keen to work together collaboratively. We are about to start working with EUCLID. PC suggests that following the LSST experience, we aim for a “quick-start” to establish worth, but that this be managed on a more formal basis. After demonstrating the benefits we can explain that they need to invest some effort to get access to the more benefits in the longer term.

It was agreed that a champion be identified for each such experiment who can take a mini-project management role to ensure timely progression, plus a “few” GridPP members who are naturally associated with the new VO. This set of people will be the primary contacts. Following the request from the Ops team members, it is also strongly encouraged that the contact team post all queries, issues, etc to the normal mailing list, as there are many keen and experienced people in GridPP who are eager to help.

As we are not currently in an accounting period (The next period is likely to start on the 1st April) now is a good window to try supporting these new VOs.

The PMB needs to evolve a policy in respect of GridPP members spending time helping non-HEP VOs. To start this the PMB agrees that:
– It is in the long-term interests of GridPP to be more inclusive, and be able to help other PPAN science domains as per the UK-T0 vision, and responding to the external pressures in this direction.
– It is therefore understood that members of GridPP will need to invest some time to help non-HEP VOs now for future benefit. This may mean that it is necessary to delay some work for, say, an LHC experiment. It is agreed that within reasonable limits, and for a limited number of new VOs, that this is acceptable.
– The experiment reps on the PMB, aided by the Ops meeting Chair, should monitor the situation and should raise to the PMB if this leads to any unexpected problems.
– The situation should be reviewed in the light of experience after ~ 6 months.

Finally it is agreed that in the longer term it is important we look towards establishing a defined strategy rather than short-term gains. Possibly a discussion topic for Pitlochry – new GridPP5 action and we need to more actively manage such issue that require more dedicated time.

ACTION 589.3: PC will report at the next Ops meeting that the PMB is happy to support a bit of effort into short term successes with Astronomy projects and keep this monitored/under review.

3. AOCB
=======
a) Hardware and expiry: The infrastructure equipment bought by the DRI grant will be 5 years old next year some sites networking switches are likely to run out of warranty. Already we are pushing Tier-1 network capacity so will need big upgrades to networking bandwidth. Planning therefore needs to be put in place for networking. The larger sites may need upgrades to cope with increased remote access to data from the smaller diskless sites. Limits and restrictions need to be established for other sites.

4. Standing Items
===================

SI-0 Bi-Weekly Report from Technical Group (DC)
———————————————–
– There is expected to be a new release of Vac latter this week (version
.0021) which will be a candidate for release 1.0. This contains several updates/improvements (handling of storage devices, overloaded hypervisors etc)

– Liverpool is doing a fine as the significant release site and currently has ~200 cores running Vac. I think that the PMB should thank them at some point for their efforts.

– Currently Liverpool are running LHCb jobs but the ATLAS vm is on its way soon. Andrew L. is going to prepare a CMS vm for testing.

– There was general agreement that the Monitoring Portal for accounting is rubbish which makes it hard to debug things. Is there anything that GridPP can do about this?

– Sam gave an update as to how storage is evolving. He hadn’t had time to prepare slides but will do so and distribute them as a record.
However verbally he covered the potential use of caching by Rucio (non in production yet), he also discussed how this could be integrated with ARC CE caching (nobody knew if any of these were implemented on the ARC CEs that we currently have installed – Glasgow had done in the past but are sure now)

– There was also some discussion about Dynafed which may be interesting.

– There will be a (short) technical meeting next Friday in order to get the meetings back to their original timetable.

SI-1 Dissemination Report (SL)
——————————
##GridPP Dissemination Officer Notes for PMB

###New User Engagement Programme – New User MoU

Attached is a draft MoU for New Users covering the “First Contact” with GridPP. As explained in the MoU, it covers the use case of “an individual who wishes to investigate using GridPP’s resources for research and/or development purposes” – i.e. a single individual using the UserGuide and a CernVM to get on the grid.

We will need further MoUs for Users who have decided that GridPP is for them and so need a VO, resource allocation discussions, etc. and one for “Very Important New Users” e.g. EUCLID where dedicated GridPP resource will be required to ensure “quick wins” and “happy users”.

The aim is to make sure, at the very least, that new users are plugged in to all of the appropriate communication and support channels that are available.

Regarding post-UserGuide interactions with GridPP, PMB will therefore to decide:

* What can and should be allocated in terms of GridPP resource (i.e. staff time) to guiding (VI) New Users through the more complicated processes of creating and supporting new VOs, especially where PMB has decided that the engagement of these New Users is a priority;

* How the contribution of GridPP staff in these situations should be recorded and acknowledged so that the contributing sites receive appropriate credit and support.

ACTION 589.4: ALL read the MoU for New Users and discuss at next week’s PMB.

SI-2 ATLAS Weekly Review and Plans (RJ)
—————————————
RJ not present – no report presented.

SI-3 CMS Weekly Review and Plans (DC)
————————————-
DC not present – no report presented.

SI-4 LHCb Weekly Review and Plans (PC)
————————————–
PC left the meeting to meet with Tony Hey, no report submitted.

SI-5 Production Manager’s report (JC)
————————————-
1. In the ops meeting it is becoming apparent again that a clear policy steer would help focus people’s efforts. We are trying to support new VOs in several areas and a question that is (again) arising is how this work should be prioritised versus ongoing LHC experiment requests? When the VO is not strongly engaging GridPP do we put in more effort to compensate and look for some quick wins (there are new problems being identified in supporting each new community as their use-cases are all slightly different)?

2. In producing a forward look ROD rota for the next 6 months it is becoming apparent that some of the current contributors are not guaranteed to still be working on GridPP for the whole period (due to changing personal circumstances or related to GridPP5 funding). How do we want to tackle this issue?

3. Discussion (at our ops meeting last week) of the outcomes of the WLCG collaboration workshop at the start of February suggested that not as much progress was made on defining future directions as we had hoped. Is there a GridPP36 discussion to be had on how to support the medium and longer term planning?

4. We are seeing an accumulation of problems at Sussex since Matt RB left the site. We will need to review how we handle sites in the transition from grid to VAC.

5. WLCG Tier-2 site reliability/availability looks generally fine for January. I am in the process of checking the background to problems that did occur.

Some discussion on the accounts portal as discussed in the PMB information emails earlier. Several operational issues have been raised as this is an EGI component that we are less than satisfied with.

ACTION 589.5: DC to update the PMB and explain the problem issues with the accounting portal.

SI-6 Tier-1 Manager’s Report (GS)
———————————
Tier1 report for the PMB meeting on 15th February 2016.

Castor:
– Testing of the 2.1.15 version is ongoing. Aiming to upgrade on a timescale of weeks.

Networking:
– No change to report. The traffic on the OPN link this last week was high – but not quite as high as the previous couple of weeks.

Batch:
– No changes to report apart from resolving the cvmfs problem that affected Atlas analysis jobs and Hammercloud tests.

Procurement:
– No change (Disk and CPU capacity orders in place).

Action 588.7: Tier1 Availabilities for last three months:

Nov ’15 Dec ’15 Jan ’16
OPS 100 100 100
Alice 100 99 100
Atlas 100 99 92 (Awaiting result of recalculation
request for Atlas Jan ’16).
CMS 100 99 94
LHCb 99 100 97

SI-7 LCG Management Board Report of Issues (DB)
———————————————–
Next meeting is tomorrow, a summary will be provided next week.

REVIEW OF ACTIONS
=================
582.4: DC to insert an update in the wiki page regarding communication with LZ. Ongoing.

586.2: AS will contact MoBrain and discuss resources for their EGI project. Next step is to make allocations. The ticket is now with Catalin – AS will check and report to the PMB next week. Ongoing.

587.2: AM will invite selected small, medium and large sites to contribute presentations at GridPP36 on their plans for site evolution over the next few years and construct a session around this. Ongoing.

588.1: PG will discuss with DB Oversight meeting re concentrating efforts on paperwork for closing down GridPP4 and leave GridPP5 until later and generate emails to update PMB later in the week. Ongoing.

588.2: ALL to make email suggestions for GridPP36 themes over the next few days. Ongoing.

588.3: SL will provide PG with a script for importing lists of publications from Inspire. Done.

588.4: ALL to inform PG of any new roles and other items that need to be inserted into different categories and grants on Researchfish so that he can ensure all are included and circulate to PMB to check. Ongoing.

588.5 PG will email Ian Puller at STFC to update the lists of current grants associated with PIs. Ongoing.

588.6: GS will investigate reasons for saturation on OPM and report back to the PMB with findings. Ongoing.

588.7: GS to circulate to PMB availability for all 4 VOs for last 3 months then discuss at next week’s meeting. He currently produces a monthly report for DB. Done.

588.8: GS to report on ongoing disc server issues in general. Ongoing.

ACTIONS AS OF 15.02.16
======================
582.4: DC to insert an update in the wiki page regarding communication with LZ. Ongoing.

586.2: AS will contact MoBrain and discuss resources for their EGI project. Next step is to make allocations. The ticket is now with Catalin – AS will check and report to the PMB next week. Ongoing.

587.2: AM will invite selected small, medium and large sites to contribute presentations at GridPP36 on their plans for site evolution over the next few years and construct a session around this. Ongoing.

588.1: PG will discuss with DB Oversight meeting re concentrating efforts on paperwork for closing down GridPP4 and leave GridPP5 until later and generate emails to update PMB later in the week. Ongoing.

588.2: ALL to make email suggestions for GridPP36 themes over the next few days. Ongoing.

588.4: ALL to inform PG of any new roles and other items that need to be inserted into different categories and grants on Researchfish so that he can ensure all are included and circulate to PMB to check. Ongoing.

588.5 PG will email Ian Puller at STFC to update the lists of current grants associated with PIs. Ongoing.

588.6: GS will investigate reasons for saturation on OPM and report back to the PMB with findings. Ongoing.

588.8: GS to report on ongoing disc server issues in general. Ongoing.

589.1: PC and PG will discuss and agree the GridPP36 Agenda.

589.2: LC will announce GridPP36 registration is open to UKHEPGRID mailing list and provide url link as well as information on travel to the venue from Edinburgh (https://indico.cern.ch/event/477023/).

589.3: PC will report at the next Ops meeting that the PMB is happy to support a bit of effort into short term successes with Astronomy projects and keep this monitored/under review.

589.4: ALL read the MoU for New Users and discuss at next week’s PMB.

589.5: DC to update the PMB and explain the problem issues with the accounting portal.