GDB 9th May 2012

From GridPP Wiki
Jump to: navigation, search

Welcome

Speaker: Michel Jouvin

Short minutes will be taken each meeting. August meeting probably cancelled. One meeting to be held outside of CERN in autumn. June pre-GDB on glexec on WN. June GDB on future TEG work (post CHEP) and on middleware/support after EMI. Probably postpone experiment session to July. September IPv6-related session.

Perfsonar: encourage deployment at T2 of other VO'S other than ATLAS.

Forthcoming meetings:

EGI Technical Forum, Prague, 17-21 September HEPiX Fall Meeting, IHEP, 15-19 October

TEGs: status and what next

Speaker: Ian Bird (CERN)

Reports have been delivered, TEGs (as big working groups) may be regarded as finished - many thanks for those involved. Some work still ongoing. Discussion at CHEP workshop and initial proposal of working groups that WLCG should set up for pre-GDB and GDB slots (e.g. future of DPM). CMS would like to deploy glexec ASAP and could WLCG support that.

Jeremy asked about whether relocatable glexec was available - sites need to do it themselves (detailed notes frm Nikef on how to do it) since config file needs to be hard-coded into the executable. glexec more like OS rather than application. Most sites happy without relocatable.

Experiment Resources in the Coming Years

C-RSG and CCRB summary. Speaker: Ian Bird (CERN)

This year 3 expts intend to take additional data to be processed during long shutdown. Translates to a 20% increase in required resources. Pile-up also increasing which increases processing time.

ATLAS: plans to record data @ 400 Hz and 'park' the less relevant part for later analysis. Good at using resources - have been able to do more simulation than originally envisaged. Tier-2 CPU requirement go from 281 to 319 kHEPSPEC06 and disk drops from 53 to 49 TB.

CMS : Tier-2 CPU from 306 to 350 kHS06 and disk same at 26 TB

LHCb: Tier-2 CPU stays at 47 kHS06.

Comment from ATLAS computing: "we are grateful to the sites for providing these resources".

Need to progress with storage accounting. Agreement to use EMI storage accounting record.

LHCOPN/ONE Status and Future Directions (Stokholm's meeting summary)

Speaker: John Shade (CERN)

LHCOPN functioning well - few problems. Perfsonar starting to be equipped with email alerts. Jason Zurawski has offered to hold a workshop for site managers on how to use perfsonar - when/where though?

LHCONE - layer3 VPN service now operational. Several european sites now connected, fewer in US. Presentation on LHCONE Diagnostic Service - (perfsonar) DANTE upgrading european backbone to 100 Gbps.

Lots of other stuff going on - see talks

https://indico.cern.ch/conferenceDisplay.py?confId=179710

Federated Identity Managmeent

Speaker: Dr. David Kelsey

This is a sub-task of the security TEG. Federation: a common trust and policy framework between multiple organisations, IdPs and SPs Many research communities facing common problems of identity management and access to resources. e.g. involves photon & neutron facilities, social science & humanities, high energy physics, climate science and life sciences, fusion energy. 3 workshops : https://indico.cern.ch/conferenceDisplay.py?confId=177418 Now have a draft document with recommendations. Now not just HEP. WLCG security TEG has agreed in principle to the vision and recommendations in the draft paper. Still need to find a good pilot project (e.g. twiki access) for WLCG/HEP/CERN. WLCG MB endorsement also required.

KISTI, a new T1 for ALICE

Policy for accepting new T1s, Next milestones for KISTI Speaker: Ian Bird (CERN)

What does it take to become a Tier-1? Support from experiments, balance of resources, reach standards of existing Tier-1 - timeline to achieve this - order of 1 year. Sign up as "Associate Tier-1". KISTI is in South Korea Also Russia has proposed Tier-1 for all 4 expts, possibility also of Mexico for ALICE and India for ALICE/CMS.

HEPiX Prague Summary

Speaker: Dr. Helge Meinhard (CERN)

Busy agenda; New track on business continuity, convener: Alan http://indico.cern.ch/conferenceDisplay.py?confId=160737 Helge Meinhard elected as co-chair (Michel Jouvin stepped down following move to GDB chair). Topical highlights: business continuity, infrastructure, energy efficiency, fabric management/agile infracstructure (AI) (move to puppet, some away from nagios) batch scheduler (some scaling issues with pbs), condor & slurm on the rise, forum created for SGE sites, storage - what comes after RAID?

Autumn 2012 in Beijing.

Spring 2013 in Bologna.

Interest by LCG to use Hepix as an advisory body for site matters.

WLCG Workshop at CHEP 2012 in NY

Speaker: Dr. Jamie Shiers (CERN)

CHEP 2013 October in Amsterdam. CHEP 2015 - possibly in Asia-Pacific (around time of 6.5 + 6.5 TeV collisions).

Jamie went over the agenda. Need to use this workshop for RUN2 as did for RUN1. "Whilst we have the existence proof of the current (imperfect) service, many changes and challenges remain, not least regarding resources" "Output of the WS: a draft Gantt chart for RUN2?"

HEPiX WG Report

Update on HEPiX WG work on trusted virtual images Speaker: Tony Cass (CERN)

"The endorsement policy is agreed." "It should be possible for trusted user generated images to be safely instantiated at Hepix and WLCG sites." "The creation and use of images which connect directly to an experiment-specific job framework is entirely feasible"

WNoDeS: CNAF experience with virtualized WNs

Speaker: Davide Salomoni (Universita e INFN (IT))

WNoDeS = worker nodes on demand service. Created by INFN to integrate grid and cloud provisioning through virtualisation. Scaleable and reliable. WNoDeS version 2 is part of EMI-­‐2, to be released on May 18, 2012 Mixed mode (optional) allows the possibility to run WN as both traditional batch nodes and hypervisors for VM on the same hardware at the same time. http://web.infn.it/wnodes

Cloud Resources in EGI

EGI Federated Cloud TF report and clouds in future EGI resources Speaker: Matteo Turilli (U) from Oxford E-science Centre

TF objectives: Integration, end-user requirements, technical feedback, early adopters, recommendations. 18 month mandate, 23 member institutes (including STFC)

Discussion (Michel)

Sites need to understand the VO viewpoint.

ATLAS: VM contextualization and image cataloguing in ATLAS

Fernando H. Barreiro Megino CERN IT-ES-VOS

ATLAS has been running MC on panda queues with a number of different cloud infrastructures: LxCloud, HelixNebula(CloudSigma), Stratuslab, FutureGrid, Canadian Clouds Mostly development and proof-of-concept phase Plan to graduate services to production in next months Relying on CVMFS for applications. Long-term goal is an automatic factory for making images, cataloguing them, store site parameters, publishing and monitoring.

LHCb - (Philippe) not at the level that ATLAS are doing yet. Will be based on CERNVM. Should we consider the clouds as yet another batch system? Not worth having single core VM - probably whole node VM only. We should account VO's on wall-clock time rather than CPU time.

Michel: We have some sites ready to run jobs on cloud/vm infrastructure. So how can we do small scale experiment to understand the issues and make progress?