GDB 8th February 2012

From GridPP Wiki
Jump to: navigation, search

Pre GDB Tuesday 7th Feb 2012 Agenda http://indico.cern.ch/conferenceDisplay.py?confId=158775

Data Management report Speakers: Dirk Duellmann (CERN), Brian Paul Bockelman (University of Nebraska (US))


Still work in progress: LHCb has stopped using POOL, ATLAS still using it but from LCG62 (using ROOT 5.32) will not require support from IT-ES. Data management and Storage TEG worked together for most of the time has good and bad points. Dark and Grey Data. Most experiments are already federating data (with xrootd) WLCG has no long term need for LFC Protocols gridftp and xrootd are the current core protocols http(s) promising NFS-4.1 and S3 will be watched


Storage Management report Wahid Bhimji (University of Edinburgh (GB)), Dr. Daniele Bonacorsi (University of Bologna)

SRM for disk may not be providing any value? For tape it is implemented to provide abstraction, but may not be providing that. WebDAV mentioned, could use http as a protocol. Hope with http was to be able to use existing clients, but that may not be the case. If you do not have SRM it pushes the onus on to the middle ware? /FTS to support them. Worry about lack of conversion. Dangerous that the 4 experiments diverge. Intend to have various meetings over the next few weeks. Wahid suggests it will take till march /end of ? before conclusions come. Experiments have provided input, but in different formats. The group needs to write statements and get the experiments to comment /approve. IB would like a stream of output not wait 3 months for it. Input is specific recommendations, output is agreed recommendations.


Workload Management report Speakers: Dr. Torre Wenaus (Brookhaven National Laboratory (US)), Davide Salomoni (Universita e INFN (IT))

The mandate is what we need in 1, 2-4 years’ time Lots of discussion on the mailing list.. Areas of interest; commonalities with pilots and frame works. Support for whole nodes scheduling CPU affinity I/O vs. CPU intensive jobs Cloud and virtualization. 4 LHC experiments ; Alice and LHCb use direct submission to CREAM CE, CMS use glidein WMS, ATLAS is also testing this. All VOs sould like to be able to keep n jobs queued. Glexec discussed. New JDL required to request memory and no of cores. Sites will need to advertise the max number of cores via the info ystem. Some CPU affinity options require SL6.

Operations & tools report

Speakers: Dr. Maria Girone (CERN), Jeff Templon (NIKHEF (NL)) Organised into 5 WGs. Things work but not optimal or sustainable. Want a small number of well defined common services which are easy to install configure and upgrade. Should be resistant to glitches. (Is this every likely?) Reduce ops effort, complexity and dependancies between sites and services. Global recommendations R1-8 Availability monitoring Surprising how few sites involved. (Both ops and Security) Want feedback on the document. https://twiki.cern.ch/twiki/bin/view/LCG/WLCGTEGOPerations

Security report Speakers: Steffen Schreiner (Technische Universitaet Darmstadt (DE)), Mr. Romain Wartel (CERN)

Security of Pilot Jobs, of Grid jobs, and traceability and accountability Only serious option right now is gLExec Will work on a basic trust model, Requirements for a Grid Job delegation credential Discuss different frameworks

Risk assessment document http://cern.ch/go/dt9S Important to keep the work of the security TEG focused.

Databases report Speakers: Dr. Dario Barberis (Universita e INFN (IT)), Dave Dykstra (Fermi National Accelerator Lab. (US))

ROOT files, COOL SQL, Oracle CERN IT should deploy a Hadoop cluster and test NoSQL products. Need to complete and harmonize the report, especially the section on NoSQL, and submit to WLCG MB and GDB, aim for +2 weeks time.

• 17:30 - 18:00 Summary and Conclusions 30' SRM discussion.

WMS etc



GDB Agenda http://indico.cern.ch/conferenceDisplay.py?confId=155065 People interested in Grid Engine should contact philippe.olivero@in2p3.fr Cern replacing WMS servers by ones running EMI1. Vidyo seems to be ok . To be invited to next GDB. MB reduced to once a month. We have the TEGs, need somewhere for them to follow up. What should the GDB look like in the future? Coordination roles. Marcus often leaves without having a clear idea if they reached a consensus. Minutes and Actions could be more structured. EGI end Q2 2013. Unlikely to attract generic funding in future. Complexity has moved from middleware to experiment specific layers. Less effort from cern, can we get help from INFN or GridPP?

TEG Summaries: WLCG has made use of projects such as EDG,EGEE, EGI, PPDG and OSG , these are coming to an end or fudning not clear. What does WLCG become? Sites need to concentrate on providing extremely robust infrastructure and key services. Effort is limited. There will be some CERN IT effort, but need help from other large national grid projects, INFN, GridPP??) Need to collaborate across all WLCG, IB doubts that supporting s/w for a single expt will be easy to justify or fund in the future. TEG’s have been information gathering, Synthesis/Exploration/Orientation, now Refinement required. Need overall architecture diag, and concrete proposals. At next MB IB will propose a small editing team to bring work of TEGS into an overall strategy. Workout which areas require further work (maybe by continuation of TEGs or small team/wg. Future MB/GDB will review the recommendations and plan.

Site Dashboards Too many experiment information sources. Infor published by VO specific monitoring systems should be integrated in a high level tool. SiteView started in 2009, now common implementation with Site Status Board. Collects global site status , job monitoring info, and transfer monitoring info. So far mainly used for dissemination purposes, runs behind WLCG GoogleEarth. Work in progress needs feedback. GridMap UI http://dashb-siteview.cern.ch SSB-like UI http://dashb-siteview/dashboard/request.py/siteview

EMI Gave an update of the latest releases. https://twiki.cern.ch/twiki/bin/view/EMI/EMIUmdStatus

LHCOPN and LHCONE Most T1s now have 2 perfsonnar-PS servers. For LHCOPN monitoring Dashboard https://perfsonar.usatlas.bnl.gov:8443/exda/?page=25&cloudName=LHCOPN New LHCONE multipoint architecture. Will make a solution for now, and a long term plan for LHC shutdown. Dynamic Network Services: Vision is the ability to partition network to: Enable isolation of large flows and small flows Enable predictable performance Enable virtual privacy Enable incremental scalability of underlying resources Want to have monitoring before and after LHCONE comes online.

Security Romain Wartel gave a very interesting talk on what Hackers are looking for and why.

Storage Accounting EMI StAR proposal for storage accounting record. EGI infrastructure is changing. Gstat, for single VO storage reporting is good, for multi VO it’s more problematic. DGAS has a storage accounting system. Does WLCG wish for an interim solution based on gstat or IGI?

RFC/SHA-2 proxies IGTF would like CAs to move from SHA-1 to SHA-2 signatures. For WLCG this means using RFC proxies instead of Globus legacy proxies. Have 10months to get ready. Several products such as dcache, BeStMan (RFC proxies already supported), Argus, Cream, WMS, DIRAC SHA-2 should work but not tested. Need all SW to support RFC proxies by Autumn 2012, and ideally support SHA-2, (except dchace and BestMan). Switch over to RFC proxies in Jan2013.