GDB 8th December 2010
GDB minutes
Agenda: http://indico.cern.ch/conferenceDisplay.py?confId=83606
Contents
- 1 Introduction - John Gordon
- 2 HEPiX IPv6 Group status update - David Kelsey (RAL)
- 3 Security Policy Update - David Kelsey (RAL)
- 4 Middleware - Andrew Elwell (CERN)
- 5 "Middleware in Limbo" (components not funded by EMI) - Markus
- 6 Extra (John Gordon)
- 7 HEPiX Virtualisation WG - Tony Cass
- 8 CernVM-FS Status - Ian Collier
- 9 CMS report - Ian Fisk
- 10 ATLAS report - I Ueda
- 11 LHCb report - Roberto Santinelli
- 12 Conclusions - John Gordon
Introduction - John Gordon
Forthcoming events:
- VOMS/VOMRS, CERN
- EGI User Forum, Vilnius, 11-15th April, 2011 (http://uf2011.egi.eu/)
- WLCG Workshop
Proposal to remove support for LCG-CE before data-taking in Jan 2011.
WLCG middleware - CERN/EGEE tightly linked EGI/EMI more arms-length and so WLCG/GDB needs to make sure of the support models and expectations for all software that is important to us.
plans for Jan GDB:
- DAaM demonstrator review
- ARGUS
- EMI release 1
HEPiX IPv6 Group status update - David Kelsey (RAL)
Dave Kelsey and US federal CIO both sent out memos addressing issue of moving to IPv6. By end FY 2012 public/external facing servers should be upgraded to IPv6. IPv6 deployment has been too slow. Applications are a big problem, both HEP and 3rd party e.g. openafs. Reminiscent of Y2K problem. Technical transition details are immature or missing. When will we (HEP) have to suport IPv6 only? Lessons from HEP DECnet/Phase V coordination 20 years ago: "But we did learn that analysis and planning is essential and takes lots of time!" Agreed to set up a HEPIX IPv6 sub-group - vounteers wanted!
Security Policy Update - David Kelsey (RAL)
EGI security policy group (SPG) https://wiki.egi.eu/wiki/SPG
- Develop and maintain Security Policy
WLCG security meeting FNAL Nov 2010
- Updated Grid User AUP - New version EGEE/WLCG in April/May 2010
- (https://edms.cern.ch/document/428036)
Middleware - Andrew Elwell (CERN)
Released:
- 3.2 (update 20) - 2010-11-10 – ARGUS 1.2 – CREAM 1.6.3 – LB 2.1.16
- 3.1-i386 (update 67) - 2010-11-16 – WMS 3.2.15
glite-CLUSTER on hold
In progress:
- BDII_top
- GLEXEC 0.8
- DPM/LFC 1.8.0
- WN/UI/VOBOX 3.2.10
GFAL / lcg_utils were delayed, now OK lcg-vomscerts 6.3.0 (after 10 Dec)
Retirement calendar now public (http://glite.web.cern.ch/glite/packages/R3.1/)
"Middleware in Limbo" (components not funded by EMI) - Markus
- Infoproviders esp for CREAM-CE
- batch system support: all except for torque & LSF are best effort
- lcg-tags and lcg-ManageVOTags - Maarten volunteered to support
- glite-Cluster not part of EMI (Steven Burke)
- for publishing complex clusters correctly
- runs on lcg-CE
- aimed at sites with multiple head nodes/cluster types etc
- WN/UI - currently funded by CERN
- Data Management components related to XROOTD
Extra (John Gordon)
UK sites not running a CREAM CE: UKI-LT2-Brunel UKI-LT2-RHUL UKI-SCOTGRID-DURHAM UKI-SCOTGRID-ECDF UKI-SOUTHGRID-CAM-HEP
HEPiX Virtualisation WG - Tony Cass
The HEPiX Virtualisation Working Group has established a framework which should enable the free interchange of virtual machine images between HEP sites. VO supplied images could connect to pilot job infrastructure directly, not the local workload management system. Software distributed by CERNVM-FS. Should be easier to apply security updates. The StratusLab project has very similar ideas. Collaboration looks to be a possibility
CernVM-FS Status - Ian Collier
There seems to have been a marked increase in interest in CernVM-FS, especially from Tier1 sites in the last few weeks. RAL have been testing and looking at load on squid and about to scale test with LHCb. Until the service is fully supported at CERN (and the repository is mirrored elsewhere) it should, for Tier 1s at least, be regarded as experimental. Security audit in progress (as offshoot of HEPiX virtualisation working group). Grid sites are a very different use case from CernVM. RAL are working with developers to test and document process for mirroring CernVM-FS web repository. See final slide for links & further info. ATLAS very keen on it.
CMS report - Ian Fisk
Tier-2s: about 20% of the sites are in a not ready state. CMS is able to replicate data from most Tier-1s to most Tier-2s. The majority of links between Tier-2s are also tested. Full mesh allows flexibility and speed for replicating samples. Good rates across the atlantic. Need to work in 2011 on how we allocate space. Big variation on how much data is accessed. Analysis activity continues to increase - 150K analysis jobs per day (25k running at any one time). The number of stage out errors is falling. Migration to AOD (factor of 4 smaller in size than reco) has started. Tier-2 resources hit 100% in September and are staying there.
ATLAS report - I Ueda
- Auto-cleaning now in place.
- Dynamic data placement now operating in most clouds:
- DaTRI: Data Transfer Request Interface
- PD2P: Panda Dynamic Data Placement- needs tuning
- In terms of number of files moved user data movement represents a major component - many small files.
- BDII — reconfirmation of ATLAS position
- we are relying on BDII for ATLAS distributed computing operations
- for the moment we need BDII supported
- ATLAS would like to have a place to get the information about how much data is stored on disk.
- If BDII is the place, we need this implemented and supported.
- CVMFS - Validated for production and analysis in ATLAS. Looking for volunteer sites.
- ATLAS is foreseeing much more data in 2011 than in 2010.
LHCb report - Roberto Santinelli
- Space limits being hit everywhere
- Introduced a throttling mechanism in Dirac
- CREAM used in production - big improvement
- Looking at CERNVMfs for experimental software - shared areas still a problem
- Looking at alternative to oracle for condition DB: candiates are Frontier/Squid, SQLDDDB, Condition Information embedded in the ROOT file
- Smooth data access is still a must: "Spindles Vs Space"
Conclusions - John Gordon
- John Gordon has never heard 4 such happy experiments
- Commonality: CERNVMFS
- Highlights:
- CMS: room for improvement in Tier-1 availability. Also 20% of Tier-2's not ready
- ATLAS: request for getting space token info from BDII, much more tape use next year
- LHCb: not enough disk (JG: i.e. too much data, since pledges are being met)
- ATLAS, LHCb, ALICE confirm they are ready for CREAMCE only at sites