Difference between revisions of "GDB 12th October 2011"

From GridPP Wiki
Jump to: navigation, search
 
(No difference)

Latest revision as of 15:38, 13 October 2011

  • Work in progress

If a theme had been put forward for October's GDB it would have been "A Call to Arms". The meeting was dominated by presentations from the Technical Evaluation Groups, many of whom asked for possible contributers - as did Mario David after his talk about the work of with the EMI Early Adopters.

Link to the agenda and presentations: http://indico.cern.ch/conferenceDisplay.py?confId=106649


  • Introduction - John Gordon
   I missed John Gordon's introduction talk, but the highlight from the slides appear to be that the removal of the lcg-CE from availability calculations has been postpone for a while. On the gLexec front there is a "lower priority" for sites to roll this out pending TEG review.
  • Networking - John Shade
   It seems LHCOne is setting up something like our old GridMon boxes. I admit to not paying as much attention as I should have, there seems to be some debate about basic underlying infrastructure for this (Routing vs Layer 2 rules vs other stuff I didn't understand). Interesting note is that the endeavour seems to be very VO driven.
  • Staged Rollout - Mario David
   Mainly an advertisement for the EA scheme.
   UK “middle of the pack” for number of EA sites. Doubt we want more.
   Notable “product” missing an EA is FTS - not sure we’d want to step forward.
   EMI does also have a dedicated testbed.
   John Gorden reminded us of past problems with the PPS. Troubles with scale, and the simple wisdom that software developers shouldn’t be the ones to test their software works in production.
   Concern over frequency of EMI releases.

TEG Program

  • Databases - Dario Barberis
       Just appointed, work only starting
       Large remit, comparing Oracle, mysql etc, looking at squids & alternatives, and a framework for rolling out databases for applications that need them.
       Workshop on afternoons of 7th-9th of November, people might want to evo in.
       https://indico.cern.ch/conferenceDisplay.py?confId=158091
       wlcg-teg-database@cern.ch
       Everyone welcome to the mailing list.
  • Operations & Tools - Jeff Templon & Maria Girone
       Ridiculous scope of work
       Example question: Will batch systems scale?
       Highlights in green on Jeff’s slides are where they hope to start.
       First meeting Monday 17th (a phone conference given the short notice). Regular F2F meetings, where subgroups come together.
       24 names down now but more might be wanted, especially from Tier 2s. Another case of something the UK might want to be involved in?
  • Security - Romain Wartel & Steffen Schreiner
       Members only just contacted, so no firm plan yet.
       wlcg-security-teg@cern.ch
       gLexec on their remit.
       Another invitation to contact if you want to get involved.

Middleware Section

  • EMI Update - Doina Cristina Aiftimiei
   Next update 13.10.2011
       -Of interest to me APEL LSF and SGE parsers are getting a fix.
       -  -glite-MPI and ARC Products getting some love.
   Incoming releases:
       -Major - CREAM SGE module v. 1.0.0, first release of HYDRA
       -DPM & LFC v. 1.8.2
   • support for Catalog Synchronization (SEmsg)
   • improved scalability of all frontend daemons
   • faster DPM drain & better balancing of data among disk nodes
   • log to syslog & GLUE2 support
   -Storm 1.8.0
   -BLAH v. 1.16.3, improved SGE support and fixing some blah scale problems
   -WMS, v. 3.3.4, fixing a number of bugs
   -Details on how to submit requirements to EMI: http://www.eu-emi.eu/services
   -There’s a lot of heated debate that’s hard to follow, concerning early availability and things like tarball releases.
   -One chap mentioned a one line improvement to LFC code that has undergone robust testing and asked how long would it take for it to show up in the release?
   -There wasn’t a solid answer, would need to go through GGUS ticket, and then it would depend how high a priority.
   -Changes need to be announced.
   -Could be in testing “next day”, but much dancing around.
   -Challenge to time the next example.


  • Storm - Luca dell'Agnello
   -1.8.0 in EMI 1.9 (expected Nov. 3 2011) (code frozen, all looks good)
   -Storm had a big rejiggy on it’s internal testing infrastructure and have a new coordinator. Hopefully this will stop nasty surprises sneaking through.
   -If I understand it current storm in UMD is a rubbish, old buggy version.
  • User Support: GGUS news: �fields, workflows, reports - Maria Dimou
   -REminder that you can open a ticket for a third party
   -alarm ticket functionality (SMSs experts) + more coming (if you want soemthing new, submit a ticket!).
   -Push to move from e-mail threads to ggus (ggus wants to rule the world)
   Slide of useful GGUS links:
   1. EGI metrics:
   https://twiki.cern.ch/twiki/bin/view/LCG/VoUserSupport#EGI_SA3_Metrics
   2. GGUS tickets for LHC experiment VOs per week:
   https://ggus.eu/pages/metrics/download_escalation_reports_wlcg.php
   3. Did you know?: https://ggus.eu/pages/didyouknow.php
   4. Workflow for the T0 (prepared for CMS, applicable to all LHC
   experiment VOs):
   https://twiki.cern.ch/twiki/pub/LCG/VoUserSupport/IT-CMSTickets-20110907.pdf
   5. Tailored columns of search results:
   https://ggus.eu/ws/ticket_search.php
  • "Tier 3" talks
   This appeared to be a US-dominated section. Motivated by the MB wanting a better handle on what a Tier 3 was and how they were being used.
   CMS - traditionally anything not controlled by CMS was a Tier 3.
       -Now many of them act like small T2s
       -Trying to increase support as simply as possible to maximse cost/benefit ratio
       -One of the problems is lack of staffing at a Tier 3.
   ATLAS- Tiered tier3s, every atlas institute got some tier 3 funding. The concept didn’t export very well to other countries.
       -Twiki pages on it
       -Have a Tier 3 coordinator.
   LHCB - No WLCG Tier 3 definition
       -  Tier 3’s, like CMS, not Tier 2s and have not signed MoUs.
           - no disk needed.
           -plans to increase oppurtunistic use

TEG Part II

  • Data Management - Brian Bockelman
   Like the other TEGs this one had yet to kick off properly, but Brian had a few things to say on the matter of Data Management:
       -It’s hard
       -false assumption that storage is reliable and data loss is an exception
       -Group still needs to kick off properly    
  • Data Storage - Daniele Bonacorsi & Wahid
   Again just getting off the ground
       Site participation is a bit thin (Wahid or Jens might try to recruit some UK chaps).


  • CVMFS: Production Ready? - Steve Traylen
   CVMFS “Stratums” in place.
   Stratum 0 in the midst of being streamlined.
   Will be ready for production after load testing.
   Some more streamlining needs to be done, particularly with the installer boxes.
   -Responsibilites for future support and developement need to be hammered out
   -Recent release all bug fixes
   -VOs have wishlist (faster publishing, export cvmfs over nfs (??) Mac OS support) , so do sites (encryption, monitoring, extended documentation).
   -New deployment advice - do what CERN does.
   -JG - how does a site know what CERN is doing?
   -ST-the information is on the website
   • CVMFS discussion list.
   – cvmfs-talk@cern.ch
   • CVMFS Releases, including release notes.
http://cernvm.cern.ch/portal/downloads
   • CERN-IT CVMFS Central Services
   – A new homepage for central CVMFS service
https://twiki.cern.ch/twiki/bin/view/CvmFS/


   -Diskless batch workers and nfs-like only installs still a long way off
   -Don’t hold off (despite there still being some Statum 0 stuff needed to be rejigged)
   -CVMFS production ready
   -Two thumbs up from people.
  

Next meeting in November.