GDB 8th June 2011
GDB
Minutes of GDB - June 2011
https://indico.cern.ch/conferenceDisplay.py?confId=106645
Contents
Introduction: John Gordon
- Happy IPV6 day (http://test-ipv6.com/).
- Old sam database will close end august
Can sites remove LCG-CE?
- no problem with any of the experiments. - might be a problem for portals which submit directly, but not a problem for WMS - new Cream CE 1.6.6 coming soon with fixes for SGE - availability now based on lcg-CE || cream CE * will move to based on cream - at some point -??? september.
Virtualisation
summary of EGI virtualisation and clouds workshop.
- Introduce virtualised resources alongside current grid ones to increase flexibility while retaining the current federated model. - resource providers to set aside resources for testbed and investigations. - virtualisation workshop * no users there, only site managers and developers * what people seemed to want was persistent services - didn't seem to understand submitting a machine as a job.
WLCG workshop Desy 11-13 July.
- please register if you will be attending. - please volunteer for talks - discussion encouraged.
WLCG stickers left.
GGUS - towards a fail safe system :Oleg Dulov
* Used ITIL methodology to work out what to improve in GGUS reliability. * migrating to high availibility technology. - Using vmware platform with HA support.
Glexec: Maarten Litmaath
* Ops - basically OK. * going back to testing all CEs unconditionally (was testing only those advertising in gocdb). See links in Maateen's talk. * Some issue at ASGC, now fixed. * CERN - some issue with ldap infrastructure that jobs get mapped incorrectly - but still sensible. Shouldn't be a problem for ARGUS. * CMS tests - open for all sites supporting CMS - 40% responded so far, many aiming for end June
* Real work flow tests with Condor glideins fail on EGI sites. - Glexec configured differently in OSG and EGI: linger mode. * LHCB - Code to report back glexec failures not yet in production - will be soon. * Atlas - glexec in production version of pilots. - Works OK at TRIUMF - Got stuck at CERN. - Continue debugging T1 tests
* Dedicated T2 mailing list so far.
* 41 sites (26 EGI + 15 OSG) - many not OK in Nagios
Still some way to go.
* 2 bugs - neither a showstopper - "CN=host/" - Random timeouts (one site in UK discovered this).#
* relocatable install. - config file needs to be hard coded. - can't rely on load library path.
* Maarteen and Jeremy will talk offline about relocatable release.
Database futures summary: Tony Cass
* Oracle - many mission critical applications - relatively well understood and relatively stable - Trivial volume, but many users
- Experiments: Large data volumes but growth linear with physics data volume. In some cases hardware capability growth outstrips system requirements.
- Accelerator: O(10PB) by 2020. - O(Exabyte) for CLIC - but out by factor 50 - Live long and prosper.
* Other SQL (mostly MySQL, some SQLite). - * NoSQL - key issue seems to be difficulty of providing efficient read performance for essentially random queries - databases have been optimised for inserts and production queries. - seem to be able to put things together quickly with reasonable performance. - No real comparison between optimised performance.
_ requirements still somewhat unclear. - Ease of setup at the cost of future maintenance woes.
- Application developer productivity much greater in NoSQL - but possibly at cost of future maintainance.
EMI - UMD 1.0
* UMD 1.0 - sl5/64 bit - planned publication 4 July - most EMI 1.0 contents
* UMD 1.1 - planned publication 5 september - include remaining EMI 1.0 components - globus
* Prioritisation - based on request from OMB and UCB
* respects end of support schedules
* UMD 1.0
- see slide for list of proposed components
- Unverified
- StoRM
- pending EMI delivery
- Rejected
- glite MPI
- WMS
- Source is provided - EGI don't rebuild from source.
Security Discussion
* Lively discussion about need for glexec - whether this was the best way forward.
BDII
Short term
- improve caching - improve info providers
Long term
- rocky road of separating short term info from long term info - gathering requirements about what is needed.