GDB 9th November

From GridPP Wiki
Jump to: navigation, search

GBD Summary

Author: Steve Jones Date: 9th Nov 2011 Slides: Slides for all talks are to be found on http://tinyurl.com/buyvsz9 Minutes: https://indico.cern.ch/getFile.py/access?resId=minutes&materialId=minutes&confId=106650


Foreword: I attended this meeting via EVO and captured these points from the discussions. I've represented the points as faithfully as I can, but there was plenty of scope for mistakes, so don't take this too literally. Cheers, Steve


Introduction: Dr. John Gordon


The EMI/EGI/UMD version CREAM for SGE is derailed by some sort of show stopper. It is hoped that the problems should be “revisited” (solved?) by the end of the year. There are general worries about disk prices due to Asian floods.

Technical Evolution Groups

Workload Management: Davide Salomoni , Dr. Torre Wenaus


Invitation proffered for more T1 sites to join TEG. Davide remarked that info system is not trusted by the experiments, so Atlas (e.g.) uses its own.

Data Management: Dr. Brian Bockelman , Dirk Duellmann


The TEG is currently creating a new strategy. There will be a model for data management which will be simple and pragmatic. It will use HTTP. There was a short discussion on push model versus on demand, federation, protocols etc. Caching is out of scope at this time.

Operational Tools: Jeff Templon , Dr. Maria Girone


There was a brief discussion on BDII. There were concerns raised about what's in it! Also discussed were SAM, Nagios, APEL, GGUS. Presenters appealed for sites to engage with TEG. The overall goal of this TEG is to reduce manpower (ideally to zero!). There was a long discussion on accounting, and how to publish well.

Accounting: Dr. John Gordon


ActiveMQ is now being used; it seems to be very reliable and simple. A before and after data flow picture was presented (see slides). New technologies were mentioned, that will come in shortly i.e. STOMP, APEL SSM (producer/consumer, supposedly reliable).

There was a short discussion of supposed flaws of the consumer/producer paradigm. It seems that, in certain circumstances, rogue consumers could consume data, which is then missing when a real consumer arrives. I.e. data could “goes missing” when read. Solutions include making records persist longer, or authenticating consumers. Many of the options were briefly discussed, e.g. digital signatures/encryption for traceability. No recommendations, AFAICT.

Accounting: (Jérôme Belleman)

CERN uses LSF as LRMS. Jérôme illustrated a new, proposed data flow (see slides). It will be easier, less complex, more consistent and real time. It uses uses new software, with a node normalisation factor for scaling (not average).

Question from the floor: why implement new LSF parser/collector. Answer: Simplified. The new LSF parser could be reused at other sites (NOTE: Lancaster).

Further points: Condor is going to be phased out, due to no demand. SGE support is weak still. Jérôme says the new account subsystem system will support any batch system, but needs LRMS expertise to make it happen. Sites must be aware of coupling between batch logs and APEL parsing. Glite-CLUSTER may help with consistent publication (heterodox generous sub-clusters).

Middleware


EMI: Doina Cristina Aiftimiei


There was a long, inconclusive discussion/dispute about the meaning of “staged roll-out”. There was a discussion on roll-out schedules (very detailed) which covered EMI 1 (Kebnekaise) – SL6 porting proceeding. This was detailed information that should be available to all sites, IMHO.

GLite: Maria Alandes Pradillo


Glite will retain security patches 'til April.

VDT rpms:


Nothing captured

WLCG Client Distribution: Oliver Keeble


Nothing captured

HEPiX Summary of 20th Anniversary Meeting : Michel Jouvin


This will occur in Vancouver. There was a discussion about some IPV6 test beds; CERN intends to convert by 2013. There will be working groups on Visualisation, Storage. Discussions on clouds and benchmarking.

Benchmarking: Dr. Helge Meinhard


Helge gave us a history of HS06 (see slides?). It still works well and is held in high regard by industry. A 32 to 64 bit difference is emerging – it's approx 15%. There is a new benchmark coming: SPECcpu2012. They will rewrite HS06 as a new standard to match this. There was a discussion on sources of variations in benchmarking. Familiar discussions took place on benchmarking error bars, sub clusters etc.

Virtualisation: Tony Cass


Discussion on policies about image repositories (creation/life cycle). In the end, this may resolve to just one CERNVM image. ~16:00 Lost connection/meeting closed.