GDB December 2008
From GridPP Wiki
Revision as of 10:44, 11 December 2008 by Graeme stewart
- Next GDB meetings have been confirmed. Pre-GDBs will be confirmed soon (especially for January).
- GGUS will be able to ticket sites automatically - contact emails taken from GOC.
- GGUS have reached an agreement with OSG to be able to send tickets straight to OSG resource centres. This will push the ticket into the OSG system. As the ticket is handled in the OSG system "public diary" entries will be pushed back into GGUS.
Both of these facilities are foreseen for the January release.
- JSPG met in October.
- VOs will not be asked to document their registration procedures (too heavyweight).
- Work on web portals run by VOs - there is a policy document for this.
- These should be run with robot certificates and all CAs are pushed to support these. The DN of the user may or may not be exposed depending on the use case. (Similarities with pilot jobs.)
- PMA has continues to meet to approve new and review exiting CAs. FNAL KCA should be properly approved soon.
- Data privacy seems to be mired in legal issues - not clear what is "personal data" (everything?). People have to consent to the publication of any personal data. Is this covered in the VO registration? (I found this a confusing discussion with many pathological examples being given.)
- Education federations are trying to hide personal data via persistent "targeted identity".
- Even IP addresses seem to be "personal data".
- For grids people have X.509 certificate with their name in it, which is definitely personal data.
- Sites and VOs need to know who individuals are.
- This has to be seen as a "contractual obligation" not as something they can meaningfully opt out of.
- But this will probably require further registration of sites/VOs as data managers.
- User AUP has been adopted by many grids, but often tweaked.
- Maria noted that the VOMS servers do not at all comply with security policies.
- Much more co-ordination between activities in security operations (within EGEE and with other grids).
- Security Service Challenge 3 was a successful test
- Tier-1s want another go.
- Preparing a kit to allow regions to run their own challenges.
- UK is organising training.
- 6 security incidents on the grid - though no breaches through the grid middleware itself.
- Attacks have spread across grids.
- Logging in middleware is still inadequate - not using syslog, not machine readable.
- Sites need to log for 90 days, but need to clarify what is relevant.
- Space for logs is not generally a problem, but needs to be filtered (much is irrelevant for security).
- Incidents usually require 2->6 month information (>90 days!)
- Want to decide what logs need to be kept for longer than 90 days - to be ratified by the GDB.
- Markus said that gLite services have 'service cards' which might be useful.
Installed Resource Collection
- Goal to clarify what can be published via glue (1.3) to help with accounting what sites are providing.
- Current proposal would be to have a CE per logical sub-cluster.
- This will lead to a serious increase in the amount of information published.
- There will be be support in YAIM for sites.
- For SEs, even though it's 'broken', UniqueID should be the FQDN of the SRM host
- There are clients which needs this.
- Need proper publishing of the control protocols though.
- Storage Areas will publish reserved space
OSG Campus Grids
- Interesting talk on condor based campus grids.
- Using local, grid and BOINC backfill to keep cluster busy.
- Have been able to integrate multiple clusters.
- Integrating new resources is too heavyweight for small sites
- Working on a CELite (live CD or pacman).
- Investigating using VMs to provide specialist environments for VOs - these are booted automatically by condor when needed.
- Problems with certificates in this case.
Alice Experience with CREAM
- Alice have been testing the CREAM CE.
- Used for production after initial testing on PPS.
- They like it! 55k jobs done.
- Direct submission to CREAM CE (submission through the WMS is problematic).
- If a site has LCG-CE and CREAM_CE and want to support ALICE submission to both then they need 2 VO Boxes.
- If the site switches completely then the lcg-ce is no longer required (for ALICE)
- You will need to have a gridftp server somewhere
- Known issues with proxy renewals - ALICE have an approval for long proxies in the meantime.
- This was just the best talk of the day!
- Reprocessing will happen before Christmas.
- DDM high rate tests have started to happen now (millions of tiny files - testing frequency, not volumes).
- There will be 'best effort' operations continuing over Christmas.
- Top priority is SCAS for pilot jobs.
- Need to be very sure of stability as this becomes a critical SPOF at the sites.
- Experiments will be able to have "opt-in" to SCAS/glexec during testing
- Rollout in February.
- CREAM SE
- Patches for various bugs + proxy renewal.
- Huge patch (almost a new version).
- No submission to CREAM yet. January.
- 1.7.0 version almost certified. January release.
- New patches.
- Upgrade coming.
- x86/x86_64 SL5 WN based on VDT 1.8
- New version using VDT 1.10.
- No bundled SSL - uses system's SSL libraries.
- Python 2.5
- Will put 64bit libraries into lib64.
- There will be real issues with compatibility between different compilers/python versions/architectures.
- gLite 3.2
- SL5, VDT 1.10.
- Markus wants CREAM CEs to be seriously deployed by sites.
- CERN have a job efficiency instrumentation package which they are encouraging VOs to use.
- Looks like no change...
- John presented a summary slide on operations over Christmas.
- Looks like best efforts from everyone.