Wed Nov 10 2004 London Tier 2 Technical Meeting ----------------------------------------------- Present: ------- Brunel: Henry Nebrensky IC-HEP: Owen Maroney IC-LeSC: David McBride, Keith Sephton RHUL: Grigori Rybkine QMUL: Alex Martin UCL: Ben Waugh, Mark Dixon Agenda: ------ 1. Review of actions from previous meeting. 2. Matters arising from GridPP and LCG. 3. Site status review: problems and ways forward. - Brunel - Imperial HEP - Imperial LeSc - QMUL - RHUL - UCL 4. AOB -------------------------------------------------------------------- 1. Review of Actions ==================== > Submit savannah bug report regarding passing queue parameters to the PBS scheduler. Alex to write this up Status: ongoing > SunGrid Engine Durham is no longer to provide info provider for SunGrid Engine. IC to develop instead. Status: ongoing > LCG2 SRPMS used to port to Fedora and Red Hat 9. Status: open > Circulation of Firewall (iptables) script for GridPP frontend nodes Plan to test on IC HEP farm. Status: ongoing > circulate edgtool (for cleaning RPMS repository) Owen has tested the script for compatability with the current LCG2 release and found fragile. Status: closed 2. Matters arising from GridPP and LCG ====================================== UK Grid Deployment Board was held, a number of issues passed on to the sites - CA rpm's update not done almost by any sites - Urged to upgrade to LCG2 2.2.0 as soon as possible since it is the first including R-GMA, a UK project, and the date when 2.3.0 comes out is currently very unclear Owen confirms that 2.3.0 will support SLC3 for the frontends and wn's manual script-based installation as well as Red Hat 7.3 lcfg and manual installation. - The Board would like non-HEP VOs supported and provided with 1.5% of all resources (CPU time and storage). - Only two sites have provided information on security contacts. Others please o so. A discussion took place on how often sites would like new releases to come out and how much time to have for upgrades. Most would like to have 3-4 weeks for small upgrades, not requiring draining the queues, and up to 2 months for upgrades to major releases. Some would like to have advance notice, others prefer more time for upgrades instead. The possibility to skip minor releases, not security patches, in some circumstances was considered desirable. 3. Site Status Reports ====================== Brunel: Set up a new CE. Are to enter details into the GOC database. Imperial HEP: at 2.2.0, have had to ask for the intervention of the Head of Department to reopen ports 8080 and 8088 (for R-GMA). Imperial LeSc: waiting for 2.3.0, work on SunGrid Engine intergration. QMUL: have written a Linux filesystem which allows the aggregation of multiple disk volumes and deployed this on their cluster to build a single 20 Tbyte filesystem. RHUL: CE and SE have been installed on GridPP nodes. Another 2.8 TB file server has been added, now have a total of 3 x 2.8 = 8.4 TB. CA rpm's have been updated. UCL-HEP: Ongoing problems with information intermittently disappearing from the GIIS. For once this does not seem to be a problem with PBS, since qstat returns promptly. The NIKHEF PBS caching utilities are being used, which reduces the load on the PBS server (and the size of the server log files) considerably. However, the GIIS problems occur both with and without the caching version of qstat etc. Maybe just an effect of the CE being overburdened. Should soon be able to find a more powerful machine to use as a CE, and hope this will improve matters. Still at 2.1.1 until get another machine to use as a MON box. UCL-CCC: The site is working very smoothly, currently, with many jobs from ZEUS, LHCB and ATLAS. LCG 2.2.0 installed some time ago.