Difference between revisions of "RAL Tier1 weekly operations castor 14/06/2010"
From GridPP Wiki
Matt viljoen (Talk | contribs) |
(No difference)
|
Latest revision as of 14:39, 14 June 2010
Contents
Summary of Previous Week
- Matthew:
- CoD + Depmon duties
- Write stager restarter
- Helping Kashif replace faulty RAID cards
- Planning facilities instance work with Tim
- Establishing MICE's requirements for duplicating data
- Fixing space problem on puppetmaster /var
- Testing new puppetmaster
- Debugging SL5 disk server problems
- Shaun:
- Analysis of problems on SL5 didsk servers
- SRM development
- Upgrade testing.
- Chris:
- Working on polymorphic servers
- Analysis of problems on SL5 didsk servers
- Castor 2.1.8/2.1.9 tests
- Richard:
- Added a Nagios check to vet the LDIF emitted by the CIP
- Adding detailed metrics to wiki page on pre-prod benchmarks
- Upgraded central name server on pre-prod
- Ran functional tests on pre-prod
- Brian:
- ..
- Jens:
- ..
Developments for this week
- Matthew:
- WLCG Data Management Jamboree
- MICE support
- 2010 hardware spend proposal
- Shaun:
- WLCG Data Management Jamborree
- MICE set up
- Chris:
- Castor 2.1.8/2.1.9 tests
- CoD + Depmon duties
- Working on polymorphic servers
- Richard:
- Complete the metrics on pre-prod benchmarks
- Build a Quattorised CIP server for use with pre-prod
- 1 day A/L
- Brian:
- ..
- Jens:
- ..
Operations Issues
- Misconfiguration of rfiod on new disk servers were found to cause problems with gridftp, disk2disk and tape migrations. This was due to wrong entries in /etc/services causing rfiod to run on a non-standard port. This was discovered on Thursday and fixed on Friday morning.
Blocking issues
None
Planned, Scheduled and Cancelled Interventions
Entries in/planned to go to GOCDB
None
Advanced Planning
- Upgrade to 2.1.8/2.1.9 2010
Staffing
- Castor on Call person: Chris
- Staff absences:
- Richard (Wed)