RAL Tier1 weekly operations castor 21/02/2011
From GridPP Wiki
Contents
Operations News
- Last disk servers (Gen) quattorized and upgraded to SL5 64bit
- WAN tuning rolled out to all remaining CMS disk servers
- srm0662 (ATLAS) repartitioned to give more space to logs. Two more to go.
- atlasSimStrip was successfully merged into atlasStripInput
Operations Issues
- Around 10k FTS transfers failed for ATLAS on Monday after switching to a new robot certificate, which wasn't correctly pushed out in grid-mapfiles due to a misconfiguration when upgrading to the new puppetmaster02.
- After the disk pool merging, ATLAS continued using SIMSTRIP and failed to modify their pilot jobs to use DATADISK.
- On 17/2 the xrootd redirector crashed resulting in failing functional tests. It was quickly noticed and restarted after 2 hours. An automatic restarter was written and installed that will kick in if it happens again.
Blocking Issues
- Lack of production-class hardware running ORACLE 10g needs to be resolved prior to CASTOR for Facilities going into full production. Been ordered. Servers arriving this week, RAID device mid-March.
Planned, Scheduled and Cancelled Interventions
Entries in/planned to go to GOCDB
Description | Start | End | Type | Affected VO(s) |
---|---|---|---|---|
Roll out WAN tuning changes to all remaining disk servers | 22/02/2011 09:00 | 22/02/2011 16:00 | At-Risk | ATLAS,LHCb,Gen |
Upgrade NS to 2.1.10 (STC) | mid March | mid March | Downtime | ALL |
Advanced Planning
- CASTOR certification and upgrade to 2.1.10 and upgrade of SRM to 2.10 which incorporates:
- fix for gridftp-internal to support multiple service classes, enabling checksums for Gen
- fix to report files on draining disk servers accessed by FTS to be NEARLINE not UNAVAILABLE
- Move Tier1 instances to new Database infrastructure which with a Dataguard backup instance in R26
- Move Facilities instance to new Database hardware running 10g
- Start migrating from T10KA to T10KC media later this year
Staffing
- Castor on Call person: Chris
- Staff absence/out of the office:
- Shaun, Richard (all week)