Difference between revisions of "RAL Tier1 weekly operations castor 11/01/2010"
From GridPP Wiki
Chris kruk (Talk | contribs) |
(No difference)
|
Latest revision as of 15:15, 11 January 2010
Contents
Summary of Previous Week
- Restarted Castor services after UPS intervention (Castor Team)
- Investigating RMmaster problem on GEN instance (Chris, Eter)
- Cleaning up stager GEN DB to resolve the problem with RMmaster (Shaun, Chris, Eter)
- SRM development (Shaun)
- Investigating BigID on Atlas (DB Team)
- Generated list of corrupted files on gdss70 and gdss79 (Chris)
- Problem with gc on repack instance (Castor Team)
- Set up ipmi bios configs on database servers (Cheney)
- Updating of twiki for list of servers & ssh sigs (Cheney)
- Writing of techwatch newsletter (Cheney)
- Set up Vulcan testing of EMC kit (Cheney)
- Set up database multipath ahead of EMC return to use (Cheney)
Developments for this week
- Test new kernel on certification before implementing it during next week intervention (Chris)
- Test restriction for users access on disk servers (Jonathan, Chris)
- Investigating BigID on Atlas (DB Team)
- Work on PreProduction instance (Richard, Chris, DB Team)
- Continue investigation to find out why gc doesn't work on repack instance (Castor Team)
- Return EMC kit to use on production servers (Cheney)
- Build replacement database server (Cheney)
- Install memory upgrade on castor databases (Cheney)
- Config bios for ipmi on castor head nodes (Cheney)
Ongoing work
- Investigate lhcbUser D2D copy problems (Matthew)
Operations Issues
- Continuing SCSI errors appearing on rack nodes connected to Overland. Power related? - Disappeared since reboot last week
Blocking issues
- Lack of Quattor configuration files for SLC4.8 is stopping us evaluating Quattor alongside CASTOR 2.1.8. Preprod setup will initially proceed with a Kickstart-based deployment.
- Preprod DB can only be delivered after EMC testing is done (2nd week after Jan'10)
Planned, Scheduled and Cancelled Interventions
- 13/14 January - migrate the Castor DBs back to the EMC disk arrays [NOT READY YET]
- 19/20 January
- FSCK Disk servers and pick up new kernels. - Add IPMI to Castor Head Nodes. - Replace cdbc08 and add new DB archive log destination. - Install NameServer CheckSum Trigger - Upgrade of memory to DB nodes
- The following have not been folded into the above schedule. These can be fitted around as they are, at worst, an ‘At Risk’.
- SRM Castor Client upgrade - Update fetch-crl rpm on disk servers - Restrict user login on disk servers
Advanced Planning
- Gen upgrade to 2.1.8 2010Q1
- Install/enable gridftp-internal on Gen (This year/before 2.1.8 upgrade)
Staffing
- Castor on Call person: Shaun
- A/L: Matt - Monday