Operations Report 18/01/2010

From GridPP Wiki
Jump to: navigation, search

Summary of Previous Week

Developments

  • Tier1: ASM script tested on Somnus
  • Tier1: Script to restore database backup complete, need to prepare documentation
  • Tier1: ASM script tested on Vulcan

Operational Issues and Incidents

  • Tier1: still some EMC/Oracle problems on Vulcan
  • FTS: Lock problems on memory segments. We had to reboot the database

Plans for Week(s) Ahead

Downtimes and At Risk

Description Start End Affected VO(s) Type
Migrate Castor back to EMC 27/01/2010 18/01/2010 All Downtime
Migrate 3D back to EMC 26/01/2010 ?!?! ATLAS, LHCb At risk + 1hour downtime
Migrate LFC/FTS back to Somnus 27/01/2010 08:00am 27/01/2010 19:00 All Downtime

Development Priorities

  • Deploy CASTOR Database Monitoring
  • Migrate ATLAS TAGs to 64bit systems
  • Investigate ORACLE replication technique for LFC/FTS resilience
  • Investigate hardware architecture, backup and recovery strategy, resilience and validation of restored backup.


Requirements and Blocking Issues

Description Required By Priority Status
EMC kit At least a week before going in production High Waiting
Hardware for Tag databases Medium Waiting
Hardware to test LFC database replication Medium/high Waiting

OnCall

  • Eter