Operations Report 09/11/2009

From GridPP Wiki
Jump to: navigation, search

Summary of Previous Week

  • Meeting at Cosener's house
    • Architecture Review
    • Procedures Review (backup, disaster recovery, etc)
    • Resilience Plans
    • Integrity Check

Developments

  • Castor: New repack 2.1.7
  • Castor: Tested Oracle Patch on Certdb
  • Castor: Implemented backup policy (two weeks on disk, 3 months on tape, 6 months full backup)
  • 3D: ATLAS, frontier schema has been set up


Operational Issues and Incidents


Plans for Week(s) Ahead

  • Continued investigation in to new architecture
  • Start testing a script to mitigate against database connecting to “old” ASM mirror (when hardware is in place)
  • Continued work on backup/recovery procedures (and documentation)
  • Updating disaster procedures


Downtimes and At Risk

Description Start End Affected VO(s)
Oracle security patch on Castor DBs 10/11/09 09:00 10/11/09 17:00 All
Oracle security patch on 3D databases 11/11/09 9:00 am 11/11/09 13:00 ATLAS, LHCb
investigate hardware problems 10/11/09 9:00am 10/11/09 12:pm LFC ATLAS, LFC non VOs, FTS

Development Priorities

  • CASTOR Database Monitoring
  • Migrate ATLAS TAGs to 64bit systems
  • Investigate ORACLE replication technique for LFC/FTS resilience
  • Investigate hardware architecture, backup and recovery strategy, resilience and validation of restored backup. .

Requirements and Blocking Issues

Description Required By Priority Status
Hardware for testing ASM configuration and the script to avoid ASM will mount the wrong ("old") ASM mirror ASAP High Waiting
Hardware for Tag databases Medium Waiting
Hardware to test LFC database replication Medium/high Waiting

OnCall

  • Eter Pani