Operations Report 01/03/2010

From GridPP Wiki
Jump to: navigation, search

Summary of Previous Week

  • Tested PSU 64bit Patch, not tested yet for 32bit systems
  • Investigating backups to tape
  • Move backup process to more resilient node (CASTOR Neptune)
  • Clusterware reconfiguration (CASTOR)
  • Planned CASTOR Pre-Prod database restoration
  • Applied JAVA security fix to 3D databases

Operational Issues and Incidents

None

Plans for Week(s) Ahead

  • Apply JAVA security fix to CASTOR databases and Somnus
  • Documentation for add / remove cluster nodes and testing
  • Restoration of VULCAN pre-prod system
  • Test quarterly Oracle patch update (waiting on Pre-Prod system Vulcan)

Downtimes and At Risk

Description Start End Affected VO(s) Type
Security Fix on Neptune, Pluto, Somnus 2nd March 10:00 2nd March 11:00 All At risk

Development Priorities

  • Deploy CASTOR Database Monitoring
  • Migrate ATLAS TAGs to 64bit systems
  • Investigate ORACLE replication technique for LFC/FTS resilience
  • Investigate hardware architecture, backup and recovery strategy, resilience and validation of restored backup.


Requirements and Blocking Issues

Description Required By Priority Status
Vulcan HW configuration ASAP Medium/high Waiting

OnCall

  • Carmine

Absences

  • Keir (All Week)