Operations Report 14/09/2009

From GridPP Wiki
Jump to: navigation, search

Summary of Previous Week

Developments

  • 3D ATLAS has been migrated to 64bit.
  • This morning we met with Grid services people (Catalin, Matt h.) about FTS/LFC resilience

Operational Issues and Incidents

  • On Thursday one of the Castor Database disk array became unavailable and ASM did hang (Oracle bug). The service was down from ~15:00 till ~18:00

Plans for Week(s) Ahead

  • Castor Name server upgrade
  • Change some Oracle configuration on Castor systems to improve performance.


Downtimes and At Risk

Description Start End Affected VO(s)
Name server upgrade (Down Time) 15/09/2009 9:00am 15/09/2009 12:00pm All
Apply changes to Oracle server (At Risk) 15/09/2009 12pm 17/09/2009 17pm All

Development Priorities

  • CASTOR Database Monitoring
  • Migrate LHCb 3D and ATLAS TAGs to 64bit systems
  • Investigate ORACLE replication technique for LFC/FTS resilience

Requirements and Blocking Issues

Description Required By Priority Status
LHCb 3D and Tag databases migration to 64bit Oracle Medium 64 bit cluster installation

OnCall

  • Keir Hawker