Difference between revisions of "RAL Tier1 weekly operations castor 03/08/2009"

From GridPP Wiki
Jump to: navigation, search
 
(No difference)

Latest revision as of 12:54, 31 July 2009

Summary of Previous Week

  • CIP development on certification (Jens)
  • CASTOR Disaster Recovery document (Matt)
  • Investigate 2.1.8 NS client on 2.1.7 NS DB (Chris)
  • Tier1 Production Manager (Matt)
  • Interviewing for CASTOR Pre Production Service Manager post (Matt)
  • Established with experiments when to intervene with hardware on D0T1 disk servers (Matt)
  • Python 3 training course (Cheney)
  • Oracle firewall rules (Cheney)
  • Fixing problem with robot controllers (Cheney)
  • Liaising with CMS to delete data (Tim)


Developments for this week

  • CIP development on certification (Jens)
  • Configuring disk servers to use gridftp-internal (Chris)
  • CASTOR Disaster Recovery document (Matt)
  • Write CASTOR status for oversight committee (Matt)

Ongoing

  • Find out more about CERN's virtualized certification setup (Chris, Matt)
  • Deploying 2.1.8-8 on remaining tape servers (Chris)
  • Fix missing graphs on castormon - Atlas tape migration and aggregated Gen monitoring(Brian)
  • Cleaning up database for a future 2.1.8 upgrade (Shaun)
  • Setting up Preproduction (Matt)

Operations Issues

Disk failure plus errors on 3 other disk on gdss213 (AtlasScratchDisk)

Blocking issues

none

Scheduled and Cancelled Down Times

none

Changes to Production Milestones

none

Advanced Planning

  • CIP upgrade to include nearline publishing (August)
  • SRM 2.8 upgrade (August)
  • Work with Fabric to add extra RAID card in remaining Viglen'06 disk servers (Second half of August)
  • Database optimization tasks (September)
  • Upgrade nameserver to 2.1.8 (Possibly during September)
  • Black and White lists? (Possibly during September)
  • Improve resiliency to central services (This year)

Staffing

  • Castor on Call person: Matt (Mon-Thurs), Chris (Fri-Sun)
  • Cheney on A/L
  • Matt on A/L from Friday