RAL Tier1 weekly operations castor 14/01/2011

From GridPP Wiki
Jump to: navigation, search

Operations News

  • Upgraded grid-ftp RPM rolled out to CMS and LHCb
  • WAN tuning rolled out to all remaining service classes in CMS
  • Confirmation from CERN that we 'should' be able to upgrade NS to 2.1.10 with stagers on 2.1.9-6 - needs testing

Operations Issues

  • Callouts for low partition free space on SRM and NS, but neither had a production impact. Both being dealt with.

Blocking Issues

  • Lack of production-class hardware running ORACLE 10g needs to be resolved prior to CASTOR for Facilities going into full production. Been ordered. Servers arriving this week, RAID device mid-March.

Planned, Scheduled and Cancelled Interventions

Entries in/planned to go to GOCDB

Description Start End Type Affected VO(s)
Upgrade and quattorize Gen disk servers to SL5 64 bit and upgrade gridftp RPMs 15/02/2011 08:00 15/02/2011 16:00 Downtime Gen
Roll out WAN tuning changes to remaining CMS disk pools 15/02/2011 10:00 15/02/2011 12:00 At-Risk CMS
Merge the DATADISK and MCDISK diskpool 17/02/2011 11:00 17/02/2011 15:00 At-Risk ATLAS
Roll out WAN tuning changes to all remaining disk servers (STC) 01/03/2011 09:00 01/03/2011 16:00 At-Risk ATLAS,LHCb,Gen
Upgrade NS to 2.1.10 (STC) 01/03/2011 10:00 01/03/2011 11:00 Downtime ALL

Advanced Planning

  • Upgrade Gen disk servers to SL5 64bit and Quattorize the remaining non-Quattorized disk servers
  • CASTOR certification and upgrade to 2.1.10 and upgrade of SRM to 2.10 which incorporates:
    • fix for gridftp-internal to support multiple service classes, enabling checksums for Gen
    • fix to report files on draining disk servers accessed by FTS to be NEARLINE not UNAVAILABLE

Staffing

  • Castor on Call person: Chris
  • Staff absence/out of the office:
    • Shaun (Mon)
    • Richard (Mon,Wed,Thu)
    • Jens (Mon,Tue,Fri)