RAL Tier1 weekly operations castor 09/08/2010

From GridPP Wiki
Jump to: navigation, search

Work previous week

  • Matthew:
    • A/L
  • Shaun:
    • Operations: Two disk servers in LHCbUser showed heavy load. Due to other two being full.
    • Operations: Database problem on Neptune; resloved with Ian and Keir
    • Operations: Need to schedule firmware update for SL08 disk servers
    • PreProd: Some problems with grid-map file. Resolved by changing order of fetching info.
    • PreProd: Problems on VULCAN DB
    • PreProd: Problems with gdss154 as source for disk-2-disk copy
    • PreProd: CMS and ATLAS started testing. No response yet from ALICE and LHCb
  • Chris:
    • A/L
  • Richard:
    • PreProd: CIP now publishing correctly
    • PreProd: Finishing the running of 2.1.9 functional test suite
  • Brian:
    • ..
  • Jens:
    • ..

Operations Issues

  • lhcbUser ran out of space on 2 disk servers. Now in draining.
  • gdss417 (atlasMcDisk) failed due to unknown problem with array controller. Inaccessible data on this disk server resulted in RAL being blacklisted running farm jobs.


PreProd


  • Vulcan DB disk array failed and was rebooted on Friday.

Blocking issues

  • None

Planned, Scheduled and Cancelled Interventions

Entries in/planned to go to GOCDB None

Advanced Planning

  • Upgrade to 2.1.8/2.1.9 2010

Staffing

  • Castor on Call person: Matthew
  • Staff absences:
    • Chris A/L
    • Shaun A/L