Difference between revisions of "RAL Tier1 weekly operations castor 31/03/2014"

From GridPP Wiki
Jump to: navigation, search
(Operations Problems)
(Operations Problems)
Line 3: Line 3:
  
 
== Operations Problems ==
 
== Operations Problems ==
* CMS load continues to cause problems had to restart transfer/diskmanagers to get things working again (Monday 10:45 and Tuesday 17:30)
+
* CMS load continues to cause problems, we had to restart transfer/diskmanagers to get things working again (Monday 10:45 and Tuesday 17:30)
 
* transfermanagerd restarted on fdscdlf02 Thursday
 
* transfermanagerd restarted on fdscdlf02 Thursday
 
* vcert srm and name server not accessible due to issues with hypervisor after rack move, possibly some config required to bring it back. Dimitrios is looking into this
 
* vcert srm and name server not accessible due to issues with hypervisor after rack move, possibly some config required to bring it back. Dimitrios is looking into this

Revision as of 15:58, 28 March 2014

Operations News

  • ..

Operations Problems

  • CMS load continues to cause problems, we had to restart transfer/diskmanagers to get things working again (Monday 10:45 and Tuesday 17:30)
  • transfermanagerd restarted on fdscdlf02 Thursday
  • vcert srm and name server not accessible due to issues with hypervisor after rack move, possibly some config required to bring it back. Dimitrios is looking into this
  • We had a node crash on Neptune causing brief issues with Atlas srm, known issue has already been logged with Oracle

Blocking Issues

  • none

Planned, Scheduled and Cancelled Interventions

Entries in/planned to go to GOCDB


Advanced Planning

Tasks

  • CASTOR 2.1.14 + SL5/6 testing. The change control has gone through today with few problems.
  • iptables to be installed on lcgcviewer01 to harden the logging system against the injection of junk data by security scans.
  • Quattor cleanup process is ongoing.
  • Installation of new Preprod headnodes

Interventions

  • (Tue 1 Apr) Facilities CASTOR Upgrade. Downtime between 0900-1600

Staffing

  • Castor on Call person
    • Matthew
  • Staff absence/out of the office:
    • ..