Difference between revisions of "RAL Tier1 weekly operations castor 31/03/2014"

From GridPP Wiki
Jump to: navigation, search
(Advanced Planning)
(Advanced Planning)
Line 18: Line 18:
 
== Advanced Planning ==
 
== Advanced Planning ==
 
'''Tasks'''
 
'''Tasks'''
REVIEW THIS ----
+
<<<<< REVIEW THIS >>>>>
 
* CASTOR 2.1.14 + SL5/6 testing. The change control has gone through today with few problems.
 
* CASTOR 2.1.14 + SL5/6 testing. The change control has gone through today with few problems.
 
* iptables to be installed on lcgcviewer01 to harden the logging system against the injection of junk data by security scans.
 
* iptables to be installed on lcgcviewer01 to harden the logging system against the injection of junk data by security scans.

Revision as of 16:16, 28 March 2014

Operations News

  • Disk deployments: 1 CV’13 in lhcbDst / 10 CV ’13 in lhcbNonProd waiting for blessing / 3 CV’13 on way to cmsNonProd
  • Disk Draining: 2 atlas servers drained and 1 in progress. 3 CMS servers drained and 1 in progress

Operations Problems

  • CMS load continues to cause problems, we had to restart transfer/diskmanagers to get things working again (Monday 10:45 and Tuesday 17:30)
  • transfermanagerd restarted on fdscdlf02 Thursday
  • vcert srm and name server not accessible due to issues with hypervisor after rack move, possibly some config required to bring it back. Dimitrios is looking into this
  • We had a node crash on Neptune causing brief issues with Atlas srm, known issue has already been logged with Oracle

Blocking Issues

  • none

Planned, Scheduled and Cancelled Interventions

Entries in/planned to go to GOCDB


Advanced Planning

Tasks <<<<< REVIEW THIS >>>>>

  • CASTOR 2.1.14 + SL5/6 testing. The change control has gone through today with few problems.
  • iptables to be installed on lcgcviewer01 to harden the logging system against the injection of junk data by security scans.
  • Quattor cleanup process is ongoing.
  • Installation of new Preprod headnodes

Interventions

  • (Tue 1 Apr) Facilities CASTOR Upgrade. Downtime between 0900-1600

Staffing

  • Castor on Call person
    • Matthew
  • Staff absence/out of the office:
    • (Mon-Fri) Rob A/L
    • (Friday) Bruno poss A/L