Difference between revisions of "RAL Tier1 weekly operations castor 20/05/2016"

From GridPP Wiki
Jump to: navigation, search
(CASTOR issues)
(CASTOR issues)
Line 9: Line 9:
 
Heavy wokload on the Atlas scracth disk resulting in almost nothing being achieved
 
Heavy wokload on the Atlas scracth disk resulting in almost nothing being achieved
  
Full recover from the tape robot and air condition problems
+
Full recovery from the tape robot and air condition problems
  
Double put start on CASTOR facilities
+
Double put start issue on CASTOR facilities (BD)
  
 
Some work to be done on the improvement of the logic of the new draining script  
 
Some work to be done on the improvement of the logic of the new draining script  
Line 18: Line 18:
  
 
GDSS727 (production D1T0 CMS disk server) FSProbe Error
 
GDSS727 (production D1T0 CMS disk server) FSProbe Error
It has been removed from Production and Overwatch Updated (Gareth, [https://helpdesk.gridpp.rl.ac.uk/Ticket/Display.html?id=172141 RT 172141]
+
Removed from Production and Overwatch Updated [https://helpdesk.gridpp.rl.ac.uk/Ticket/Display.html?id=172141 RT 172141]
  
Ongoing work on the upgrade to CASTOR 2.1.15 on preprod
+
Ongoing work on the upgrade to CASTOR 2.1.15 on preprod (RA)
  
 
GP and BD to chase the dteam for the GP membership request
 
GP and BD to chase the dteam for the GP membership request

Revision as of 10:41, 20 May 2016

Operation news

Automated workflow for disk server deployment has been disabled New CASTOR functional testing using xrootd will be enabled on Monday 23/5/2016

CASTOR issues

Heavy wokload on the Atlas scracth disk resulting in almost nothing being achieved

Full recovery from the tape robot and air condition problems

Double put start issue on CASTOR facilities (BD)

Some work to be done on the improvement of the logic of the new draining script

gdss664 was brought back to production on 18/05/2016 at ca. 15:00 folowing a sucessfull rebuilding

GDSS727 (production D1T0 CMS disk server) FSProbe Error Removed from Production and Overwatch Updated RT 172141

Ongoing work on the upgrade to CASTOR 2.1.15 on preprod (RA)

GP and BD to chase the dteam for the GP membership request

GP and BD to perform stress testing of gdss596 to evaluate the new WAN parameters

GP to talk to Andrew Lahiff about a SL7 upgrade on the worker nodes

SRM DB duplicates removal script is under testing

BD AND RA will test the newly created tape families for ATLAS