RAL Tier1 weekly operations castor 12/08/2016

From GridPP Wiki
Jump to: navigation, search

Draft agenda

1. Problems encountered this week
2. Upgrades/improvements made this week
3. What are we planning to do next week?
4. Long-term project updates (if not already covered)
      1. Facilities drive reallocation
      2. 2.1.15
      3. SL7 upgrade on tape servers
5. Special topics
6. Actions
7. Anything for CASTOR-Fabric?
8. AoTechnicalB
9. Availability for next week
10. On-Call
11. AoOtherB

Operation problems

Callout on large transfer manager queue on atlasScratchDisk. Possible but not apparent SRM problem. Restarting transfer/disk managers did not help. Used killtransfers to get rid of all pending jobs on the queue

Operation news

gdss748 (atlas d1t0) is back in production

Long-term projects

The gridFTP problem in CASTOR 2.1.15 was fixed. Xroot remains to be fixed

Actions

RA disks servers requiring RAID update - locate servers and plan for update with fabric

RA decide what to do with persistent data (for daily test) is still on GenScratch

RA to update the doc for xroot certificates

GP to present the stress test results of gdss596 configured with the WAN tuning parameters

For Castor-Fabric

Status of the OCF 2014 disk servers that will be handed from Ceph to Castor RT 173922

Staffing

RA on call