RAL Tier1 weekly operations castor 16/12/2016

From GridPP Wiki
Jump to: navigation, search

Draft agenda

1. Problems encountered this week

2. Upgrades/improvements made this week

3. What are we planning to do next week?

4. Long-term project updates (if not already covered)

  1. Castor 2.1.15
  2. SL7 upgrade on tape servers

5. Special topics

6. Actions

7. Anything for CASTOR-Fabric?

8. AoTechnicalB

9. Availability for next week

10. On-Call

11. AoOtherB


Operation problems

gdss685 (atlasStripInput) failed. Put back in prod after it had two drives replaced and rebuilt

gdss677 (cmsTape) failed and removed from prod

Heavy I/O load on the CV11 cmsTape disk servers dueo to lots of tape recalls and writes. SAM tests failed

Slow migration of diamond data to tape. Fdscts09 was showing very slow performance on a write to tape. Issue resolved after Tim changed a cable

Operation news

The fiemware on all CV13 disk servers was upgraded to the latest version RT177723

The total number of transfer slots was increased from 4000 to 8000 on Dell2015 cmsTape servers which fixed the problem with the failing SAM tests

Putting the CV11 ds in cmsTape in read-only mode for few hours cleared the load e-log