Difference between revisions of "RAL Tier1 weekly operations castor 25/11/2016"

From GridPP Wiki
Jump to: navigation, search
(Created page with " == Draft agenda == 1. Problems encountered this week 2. Upgrades/improvements made this week 3. What are we planning to do next week? 4. Long-term project updates (if not...")
 
Line 30: Line 30:
  
 
gdss750 failed due to fsprobe errors and was removed from production
 
gdss750 failed due to fsprobe errors and was removed from production
 +
 +
gdss651 (preProd) is still down [https://helpdesk.gridpp.rl.ac.uk/Ticket/Display.html?id=177006 RT177006]
  
 
== Operation news ==
 
== Operation news ==
  
 
All CV14 disk servers have been deployed into full production in lhcbDst [https://helpdesk.gridpp.rl.ac.uk/Ticket/Display.html?id=176041&results=c43ec556d93f2209c7f403533b72d222 176041]
 
All CV14 disk servers have been deployed into full production in lhcbDst [https://helpdesk.gridpp.rl.ac.uk/Ticket/Display.html?id=176041&results=c43ec556d93f2209c7f403533b72d222 176041]
 +
 +
== Long-term projects ==
 +
 +
Castor 2.1.15 upgrade has been postponed until January 2017
 +
 +
SL7 tapeserver: progress with organising the tape-server software repos in pan templates. Scheduled a meeting with Bruno to discuss what has been done and the organisation of tape-server features
 +
 +
== Special topics ==
 +
 +
Remake transfer rate plots for larger files (> 0.5 GB) and covering longer time periods - implemented these requirements in the script. Need to modify the script to ingnore treansfers that finished on the next day after they started.
 +
 +
== Actions ==
 +
 +
Discusss with Khash about the urgency of RAID upgrade on CV13 ds and plan the intervention
 +
 +
Present AL two alternatives to choose from: 1) Create generic fileclass/tapepool 2) Remove the "unroutable file to tape" call to working hours
 +
 +
Delete empty dirs from CASTOR (prompted by BD)
 +
 +
Test DB upgrade to CASTOR 2.1.15
 +
 +
Schedule with AL a CASTOR upgrade of preprod from scratch
 +
 +
RA to talk to AL about merging CMS disk pools

Revision as of 10:19, 25 November 2016

Draft agenda

1. Problems encountered this week

2. Upgrades/improvements made this week

3. What are we planning to do next week?

4. Long-term project updates (if not already covered)

1. Castor 2.1.15
2. SL7 upgrade on tape servers

5. Special topics

6. Actions

7. Anything for CASTOR-Fabric?

8. AoTechnicalB

9. Availability for next week

10. On-Call

11. AoOtherB

Operation problems

gdss750 failed due to fsprobe errors and was removed from production

gdss651 (preProd) is still down RT177006

Operation news

All CV14 disk servers have been deployed into full production in lhcbDst 176041

Long-term projects

Castor 2.1.15 upgrade has been postponed until January 2017

SL7 tapeserver: progress with organising the tape-server software repos in pan templates. Scheduled a meeting with Bruno to discuss what has been done and the organisation of tape-server features

Special topics

Remake transfer rate plots for larger files (> 0.5 GB) and covering longer time periods - implemented these requirements in the script. Need to modify the script to ingnore treansfers that finished on the next day after they started.

Actions

Discusss with Khash about the urgency of RAID upgrade on CV13 ds and plan the intervention

Present AL two alternatives to choose from: 1) Create generic fileclass/tapepool 2) Remove the "unroutable file to tape" call to working hours

Delete empty dirs from CASTOR (prompted by BD)

Test DB upgrade to CASTOR 2.1.15

Schedule with AL a CASTOR upgrade of preprod from scratch

RA to talk to AL about merging CMS disk pools