Difference between revisions of "RAL Tier1 weekly operations castor 22/07/2016"

From GridPP Wiki
Jump to: navigation, search
(Operation news)
Line 1: Line 1:
 +
------------
 +
Draft Agenda
 +
 +
1. Problems encountered this week
 +
2. Upgrades/improvements made this week
 +
3. What are we planning to do next week?
 +
a. 2.1.15
 +
4. Long-term project updates (if not already covered)
 +
a. Facilities drive reallocation
 +
5. Special topics
 +
6. Actions
 +
7. Anything for CASTOR-Fabric?
 +
8. AoTechnicalB
 +
9. Availability for next week
 +
10. On-Call
 +
11. AoOtherB
 +
 +
 +
 
== Minutes from the previous meeting ==
 
== Minutes from the previous meeting ==
  

Revision as of 09:32, 29 July 2016


Draft Agenda

1. Problems encountered this week 2. Upgrades/improvements made this week 3. What are we planning to do next week? a. 2.1.15 4. Long-term project updates (if not already covered) a. Facilities drive reallocation 5. Special topics 6. Actions 7. Anything for CASTOR-Fabric? 8. AoTechnicalB 9. Availability for next week 10. On-Call 11. AoOtherB


Minutes from the previous meeting

Operation problems

CMS external xroot test is failing

gdss619 showed hardware problems and had to be set to read-only mode for RAID verify tests.

No route to tape issues for CMS due to the way file classes are set up

An number of facilities tape drives were down

AN LHCb tape containing 800 files has been physically lost. Tim is chasing this up.

Operation news

Tape system library is stable

Deployment of the Dell 2015 tape buffers has started. Three of them have been deployed to atlasNonProd service class

Long-term projects

Not much success with fixing the 2.1.15 installation in liason with CERN

Migration to aquilon and SL7 upgrade. Intermediate step: configure a VM as a tape server.

Staffing

CP on call next week

RA may take of some time in lieu

GP may leave earlier certain days

Actions

CASTOR TEAM Durham / Leicester Dirac data - need to create separate tape pools / uid / gid

RA disks servers requiring RAID update - locate servers and plan for update with fabric

RA decide what to do with persistent data (for daily test) is still on GenScratch

RA to update the doc for xroot certificates

GP to review with RA the mailing lists he is on

GP/RA to look at the stress test results for gdss596 and evaluate the WAN tuning parameters

Operation problems

gdss650 (LHCB d1t0) gdss634 (atlasTape) went out of production due to hard drive and RAID card problems respectively

Operation news

Three Dell 2015 disk servers were deployed into atlasTape. Six more are under deployment into cmsTape and lhcbRawRdst

Long-term projects

xroot tests for the 2.1.5 upgrade are in progress. A phone meeting will be arranged with Giussepe next week

Migration to aquilon and SL7 upgrade. GP has created a VM on the cloud and will start adding tape server features on aquilon

Staffing

CP on call next week

Actions

RA disks servers requiring RAID update - locate servers and plan for update with fabric

RA decide what to do with persistent data (for daily test) is still on GenScratch

RA to update the doc for xroot certificates

GP to present the stress test results of gdss596 configured with the WAN tuning parameters

Completed actions

CASTOR TEAM Durham / Leicester Dirac data - need to create separate tape pools / uid / gid

GP to review with RA the mailing lists he is on