RAL Tier1 weekly operations castor 23/11/2015

From GridPP Wiki
Jump to: navigation, search

Hi Rob, I hope this is an accurate reflection of our meeting, you better check I got the terminology/version right!!

• ACTION: RA to try stopping tapeserverd mid-migration to see if it breaks. RA to check with TF for more detailed explanation of what he did and why.

• SL6 upgrade, CV11 problems – systems have been stable this week. RA had done some stress testing but the results from this were inconclusive, in that they did not produce a comparable result. Therefore it may be that it is just not possible to simulate the behaviour on pre-production system. ACTIONS: RA to ensure the procedure for dealing with any recurrence of this issue is documented for on-call personnel. GS to ensure this is mentioned at on call meeting and check with CW and KM what they are testing with the machines which have had the fault as we have seen them returned to production and then showing the fault again. RA – decision made to continue with upgrade to SL6 in any case.

• GS/RA to revisit the CASTOR decommissioning process in light of the production team updates to their decommissioning process

• Today: RA/JS vcirt upgrade to 2.15.20

• Long-term projects:

• RA has produced a python script to handle SRM db duplication issue which is causing callouts. Problem running the python script as version of python on the SRM servers is still at 2.4, however RA will pursue this. SdW has reviewed and confident that this is low risk.

• JJ – Glue 2 for CASTOR, something to do with publishing information??? Not sure there was a specific action associated with this

• JS – replacing Tier1 CASTOR db hardware, ACTION: RA/JS to discuss disk requirements

• Availability for next week: RA out from Wednesday 25th November until ?? BD out Thursday, 26th and Friday 27th November.

• BD/RA – review tickets prior to RA going on holiday