RAL Tier1 weekly operations castor 11/08/2014

From GridPP Wiki

Revision as of 15:18, 8 August 2014 by Rob Appleyard 7822b28575 (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Jump to: navigation, search

Contents

1 Operations News
2 Operations Problems
3 Blocking Issues
4 Planned, Scheduled and Cancelled Interventions
5 Advanced Planning
6 Staffing

Operations News

Kashyap's Elasticsearch query script has been rolled out to CASTOR headnodes. Users are encouraged to test it and report any bugs.
Samneet's query tool is under development and we hope to have an alpha version available for use by the end of next week.
Plan to ensure PreProd represents production in terms of hardware generation are underway.
The remaining 2014 disk servers have been deployed into production.

Operations Problems

The problems with the draining tool have been understood and a fix is being change-controlled on Monday
A new service class called 'cedaRetrieve' has been created to allow CEDA users (aka Kevin) to manually stage files for retrieval.
The rebalancer has been tested and found to cause problematically large transfermanager queues even with low thresholds set. We will not be using it further until we have a fix.

Blocking Issues

Neither rebalacing nor draining currently work.

Planned, Scheduled and Cancelled Interventions

A Tier 1 Database cleanup is planned so as to eliminate a number of excess tables and other entities left over from previous CASTOR versions. This will be change-controlled in the near future.

Advanced Planning

Tasks

Possible future upgrade to CASTOR 2.1.14-15.
Resume draining on the ATLAS instance once draining issues resolved.
Switch from admin machines: lcgccvm02 to lcgcadm05
New VM configured to run against the standby CASTOR database will be created as a front-end for dark data etc queries.
Replace DLF with Elastic Search
Correct partitioning alignment issue (3rd CASTOR partition) on new castor disk servers

Interventions

None

Staffing

Castor on Call person
- Rob

Staff absence/out of the office:
- Chris, Matt and Shaun out all week

Retrieved from "https://www.gridpp.ac.uk/w/index.php?title=RAL_Tier1_weekly_operations_castor_11/08/2014&oldid=5407"