Difference between revisions of "RAL Tier1 weekly operations castor 11/08/2014"

Revision as of 15:18, 8 August 2014

Kashyap's Elasticsearch query script has been rolled out to CASTOR headnodes. Users are encouraged to test it and report any bugs.
Samneet's query tool is under development and we hope to have an alpha version available for use by the end of next week.
Plan to ensure PreProd represents production in terms of hardware generation are underway.
The remaining 2014 disk servers have been deployed into production.

The problems with the draining tool have been understood and a fix is being change-controlled on Monday
A new service class called 'cedaRetrieve' has been created to allow CEDA users (aka Kevin) to manually stage files for retrieval.
The rebalancer has been tested and found to cause problematically large transfermanager queues even with low thresholds set. We will not be using it further until we have a fix.

A Tier 1 Database cleanup is planned so as to eliminate a number of excess tables and other entities left over from previous CASTOR versions. This will be change-controlled in the near future.

Tasks

Possible future upgrade to CASTOR 2.1.14-15.
Resume draining on the ATLAS instance once draining issues resolved.
Switch from admin machines: lcgccvm02 to lcgcadm05
New VM configured to run against the standby CASTOR database will be created as a front-end for dark data etc queries.
Replace DLF with Elastic Search
Correct partitioning alignment issue (3rd CASTOR partition) on new castor disk servers

Interventions

@@ Line 8: / Line 8: @@
 * The problems with the draining tool have been understood and a fix is being change-controlled on Monday
 * A new service class called 'cedaRetrieve' has been created to allow CEDA users (aka Kevin) to manually stage files for retrieval.
+* The rebalancer has been tested and found to cause problematically large transfermanager queues even with low thresholds set. We will not be using it further until we have a fix.
 == Blocking Issues ==
+* Neither rebalacing nor draining currently work.
 == Planned, Scheduled and Cancelled Interventions ==
+* A Tier 1 Database cleanup is planned so as to eliminate a number of excess tables and other entities left over from previous CASTOR versions. This will be change-controlled in the near future.
@@ Line 28: / Line 29: @@
 '''Interventions'''
+* None
 == Staffing ==
 * Castor on Call person
-** Shaun
+** Rob
 * Staff absence/out of the office:
-** Chris and Matt out all week
+** Chris, Matt and Shaun out all week