Search results

Create the page "Castor" on this wiki!

Page title matches

RAL Tier1 weekly operations castor 09/02/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Tier 1 CASTOR stop and rebooted for Ghost vulnerability (and CIP)

3 KB (449 words) - 16:58, 6 February 2015
RAL Tier1 weekly operations castor 15/09/2014

...ing - need to investigate if a fix is already available, if not discuss at castor face to face * Break in connectivity Monday 8th, it appears that this did not affect castor internally in any way however if transfers were in process they would have

3 KB (404 words) - 15:14, 12 September 2014
RAL Tier1 weekly operations castor

[[Category:CASTOR]] == Tier1 Castor at RAL Weekly Operations ==

31 KB (3,178 words) - 09:34, 2 August 2019
RAL Tier1 weekly operations castor 17/03/2014

* CASTOR 2.1.14 + SL5/6 testing. The change control has gone through today with few * Castor on Call person

1 KB (181 words) - 13:58, 17 March 2014
RAL Tier1 CASTOR Experiments Completed Actions 2016

...transfers from one non-LHC VO from affecting other due to the use a shared CASTOR instance - Able to set limits for each VO srm endpoint , need to decide and

1 KB (188 words) - 14:11, 21 December 2016
RAL Tier1 CASTOR Experiments Completed Actions 2013

...313-01 || Medium || ATLAS || Alastair || Make sure ATLAS GGUS ticket about CASTOR problems affecting FTS is up-to-date || Closed || 2013-05-01

2 KB (219 words) - 09:28, 20 May 2015
RAL Tier1 weekly operations castor 24/03/2014

* CASTOR 2.1.14 + SL5/6 testing. The change control has gone through today with few * Castor on Call person

1 KB (164 words) - 15:18, 24 March 2014
RAL Tier1 weekly operations castor 21/7/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

2 KB (333 words) - 10:24, 28 July 2017
RAL Tier1 weekly operations castor 31/03/2014

* CASTOR 2.1.14 Upgrade Progress - Reversion to 2.1.13-9 software and databases on p * (Tue 1 Apr) Facilities CASTOR Upgrade. Downtime between 0900-1600

2 KB (368 words) - 16:46, 28 March 2014
RAL Tier1 weekly operations castor 07/04/2014

* Facilities CASTOR was successfully upgraded to 2.1.14-11 ...rian to discuss with Alastair. Other tier 1s are not keen but RAL tier 1 / castor should be able to cope with this.

1,019 B (149 words) - 13:21, 4 April 2014
RAL Tier1 weekly operations castor 14/04/2014

* The NN_FILE_STAGERTIME constraint has been removed for the Facilities CASTOR database, completing the 2.1.14 upgrade. This upgrade was thought to be tra * The xrootd timeout in castor.conf is now set to 30s for all nodes.

1 KB (221 words) - 10:09, 15 April 2014
RAL Tier1 weekly operations castor 28/04/2014

* A new version of CASTOR 2.1.14 (2.1.14-12) has been released. This version makes no changes to the * CASTOR 2.1.14 upgrade for Tier 1.

1 KB (208 words) - 13:02, 25 April 2014
RAL Tier1 weekly operations castor 05/05/2014

* CASTOR 2.1.14 upgrade for Tier 1. Possible date for first stage of intervention (N * CASTOR 2.1.14 for Tier 1

1 KB (161 words) - 15:56, 2 May 2014
RAL Tier1 weekly operations castor 12/05/2014

...een identified that may have contributed to the deletion problems on their CASTOR instance. However, the key test of running the ATLAS deletion scripts local * CASTOR 2.1.14 upgrade for Tier 1. Possible date for first stage of intervention (N

2 KB (245 words) - 10:07, 13 May 2014
RAL Tier1 weekly operations castor 19/05/2014

...een identified that may have contributed to the deletion problems on their CASTOR instance. However, the key test of running the ATLAS deletion scripts local * CASTOR 2.1.14 upgrade for Tier 1. First stage of intervention (NS upgrade) is book

2 KB (294 words) - 15:03, 19 May 2014
RAL Tier1 weekly operations castor 02/06/2014

...een identified that may have contributed to the deletion problems on their CASTOR instance. However, the key test of running the ATLAS deletion scripts local ...n our issues was reported/fixed. These servers are now in acceptance test. Castor team will only deploy V13 servers to non prod until further notice.

2 KB (290 words) - 10:34, 30 May 2014
RAL Tier1 weekly operations castor 26/05/2014

...een identified that may have contributed to the deletion problems on their CASTOR instance. However, the key test of running the ATLAS deletion scripts local ...n our issues was reported/fixed. These servers are now in acceptance test. Castor team will only deploy V13 servers to non prod until further notice.

2 KB (276 words) - 13:46, 28 May 2014
RAL Tier1 weekly operations castor 09/06/2014

...een identified that may have contributed to the deletion problems on their CASTOR instance. However, the key test of running the ATLAS deletion scripts local ...have been upgraded need further configurations (James) before releasing to castor team. V13 machines in production should have firmware update, best approach

2 KB (267 words) - 15:00, 9 June 2014
RAL Tier1 weekly operations castor 16/06/2014

...een identified that may have contributed to the deletion problems on their CASTOR instance. However, the key test of running the ATLAS deletion scripts local * A partitioning alignment issue (3rd CASTOR partition) has been identified, proposal is to resolve this for new machine

3 KB (412 words) - 13:13, 13 June 2014
RAL Tier1 weekly operations castor 23/06/2014

...een identified that may have contributed to the deletion problems on their CASTOR instance. However, the key test of running the ATLAS deletion scripts local * A partitioning alignment issue (3rd CASTOR partition) has been identified, proposal is to resolve this for new machine

3 KB (423 words) - 12:45, 20 June 2014
RAL Tier1 weekly operations castor 30/06/2014

.... CERN provided a solution for SL5.9. We need to consider SL6 upgrade post CASTOR 2.1.14-13 upgrades. ...een identified that may have contributed to the deletion problems on their CASTOR instance. However, the key test of running the ATLAS deletion scripts local

2 KB (366 words) - 16:00, 27 June 2014
RAL Tier1 weekly operations castor 07/07/2014

...een identified that may have contributed to the deletion problems on their CASTOR instance. However, the key test of running the ATLAS deletion scripts local * CMS db locking issue 3/7/14 early hours, resulted in lost CMS test file, castor current shows diskcopy_failed in stager logs. Proposal is to identify if th

2 KB (362 words) - 15:10, 12 August 2014
RAL Tier1 weekly operations castor 14/07/2014

...ek with the task of investigating visualisation and querying solutions for CASTOR use. * CASTOR 2.1.14-13 upgrade for Repack - planned for Tuesday or Wednesday this week.

2 KB (308 words) - 13:48, 14 July 2014
RAL Tier1 weekly operations castor 21/07/2014

...on with the task of investigating visualisation and querying solutions for CASTOR use. * Incorrect service classes in castor.conf on disk servers, Atlas issues resolved by Rob. Other non production is

2 KB (318 words) - 09:07, 21 July 2014
RAL Tier1 weekly operations castor 28/07/2014

...on with the task of investigating visualisation and querying solutions for CASTOR use. * Facilities castor error

2 KB (262 words) - 15:46, 25 July 2014
RAL Tier1 weekly operations castor 04/08/2014

* We have received word that a 2.1.14-15 version of CASTOR may be forthcoming. * Kashyap's Elasticsearch query script has been rolled out to CASTOR headnodes. Users are encouraged to test it and report any bugs.

2 KB (279 words) - 16:48, 1 August 2014
RAL Tier1 weekly operations castor 11/08/2014

* Kashyap's Elasticsearch query script has been rolled out to CASTOR headnodes. Users are encouraged to test it and report any bugs. ...inate a number of excess tables and other entities left over from previous CASTOR versions. This will be change-controlled in the near future.

2 KB (300 words) - 11:01, 15 August 2014
RAL Tier1 weekly operations castor 18/08/2014

* Kashyap's Elasticsearch query script has been rolled out to CASTOR headnodes. Users are encouraged to test it and report any bugs. ...inate a number of excess tables and other entities left over from previous CASTOR versions. This will be change-controlled in the near future.

2 KB (313 words) - 10:40, 15 August 2014
RAL Tier1 weekly operations castor 25/08/2014

* passive draining produces file duplication - fixed in castor 2.1.14-14 * SL6 castor stalled due to resource limitations

2 KB (312 words) - 10:55, 22 August 2014
RAL Tier1 weekly operations castor 01/09/2014

* passive draining produces file duplication - fixed in castor 2.1.14-14 * SL6 castor stalled due to resource limitations & A/L

2 KB (338 words) - 15:06, 29 August 2014
RAL Tier1 weekly operations castor 08/09/2014

* Juan to patch castor dbs beginning of Nov PSU patches – standard change ...inate a number of excess tables and other entities left over from previous CASTOR versions. This will be change-controlled in the near future.

2 KB (279 words) - 12:37, 5 September 2014
RAL Tier1 Incident 20130628 Atlas Castor Outage

==RAL Tier1 Incident 20130628 Atlas Castor Outage======Description:=== The ATLAS CASTOR instance encountered a problem where large numbers of invalid subrequests g

10 KB (1,594 words) - 10:56, 1 May 2015
RAL Tier1 weekly operations castor 22/09/2014

* Juan to patch castor dbs beginning of Nov PSU patches – standard change ...inate a number of excess tables and other entities left over from previous CASTOR versions. This will be change-controlled in the near future.

2 KB (297 words) - 16:23, 19 September 2014
RAL Tier1 weekly operations castor 15/12/2014

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * gdss659 is still but will be decommissioned out of CASTOR.

2 KB (383 words) - 10:53, 4 February 2015
RAL Tier1 weekly operations castor 29/09/2014

* useful breakout sessions at Castor face to face - deadlock analysis & bugs confirmed, discussions to simplify * Juan to patch castor dbs starting next week (PSU patches) – standard change

2 KB (274 words) - 15:25, 26 September 2014
RAL Tier1 weekly operations castor 19/01/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * SL6 name server upgrade postponed due to castor team resource - likely to be this week

2 KB (366 words) - 15:20, 16 January 2015
RAL Tier1 weekly operations castor 06/10/2014

* useful breakout sessions at Castor face to face - deadlock analysis & bugs confirmed, discussions to simplify ...nt on gdss720. Server currently in read only and will revisit post current castor issues.

3 KB (479 words) - 09:41, 7 October 2014
RAL Tier1 weekly operations castor 13/10/2014

* SL6 Headnode work progressing well - hoping for test in castor vcert next week ...h due to emc failure. Action Add CIP into instructions for castor failover.Castor team decided to wait until dbs rolled back.

3 KB (479 words) - 16:36, 10 October 2014
RAL Tier1 weekly operations castor 20/10/2014

* SL6 Headnode work progressing well - tested in vcert2, hoping for test in castor vcert next week and production end of Nov. * Successfully moved Castor atlas/gen stager/srm back to primary db following EMC cache battery replace

2 KB (378 words) - 12:09, 17 October 2014
RAL Tier1 weekly operations castor 27/10/2014

* 2-1-14-14 castor upgrade priority dropped as we have a draining workaround. Revisit once SL6 ...inate a number of excess tables and other entities left over from previous CASTOR versions. This will be change-controlled in the near future.

2 KB (270 words) - 14:50, 27 October 2014
RAL Tier1 weekly operations castor 3/11/2014

...inate a number of excess tables and other entities left over from previous CASTOR versions. This will be change-controlled in the near future. * Possible future upgrade to CASTOR 2.1.14-15 post christmas

2 KB (355 words) - 10:19, 3 November 2014
RAL Tier1 weekly operations castor 10/11/2014

...inate a number of excess tables and other entities left over from previous CASTOR versions. This will be change-controlled in the near future. * Possible future upgrade to CASTOR 2.1.14-15 post Christmas

1 KB (226 words) - 13:51, 12 November 2014
RAL Tier1 weekly operations castor 17/11/2014

...inate a number of excess tables and other entities left over from previous CASTOR versions. This will be change-controlled in the near future. * Possible future upgrade to CASTOR 2.1.14-15 post Christmas

2 KB (267 words) - 14:41, 14 November 2014
RAL Tier1 weekly operations castor 24/11/2014

...inate a number of excess tables and other entities left over from previous CASTOR versions. This will be change-controlled in the near future. * Possible future upgrade to CASTOR 2.1.14-15 post Christmas

2 KB (265 words) - 14:14, 21 November 2014
RAL Tier1 weekly operations castor 01/12/2014

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * gdss659 is still but will be decommissioned out of CASTOR.

2 KB (364 words) - 15:03, 2 December 2014
RAL Tier1 weekly operations castor 08/12/2014

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * gdss659 is still but will be decommissioned out of CASTOR.

2 KB (331 words) - 10:53, 4 February 2015
RAL Tier1 weekly operations castor 22/12/2014

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Kernel and errata upgrade on Castor SL6 headnodes (including reboot) - Tues 23rd 10:00 - 12:00

3 KB (386 words) - 11:33, 19 December 2014
RAL Tier1 weekly operations castor 11/05/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Testing CASTOR rebalancer on preproduction.

4 KB (574 words) - 15:27, 11 May 2015
RAL Tier1 weekly operations castor 12/01/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * SL6 name server upgrade postponed due to castor team resource - needs to be rescheduled

2 KB (368 words) - 13:40, 9 January 2015
RAL Tier1 weekly operations castor 26/01/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Redundant atlasHotdisk service class and disk pool from CASTOR

2 KB (358 words) - 14:00, 23 January 2015
RAL Tier1 weekly operations castor 02/02/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Facilities CASTOR patched for kernel/errata (not Ghost)

3 KB (502 words) - 14:28, 30 January 2015
RAL Tier1 weekly operations castor 16/02/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * storageD retrieval from castor problems - investigation ongoing

3 KB (429 words) - 15:30, 16 February 2015
RAL Tier1 weekly operations castor 23/02/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * storageD retrieval from castor problems - investigation ongoing

3 KB (491 words) - 09:48, 25 February 2015
RAL Tier1 weekly operations castor 09/03/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] ...while draining (had difficulties previously) - now back and draining final castor partition

3 KB (550 words) - 14:59, 9 March 2015
RAL Tier1 weekly operations castor 16/03/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] ...it current version - never seen by RAL. CERN have a workaround in place on castor 2.1.15

4 KB (574 words) - 12:14, 13 March 2015
RAL Tier1 weekly operations castor 23/03/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * storageD retrieval from castor problems - investigation ongoing

3 KB (537 words) - 17:53, 20 March 2015
RAL Tier1 weekly operations castor 04/05/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Upgrade of CASTOR DBs to Oracle version DB 11.2.04 complete.

3 KB (514 words) - 16:13, 1 May 2015
RAL Tier1 Incident 20150408 network intervention preceding Castor upgrade

==RAL-LCG2 Incident 20150408 network intervention preceding Castor upgrade== ... to resolve (and was not finally cleared until the following morning.) The Castor update had to be backed out and there were some problems in doing this.

15 KB (2,406 words) - 16:43, 17 August 2015
RAL Tier1 weekly operations castor 27/04/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * testing CASTOR rebalancer (new version in 2.1.14-15)

3 KB (520 words) - 09:25, 1 May 2015
RAL Tier1 weekly operations castor 20/04/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Tier 1 CASTOR 2.1.14-15 upgrade completed successfully

3 KB (542 words) - 13:38, 20 April 2015
RAL Tier1 weekly operations castor 18/05/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Testing CASTOR rebalancer on preproduction, and developing associated tools.

4 KB (566 words) - 14:12, 15 May 2015
RAL Tier1 weekly operations castor 25/05/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] ...e are examining options for running this in a slow-and-steady fashion with CASTOR up.

4 KB (657 words) - 12:54, 22 May 2015
RAL Tier1 CASTOR Experiments Completed Actions 2012

... || Medium || || Andrew S || Discuss strategy for funding LSF in 2012 with CASTOR team || No longer necessary, since an LSF license has been purchased for th | 20120321-01 || Medium || ALICE || Lee, Shaun || Find out about the load on CASTOR from Japan || Closed. No longer relevant. || 2012-04-25

4 KB (566 words) - 09:26, 20 May 2015
RAL Tier1 weekly operations castor 01/06/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Mice (Castor Gen) will be operating overnight and able to call pri oncall

5 KB (830 words) - 15:06, 29 May 2015
RAL Tier1 weekly operations castor 08/06/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * CASTOR rebalancing from Monday

6 KB (919 words) - 14:23, 5 June 2015
RAL Tier1 weekly operations castor 15/06/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * CASTOR Gen rebalancing underway

5 KB (750 words) - 11:09, 12 June 2015
RAL Tier1 weekly operations castor 22/06/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Facilities CASTOR - change to time to write to tape from 30 mins to 5 mins now

5 KB (799 words) - 09:33, 22 June 2015
RAL Tier1 weekly operations castor 06/07/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Juno (CASTOR Facilities) Oracle update to 11.2.0.4

6 KB (974 words) - 16:10, 3 July 2015
RAL Tier1 weekly operations castor 29/06/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Change to improve file open times on CASTOR (central db, subrequest todo procedure) - has now been deployed to LHCb and

6 KB (938 words) - 12:34, 1 July 2015
RAL Tier1 weekly operations castor 13/07/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Proposed CASTOR face to face W/C Oct 5th or 12th

6 KB (1,039 words) - 08:28, 14 July 2015
RAL Tier1 weekly operations castor 20/07/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Proposed CASTOR face to face W/C Oct 5th or 12th

3 KB (509 words) - 11:07, 24 July 2015
RAL Tier1 weekly operations castor 27/07/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Proposed CASTOR face to face W/C Oct 5th or 12th

3 KB (535 words) - 15:22, 27 July 2015
RAL Tier1 weekly operations castor 24/08/2015

** all VOs / all castor disks * Upgrade CASTOR disk servers to SL6

3 KB (488 words) - 11:29, 21 August 2015
RAL Tier1 weekly operations castor 03/08/2015

* Proposed CASTOR face to face W/C Oct 5th or 12th * Upgrade CASTOR disk servers to SL6

3 KB (569 words) - 15:00, 3 August 2015
RAL Tier1 weekly operations castor 10/08/2015

* Upgrade CASTOR disk servers to SL6 * Proposed CASTOR face to face W/C Oct 5th or 12th

3 KB (539 words) - 14:09, 7 August 2015
RAL Tier1 weekly operations castor 17/08/2015

* Upgrade CASTOR disk servers to SL6 * Proposed CASTOR face to face W/C Oct 5th or 12th

2 KB (336 words) - 13:26, 14 August 2015
RAL Tier1 weekly operations castor 31/08/2015

** all VOs / all castor disks * Upgrade CASTOR disk servers to SL6

4 KB (596 words) - 10:39, 28 August 2015
RAL Tier1 weekly operations castor 07/09/2015

** all VOs / all castor disks * Upgrade CASTOR disk servers to SL6

4 KB (617 words) - 10:35, 4 September 2015
RAL Tier1 weekly operations castor 21/09/2015

** all VOs / all castor disks * Upgrade CASTOR disk servers to SL6

4 KB (651 words) - 10:23, 18 September 2015
RAL Tier1 weekly operations castor 02/10/2015

...d from castor and back to fabric to gather spares cv11 spec – no further castor action. ** all VOs / all castor disks

5 KB (886 words) - 10:45, 2 October 2015
RAL Tier1 weekly operations castor 09/10/2015

* CASTOR 2.1.15 * Proposed CASTOR face to face W/C Oct 5th or 12th

4 KB (637 words) - 12:47, 9 October 2015
RAL Tier1 weekly operations castor 16/10/2015

* The checksum issue/tickets still present. These are thought to be due to a CASTOR bug fixed in 2.1.15. * CASTOR 2.1.15

2 KB (401 words) - 13:06, 16 October 2015
RAL Tier1 weekly operations castor 16/11/2015

* RA, SdW, GTF and AS have been to CERN for a CASTOR face-to-face meeting * Disk servers name lookup issue (CV11's) - more system than CASTOR. Currently holding CV11 upgrades until understood.

3 KB (478 words) - 16:01, 16 November 2015
RAL Tier1 weekly operations castor 23/10/2015

* CASTOR 2.1.15 == Issues to bring up at CASTOR F2F ==

2 KB (345 words) - 15:23, 27 October 2015
RAL Tier1 weekly operations castor 30/10/2015

Castor ops 23/10/15 11-2-04 client updates – 2.1.15 prerequisite … has to go on castor headnodes

795 B (124 words) - 09:53, 9 November 2015
RAL Tier1 weekly operations castor 09/11/2015

* RA, SdW, GTF and AS have been to CERN for a CASTOR face-to-face meeting * CASTOR 2.1.15

2 KB (374 words) - 17:37, 6 November 2015
RAL Tier1 weekly operations castor 30/11/2015

* LHCb batch jobs failing to copy results into castor - changes made seems to have improved the situation but not fix (Raja). Inc * RA, SdW, GTF and AS have been to CERN for a CASTOR face-to-face meeting

5 KB (850 words) - 11:33, 27 November 2015
RAL Tier1 weekly operations castor 23/11/2015

• GS/RA to revisit the CASTOR decommissioning process in light of the production team updates to their de • JJ – Glue 2 for CASTOR, something to do with publishing information??? Not sure there was a speci

2 KB (306 words) - 16:22, 24 November 2015
RAL Tier1 weekly operations castor 07/12/2015

* LHCb batch jobs failing to copy results into castor - changes made seems to have improved the situation but not fix (Raja). Inc * RA, SdW, GTF and AS have been to CERN for a CASTOR face-to-face meeting

6 KB (1,018 words) - 12:25, 4 December 2015
RAL Tier1 weekly operations castor 14/12/2015

* LHCb batch jobs failing to copy results into castor - changes made seems to have improved the situation but not fix (Raja). Inc * RA, SdW, GTF and AS have been to CERN for a CASTOR face-to-face meeting

7 KB (1,141 words) - 15:00, 11 December 2015
RAL Tier1 weekly operations castor 15/01/2016

* Gfal-cat command failing for atlas reading of nsdumps form castor: https://ggus.eu/index.php?mode=ticket_info&ticket_id=117846. Developers lo * LHCb batch jobs failing to copy results into castor - changes made seems to have improved the situation but not fix (Raja). Inc

7 KB (1,085 words) - 16:11, 18 January 2016
RAL Tier1 weekly operations CASTOR 18/01/2019

#REDIRECT [[RAL Tier1 weekly operations castor 18/01/2019]]

59 B (6 words) - 12:58, 8 February 2019
RAL Tier1 weekly operations castor 07/7/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

2 KB (313 words) - 08:05, 14 July 2017
RAL Tier1 weekly operations castor 25/01/2016

...S tape no longer an issue, following disk server failure and test files in castor cache * Gfal-cat command failing for atlas reading of nsdumps form castor:

7 KB (1,203 words) - 17:47, 23 January 2016
RAL Tier1 weekly operations castor 01/02/2016

* Gfal-cat command failing for atlas reading of nsdumps form castor: https://ggus.eu/index.php?mode=ticket_info&ticket_id=117846. Developers lo * LHCb batch jobs failing to copy results into castor - changes made seems to have improved the situation but not fix (Raja). Inc

4 KB (583 words) - 17:45, 29 January 2016
RAL Tier1 weekly operations castor 08/02/2016

* Gfal-cat command failing for atlas reading of nsdumps form castor: https://ggus.eu/index.php?mode=ticket_info&ticket_id=117846. Developers lo * LHCb batch jobs failing to copy results into castor - changes made seems to have improved the situation but not fix (Raja). Inc

4 KB (565 words) - 10:29, 12 February 2016
RAL Tier1 weekly operations castor 15/02/2016

* castor 2.1.16 coming soon - SRM integration into CASTOR code base * Gfal-cat command failing for atlas reading of nsdumps form castor: https://ggus.eu/index.php?mode=ticket_info&ticket_id=117846. Developers lo

4 KB (640 words) - 13:34, 17 February 2016
RAL Tier1 weekly operations castor 22/02/2016

* castor 2.1.15 update * castor 2.1.16 coming soon - SRM integration into CASTOR code base

4 KB (703 words) - 14:52, 19 February 2016
RAL Tier1 weekly operations castor 29/02/2016

* glibc updates applied, all CASTOR systems rebooted. initial issues with head nodes, 7 failed to reboot due t * CASTOR facilities patching scheduled for next week - detailed schedule to be agree

3 KB (557 words) - 11:32, 26 February 2016
RAL Tier1 weekly operations castor 07/03/2016

* glibc updates applied, all CASTOR systems rebooted. initial issues with head nodes, 7 failed to reboot due t * CASTOR facilities patching scheduled for next week - detailed schedule to be agree

4 KB (643 words) - 11:40, 4 March 2016
RAL Tier1 weekly operations castor 14/03/2016

* glibc updates applied, all CASTOR systems rebooted. initial issues with head nodes, 7 failed to reboot due t ...pen times - Andrey reports that CERN use 100GB of memory for DB servers in castor to run 2.1.15 (vs our 32GB), Oracle are not providing adequate support at t

5 KB (810 words) - 13:23, 14 March 2016
RAL Tier1 weekly operations castor 21/03/2016

...pen times - Andrey reports that CERN use 100GB of memory for DB servers in castor to run 2.1.15 (vs our 32GB), Oracle are not providing adequate support at t * Could not drain gdss702 (castor 2.1.15) in Preprod (all files failed according to draindiskserver -q) - doe

3 KB (493 words) - 16:23, 18 March 2016
RAL Tier1 weekly operations castor 01/04/2016

...pen times - Andrey reports that CERN use 100GB of memory for DB servers in castor to run 2.1.15 (vs our 32GB), Oracle are not providing adequate support at t * CASTOR 2.1.15

2 KB (380 words) - 16:05, 1 April 2016
RAL Tier1 weekly operations castor 08/04/2016

* CERN steered us not to move to SRM 2.1.14 before castor 2.1.15 ...pen times - Andrey reports that CERN use 100GB of memory for DB servers in castor to run 2.1.15 (vs our 32GB), Oracle are not providing adequate support at t

3 KB (459 words) - 10:47, 8 April 2016
RAL Tier1 weekly operations castor 15/04/2016

* 2.1.16 castor needs deployment to tape servers * CERN steered us not to move to SRM 2.1.14 before castor 2.1.15

3 KB (532 words) - 10:57, 15 April 2016
RAL Tier1 weekly operations castor 18/11/2016

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

2 KB (322 words) - 11:08, 21 November 2016
RAL Tier1 weekly operations castor 22/04/2016

7.Anything for CASTOR-Fabric? * gfalcat does not work with castor, underlying issue fixed for gfalcopy but not gfalcat (gfal developers respo

3 KB (429 words) - 09:34, 6 May 2016
RAL Tier1 weekly operations castor 13/05/2016

... There was a question as to how batch is turned back on, concerns swamping castor? * gfalcat does not work with castor, underlying issue fixed for gfalcopy but not gfalcat (gfal developers respo

3 KB (476 words) - 11:08, 13 May 2016
RAL Tier1 weekly operations castor 20/05/2016

New CASTOR functional testing using xrootd will be enabled on Monday 23/5/2016 == CASTOR issues ==

1 KB (237 words) - 13:03, 23 May 2016
RAL Tier1 weekly operations castor 27/05/2016

Automated workflow for disk server deployment has been disabled New CASTOR functional testing using xrootd will be enabled on Monday 23/5/2016 CASTOR issues

3 KB (466 words) - 12:06, 1 June 2016
RAL Tier1 weekly operations castor 03/06/2016

7.Anything for CASTOR-Fabric? 40 files in atlas scratch had zero size in CASTOR namespace, BD declare lost to Atlas

3 KB (522 words) - 09:30, 10 June 2016
RAL Tier1 weekly operations castor 17/06/2016

The CASTOR 2.1.15 upgrade seems to work apart from the part that deals with the SRM re BD to review outstanding RT tickets on CASTOR queue

3 KB (485 words) - 10:59, 24 June 2016
RAL Tier1 weekly operations castor 10/06/2016

Further progress has been made with CASTOR 2.1.15 upgrade BD and CP to find out about zero-sized files on CASTOR facilities

3 KB (477 words) - 11:58, 10 June 2016
RAL Tier1 weekly operations castor 24/06/2016

CASTOR will be replaced in CERN by 2022. Need to consider what will happen in RAL BD to review outstanding RT tickets on CASTOR queue

3 KB (602 words) - 12:30, 24 June 2016
RAL Tier1 weekly operations castor 01/07/2016

CASTOR will be replaced in CERN by 2022. Need to consider what will happen in RAL CASTOR TEAM Durham / Leicester Dirac data - need to create separate tape pools / u

4 KB (632 words) - 12:50, 8 July 2016
RAL Tier1 weekly operations castor 08/07/2016

CASTOR will be replaced in CERN by 2022. Need to consider what will happen in RAL CASTOR TEAM Durham / Leicester Dirac data - need to create separate tape pools / u

3 KB (614 words) - 13:12, 15 July 2016
RAL Tier1 weekly operations castor 12/08/2016

7. Anything for CASTOR-Fabric? The gridFTP problem in CASTOR 2.1.15 was fixed. Xroot remains to be fixed

1 KB (192 words) - 11:19, 12 August 2016
RAL Tier1 weekly operations castor 15/07/2016

Draining of gdss748 is complete. The server is out of castor and handed over to the fabric team to swap back drives with gdss755 ...upgrade continues liaising with CERN. Need to find the license under which CASTOR is distibuted for the new users.

3 KB (482 words) - 13:25, 15 July 2016
RAL Tier1 weekly operations castor 22/07/2016

7. Anything for CASTOR-Fabric? CASTOR TEAM Durham / Leicester Dirac data - need to create separate tape pools / u

3 KB (438 words) - 09:33, 29 July 2016
RAL Tier1 weekly operations castor 29/07/2016

# Anything for CASTOR-Fabric? CASTOR TEAM Durham / Leicester Dirac data - need to create separate tape pools / u

2 KB (351 words) - 10:17, 12 August 2016
RAL Tier1 weekly operations castor 05/08/2016

All 9 new Dell tape-backed disk servers have been deployed into CASTOR Good progress has been made with the CASTOR 2.1.15 upgrade. The gridFTP transfer problem was fixed and a configuration

2 KB (377 words) - 09:28, 12 August 2016
RAL Tier1 weekly operations castor 26/08/2016

7. Anything for CASTOR-Fabric? Work on Castor 2.1.15 draining continues

1 KB (189 words) - 10:46, 26 August 2016
RAL Tier1 weekly operations castor 19/08/2016

7. Anything for CASTOR-Fabric? Work on Castor 2.1.15 draining continues

1 KB (144 words) - 09:00, 26 August 2016
RAL Tier1 weekly operations castor 05/04/2019

** Tape library for CASTOR-side testing in progress now * CASTOR metric reporting for GridPP

4 KB (562 words) - 09:54, 5 April 2019
RAL Tier1 weekly operations castor 09/09/2016

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

2 KB (346 words) - 11:09, 9 September 2016
RAL Tier1 weekly operations castor 23/09/2016

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

1 KB (186 words) - 15:04, 27 September 2016
RAL Tier1 weekly operations castor 30/09/2016

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

2 KB (312 words) - 09:05, 5 October 2016
RAL Tier1 weekly operations castor 07/10/2016

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

1 KB (188 words) - 09:25, 28 October 2016
RAL Tier1 weekly operations castor 14/10/2016

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

2 KB (230 words) - 09:31, 28 October 2016
RAL Tier1 weekly operations castor 28/10/2016

1. Castor 2.1.15 2. SL7 upgrade on tape servers 7. Anything for CASTOR-Fabric?

2 KB (255 words) - 10:01, 31 October 2016
RAL Tier1 weekly operations castor 04/11/2016

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

2 KB (314 words) - 12:34, 4 November 2016
RAL Tier1 weekly operations castor 11/11/2016

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

2 KB (311 words) - 12:43, 11 November 2016
RAL Tier1 weekly operations castor 02/12/2016

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

2 KB (330 words) - 16:34, 8 December 2016
RAL Tier1 weekly operations castor 25/11/2016

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

3 KB (398 words) - 12:16, 25 November 2016
RAL Tier1 weekly operations castor 27/1/2017

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

1 KB (187 words) - 09:14, 3 February 2017
RAL Tier1 weekly operations castor 29/04/2016

Alice use farm (quite significant) but dont really use castor *2014 disk serevrs can be put into castor - poss cms .. for IO throughput

848 B (147 words) - 09:43, 8 December 2016
RAL Tier1 weekly operations castor 09/12/2016

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

3 KB (458 words) - 09:16, 16 December 2016
RAL Tier1 weekly operations castor 16/12/2016

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

2 KB (312 words) - 14:05, 16 December 2016
RAL Tier1 weekly operations castor 09/1/2017

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

2 KB (218 words) - 14:21, 9 January 2017
RAL Tier1 CASTOR Experiments Completed Actions 2014

| 20131023-03 || Normal || ATLAS || Matthew || Report back about ATLAS CASTOR deletion problem after F2F discussion with developers || Closed. || 2014-01 | 20140827-02 || Normal || N/A || Rob || Report on plans for Castor 2.1.15 upgrade. || Done || 2014-10-28

2 KB (289 words) - 14:09, 21 December 2016
RAL Tier1 CASTOR Experiments Completed Actions 2015

...mal || CMS || Andrew L || Ensure the relevant people are looking into CMS CASTOR problems || Closed || 2015-02-11 ... || Normal || All || Rob Appleyard || Propagate information about upcoming CASTOR interventions || Done || 2015-08-26

1 KB (163 words) - 14:10, 21 December 2016
RAL Tier1 weekly operations castor 13/1/2017

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

1 KB (147 words) - 13:57, 17 January 2017
RAL Tier1 weekly operations castor 20/1/2017

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

1 KB (172 words) - 10:17, 25 January 2017
RAL Tier1 weekly operations castor 03/2/2017

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

1 KB (175 words) - 14:58, 9 February 2017
RAL Tier1 weekly operations castor 10/2/2017

2. SRM upgrade to SL6/CASTOR 2.1.16 7. Anything for CASTOR-Fabric?

1 KB (216 words) - 12:18, 13 February 2017
RAL Tier1 weekly operations castor 17/2/2017

2. SRM upgrade to SL6/CASTOR 2.1.16 7. Anything for CASTOR-Fabric?

1 KB (169 words) - 12:00, 17 February 2017
RAL Tier1 weekly operations castor 02/3/2017

2. SRM upgrade to SL6/CASTOR 2.1.16 3. SL5 elimination from CASTOR functional test boxes and tape verification server

1 KB (215 words) - 11:00, 6 March 2017
RAL Tier1 weekly operations castor 21/4/2017

2. SRM upgrade to SL6/CASTOR 2.1.16 3. SL5 elimination from CASTOR functional test boxes and tape verification server

1 KB (168 words) - 11:42, 26 April 2017
RAL Tier1 weekly operations castor 28/4/2017

2. SRM upgrade to SL6/CASTOR 2.1.16 3. SL5 elimination from CASTOR functional test boxes and tape verification server

1 KB (176 words) - 13:35, 3 May 2017
RAL Tier1 weekly operations castor 05/5/2017

2. SRM upgrade to SL6/CASTOR 2.1.16 3. SL5 elimination from CASTOR functional test boxes and tape verification server

1 KB (149 words) - 13:30, 11 May 2017
RAL Tier1 weekly operations castor 12/5/2017

2. SRM upgrade to SL6/CASTOR 2.1.16 3. SL5 elimination from CASTOR functional test boxes and tape verification server

2 KB (260 words) - 10:27, 12 May 2017
RAL Tier1 weekly operations castor 19/5/2017

2. SRM upgrade to SL6/CASTOR 2.1.16 3. SL5 elimination from CASTOR functional test boxes and tape verification server

2 KB (229 words) - 13:45, 25 May 2017
RAL Tier1 weekly operations castor 26/5/2017

2. SRM upgrade to SL6/CASTOR 2.1.16 3. SL5 elimination from CASTOR functional test boxes and tape verification server

2 KB (217 words) - 13:21, 31 May 2017
RAL Tier1 weekly operations castor 02/6/2017

2. SRM upgrade to SL6/CASTOR 2.1.16 3. SL5 elimination from CASTOR functional test boxes and tape verification server

2 KB (310 words) - 07:47, 8 June 2017
RAL Tier1 weekly operations castor 09/6/2017

2. SRM upgrade to SL6/CASTOR 2.1.16 3. SL5 elimination from CASTOR functional test boxes and tape verification server

1 KB (194 words) - 10:41, 15 June 2017
RAL Tier1 weekly operations castor 16/6/2017

2. SRM upgrade to SL6/CASTOR 2.1.16 3. SL5 elimination from CASTOR functional test boxes and tape verification server

2 KB (261 words) - 16:16, 19 June 2017
RAL Tier1 weekly operations castor 23/6/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

1 KB (175 words) - 08:18, 30 June 2017
RAL Tier1 weekly operations castor 30/6/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

2 KB (360 words) - 14:43, 6 July 2017
RAL Tier1 weekly operations castor 14/7/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

3 KB (505 words) - 11:18, 14 July 2017
RAL Tier1 weekly operations castor 28/7/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

2 KB (362 words) - 10:33, 28 July 2017
RAL Tier1 weekly operations castor 11/8/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

2 KB (346 words) - 10:34, 11 August 2017
RAL Tier1 weekly operations castor 25/8/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

2 KB (281 words) - 14:42, 25 August 2017
RAL Tier1 weekly operations castor 18/8/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

2 KB (303 words) - 13:09, 18 August 2017
RAL Tier1 weekly operations castor 1/9/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

2 KB (282 words) - 10:05, 4 September 2017
RAL Tier1 weekly operations castor 22/9/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

2 KB (234 words) - 12:11, 22 September 2017
RAL Tier1 weekly operations castor 29/9/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

1 KB (214 words) - 15:18, 12 October 2017
RAL Tier1 weekly operations castor 06/10/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

2 KB (230 words) - 09:28, 13 October 2017
RAL Tier1 weekly operations castor 13/10/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

2 KB (251 words) - 16:12, 23 October 2017
RAL Tier1 weekly operations castor 20/10/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

2 KB (230 words) - 15:00, 23 October 2017
RAL Tier1 weekly operations castor 27/10/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

2 KB (268 words) - 15:05, 27 October 2017
RAL Tier1 weekly operations castor 03/11/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

2 KB (276 words) - 17:19, 3 November 2017
RAL Tier1 weekly operations castor 10/11/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

2 KB (317 words) - 13:14, 15 November 2017
RAL Tier1 weekly operations castor 17/11/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

3 KB (449 words) - 11:08, 17 November 2017
RAL Tier1 weekly operations castor 24/11/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

2 KB (313 words) - 14:28, 1 December 2017
RAL Tier1 weekly operations castor 01/12/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

2 KB (317 words) - 14:22, 1 December 2017
RAL Tier1 weekly operations castor 08/12/2017

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (261 words) - 10:39, 15 December 2017
RAL Tier1 weekly operations castor 02/03/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (348 words) - 14:44, 5 March 2018
RAL Tier1 weekly operations castor 05/01/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (263 words) - 11:42, 5 January 2018
RAL Tier1 CASTOR Experiments Completed Actions 2017

| 20151014-01 || Normal || LHCB || Rob Appleyard || LHCb Writes from WN to Castor failing || Closed || 2017-02-08 | 20170215-01 || Normal || LHCB || Raja || LHCb Writes from WN to Castor failing , and then also failing to other sites. Raja to follow up. || Close

2 KB (234 words) - 11:02, 8 January 2018
RAL Tier1 weekly operations castor 12/01/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (338 words) - 10:46, 12 January 2018
RAL Tier1 weekly operations castor 19/01/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

3 KB (420 words) - 10:24, 26 January 2018
RAL Tier1 weekly operations castor 04/05/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (307 words) - 09:58, 4 May 2018
RAL Tier1 weekly operations castor 26/01/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (340 words) - 10:26, 29 January 2018
RAL Tier1 weekly operations castor 02/02/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (316 words) - 11:12, 2 February 2018
RAL Tier1 weekly operations castor 09/02/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (286 words) - 16:40, 9 February 2018
RAL Tier1 weekly operations castor 08/03/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

3 KB (448 words) - 14:18, 8 March 2018
RAL Tier1 weekly operations castor 16/02/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (286 words) - 14:02, 16 February 2018
RAL Tier1 weekly operations castor 23/02/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (335 words) - 12:28, 23 February 2018
RAL Tier1 weekly operations castor 20/04/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (330 words) - 10:22, 20 April 2018
RAL Tier1 weekly operations castor 16/03/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

3 KB (444 words) - 15:56, 16 March 2018
RAL Tier1 weekly operations castor 23/03/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (338 words) - 12:16, 23 March 2018
RAL Tier1 weekly operations castor 06/04/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (327 words) - 10:10, 6 April 2018
RAL Tier1 weekly operations castor 13/04/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

3 KB (380 words) - 10:39, 13 April 2018
RAL Tier1 weekly operations castor 27/04/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (286 words) - 10:04, 27 April 2018
RAL Tier1 weekly operations castor 11/05/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

3 KB (402 words) - 11:14, 11 May 2018
RAL Tier1 weekly operations castor 18/05/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (301 words) - 10:01, 18 May 2018
RAL Tier1 weekly operations castor 25/05/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (334 words) - 13:44, 25 May 2018
RAL Tier1 weekly operations castor 15/06/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (255 words) - 11:26, 15 June 2018
RAL Tier1 weekly operations castor 08/06/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (292 words) - 13:06, 8 June 2018
RAL Tier1 weekly operations castor 22/06/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (240 words) - 10:08, 22 June 2018
RAL Tier1 weekly operations castor 29/06/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (240 words) - 10:05, 29 June 2018
RAL Tier1 weekly operations castor 27/07/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (247 words) - 13:48, 27 July 2018
RAL Tier1 weekly operations castor 03/08/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (212 words) - 09:51, 3 August 2018
RAL Tier1 weekly operations castor 10/08/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (295 words) - 09:53, 10 August 2018
RAL Tier1 weekly operations castor 17/08/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (297 words) - 13:29, 17 August 2018
RAL Tier1 weekly operations castor 04/01/2019

* Oracle/kernel patching for CASTOR Facilities DB (23rd Jan) * Replacement of Facilities CASTOR d0t1 ingest nodes.

3 KB (417 words) - 11:01, 4 January 2019
RAL Tier1 weekly operations castor 24/08/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (278 words) - 09:51, 24 August 2018
RAL Tier1 weekly operations castor 07/09/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (221 words) - 15:15, 7 September 2018
RAL Tier1 weekly operations castor 14/09/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

3 KB (451 words) - 11:27, 14 September 2018
RAL Tier1 weekly operations castor 12/10/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (259 words) - 10:06, 12 October 2018
RAL Tier1 weekly operations castor 05/10/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

2 KB (346 words) - 13:08, 5 October 2018
RAL Tier1 weekly operations castor 21/09/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

3 KB (432 words) - 10:18, 21 September 2018
RAL Tier1 weekly operations castor 28/09/2018

2. CASTOR stress test improvement 3. Generic CASTOR headnode setup

3 KB (393 words) - 10:10, 28 September 2018
RAL Tier1 weekly operations castor 26/10/2018

Many checksum errors reported for files sent to CASTOR by Elastic tape for ingest (Kevin) All CASTOR d0t1 disk pools and atlasStripInput and cmsDisk patched

2 KB (212 words) - 10:22, 26 October 2018
RAL Tier1 weekly operations castor 19/10/2018

...doing: 1) Update of the CASTOR ldif file and 2) Xrootd functional test on castor-fuctional-test1 3) Fix a misconfiguration on the Eris disk array (cannot br

1 KB (162 words) - 09:52, 19 October 2018
RAL Tier1 weekly operations castor 30/11/2018

* Complete kernel patching on CASTOR hosts * Oracle/kernel patching for CASTOR Facilities DB

1 KB (163 words) - 10:28, 30 November 2018
RAL Tier1 weekly operations castor 02/11/2018

* castor-functional-test1 is still running tests that it shouldn't and called out on * New CASTOR WLCGTape instance. Things need doing: Create a seperate xrootd redirector f

2 KB (221 words) - 16:27, 5 November 2018
RAL Tier1 weekly operations castor 09/11/2018

* New CASTOR WLCGTape instance. Things need doing: Create a seperate xrootd redirector f

1 KB (144 words) - 10:59, 9 November 2018
RAL Tier1 weekly operations castor 16/11/2018

* New CASTOR WLCGTape instance. Things need doing: Create a seperate xrootd redirector f * CASTOR disk server migration to Aquilon: gdss742 has been compiled with a draft aq

2 KB (243 words) - 14:54, 16 November 2018
RAL Tier1 weekly operations castor 23/11/2018

* Complete kernel patching on CASTOR hosts * Oracle/kernel patching for CASTOR Facilities DB

1 KB (164 words) - 10:54, 23 November 2018
RAL Tier1 weekly operations castor 14/12/2018

...o dbssql04. John has fixed the immediate problem but more may arise. Given CASTOR d1t0 is going away anyway, open questions: Can we retire this and do we nee * Oracle/kernel patching for CASTOR Facilities DB (January, precise date to be agreed with Martin)

2 KB (351 words) - 10:52, 14 December 2018
RAL Tier1 weekly operations castor 07/12/2018

* CMS migrated to the new CASTOR instance * Complete kernel patching on CASTOR hosts

1 KB (186 words) - 10:38, 7 December 2018
RAL Tier1 weekly operations castor 21/12/2018

* castor-stager01 failed on Wed evening causing some disruption in WLCGTape. Resolve * Oracle/kernel patching for CASTOR Facilities DB (January, precise date to be agreed with Martin)

3 KB (430 words) - 14:11, 21 December 2018
RAL Tier1 weekly operations castor 11/01/2019

* fdsdss55 and 56 have handed over to CASTOR team. * Kernel patching for CASTOR standbys on Tuesday 15th Jan

3 KB (486 words) - 10:33, 18 January 2019
RAL Tier1 weekly operations castor 18/01/2019

* Ready to go into production from a CASTOR perspective * Kernel patching for CASTOR standbys on Tuesday 15th Jan

3 KB (497 words) - 12:58, 8 February 2019
RAL Tier1 weekly operations castor 25/01/2019

* Ready to go into production from a CASTOR perspective * Oracle/kernel patching for CASTOR Facilities DB done

3 KB (459 words) - 10:48, 25 January 2019
RAL Tier1 weekly operations castor 28/06/2019

* CASTOR disk server migration to Aquilon. ** CASTOR team proposal is either:

4 KB (577 words) - 10:21, 28 June 2019
RAL Tier1 weekly operations castor 01/02/2019

* Again the issue of CASTOR marking unmounted tapes as BUSY. Migration backlog * Ready to go into production from a CASTOR perspective

3 KB (505 words) - 17:53, 1 February 2019
RAL Tier1 weekly operations castor 08/02/2019

** Ready to go into production from a CASTOR perspective * Examine further standardisation of CASTOR pool settings.

4 KB (592 words) - 16:10, 11 February 2019
RAL Tier1 weekly operations castor 15/02/2019

** Ready to go into production from a CASTOR perspective * Examine further standardisation of CASTOR pool settings.

4 KB (602 words) - 16:03, 18 February 2019
RAL Tier1 weekly operations castor 29/03/2019

** Tape library for CASTOR-side testing in progress now * CASTOR metric reporting for GridPP.

3 KB (531 words) - 11:00, 29 March 2019
RAL Tier1 weekly operations castor 22/02/2019

* LHCb currently have a problem reading some files on lhcbDst. The CASTOR team is investigating. ** Ready to go into production from a CASTOR perspective.

4 KB (570 words) - 12:04, 1 March 2019
RAL Tier1 weekly operations castor 01/03/2019

* LHCb currently have a problem reading some files on lhcbDst. The CASTOR team is investigating. ** Ready to go into production from a CASTOR perspective.

3 KB (515 words) - 12:04, 1 March 2019
RAL Tier1 weekly operations castor 08/03/2019

* Examine further standardisation of CASTOR pool settings. ** CASTOR team to generate a list of nonstandard settings and consider whether they a

3 KB (439 words) - 15:31, 11 March 2019
RAL Tier1 weekly operations castor 15/03/2019

** Expect it for CASTOR-side testing next week. * CASTOR metric reporting for GridPP.

3 KB (511 words) - 13:54, 15 March 2019
RAL Tier1 weekly operations castor 21/06/2019

*** CASTOR team asked for the machines to have the same Nagios config as Tier 1 headno ** Castor tape testing has started again for CEDA

3 KB (452 words) - 09:50, 21 June 2019
RAL Tier1 weekly operations castor 22/03/2019

** Tape library ready for CASTOR-side testing * CASTOR metric reporting for GridPP.

3 KB (436 words) - 10:48, 22 March 2019
RAL Tier1 weekly operations castor 10/05/2019

* Examine further standardisation of CASTOR pool settings. ** CASTOR team to generate a list of nonstandard settings and consider whether they a

3 KB (463 words) - 10:27, 10 May 2019
RAL Tier1 weekly operations castor 12/04/2019

* CASTOR metric reporting for GridPP ** Looking for clarity on precisely what metrics are relevant, and given CASTOR's changed role, what system RA should report on.

4 KB (573 words) - 16:39, 15 April 2019
RAL Tier1 weekly operations castor 26/04/2019

* Produced lots of stats on CASTOR ingest rates * Examine further standardisation of CASTOR pool settings.

3 KB (474 words) - 09:50, 3 May 2019
RAL Tier1 weekly operations castor 03/05/2019

* Migrated Facilities CASTOR from Juno to Bellona. * Examine further standardisation of CASTOR pool settings.

3 KB (431 words) - 14:17, 3 May 2019
RAL Tier1 weekly operations castor 17/05/2019

* Examine further standardisation of CASTOR pool settings. ** CASTOR team to generate a list of nonstandard settings and consider whether they a

3 KB (396 words) - 09:40, 17 May 2019
RAL Tier1 weekly operations castor 14/06/2019

* Migrated CASTOR gridmap-files generation away for castor-functional-test1 onto a system. * Castor tape testing to continue after the production tape robot networking is inst

3 KB (449 words) - 10:22, 14 June 2019
RAL Tier1 weekly operations castor 24/05/2019

* Examine further standardisation of CASTOR pool settings. ** CASTOR team to generate a list of nonstandard settings and consider whether they a

3 KB (469 words) - 09:49, 24 May 2019
RAL Tier1 weekly operations castor 31/05/2019

* Examine further standardisation of CASTOR pool settings. ** CASTOR team to generate a list of nonstandard settings and consider whether they a

3 KB (468 words) - 09:52, 31 May 2019
RAL Tier1 weekly operations castor 07/06/2019

* Last Thursday, a change to resolv.conf necessary to test the new Facilities CASTOR headnodes caused the production tape servers to go down. * Examine further standardisation of CASTOR pool settings.

3 KB (535 words) - 10:05, 7 June 2019
RAL Tier1 weekly operations castor 05/07/2019

* Sorting out personal proxy being used to support CASTOR functional test. * New CASTOR disk servers currently with Martin.

3 KB (441 words) - 14:44, 10 July 2019
RAL Tier1 weekly operations castor 12/07/2019

* Sorting out personal proxy being used to support CASTOR xrootd functional test. * New CASTOR disk servers currently with Martin.

3 KB (524 words) - 10:39, 12 July 2019
RAL Tier1 weekly operations castor 19/07/2019

* Facilities CASTOR DB (Bellona) has one RAC node out of production, being worked on by Fabric. ** This is like an old CASTOR bug we encountered where double-slashes would break transfers

3 KB (440 words) - 10:02, 19 July 2019
RAL Tier1 weekly operations castor 26/07/2019

** CASTOR downtime on Thursday due to this. 11-2. ** Minimum non-CASTOR staff needed for the intervention: Brian, Kevin.

2 KB (341 words) - 10:06, 26 July 2019
RAL Tier1 weekly operations castor 02/08/2019

* Decommissioned the LHCb CASTOR instance. * Upgraded the xrootd version on the ALICE CASTOR xrootd redirector to the 4.10.0-1

3 KB (377 words) - 10:02, 2 August 2019

Page text matches

Main Page

** [[Using Castor At RAL]]

8 KB (1,130 words) - 17:31, 17 April 2024
RAL Tier1 weekly operations castor 09/02/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Tier 1 CASTOR stop and rebooted for Ghost vulnerability (and CIP)

3 KB (449 words) - 16:58, 6 February 2015
RAL Tier1 weekly operations castor 15/09/2014

...ing - need to investigate if a fix is already available, if not discuss at castor face to face * Break in connectivity Monday 8th, it appears that this did not affect castor internally in any way however if transfers were in process they would have

3 KB (404 words) - 15:14, 12 September 2014
RAL Tier1 weekly operations castor

[[Category:CASTOR]] == Tier1 Castor at RAL Weekly Operations ==

31 KB (3,178 words) - 09:34, 2 August 2019
RAL Tier1 weekly operations castor 17/03/2014

* CASTOR 2.1.14 + SL5/6 testing. The change control has gone through today with few * Castor on Call person

1 KB (181 words) - 13:58, 17 March 2014
Tier1 Operations Report 2019-06-17

...; margin-bottom: 0; padding-top: 0.1em; padding-bottom: 0.1em;" | Resolved Castor Disk Server Issues ...0; margin-bottom: 0; padding-top: 0.1em; padding-bottom: 0.1em;" | Ongoing Castor Disk Server Issues

14 KB (1,386 words) - 09:37, 19 June 2019
Tier1 Operations Report 2015-05-06

...en many zero-sized files created for Alice in Castor. This appears to be a Castor timeout affecting files that are written over a period of more than two hou * A start has been made on updating the Castor tape servers to SL6. (One server for each of the 'C' and 'D' drives was upd

16 KB (1,794 words) - 12:58, 6 May 2015
Operations Bulletin Latest

* LHCb Castor instance has been completely disabled for LHCb and will be decommissioned.

41 KB (5,018 words) - 14:09, 30 October 2019
Operations Bulletin 110416

...ind Castor. In the meantime we will carry out the (separate) update of the Castor SRMs to version 2.14. * "GEN Scratch" storage in Castor will be decommissioned.

40 KB (4,974 words) - 12:18, 11 April 2016
Tier1 Operations Report 2016-03-16

* We have uncovered a problem where draining of Castor disk servers is now going very slowly. We need to drain a few old servers t ...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed)

13 KB (1,356 words) - 09:59, 16 March 2016
Tier1 Operations Report 2016-02-17

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * The Castor repack instance was updated from version 2.1.14.13 to 2.1.14.15.

11 KB (1,098 words) - 09:42, 17 February 2016
Tier1 Operations Report 2015-12-09

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) ** This morning the link to the Castor headnodes was moved.

13 KB (1,411 words) - 08:55, 10 December 2015
Tier1 Operations Report 2014-06-25

...roblems with the special xroot configuration for Alice following since the Castor 2.1.14 update on Tuesday (24th). These were resolved this morning (25th). ...e xroot settings were tuned. Significant improvement were made and the CMS Castor instance is now working OK but being closely monitored.

13 KB (1,342 words) - 16:20, 25 June 2014
Tier1 Operations Report 2014-05-07

* Problems with "CMSDisk" in Castor reported last week have been resolved. CMS deleted files freeing up space a | All Castor (SRM) endpoints

13 KB (1,357 words) - 12:47, 9 May 2014
Past Ticket Bulletins 2014

...ier 1 concerning not being able to get the gfal commands to work accessing Castor. Duncan has posted to the ticket that things are working for him now, along ... discussed recently in the Ops meeting, a conversation is ongoing with the Castor devs about this, but there wasn't much noise from them at last check. The t

184 KB (30,332 words) - 17:18, 16 December 2014
Tier1 Operations Report 2014-09-10

* A high rate of Atlas file access failures into/from Castor was seen during the day yesterday (9th Sep). A number of measures were take ... files. These are likely to be the results of partly failed transfers into Castor in the past. These are being checked and will be followed up with the appro

13 KB (1,367 words) - 13:30, 10 September 2014
Tier1 Operations Report 2018-07-09

...; margin-bottom: 0; padding-top: 0.1em; padding-bottom: 0.1em;" | Resolved Castor Disk Server Issues ...0; margin-bottom: 0; padding-top: 0.1em; padding-bottom: 0.1em;" | Ongoing Castor Disk Server Issues

17 KB (1,646 words) - 09:31, 11 July 2018
Tier1 Operations Report 2017-01-04

* We have had load issues on the CMS Castor instance throughout the holiday period which has led to repeated SAM test f ...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed)

14 KB (1,476 words) - 14:02, 4 January 2017
RAL Tier1 CASTOR Experiments Completed Actions 2016

...transfers from one non-LHC VO from affecting other due to the use a shared CASTOR instance - Able to set limits for each VO srm endpoint , need to decide and

1 KB (188 words) - 14:11, 21 December 2016
Tier1 Operations Report 2015-05-20

* Castor xroot performance problems seen by CMS - particularly in very long file ope * The Castor tape servers are being updated to SL6.

13 KB (1,442 words) - 11:25, 20 May 2015
RAL Tier1 CASTOR Experiments Completed Actions 2013

...313-01 || Medium || ATLAS || Alastair || Make sure ATLAS GGUS ticket about CASTOR problems affecting FTS is up-to-date || Closed || 2013-05-01

2 KB (219 words) - 09:28, 20 May 2015
Tier1 Operations Report 2014-03-19

* There have been problems with the CMS Castor instance through the last week. These are triggered by high load on CMS_Tap ...is significantly adcanced and further investigations are ongoing using the Castor Preprod instance. Ideas for a workaround are being developed.

14 KB (1,553 words) - 11:36, 19 March 2014
Tier1 Operations Report 2016-01-20

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * Castor:

13 KB (1,364 words) - 12:54, 20 January 2016
RAL Tier1 weekly operations castor 24/03/2014

* CASTOR 2.1.14 + SL5/6 testing. The change control has gone through today with few * Castor on Call person

1 KB (164 words) - 15:18, 24 March 2014
RAL Tier1 weekly operations castor 21/7/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

2 KB (333 words) - 10:24, 28 July 2017
Tier1 Operations Report 2017-07-26

* There have been problems with the Atlas Castor instance that appear to be within the SRM. AtlasScratch shows some high loa * Since yesterday morning there has been a problem with Castor file transfers for those transfers initiated by the CERN FTS3 service. This

18 KB (1,971 words) - 14:03, 26 July 2017
Operations Bulletin 090315

* Last Tuesday we moved a number of Castor disk servers physically within the machine room. This was required to make * The Castor Team plan to upgrade to version 2.1.14-15 ahead of migrating to the next ve

46 KB (5,846 words) - 07:57, 9 March 2015
RAL Tier1 weekly operations castor 31/03/2014

* CASTOR 2.1.14 Upgrade Progress - Reversion to 2.1.13-9 software and databases on p * (Tue 1 Apr) Facilities CASTOR Upgrade. Downtime between 0900-1600

2 KB (368 words) - 16:46, 28 March 2014
Operations Bulletin 310314

...the CMS Castor instance at the end of last week and the start of this. The Castor /Database teams have some ideas for the cause of this which looks to be loa * There have been problems with the CMS Castor instance caused by load issues through the disk cache in front of CMS_Tape.

48 KB (6,293 words) - 07:35, 31 March 2014
Operations Bulletin 240314

...the CMS Castor instance at the end of last week and the start of this. The Castor /Database teams have some ideas for the cause of this which looks to be loa * There have been problems with the CMS Castor instance caused by load issues through the disk cache in front of CMS_Tape.

48 KB (6,293 words) - 07:36, 31 March 2014
Tier1 Operations Report 2014-04-02

* There was a failover of an Atlas Castor Database early evening on Tuesday 25th March. The failover triggered a call * There have been problems with the CMS Castor instance in recent weeks. These are triggered by high load. Work is underwa

16 KB (1,769 words) - 14:16, 2 April 2014
RAL Tier1 weekly operations castor 07/04/2014

* Facilities CASTOR was successfully upgraded to 2.1.14-11 ...rian to discuss with Alastair. Other tier 1s are not keen but RAL tier 1 / castor should be able to cope with this.

1,019 B (149 words) - 13:21, 4 April 2014
Operations Bulletin 070414

* Load related problems with the CMS Castor instance have been ongoing. Plans to mitigate this are in place.

45 KB (5,701 words) - 09:21, 7 April 2014
Tier1 Operations Report 2014-04-16

* The load related problems reported for the CMS Castor instance have not been seen this last fortnight. However, work is underway ...is significantly adcanced and further investigations are ongoing using the Castor Preprod instance. Ideas for a workaround are being developed.

13 KB (1,469 words) - 10:34, 16 April 2014
Tier1 Operations Report 2014-04-09

* The load related problems reported for the CMS Castor instance havenot been seen this last week. However, work is underway to tac ...is significantly adcanced and further investigations are ongoing using the Castor Preprod instance. Ideas for a workaround are being developed.

14 KB (1,599 words) - 11:33, 14 April 2014
RAL Tier1 weekly operations castor 14/04/2014

* The NN_FILE_STAGERTIME constraint has been removed for the Facilities CASTOR database, completing the 2.1.14 upgrade. This upgrade was thought to be tra * The xrootd timeout in castor.conf is now set to 30s for all nodes.

1 KB (221 words) - 10:09, 15 April 2014
Tier1 Operations Report 2014-04-23

* The load related problems reported for the CMS Castor instance have not been seen for a few weeks. However, work is underway to t ...is significantly advanced and further investigations are ongoing using the Castor Preprod instance. Ideas for a workaround are being developed.

13 KB (1,411 words) - 10:57, 23 April 2014
RAL Tier1 weekly operations castor 28/04/2014

* A new version of CASTOR 2.1.14 (2.1.14-12) has been released. This version makes no changes to the * CASTOR 2.1.14 upgrade for Tier 1.

1 KB (208 words) - 13:02, 25 April 2014
Tier1 Operations Report 2014-12-10

... continued through until the night of Thursday/Friday (4/5 December). With Castor very full there were very few disk servers available with any space on to r * CMS Castor headnodes were updated to SL6 on Tuesday 9th December and the Atlas ones th

14 KB (1,492 words) - 13:08, 10 December 2014
Tier1 Operations Report 2014-04-30

* There have been problems with "CMSDisk" in Castor caused by it becoming very full. * The load related problems reported for the CMS Castor instance have not been seen for a few weeks. However, work is underway to t

14 KB (1,557 words) - 13:24, 30 April 2014
RAL Tier1 weekly operations castor 05/05/2014

* CASTOR 2.1.14 upgrade for Tier 1. Possible date for first stage of intervention (N * CASTOR 2.1.14 for Tier 1

1 KB (161 words) - 15:56, 2 May 2014
RAL Tier1 weekly operations castor 12/05/2014

...een identified that may have contributed to the deletion problems on their CASTOR instance. However, the key test of running the ATLAS deletion scripts local * CASTOR 2.1.14 upgrade for Tier 1. Possible date for first stage of intervention (N

2 KB (245 words) - 10:07, 13 May 2014
Operations Bulletin 120514

* In process of scheduling Castor 2.1.14 upgrade. * In process of scheduling Castor 2.1.14 upgrade. Proposed date for Nameserver upgrade: Wednesday 28th May.

37 KB (4,615 words) - 08:50, 12 May 2014
Tier1 Operations Report 2014-05-14

* Provisional dates for the Castor 2.1.14 upgrade delayed to: Nameserver: Tuesday 10th June; Stagers to follow * Castor:

13 KB (1,393 words) - 10:46, 14 May 2014
Tier1 Operations Report 2019-02-25

* Castor disk server were physically moved to make room for new procurements. This * We have had two Castor disk server crashes since the move gdss776 and gdss783 both lhcbDst disk se

17 KB (1,612 words) - 11:29, 27 February 2019
RAL Tier1 weekly operations castor 19/05/2014

...een identified that may have contributed to the deletion problems on their CASTOR instance. However, the key test of running the ATLAS deletion scripts local * CASTOR 2.1.14 upgrade for Tier 1. First stage of intervention (NS upgrade) is book

2 KB (294 words) - 15:03, 19 May 2014
Operations Bulletin 190514

* LHCb: Incremental stripping campaign finished, all productions closed. CASTOR->EOS migration of LHCb user data finished. * In process of scheduling Castor 2.1.14 upgrade. (Now likely to be 10th June).

46 KB (6,091 words) - 11:47, 19 May 2014
Tier1 Operations Report 2014-05-21

* The checksum checker found a corrupt LHCb file in Castor which has been declared lost. * Provisional dates for the Castor 2.1.14 upgrade: Nameserver: Tuesday 10th June; Stagers: CMS- Tue 17th June;

14 KB (1,427 words) - 13:22, 21 May 2014
Tier1 Operations Report 2014-05-28

...rs were the OCF'12 batch, which are in AtlasDataDisk, CMSDisk and LHCbDst. Castor recovered OK from this. The network change itself was carried out to comple ...May) two network switches that provide connectivity to the 2012 batches of Castor disk servers were moved to the mesh network.

14 KB (1,452 words) - 12:18, 29 May 2014
RAL Tier1 weekly operations castor 02/06/2014

...een identified that may have contributed to the deletion problems on their CASTOR instance. However, the key test of running the ATLAS deletion scripts local ...n our issues was reported/fixed. These servers are now in acceptance test. Castor team will only deploy V13 servers to non prod until further notice.

2 KB (290 words) - 10:34, 30 May 2014
RAL Tier1 weekly operations castor 26/05/2014

...een identified that may have contributed to the deletion problems on their CASTOR instance. However, the key test of running the ATLAS deletion scripts local ...n our issues was reported/fixed. These servers are now in acceptance test. Castor team will only deploy V13 servers to non prod until further notice.

2 KB (276 words) - 13:46, 28 May 2014
Operations Bulletin 020614

* Castor 2.1.14 upgrade. Firming update of 10th June for nameserver with stagers CMS * Castor Nameserver 2.1.14 update on 10th June announced in GOC DB. Stager dates to

41 KB (5,148 words) - 09:38, 2 June 2014
Tier1 Operations Report 2015-12-02

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) | All Castor (All SRM endpoints)

13 KB (1,425 words) - 11:34, 2 December 2015
Operations Bulletin 090614

* Castor 2.1.14 upgrade. Firming update of 10th June for nameserver with stagers CMS * Castor Nameserver 2.1.14 update on 10th June announced in GOC DB. Stager dates to

41 KB (5,148 words) - 07:10, 9 June 2014
RAL Tier1 weekly operations castor 09/06/2014

...een identified that may have contributed to the deletion problems on their CASTOR instance. However, the key test of running the ATLAS deletion scripts local ...have been upgraded need further configurations (James) before releasing to castor team. V13 machines in production should have firmware update, best approach

2 KB (267 words) - 15:00, 9 June 2014
Tier1 Operations Report 2014-06-11

...k - D1T0) failed to restart after kernel/errata updates applied during the Castor update on 10th June. It was returned to production just befor this meeting ...e firmware in some network switches and apply kernel/errata updates to the Castor disk servers.

15 KB (1,592 words) - 12:26, 11 June 2014
Operations Bulletin 020315

* A Castor namesever box has been set-up to enable queries against Castor metadata to be made without affecting the throughput of production work. * A system has been set-up to provide Atlas with Castor information that is not supplied by the SRM.

42 KB (5,185 words) - 11:36, 2 March 2015
Tier1 Operations Report 2014-06-18

* The CMS Castor stager update to version 2.1.14-13 took place yesterday (Tuesday) as planne * Yesterday (17th June) the CMS Castor stager was updated to version 2.1.14-13.

12 KB (1,236 words) - 13:13, 18 June 2014
RAL Tier1 weekly operations castor 16/06/2014

...een identified that may have contributed to the deletion problems on their CASTOR instance. However, the key test of running the ATLAS deletion scripts local * A partitioning alignment issue (3rd CASTOR partition) has been identified, proposal is to resolve this for new machine

3 KB (412 words) - 13:13, 13 June 2014
Operations Bulletin 160614

* Castor and batch services currently down for Castor Namserver Upgrade (to version 2.1.14). If all goes well plan to upgrade sta * Castor Nameserver 2.1.14-13 updated successfully yesterday (10th June). Stager dat

39 KB (4,952 words) - 19:40, 13 June 2014
RAL Tier1 weekly operations castor 23/06/2014

...een identified that may have contributed to the deletion problems on their CASTOR instance. However, the key test of running the ATLAS deletion scripts local * A partitioning alignment issue (3rd CASTOR partition) has been identified, proposal is to resolve this for new machine

3 KB (423 words) - 12:45, 20 June 2014
Operations Bulletin 230614

* Castor Namserver Upgrade (to version 2.1.14) successful last week. CMS Stager upda * Castor CMS Stager 2.1.14-13 updated yesterday (17th June) although there were some

37 KB (4,591 words) - 09:54, 23 June 2014
RAL Tier1 weekly operations castor 30/06/2014

.... CERN provided a solution for SL5.9. We need to consider SL6 upgrade post CASTOR 2.1.14-13 upgrades. ...een identified that may have contributed to the deletion problems on their CASTOR instance. However, the key test of running the ATLAS deletion scripts local

2 KB (366 words) - 16:00, 27 June 2014
Operations Bulletin 300614

* Castor Stager Upgrade was carried out last week. 'GEN' stager update this morning. * Castor GEN Stager 2.1.14-13 updated yesterday (24th June). Some problems with xroo

35 KB (4,220 words) - 09:11, 30 June 2014
Tier1 Operations Report 2014-07-02

... SRM test failures during last week which were traced to load on the Atlas Castor system during searches for dark data. Less intrusive ways of carrying out t * We continue to monitor closely the performance of xroot access to CMS Castor following the upgrade on the 17th June. Performance is generally good altho

12 KB (1,212 words) - 10:15, 2 July 2014
WebDAV

|Castor |Test CASTOR WebDAV developed. Not production ready.

5 KB (692 words) - 08:28, 29 April 2016
RAL Tier1 weekly operations castor 07/07/2014

...een identified that may have contributed to the deletion problems on their CASTOR instance. However, the key test of running the ATLAS deletion scripts local * CMS db locking issue 3/7/14 early hours, resulted in lost CMS test file, castor current shows diskcopy_failed in stager logs. Proposal is to identify if th

2 KB (362 words) - 15:10, 12 August 2014
Tier1 Operations Report 2014-07-09

* There were problems with the SRM (not Castor) for the GEN instance on Thursday and Friday of last week (3/4 July). It wa * We are still investigating xroot access to CMS Castor following the upgrade on the 17th June.

11 KB (1,140 words) - 13:17, 9 July 2014
Operations Bulletin 070714

... was carried out successfully last Thursday. The final update is the Atlas Castor instance stager which is planned for the Atlas - Tue 1st July. The information publishing police have pointed out that the RAL Castor isn't publishing a sane version. Brian suspects an rogue ":" causing the pr

43 KB (5,584 words) - 12:52, 7 July 2014
RAL Tier1 weekly operations castor 14/07/2014

...ek with the task of investigating visualisation and querying solutions for CASTOR use. * CASTOR 2.1.14-13 upgrade for Repack - planned for Tuesday or Wednesday this week.

2 KB (308 words) - 13:48, 14 July 2014
Tier1 Operations Report 2014-07-16

* There have been recurring problems with the SRM processes for the castor GEN instance crashing since Friday (11th). This appears to be linked to a p * We are still investigating xroot access to CMS Castor following the upgrade on the 17th June.

13 KB (1,422 words) - 13:41, 16 July 2014
Tier1 Operations Report 2015-02-25

* Yesterday (Tuesday) there was an outage of part of Castor as some racks containing disk servers (the 2011 batches) were shutdown whil ...a single file was reportd lost to CMS. This file had been picked up by the Castor checksum checker.

12 KB (1,241 words) - 14:07, 25 February 2015
Operations Bulletin 210714

...e was carried out successfully last Tuesday (8th July). This completes the Castor 2.1.14 upgrades apart from some internal changes (E.g. the 'repack' instanc * All Castor instances have been updated to version 2.1.14-13. Some issues remain and ar

39 KB (4,936 words) - 09:03, 21 July 2014
RAL Tier1 weekly operations castor 21/07/2014

...on with the task of investigating visualisation and querying solutions for CASTOR use. * Incorrect service classes in castor.conf on disk servers, Atlas issues resolved by Rob. Other non production is

2 KB (318 words) - 09:07, 21 July 2014
Operations Bulletin 280714

* All Castor instances have been upgraded to version 2.1.14. The upgrade is complete apa

39 KB (4,833 words) - 10:09, 28 July 2014
Tier1 Operations Report 2014-07-23

* The recurring problems with the SRM processes for the castor GEN instance crashing has been solved. The problem started on Friday 11th J * On Thursday (17th) the Castor disk cache for AtlasTape filled up. This was traced to the garbage collecto

13 KB (1,382 words) - 13:24, 23 July 2014
RAL Tier1 weekly operations castor 28/07/2014

...on with the task of investigating visualisation and querying solutions for CASTOR use. * Facilities castor error

2 KB (262 words) - 15:46, 25 July 2014
Tier1 Operations Report 2014-07-30

...und (code to trap and fixup the mal-formed filename) was inserted into the Castor GEN instance. * There have been some problems with the Atlas SRM/Castor instance in the last couple of days that are under investigation.

13 KB (1,402 words) - 13:04, 30 July 2014
RAL Tier1 weekly operations castor 04/08/2014

* We have received word that a 2.1.14-15 version of CASTOR may be forthcoming. * Kashyap's Elasticsearch query script has been rolled out to CASTOR headnodes. Users are encouraged to test it and report any bugs.

2 KB (279 words) - 16:48, 1 August 2014
Operations Bulletin 040814

* All Castor instances have been upgraded to version 2.1.14. The upgrade is complete inc ..._id=106655 GGUS 106655]. Cross-contamination of information due to the GEN-CASTOR SRMs sharing a database, and some VOs sharing service classes. In progress.

42 KB (5,191 words) - 14:37, 2 August 2014
Tier1 Operations Report 2014-08-06

...epancies were found in some of the Castor database tables and columns. The Castor team are considering options with regard to fixing these. The issue has no * There are problems with disk server draining for Atlas in Castor 2.1.4. This is under investigation.

12 KB (1,257 words) - 12:25, 6 August 2014
RAL Tier1 weekly operations castor 11/08/2014

* Kashyap's Elasticsearch query script has been rolled out to CASTOR headnodes. Users are encouraged to test it and report any bugs. ...inate a number of excess tables and other entities left over from previous CASTOR versions. This will be change-controlled in the near future.

2 KB (300 words) - 11:01, 15 August 2014
Tier1 Operations Report 2014-08-13

...t triggered by attempts to use the disk server re-balancing feature now in Castor. ...oblems with disk server draining in Castor (and specifically for Atlas) in Castor 2.1.4. This is under investigation.

11 KB (1,152 words) - 10:26, 13 August 2014
RAL Tier1 weekly operations castor 18/08/2014

* Kashyap's Elasticsearch query script has been rolled out to CASTOR headnodes. Users are encouraged to test it and report any bugs. ...inate a number of excess tables and other entities left over from previous CASTOR versions. This will be change-controlled in the near future.

2 KB (313 words) - 10:40, 15 August 2014
Operations Team Completed Actions

...not part of RAL-LCG2. RAL APEL accounting of course includes both Echo and CASTOR jobs.

33 KB (5,297 words) - 10:13, 15 November 2017
Operations Bulletin 180814

* Ongoing investigations into problems with draining disk servers in Castor 2.1.14.

41 KB (5,255 words) - 08:52, 18 August 2014
Tier1 Operations Report 2019-12-11

| Decommission RAL's Castor Disk endpoint for ALICE

14 KB (1,558 words) - 11:37, 12 December 2019
RAL Tier1 weekly operations castor 25/08/2014

* passive draining produces file duplication - fixed in castor 2.1.14-14 * SL6 castor stalled due to resource limitations

2 KB (312 words) - 10:55, 22 August 2014
Operations Bulletin 250814

* Ongoing investigations into problems with draining disk servers in Castor 2.1.14.

42 KB (5,304 words) - 10:39, 25 August 2014
Tier1 Operations Report 2014-11-05

* Port opened up to allow external Castor WebDav access (requested by LHCb). * Castor:

12 KB (1,238 words) - 08:50, 11 November 2014
Tier1 Operations Report 2014-08-27

* Following some problems with disk server draining in Castor 2.1.14 a modified procedure has been tested on one disk server and been suc ...epancies were found in some of the Castor database tables and columns. The Castor team are considering options with regard to fixing these. The issue has no

14 KB (1,421 words) - 13:42, 27 August 2014
RAL Tier1 weekly operations castor 01/09/2014

* passive draining produces file duplication - fixed in castor 2.1.14-14 * SL6 castor stalled due to resource limitations & A/L

2 KB (338 words) - 15:06, 29 August 2014
Operations Bulletin 010914

* We have resumed draining disk servers after the Castor 2.1.14 upgrade. There were some problems with this that are now resolved.

42 KB (5,358 words) - 10:48, 1 September 2014
Tier1 Operations Report 2014-09-03

...t took some time to fix. Not all services were affected - the site (except Castor) was declared down for around 6 hours on Saturday. ...epancies were found in some of the Castor database tables and columns. The Castor team are considering options with regard to fixing these. The issue has no

13 KB (1,342 words) - 11:16, 3 September 2014
RAL Tier1 Incident 20140830 Network Related Problems

...ver. Storage (Castor) services were unaffected. The Tier1 site (apart from Castor storage) was declared down in the GOC DB for 5.5 hours from 09:00 on Saturd | Start of unscheduled 'outage' in the GCDB for the whole Tier1 apart from Castor.

10 KB (1,628 words) - 09:31, 14 November 2014
RAL Tier1 weekly operations castor 08/09/2014

* Juan to patch castor dbs beginning of Nov PSU patches – standard change ...inate a number of excess tables and other entities left over from previous CASTOR versions. This will be change-controlled in the near future.

2 KB (279 words) - 12:37, 5 September 2014
Dirac on a vm at cern

BackendType = castor Path = /castor/cern.ch/grid/dteam/

8 KB (1,180 words) - 12:27, 1 February 2016
Tier1 Operations Report 2014-10-01

* There were problems with the Atlas Castor instance over the weekend which was linked to the draining of a disk server * Oracle patches (PSU) applied to the standby Neptune database (Castor Atlas & GEN) yesterday (Tuesday 30th Sep).

13 KB (1,429 words) - 10:06, 8 October 2014
RAL Tier1 Incident 20130628 Atlas Castor Outage

==RAL Tier1 Incident 20130628 Atlas Castor Outage======Description:=== The ATLAS CASTOR instance encountered a problem where large numbers of invalid subrequests g

10 KB (1,594 words) - 10:56, 1 May 2015
Tier1 Operations Report 2014-09-17

* On Saturday (13th Sep) there was a problem with the Atlas Castor instance that persisted into the beginning of Sunday. A number of measures ** Apply latest Oracle patches (PSU) to the production database systems (Castor, LFC).

12 KB (1,195 words) - 14:07, 17 September 2014
RAL Tier1 weekly operations castor 22/09/2014

* Juan to patch castor dbs beginning of Nov PSU patches – standard change ...inate a number of excess tables and other entities left over from previous CASTOR versions. This will be change-controlled in the near future.

2 KB (297 words) - 16:23, 19 September 2014
Tier1 Operations Report 2014-09-24

* VO Pheno have been enabled to use Castor. ** Apply latest Oracle patches (PSU) to the production database systems (Castor, LFC).

12 KB (1,304 words) - 10:59, 24 September 2014
RAL Tier1 weekly operations castor 15/12/2014

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * gdss659 is still but will be decommissioned out of CASTOR.

2 KB (383 words) - 10:53, 4 February 2015
RAL Tier1 weekly operations castor 29/09/2014

* useful breakout sessions at Castor face to face - deadlock analysis & bugs confirmed, discussions to simplify * Juan to patch castor dbs starting next week (PSU patches) – standard change

2 KB (274 words) - 15:25, 26 September 2014
Operations Bulletin 290914

* Access to Castor has been given to the Pheno VO.

49 KB (6,449 words) - 10:22, 29 September 2014
RAL Tier1 weekly operations castor 19/01/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * SL6 name server upgrade postponed due to castor team resource - likely to be this week

2 KB (366 words) - 15:20, 16 January 2015
Operations Bulletin 200317

* Castor SRM updates: Planning for LHCb SRMs to be the first to be updated on 22nd M

42 KB (5,079 words) - 18:37, 19 March 2017
RAL Tier1 weekly operations castor 06/10/2014

* useful breakout sessions at Castor face to face - deadlock analysis & bugs confirmed, discussions to simplify ...nt on gdss720. Server currently in read only and will revisit post current castor issues.

3 KB (479 words) - 09:41, 7 October 2014
Tier1 Operations Report 2014-10-08

...) was moved to the standby database system. This required an outage of the Castor Atlas and GEN instances which lasted around 2 hours. The standby is, under ...e declared lost to ALICE, both from AliceDisk. These were picked up by the Castor checksum checker.

15 KB (1,740 words) - 10:50, 15 October 2014
RAL Tier1 weekly operations castor 13/10/2014

* SL6 Headnode work progressing well - hoping for test in castor vcert next week ...h due to emc failure. Action Add CIP into instructions for castor failover.Castor team decided to wait until dbs rolled back.

3 KB (479 words) - 16:36, 10 October 2014
Operations Bulletin 131014

... experiencing a problem with a disk array that holds the Castor databases. Castor performance may be degraded and we await an engineer to fix the faulty arra ...heck at RAL, and asking if things are alright. When looking over the whole Castor namespace it appears that all files are present and correct which doesn't e

46 KB (5,940 words) - 04:40, 13 October 2014
Suggestions for suitable hardware to run a Grid SE

|CASTOR |CASTOR

6 KB (928 words) - 16:45, 3 January 2019
Guide to Ganga

...f different file types for accessing Dirac files, Mass Storage Files (e.g. castor), LCG SE files and Google Drvie files and all can accept wilcards.

15 KB (2,621 words) - 14:40, 27 May 2020
Tier1 Operations Report 2014-10-15

...y (Tuesday 14th October) there was a scheduled outage of the Atlas and GEN Castor instances while the database configuration was put back in its normal opera ** Apply latest Oracle patches (PSU) to the production database systems (Castor, LFC). (Underway).

14 KB (1,556 words) - 13:12, 15 October 2014
Tier1 Operations Report 2018-06-18

...; margin-bottom: 0; padding-top: 0.1em; padding-bottom: 0.1em;" | Resolved Castor Disk Server Issues ...0; margin-bottom: 0; padding-top: 0.1em; padding-bottom: 0.1em;" | Ongoing Castor Disk Server Issues

16 KB (1,535 words) - 13:37, 20 June 2018
Tier1 Operations Report 2018-05-14

...; margin-bottom: 0; padding-top: 0.1em; padding-bottom: 0.1em;" | Resolved Castor Disk Server Issues ...0; margin-bottom: 0; padding-top: 0.1em; padding-bottom: 0.1em;" | Ongoing Castor Disk Server Issues

16 KB (1,476 words) - 07:41, 16 May 2018
RAL Tier1 weekly operations castor 20/10/2014

* SL6 Headnode work progressing well - tested in vcert2, hoping for test in castor vcert next week and production end of Nov. * Successfully moved Castor atlas/gen stager/srm back to primary db following EMC cache battery replace

2 KB (378 words) - 12:09, 17 October 2014
Operations Bulletin 201014

.... The disk array has been fixed and this morning we have a downtime of the Castor Atlas and GEN instances to revert to the 'normal' database configuration. ...et to the first order (confirming that the files in question are indeed in castor), so the ticket could be solved - or at least CMS could be asked to see if

39 KB (4,711 words) - 03:39, 21 October 2014
Tier1 Operations Report 2014-10-22

... expected to be a transparent intervention. The 'Pluto' database hosts the Castor Nameserver as well as the CMS and LHCb stager databases. ** Apply latest Oracle patches (PSU) to the production database systems (Castor, LFC). (Underway).

14 KB (1,591 words) - 11:09, 22 October 2014
RAL Tier1 weekly operations castor 27/10/2014

* 2-1-14-14 castor upgrade priority dropped as we have a draining workaround. Revisit once SL6 ...inate a number of excess tables and other entities left over from previous CASTOR versions. This will be change-controlled in the near future.

2 KB (270 words) - 14:50, 27 October 2014
Tier1 Operations Report 2014-10-29

... expected to be a transparent intervention. The 'Pluto' database hosts the Castor Nameserver as well as the CMS and LHCb stager databases. ** Apply latest Oracle patches (PSU) to the production database systems (Castor, LFC). (Underway).

14 KB (1,569 words) - 13:13, 29 October 2014
Operations Bulletin 271014

* The intervention to put our Castor Oracle database configuration to its 'normal' state was completed OK last T

40 KB (4,976 words) - 10:25, 27 October 2014
RAL Tier1 weekly operations castor 3/11/2014

...inate a number of excess tables and other entities left over from previous CASTOR versions. This will be change-controlled in the near future. * Possible future upgrade to CASTOR 2.1.14-15 post christmas

2 KB (355 words) - 10:19, 3 November 2014
Operations Bulletin 031114

* The intervention to put our Castor Oracle database configuration to its 'normal' state was completed OK last T

42 KB (5,228 words) - 10:37, 4 November 2014
RAL Tier1 weekly operations castor 10/11/2014

...inate a number of excess tables and other entities left over from previous CASTOR versions. This will be change-controlled in the near future. * Possible future upgrade to CASTOR 2.1.14-15 post Christmas

1 KB (226 words) - 13:51, 12 November 2014
Operations Bulletin 101114

...w castor reports read-only disk servers, Brian has put in a request to the Castor team for information on this. On hold (3/11) * Gfal-copy and castor issue

48 KB (6,138 words) - 09:19, 10 November 2014
Tier1 Operations Report 2014-11-12

* Some problems on Atlas Castor instance. At various times in the last couple of weeks the Atlas workload h * OS Errata rolled out to Castor GEN instance (headnodes & disk servers) this morning (12th Nov).

13 KB (1,376 words) - 15:37, 12 November 2014
Operations Bulletin 171114

* Gfal-copy and castor issue * Some problems on Atlas Castor instance. At various times in the last couple of weeks the Atlas workload h

39 KB (4,698 words) - 18:46, 16 November 2014
RAL Tier1 weekly operations castor 17/11/2014

...inate a number of excess tables and other entities left over from previous CASTOR versions. This will be change-controlled in the near future. * Possible future upgrade to CASTOR 2.1.14-15 post Christmas

2 KB (267 words) - 14:41, 14 November 2014
Tier1 Operations Report 2014-11-19

* Some problems on Atlas Castor instance. At various times in the last couple of weeks the Atlas workload h * None, A proposed at risk to update errata on Atlas, CMS and LHCb castor instance machines was cancelled.

12 KB (1,302 words) - 14:14, 19 November 2014
RAL Tier1 weekly operations castor 24/11/2014

...inate a number of excess tables and other entities left over from previous CASTOR versions. This will be change-controlled in the near future. * Possible future upgrade to CASTOR 2.1.14-15 post Christmas

2 KB (265 words) - 14:14, 21 November 2014
Tier1 Operations Report 2014-11-26

...ause has been understood and a fix will be provided in a future version of Castor. * Some problems on Atlas Castor instance. At various times in recent weeks the Atlas workload has led to di

14 KB (1,484 words) - 15:28, 26 November 2014
RAL Tier1 weekly operations castor 01/12/2014

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * gdss659 is still but will be decommissioned out of CASTOR.

2 KB (364 words) - 15:03, 2 December 2014
Operations Bulletin 011214

...M storage capacity mismatch. At last word Brian had submitted a request to Castor to find out how it reports read-only volumes. Any news? On Hold (3/11) * Gfal-copy and castor issue

41 KB (5,123 words) - 10:02, 1 December 2014
Tier1 Operations Report 2014-12-03

* CMS managed to fill up their Castor space allocation yesterday 2nd December. The VO was informed and they delet * Today CMS jobs have been overwhelming Castor with xrootd jobs. We hope to fix this by switching CMS to use RFIO and by r

13 KB (1,397 words) - 12:01, 3 December 2014
RAL Tier1 weekly operations castor 08/12/2014

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * gdss659 is still but will be decommissioned out of CASTOR.

2 KB (331 words) - 10:53, 4 February 2015
Operations Bulletin 081214

* The headnodes of LHCb instance of Castor is being upgraded to SL6 today 10:00-14:00. * Investigating problems on the CMS Castor instance.

50 KB (6,536 words) - 00:08, 7 December 2014
Operations Bulletin 151214

* The headnodes of LHCb instance of Castor is being upgraded to SL6 today 10:00-14:00. * Investigating problems on the CMS Castor instance.

40 KB (4,992 words) - 19:01, 15 December 2014
Tier1 Operations Report 2015-01-21

* The problem with xroot access to the Castor GEN instance was fixed during the afternoon of Wednesday 14th Jan. (This pr ... were updated to SL6 on Monday 19th Jan. This operation was transparent to Castor users.

14 KB (1,514 words) - 17:14, 21 January 2015
Past Ticket Bulletins 2015

Castor not publishing glue2. Stephen Burke has offered a hand with the task. In pr ...z, with his mice (mouse?) hat on, cannot copy data from the Imperial SE to Castor via the FTS, but can do uploads/downloads using the regular tools. Early da

117 KB (18,736 words) - 11:05, 4 January 2016
Tier1 Operations Report 2014-12-17

* Completing Castor headnode upgrades to SL6: Tuesday 6th Jan - GEN; Wednesday 7th Jan - Namese * Castor:

14 KB (1,504 words) - 14:50, 17 December 2014
RAL Tier1 weekly operations castor 22/12/2014

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Kernel and errata upgrade on Castor SL6 headnodes (including reboot) - Tues 23rd 10:00 - 12:00

3 KB (386 words) - 11:33, 19 December 2014
Operations Bulletin 221214

* In addition upgrading the Castor Headnodes to SL6 needs to be completed:

44 KB (5,642 words) - 20:59, 21 December 2014
Operations Bulletin 291214

* In addition upgrading the Castor Headnodes to SL6 needs to be completed:

46 KB (5,865 words) - 23:34, 26 December 2014
Tier1 Operations Report 2015-01-07

* On Sunday 21st December there was a problem with the LHCb Castor instance when teh transfer manager processes became unresponsive. These wer * On Tuesday 6th Jan the Castor GEN instance headnodes were successfully upgraded to SL6.

17 KB (1,780 words) - 12:56, 7 January 2015
RAL Tier1 weekly operations castor 11/05/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Testing CASTOR rebalancer on preproduction.

4 KB (574 words) - 15:27, 11 May 2015
Operations Bulletin 050115

...There is a 'warning' on the Tier1 Castor tomorrrow (Wednesday 7th) for the Castor "nameserver" component to be likewise updated.

47 KB (5,942 words) - 15:24, 8 January 2015
RAL Tier1 weekly operations castor 12/01/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * SL6 name server upgrade postponed due to castor team resource - needs to be rescheduled

2 KB (368 words) - 13:40, 9 January 2015
Tier1 Operations Report 2015-01-14

* There is a problem with xroot access to the Castor GEN instance (not affecting ALICE). | All Castor (All SRM endpoints).

14 KB (1,559 words) - 10:52, 21 January 2015
Operations Bulletin 190115

...lace last week. This is being re-scheduled for next Monday (19th Jan). The Castor service will be At Risk during this work. * SL6 updates on Castor Namesevers announced for next Monday (19th Jan). (Castor services At Risk during this upgrade).

44 KB (5,523 words) - 22:20, 18 January 2015
RAL Tier1 weekly operations castor 26/01/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Redundant atlasHotdisk service class and disk pool from CASTOR

2 KB (358 words) - 14:00, 23 January 2015
Operations Bulletin 260115

* The delayed upgrade of the Castor Namservers to SL6 took place successfully yesterday, Monday (19th Jan). * Some Castor updates done (SL6 updates on Namesevers; Addition of a third SRM node for L

44 KB (5,505 words) - 10:46, 26 January 2015
Tier1 Operations Report 2015-01-28

* On Monday (26th Jan) some redundant Castor stager schemas were cleaned up. | Castor CMS & GEN instances (srm-alice, srm-biomed, srm-cms, srm-cms-disk, srm-dtea

13 KB (1,340 words) - 12:21, 28 January 2015
RAL Tier1 weekly operations castor 02/02/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Facilities CASTOR patched for kernel/errata (not Ghost)

3 KB (502 words) - 14:28, 30 January 2015
Operations Bulletin 020215

* The delayed upgrade of the Castor Namservers to SL6 took place successfully yesterday, Monday (19th Jan). * Kernel and errata updates (requiring a reboot) are being applied to Castor disk servers this week.

45 KB (5,609 words) - 09:48, 2 February 2015
Tier1 Operations Report 2015-02-04

...des within the RAC. After these were moved to their correct locations some Castor service restarts were carried out to ensure the connections to the database ...d not come back after being shutdown for a reboot during scheduled work on Castor on Monday (2nd Feb). It was found to have a faulty disk drive. After being

13 KB (1,435 words) - 11:48, 4 February 2015
Tier1 Operations Report 2015-02-18

* Castor: ** Fix discrepancies were found in some of the Castor database tables and columns. (The issue has no operational impact.)

12 KB (1,187 words) - 13:13, 18 February 2015
Operations Bulletin 090215

* Outage for reboot of Castor systems yesterday (to pick up latest OS patches).

46 KB (5,740 words) - 12:49, 8 February 2015
Tier1 Operations Report 2015-02-11

| All Castor (All SRMs) | Castor services At Risk during application of regular patches to back end database

13 KB (1,290 words) - 11:23, 11 February 2015
RAL Tier1 weekly operations castor 16/02/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * storageD retrieval from castor problems - investigation ongoing

3 KB (429 words) - 15:30, 16 February 2015
Operations Bulletin 160215

* Patching of Oracle databases behind Castor scheduled for Feb 11th, declared as WARNING in GOCDB.

48 KB (6,051 words) - 10:02, 16 February 2015
RAL Tier1 Experiments Liaison Meeting Presentations

...ridpp.ac.uk/tier1a/Tier1_Experiment_Liaisons_Meeting/CASTOR_Accounting.pdf CASTOR information provider and accounting] ...ww.gridpp.ac.uk/tier1a/Tier1_Experiment_Liaisons_Meeting/CASTOR_2.1.13.pdf CASTOR 2.1.13]

948 B (132 words) - 09:36, 25 February 2015
RAL Tier1 weekly operations castor 23/02/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * storageD retrieval from castor problems - investigation ongoing

3 KB (491 words) - 09:48, 25 February 2015
Operations Bulletin 240215

* Move of some racks containing Castor disk servers needed to make space for new deliveries exopected to take plac

45 KB (5,650 words) - 15:12, 20 February 2015
Operations Bulletin 060415

** ALICE: High activity. VO boxes switching to RFC proxies. Some CERN CASTOR instabilities. * Castor update to 2.1.14-15 planned for next week.

43 KB (5,265 words) - 10:52, 6 April 2015
Tier1 Operations Reprot 2015-04-08

| All Castor (all SRM endpoints) | Upgrade of Castor storage to version 2.1.14-15

12 KB (1,197 words) - 08:36, 8 April 2015
Tier1 Operations Report 2015-03-04

...created a few weeks ago - which was the second occurrence of a rare bug in Castor. * Castor:

12 KB (1,175 words) - 14:56, 4 March 2015
Tier1 Operations Report 2015-03-11

* Update Castor to 2.1-14-15 (Proposed date - Wednesday 8th April). * Castor:

12 KB (1,242 words) - 14:15, 11 March 2015
RAL Tier1 weekly operations castor 09/03/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] ...while draining (had difficulties previously) - now back and draining final castor partition

3 KB (550 words) - 14:59, 9 March 2015
RAL Tier1 weekly operations castor 16/03/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] ...it current version - never seen by RAL. CERN have a workaround in place on castor 2.1.15

4 KB (574 words) - 12:14, 13 March 2015
Operations Bulletin 160315

* Proposed date for Castor upgrade to version 2.1.14-15 is the 8th April. (To be confirmed).

48 KB (6,074 words) - 10:36, 16 March 2015
Tier1 Operations Report 2015-03-18

* Five additional disk servers have been deployed into Castor storage for ALICE_Disk. This roughly doubles the disk space available to AL | All Castor (all SRMs)

12 KB (1,182 words) - 14:05, 18 March 2015
RAL Tier1 weekly operations castor 23/03/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * storageD retrieval from castor problems - investigation ongoing

3 KB (537 words) - 17:53, 20 March 2015
Operations Bulletin 230315

* Proposed date for Castor upgrade to version 2.1.14-15 is the 8th April. * Five additional disk servers have been deployed into Castor storage for ALICE_Disk. This roughly doubles the disk space available to AL

44 KB (5,438 words) - 08:48, 23 March 2015
Tier1 Operations Report 2015-04-01

| All Castor (all SRM endpoints) | Upgrade of Castor storage to version 2.1.14-15

13 KB (1,330 words) - 10:18, 1 April 2015
Operations Bulletin 300315

* Proposed date for Castor upgrade to version 2.1.14-15 is the 8th April. * Five additional disk servers have been deployed into Castor storage for ALICE_Disk. This roughly doubles the disk space available to AL

44 KB (5,487 words) - 08:53, 30 March 2015
Tier1 Operations Report 2015-04-08

| All Castor (all SRM endpoints) | Upgrade of Castor storage to version 2.1.14-15

12 KB (1,175 words) - 10:12, 8 April 2015
RAL Tier1 weekly operations castor 04/05/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Upgrade of CASTOR DBs to Oracle version DB 11.2.04 complete.

3 KB (514 words) - 16:13, 1 May 2015
RAL Tier1 Summary of Post Mortems

* UPS power for castor Disk Servers * Instigate regular check of Castor logs for tape access for this type of problem. (229)

8 KB (1,074 words) - 09:36, 18 September 2018
RAL Tier1 Incident 20150408 network intervention preceding Castor upgrade

==RAL-LCG2 Incident 20150408 network intervention preceding Castor upgrade== ... to resolve (and was not finally cleared until the following morning.) The Castor update had to be backed out and there were some problems in doing this.

15 KB (2,406 words) - 16:43, 17 August 2015
Operations Bulletin 130415

** T0 & T1 services: CERN CASTOR upgrades 6th and 7th April. T1s various SE upgrades. ** ALICE: High activity. Most VO Boxes now use RFC proxies. Various CERN CASTOR instabilities. CRL URL being firewalled for IPv6.

44 KB (5,515 words) - 13:09, 12 April 2015
RAL Tier1 weekly operations castor 27/04/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * testing CASTOR rebalancer (new version in 2.1.14-15)

3 KB (520 words) - 09:25, 1 May 2015
Tier1 Operations Report 2015-04-15

...e Tier1 network and caused significant disruption of services. The planned Castor upgrade for that day had to be abandoned. A post mortem is being generated * Castor was successfully upgraded to version 2.1.14-15 this morning.

13 KB (1,382 words) - 13:27, 15 April 2015
Operations Bulletin 200415

** T0 & T1 services: CERN CASTOR upgrades 6th and 7th April. T1s various SE upgrades. ** ALICE: High activity. Most VO Boxes now use RFC proxies. Various CERN CASTOR instabilities. CRL URL being firewalled for IPv6.

49 KB (6,238 words) - 08:07, 18 April 2015
RAL Tier1 weekly operations castor 20/04/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Tier 1 CASTOR 2.1.14-15 upgrade completed successfully

3 KB (542 words) - 13:38, 20 April 2015
RAL Tier1 Incident 20150417 ElasticTape truncation of input tarballs

Extensive loss/corruption of CEDA data stored in Facilities CASTOR instance ** This hardware is then deployed into CASTOR.

3 KB (435 words) - 13:57, 24 April 2015
Operations Bulletin 270415

** T0 & T1 services: CERN CASTOR upgrades 6th and 7th April. T1s various SE upgrades. ** ALICE: High activity. Most VO Boxes now use RFC proxies. Various CERN CASTOR instabilities. CRL URL being firewalled for IPv6.

43 KB (5,339 words) - 06:42, 27 April 2015
Operations Bulletin 040515

...E: New job record of 80k+. Taking advantage of resource availability. Some CASTOR file access instabilities for re-reco jobs. ...DIGI-RECO of Upgrade MC at T1s. At EOS unmerged area quota limit. Possible CASTOR connections saturation issue.

40 KB (4,831 words) - 21:51, 8 May 2015
Operations Bulletin 110515

...E: New job record of 80k+. Taking advantage of resource availability. Some CASTOR file access instabilities for re-reco jobs. ...DIGI-RECO of Upgrade MC at T1s. At EOS unmerged area quota limit. Possible CASTOR connections saturation issue.

40 KB (4,963 words) - 21:55, 8 May 2015
RAL Tier1 weekly operations castor 18/05/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Testing CASTOR rebalancer on preproduction, and developing associated tools.

4 KB (566 words) - 14:12, 15 May 2015
Tier1 Operations Report 2015-05-13

* Problems with Castor xroot response for CMS have been very acute. In particular the time require * Castor xroot performance problems seen by CMS - particularly in file open times.

14 KB (1,523 words) - 12:07, 13 May 2015
Operations Bulletin 180515

* ALICE: CASTOR at CERN - some re-reco job instabilities. * LHCb: Various operational issues reported - CASTOR/CERN SRM access problems; other data access issues.

46 KB (5,803 words) - 11:48, 16 May 2015
RAL Tier1 weekly operations castor 25/05/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] ...e are examining options for running this in a slow-and-steady fashion with CASTOR up.

4 KB (657 words) - 12:54, 22 May 2015
RAL Tier1 CASTOR Experiments Completed Actions 2012

... || Medium || || Andrew S || Discuss strategy for funding LSF in 2012 with CASTOR team || No longer necessary, since an LSF license has been purchased for th | 20120321-01 || Medium || ALICE || Lee, Shaun || Find out about the load on CASTOR from Japan || Closed. No longer relevant. || 2012-04-25

4 KB (566 words) - 09:26, 20 May 2015
Operations Bulletin 250515

* ALICE: CASTOR at CERN - some re-reco job instabilities. * LHCb: Various operational issues reported - CASTOR/CERN SRM access problems; other data access issues.

46 KB (5,732 words) - 18:32, 23 May 2015
Tier1 Operations Report 2015-05-27

...h an outage was added to the GOC DB - but removed before it became active. Castor was in a 'warning' state for some time. * Castor xroot performance problems seen by CMS - particularly in very long file ope

13 KB (1,470 words) - 11:04, 27 May 2015
RAL Tier1 weekly operations castor 01/06/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Mice (Castor Gen) will be operating overnight and able to call pri oncall

5 KB (830 words) - 15:06, 29 May 2015
Operations Bulletin 010615

* Tier-1problems with secondary database system for Castor - resolved quickly. * T0 services: CASTOR updated to 2.1.15. SRM validation ongoing. xroot is the main access protoco

43 KB (5,271 words) - 22:18, 31 May 2015
Tier1 Operations Report 2015-06-10

* There have been some severe problems with the CMS Castor instance that are traiggered by particular high loads. The amount of CMS wo * Castor xroot performance problems seen by CMS - particularly in very long file ope

16 KB (1,741 words) - 13:24, 10 June 2015
RAL Tier1 weekly operations castor 08/06/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * CASTOR rebalancing from Monday

6 KB (919 words) - 14:23, 5 June 2015
Operations Bulletin 080615

* Tier-1problems with secondary database system for Castor - resolved quickly. * T0 services: CASTOR updated to 2.1.15. SRM validation ongoing. xroot is the main access protoco

43 KB (5,271 words) - 13:02, 6 June 2015
RAL Tier1 weekly operations castor 15/06/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * CASTOR Gen rebalancing underway

5 KB (750 words) - 11:09, 12 June 2015
Operations Bulletin 150615

* Tier-1problems with secondary database system for Castor - resolved quickly. ...hts/summary of the Tier1 Monday operations meeting (Grid Services; Fabric; CASTOR and Other)

43 KB (5,391 words) - 15:50, 14 June 2015
RAL Tier1

! VO !! Batch !! CASTOR !! Echo !! LFC |[[castor|Castor]]|| [[RAL Tier1 CASTOR SRM]]

5 KB (623 words) - 08:48, 22 February 2021
Tier1 Operations Report 2015-08-26

...of migrations to tape for Atlas as the instance was seeing very high load. Castor caught up with the backlog overnight Thursday/Friday (21/21 Aug) - and ther ...ues for CMS. There is a problem with the Xroot (AAA) redirection accessing Castor and file open times using Xroot are slow. The poor batch job efficiencies h

14 KB (1,579 words) - 07:36, 27 August 2015
Tier1 Operations Report 2015-06-17

* The behaviour of the CMS Castor instance has been improved since a change made last week. * A problem with a couple of entries in the Castor gridmap file was flagged up by LHCb. This was chased to the relevant updati

15 KB (1,692 words) - 12:10, 17 June 2015
RAL Tier1 weekly operations castor 22/06/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Facilities CASTOR - change to time to write to tape from 30 mins to 5 mins now

5 KB (799 words) - 09:33, 22 June 2015
Operations Bulletin 220615

* Tier-1problems with secondary database system for Castor - resolved quickly. ...hts/summary of the Tier1 Monday operations meeting (Grid Services; Fabric; CASTOR and Other)

45 KB (5,632 words) - 13:32, 21 June 2015
RAL Tier1 weekly operations castor 06/07/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Juno (CASTOR Facilities) Oracle update to 11.2.0.4

6 KB (974 words) - 16:10, 3 July 2015
Tier1 Operations Report 2015-07-01

... week we reported that the change had already been made for the LHCb & CMS Castor instances.) ...ersion of xrootd (upgraded from version 3.3.3 to 3.3.6) was applied to the Castor GEN instance to enable 3rd party transfers for Alice.

13 KB (1,370 words) - 13:22, 1 July 2015
Tier1 Operations Report 2015-06-24

* As reported at the last meeting. AtlasDataDisk in Castor became full on the morning of Wed 17th June. Four additional disk servers w ... this for some years). Stored checksums have been retrospectively added to Castor for these cases.

15 KB (1,738 words) - 13:30, 24 June 2015
RAL Tier1 weekly operations castor 29/06/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Change to improve file open times on CASTOR (central db, subrequest todo procedure) - has now been deployed to LHCb and

6 KB (938 words) - 12:34, 1 July 2015
Operations Bulletin 290615

* ALICE: High activity. CASTOR issue with xrd3cp. Request sites to plan for Xrootd v4.1. ...hts/summary of the Tier1 Monday operations meeting (Grid Services; Fabric; CASTOR and Other)

45 KB (5,792 words) - 21:55, 28 June 2015
Operations Bulletin 060715

* ALICE: High activity. CASTOR issue with xrd3cp. Request sites to plan for Xrootd v4.1. ...hts/summary of the Tier1 Monday operations meeting (Grid Services; Fabric; CASTOR and Other)

45 KB (5,742 words) - 21:15, 5 July 2015
Tier1 Operations Report 2015-07-15

...f a 'wait' to study the effect on file open times. Initial change for LHCb Castor instance this morning (Wed 15th July). ... Oracle 11.2.0.4. This will affect all services that use Oracle databases: Castor, Atlas Frontier (LFC done)

12 KB (1,261 words) - 11:39, 15 July 2015
Tier1 Operations Report 2015-07-08

* A test of using the Castor re-balancer more aggressively was made. However, these showed some internal ...Wednesday afternoon (1st July). This change has now been rolled out to all Castor instances.

13 KB (1,329 words) - 09:31, 8 July 2015
RAL Tier1 weekly operations castor 13/07/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Proposed CASTOR face to face W/C Oct 5th or 12th

6 KB (1,039 words) - 08:28, 14 July 2015
RAL Tier1 weekly operations castor 20/07/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Proposed CASTOR face to face W/C Oct 5th or 12th

3 KB (509 words) - 11:07, 24 July 2015
Operations Bulletin 200715

...i evening Jun 26 caused all VOBOXes for CERN to become unreachable; latest CASTOR versions are incompatible with old xrd3cp implementation. ...hts/summary of the Tier1 Monday operations meeting (Grid Services; Fabric; CASTOR and Other)

46 KB (5,777 words) - 09:41, 20 July 2015
Tier1 Operations Report 2015-07-22

* A race condition has been uncovered in Castor whereby some stored files do not have the correct location on disk recorded ...as creating multiple bad copies. This is not a problem seen by the VOs (as Castor will still use the original good copy) - but it needs to be resolved.

13 KB (1,372 words) - 11:46, 22 July 2015
RAL Tier1 weekly operations castor 27/07/2015

[https://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor List of CASTOR meetings] * Proposed CASTOR face to face W/C Oct 5th or 12th

3 KB (535 words) - 15:22, 27 July 2015
Tier1 Operations Report 2015-07-29

...es for CMS. These are a problem with the Xroot (AAA) redirection accessing Castor; Slow file open times using Xroot; and poor batch job efficiencies. ... Oracle 11.2.0.4. This will affect all services that use Oracle databases: Castor, Atlas Frontier (LFC done)

13 KB (1,341 words) - 12:17, 29 July 2015
RAL Tier1 weekly operations castor 24/08/2015

** all VOs / all castor disks * Upgrade CASTOR disk servers to SL6

3 KB (488 words) - 11:29, 21 August 2015
RAL Tier1 weekly operations castor 03/08/2015

* Proposed CASTOR face to face W/C Oct 5th or 12th * Upgrade CASTOR disk servers to SL6

3 KB (569 words) - 15:00, 3 August 2015
Tier1 Operations Report 2015-08-12

...es for CMS. These are a problem with the Xroot (AAA) redirection accessing Castor; Slow file open times using Xroot; and poor batch job efficiencies. * Investigative work into the ongoing issues for CMS Castor. We have now changed the I/O scheduler on the disk servers.

13 KB (1,380 words) - 13:20, 12 August 2015
Tier1 Operations Report 2015-08-19

...es for CMS. These are a problem with the Xroot (AAA) redirection accessing Castor; Slow file open times using Xroot; and poor batch job efficiencies. The cha * Upgrade of Castor disk servers to SL6. We plan to do this for the D1T0 Service Classes on the

14 KB (1,580 words) - 13:55, 19 August 2015
SL5 status

| Castor SRM systems. Small number of internal service machines. Database systems on

2 KB (318 words) - 15:36, 29 March 2016
Tier1 Operations Report 2015-08-05

...es for CMS. These are a problem with the Xroot (AAA) redirection accessing Castor; Slow file open times using Xroot; and poor batch job efficiencies. ...sues for CMS Castor. This included putting the CMS xroot reads through the Castor scheduler again.

15 KB (1,588 words) - 12:05, 5 August 2015
RAL Tier1 weekly operations castor 10/08/2015

* Upgrade CASTOR disk servers to SL6 * Proposed CASTOR face to face W/C Oct 5th or 12th

3 KB (539 words) - 14:09, 7 August 2015
Operations Bulletin 100815

* ALICE: High activity. Raw data copies in CASTOR timing out at CERN and job submissions have become slow a few times. NDGF r ...ading Castor disk servers OS during second half of August and updating the Castor Oracle database during September.

52 KB (6,730 words) - 22:58, 9 August 2015
Operations Bulletin 030815

* ALICE: High activity. Raw data copies in CASTOR timing out at CERN and job submissions have become slow a few times. NDGF r ...ading Castor disk servers OS during second half of August and updating the Castor Oracle database during September.

52 KB (6,730 words) - 23:00, 9 August 2015
RAL Tier1 weekly operations castor 17/08/2015

* Upgrade CASTOR disk servers to SL6 * Proposed CASTOR face to face W/C Oct 5th or 12th

2 KB (336 words) - 13:26, 14 August 2015
Operations Bulletin 170815

* ALICE: High activity. Raw data copies in CASTOR timing out at CERN and job submissions have become slow a few times. NDGF r ...ading Castor disk servers OS during second half of August and updating the Castor Oracle database during September.

48 KB (6,103 words) - 23:03, 16 August 2015
Operations Bulletin 240815

* ALICE: High activity. Raw data copies in CASTOR timing out at CERN and job submissions have become slow a few times. NDGF r ...p (provisionally for 26/27 August). Also finalizing plans for updating the Castor Oracle database during September/October. (See dates in report linked just

45 KB (5,578 words) - 19:59, 22 August 2015
RAL Tier1 weekly operations castor 31/08/2015

** all VOs / all castor disks * Upgrade CASTOR disk servers to SL6

4 KB (596 words) - 10:39, 28 August 2015
Operations Bulletin 310815

* CERN is going to disable in few weeks write operations via RFIO v2 to Castor in the context of the RFIO access decommission. * The Castor disk servers are being upgraded to SL6 (26/27 Aug).

41 KB (5,000 words) - 04:11, 1 September 2015
Tier1 Operations Report 2015-09-02

* Following the upgrade of Castor disk servers in the disk-only service classes on Wednesday and Thursday las ...ues for CMS. There is a problem with the Xroot (AAA) redirection accessing Castor - although this has been much better understood in the last week. The poor

14 KB (1,541 words) - 11:10, 2 September 2015
RAL Tier1 weekly operations castor 07/09/2015

** all VOs / all castor disks * Upgrade CASTOR disk servers to SL6

4 KB (617 words) - 10:35, 4 September 2015
Operations Bulletin 070915

* CERN is going to disable in few weeks write operations via RFIO v2 to Castor in the context of the RFIO access decommission. * The Castor disk servers are being upgraded to SL6 (26/27 Aug).

43 KB (5,351 words) - 16:49, 6 September 2015
Operations Bulletin 140915

* CERN is going to disable in few weeks write operations via RFIO v2 to Castor in the context of the RFIO access decommission. * The Castor disk servers are being upgraded to SL6 (26/27 Aug).

44 KB (5,604 words) - 10:22, 15 September 2015
Tier1 Operations Report 2015-09-16

* At the end of last week there was a problem with the Atlas Castor instance - which a few times became unresponsive for some tens of minutes. ...es). A new server has been provided for the CMS Xroot (AAA) redirection to Castor but the problems remain. File open times using Xroot remain slow but this i

15 KB (1,569 words) - 12:24, 16 September 2015
RAL Tier1 weekly operations castor 21/09/2015

** all VOs / all castor disks * Upgrade CASTOR disk servers to SL6

4 KB (651 words) - 10:23, 18 September 2015
Operations Bulletin 210915

* CERN is going to disable in few weeks write operations via RFIO v2 to Castor in the context of the RFIO access decommission. * The first step of the update of the Oracle databases behind Castor was made on Tuesday 15th. There are further steps to do - as announced in t

44 KB (5,552 words) - 22:25, 19 September 2015
Tier1 Operations Report 2015-09-23

...tember). This has re-enabled the Oracle DataGuard copy for the Atlas & GEN Castor stager databases. * Updating the first batch of the remaining Castor disk servers (those in tape-backed service classes) to SL6. This will be do

14 KB (1,604 words) - 12:01, 23 September 2015
Operations Bulletin 290915

* The first step of the update of the Oracle databases behind Castor was made on Tuesday 15th. There are further steps to do - as announced in t

45 KB (5,699 words) - 08:31, 28 September 2015
Tier1 Operations Report 2015-09-30

...ed a current problem that sometimes occurs when writing these files to our Castor storage. | All Castor

13 KB (1,415 words) - 12:15, 30 September 2015
RAL Tier1 weekly operations castor 02/10/2015

...d from castor and back to fabric to gather spares cv11 spec – no further castor action. ** all VOs / all castor disks

5 KB (886 words) - 10:45, 2 October 2015
Tier1 Operations Report 2015-10-14

* We have been seeing problems with very high load on the AtlasTape Castor instance. This was leading to a big backlog of files waiting to go to tape ...ed a current problem that sometimes occurs when writing these files to our Castor storage.

13 KB (1,416 words) - 09:37, 14 October 2015
Operations Bulletin 191015

...o version 11.2.0.4 is taking place '''today'''. At the time of the meeting Castor is down.

46 KB (5,818 words) - 10:21, 19 October 2015
Operations Bulletin 121015

...o version 11.2.0.4 is taking place '''today'''. At the time of the meeting Castor is down. This is the upgrade of the "Pluto" database which hosts the Namese

52 KB (6,786 words) - 16:22, 12 October 2015
Tier1 Operations Report 2015-10-07

...ed a current problem that sometimes occurs when writing these files to our Castor storage. ...roblem on this server (failed battery on the RAID card) during yesterday's Castor outage uncovered a further problem and the server has been kept out of serv

14 KB (1,524 words) - 14:55, 8 October 2015
RAL Tier1 weekly operations castor 09/10/2015

* CASTOR 2.1.15 * Proposed CASTOR face to face W/C Oct 5th or 12th

4 KB (637 words) - 12:47, 9 October 2015
RAL Tier1 weekly operations castor 16/10/2015

* The checksum issue/tickets still present. These are thought to be due to a CASTOR bug fixed in 2.1.15. * CASTOR 2.1.15

2 KB (401 words) - 13:06, 16 October 2015
RAL Tier1 weekly operations castor 16/11/2015

* RA, SdW, GTF and AS have been to CERN for a CASTOR face-to-face meeting * Disk servers name lookup issue (CV11's) - more system than CASTOR. Currently holding CV11 upgrades until understood.

3 KB (478 words) - 16:01, 16 November 2015
Tier1 Operations Report 2015-10-21

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * Upgrade of remaining Castor disk servers (those in tape-backed service classes) to SL6. This will be tr

12 KB (1,299 words) - 11:10, 21 October 2015
Operations Bulletin 261015

* The final step in the upgrade of the Castor Oracle databases to version 11.2.0.4 took place successfully last Wednesday ... investigating why LHCB batch jobs sometimes fail to write results back to Castor (and the sometimes fail to write remotely as well).

47 KB (6,004 words) - 20:08, 25 October 2015
RAL Tier1 weekly operations castor 23/10/2015

* CASTOR 2.1.15 == Issues to bring up at CASTOR F2F ==

2 KB (345 words) - 15:23, 27 October 2015
Tier1 Operations Report 2015-10-28

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * Upgrade of remaining Castor disk servers (those in tape-backed service classes) to SL6. This will be tr

12 KB (1,221 words) - 12:58, 28 October 2015
Operations Bulletin 021115

... investigating why LHCB batch jobs sometimes fail to write results back to Castor (and the sometimes fail to write remotely as well).

48 KB (6,098 words) - 09:45, 2 November 2015
Tier1 Operations Report 2015-11-11

* We are investigating a problem that has affected some Castor disk servers ability to make DNS lookups. ...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed)

12 KB (1,234 words) - 14:05, 11 November 2015
RAL Tier1 weekly operations castor 30/10/2015

Castor ops 23/10/15 11-2-04 client updates – 2.1.15 prerequisite … has to go on castor headnodes

795 B (124 words) - 09:53, 9 November 2015
Tier1 Operations Report 2015-11-04

* We again saw high load on the AtlasTape Castor instance - exacerbated by the failure of some disk servers in the cache in ...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed)

12 KB (1,253 words) - 10:56, 4 November 2015
RAL Tier1 weekly operations castor 09/11/2015

* RA, SdW, GTF and AS have been to CERN for a CASTOR face-to-face meeting * CASTOR 2.1.15

2 KB (374 words) - 17:37, 6 November 2015
Operations Bulletin 091115

... investigating why LHCB batch jobs sometimes fail to write results back to Castor (and the sometimes fail to write remotely as well).

54 KB (7,032 words) - 12:38, 8 November 2015
RAL Tier1 weekly operations castor 30/11/2015

* LHCb batch jobs failing to copy results into castor - changes made seems to have improved the situation but not fix (Raja). Inc * RA, SdW, GTF and AS have been to CERN for a CASTOR face-to-face meeting

5 KB (850 words) - 11:33, 27 November 2015
Operations Bulletin 161115

... investigating why LHCB batch jobs sometimes fail to write results back to Castor (and the sometimes fail to write remotely as well).

48 KB (6,095 words) - 12:37, 16 November 2015
RAL Tier1 weekly operations castor 23/11/2015

• GS/RA to revisit the CASTOR decommissioning process in light of the production team updates to their de • JJ – Glue 2 for CASTOR, something to do with publishing information??? Not sure there was a speci

2 KB (306 words) - 16:22, 24 November 2015
Tier1 Operations Report 2015-11-18

* We are investigating a problem that has affected some Castor disk servers ability to make name (DNS & NIS) lookups. ...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed)

12 KB (1,287 words) - 09:15, 18 November 2015
Tier1 Operations Report 2015-11-25

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * Upgrade of remaining Castor disk servers (those in tape-backed service classes) to SL6. This will be tr

12 KB (1,218 words) - 09:50, 25 November 2015
Operations Bulletin 301115

... investigating why LHCB batch jobs sometimes fail to write results back to Castor (and the sometimes fail to write remotely as well). A recent change has imp

48 KB (6,163 words) - 16:24, 29 November 2015
RAL Tier1 weekly operations castor 07/12/2015

* LHCb batch jobs failing to copy results into castor - changes made seems to have improved the situation but not fix (Raja). Inc * RA, SdW, GTF and AS have been to CERN for a CASTOR face-to-face meeting

6 KB (1,018 words) - 12:25, 4 December 2015
Operations Bulletin 071215

... investigating why LHCB batch jobs sometimes fail to write results back to Castor (and the sometimes fail to write remotely as well). A recent change has imp

47 KB (5,899 words) - 20:49, 6 December 2015
Operations Bulletin 110116

...em during the day on Monday 4th January with timeouts occurring within the Castor system. This problem has now disappeared but we do not yet understand the c ... investigating why LHCb batch jobs sometimes fail to write results back to Castor (and the sometimes fail to write remotely as well). A recent change has imp

47 KB (5,834 words) - 10:11, 11 January 2016
RAL Tier1 weekly operations castor 14/12/2015

* LHCb batch jobs failing to copy results into castor - changes made seems to have improved the situation but not fix (Raja). Inc * RA, SdW, GTF and AS have been to CERN for a CASTOR face-to-face meeting

7 KB (1,141 words) - 15:00, 11 December 2015
Operations Bulletin 141215

... investigating why LHCB batch jobs sometimes fail to write results back to Castor (and the sometimes fail to write remotely as well). A recent change has imp ...our external data path) this morning. There is another change that affects Castor tomorrow morning and a 'warning' has been declared in the GOC DB.

44 KB (5,454 words) - 14:43, 17 December 2015
Operations Bulletin 211215

... investigating why LHCB batch jobs sometimes fail to write results back to Castor (and the sometimes fail to write remotely as well). A recent change has imp Castor not publishing glue2. Stephen Burke has offered a hand with the task. In pr

53 KB (6,852 words) - 11:31, 21 December 2015
Tier1 Operations Report 2016-01-06

...ing the day on Monday 4th January when there were internal problems within Castor. The problem lasted roughly the length of the working day but is not yet un ...sDataDisk - D1T0) Was taken out of service on the 4th January - one of the Castor processes was failing to run. Following investigations it was returned to p

16 KB (1,824 words) - 12:29, 6 January 2016
Operations Bulletin 281215

... investigating why LHCB batch jobs sometimes fail to write results back to Castor (and the sometimes fail to write remotely as well). A recent change has imp Castor not publishing glue2. Stephen Burke has offered a hand with the task. In pr

53 KB (6,920 words) - 15:52, 4 January 2016
Deployment Team Completed Actions

| Speak to Andrew about the acceptance testing of CASTOR | Check CASTOR sticky bit support

68 KB (11,032 words) - 13:08, 16 September 2016
Tier1 Operations Report 2015-12-23

...ne of the disk servers in the disk cache and a parameter introduced in the Castor 2.1.15 tape servers that delayed the reporting of when files were read from ...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed)

15 KB (1,710 words) - 11:39, 23 December 2015
Past Ticket Bulletins 2016

Castor not publishing glue2 - Jens updated in October with word that available eff Glue 2 publishing for Castor. Jens provided an update last month (thanks Jens!), citing the lack of reso

150 KB (23,740 words) - 12:54, 9 January 2017
Tier1 Operations Report 2016-01-27

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * Castor:

13 KB (1,350 words) - 10:02, 27 January 2016
Tier1 Operations Report 2016-01-13

...noop') which it is planned to rollout everywhere, and adjusting one of the Castor parameters. Some throttling in the FTS was also done. The CMS SRM SAM test ...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed)

13 KB (1,458 words) - 14:30, 13 January 2016
RAL Tier1 weekly operations castor 15/01/2016

* Gfal-cat command failing for atlas reading of nsdumps form castor: https://ggus.eu/index.php?mode=ticket_info&ticket_id=117846. Developers lo * LHCb batch jobs failing to copy results into castor - changes made seems to have improved the situation but not fix (Raja). Inc

7 KB (1,085 words) - 16:11, 18 January 2016
Operations Bulletin 180116

...em during the day on Monday 4th January with timeouts occurring within the Castor system. This problem has now disappeared but we do not yet understand the c ... investigating why LHCb batch jobs sometimes fail to write results back to Castor (and the sometimes fail to write remotely as well). A recent change has imp

51 KB (6,516 words) - 23:59, 18 January 2016
Tier1 Operations Report 2016-03-02

* There was a problem in the "CIP" machine that inputs Castor information to the BDIIs. The standby CIP was used between Thursday (25th F ...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed)

11 KB (1,138 words) - 08:33, 2 March 2016
RAL Tier1 weekly operations CASTOR 18/01/2019

#REDIRECT [[RAL Tier1 weekly operations castor 18/01/2019]]

59 B (6 words) - 12:58, 8 February 2019
RAL Tier1 weekly operations castor 07/7/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

2 KB (313 words) - 08:05, 14 July 2017
Operations Bulletin 281116

* Owing to staff availability the upgrade of Castor to version 2.1.15 is being scheduled to take place in January. * A problem on the Castor 'GEN' instance started on Monday (21st Nov) following the renewal of the ho

46 KB (5,834 words) - 12:12, 28 November 2016
RAL Tier1 weekly operations castor 25/01/2016

...S tape no longer an issue, following disk server failure and test files in castor cache * Gfal-cat command failing for atlas reading of nsdumps form castor:

7 KB (1,203 words) - 17:47, 23 January 2016
RAL Tier1 weekly operations castor 01/02/2016

* Gfal-cat command failing for atlas reading of nsdumps form castor: https://ggus.eu/index.php?mode=ticket_info&ticket_id=117846. Developers lo * LHCb batch jobs failing to copy results into castor - changes made seems to have improved the situation but not fix (Raja). Inc

4 KB (583 words) - 17:45, 29 January 2016
Operations Bulletin 010216

... investigating why LHCb batch jobs sometimes fail to write results back to Castor (and the sometimes fail to write remotely as well). A recent change has imp

47 KB (5,867 words) - 21:18, 31 January 2016
Tier1 Operations Report 2016-02-10

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * Update the repack Castor repack instance from version 2.1.14.13 to 2.1.14.15. (Proposed for 10/2/16)

15 KB (1,664 words) - 09:48, 10 February 2016
RAL Tier1 weekly operations castor 08/02/2016

* Gfal-cat command failing for atlas reading of nsdumps form castor: https://ggus.eu/index.php?mode=ticket_info&ticket_id=117846. Developers lo * LHCb batch jobs failing to copy results into castor - changes made seems to have improved the situation but not fix (Raja). Inc

4 KB (565 words) - 10:29, 12 February 2016
RAL Tier1 weekly operations castor 15/02/2016

* castor 2.1.16 coming soon - SRM integration into CASTOR code base * Gfal-cat command failing for atlas reading of nsdumps form castor: https://ggus.eu/index.php?mode=ticket_info&ticket_id=117846. Developers lo

4 KB (640 words) - 13:34, 17 February 2016
Operations Bulletin 080216

... investigating why LHCb batch jobs sometimes fail to write results back to Castor (and the sometimes fail to write remotely as well). A recent change has imp

46 KB (5,812 words) - 23:00, 8 February 2016
Operations Bulletin 150216

CMS AAA tests failing. Andrew L reports that the CASTOR headnode has received what sounds like a big fix which will hopefully impro Another publishing ticket. How we love those! This one about CASTOR not publishing GLUE 2. Code was written by Jens and Rob but not integrated,

54 KB (7,071 words) - 09:10, 15 February 2016
RAL Tier1 weekly operations castor 22/02/2016

* castor 2.1.15 update * castor 2.1.16 coming soon - SRM integration into CASTOR code base

4 KB (703 words) - 14:52, 19 February 2016
Tier1 Operations Report 2016-02-24

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) ...are hopeful of carrying out the Nameserver part of the update (requiring a Castor down) next week.

12 KB (1,237 words) - 09:21, 24 February 2016
RAL Tier1 weekly operations castor 29/02/2016

* glibc updates applied, all CASTOR systems rebooted. initial issues with head nodes, 7 failed to reboot due t * CASTOR facilities patching scheduled for next week - detailed schedule to be agree

3 KB (557 words) - 11:32, 26 February 2016
Operations Bulletin 290216

* Lots of patching ongoing (Castor, FTS today).

45 KB (5,688 words) - 11:06, 29 February 2016
RAL Tier1 weekly operations castor 07/03/2016

* glibc updates applied, all CASTOR systems rebooted. initial issues with head nodes, 7 failed to reboot due t * CASTOR facilities patching scheduled for next week - detailed schedule to be agree

4 KB (643 words) - 11:40, 4 March 2016
Operations Bulletin 070316

* Castor 2.1.15 update is waiting. This is pending resolution of a problem found dur

43 KB (5,347 words) - 17:58, 6 March 2016
Tier1 Operations Report 2016-03-09

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * The Castor 2.1.15 update is pending. Testing has shown a database related performance

12 KB (1,208 words) - 15:53, 9 March 2016
Tier1 Operations Report 2017-04-26

...r although some significant problems have remained. The total load on LHCb Castor has had to be managed by LHCb and the there have been times when the SRMs c * There was a problem with Castor for CMS overnight last night (Tuesday/Wednesday 25/26 April) when the datab

14 KB (1,457 words) - 10:30, 26 April 2017
Tier1 Operations Report 2016-10-05

...raised a GGUS alarm ticket. The problems was resolved during Sunday by the Castor on-call. Although there was space in the disk pool as a whole, some of the ...rnoon) and, whether by this or the particular jobs ending, AtlasScratch in Castor recovered. The limits on the Atlas pilot jobs was removed this morning (Wed

14 KB (1,519 words) - 08:06, 6 October 2016
RAL Tier1 weekly operations castor 14/03/2016

* glibc updates applied, all CASTOR systems rebooted. initial issues with head nodes, 7 failed to reboot due t ...pen times - Andrey reports that CERN use 100GB of memory for DB servers in castor to run 2.1.15 (vs our 32GB), Oracle are not providing adequate support at t

5 KB (810 words) - 13:23, 14 March 2016
Operations Bulletin 140316

* Castor 2.1.15 update is waiting. This is pending resolution of a problem found dur * "GEN Scratch" storage in Castor will be decommissioned.

47 KB (6,060 words) - 23:16, 13 March 2016
RAL Tier1 weekly operations castor 21/03/2016

...pen times - Andrey reports that CERN use 100GB of memory for DB servers in castor to run 2.1.15 (vs our 32GB), Oracle are not providing adequate support at t * Could not drain gdss702 (castor 2.1.15) in Preprod (all files failed according to draindiskserver -q) - doe

3 KB (493 words) - 16:23, 18 March 2016
Operations Bulletin 210316

... resolution of a problem around memory usage by the Oracle database behind Castor. * "GEN Scratch" storage in Castor will be decommissioned.

42 KB (5,278 words) - 01:55, 20 March 2016
Tier1 Operations Report 2016-03-23

* Security updates have been rolled out. There an outage of Castor last Thursday during which all the component servers were successfully upda ...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed)

12 KB (1,283 words) - 21:43, 22 March 2016
Operations Bulletin 280316

... resolution of a problem around memory usage by the Oracle database behind Castor. * "GEN Scratch" storage in Castor will be decommissioned.

44 KB (5,455 words) - 02:03, 29 March 2016
Tier1 Operations Report 2016-03-30

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * The draining of Castor disk servers is very slow. We need to drain a few old servers to provide sp

13 KB (1,394 words) - 11:01, 30 March 2016
RAL Tier1 weekly operations castor 01/04/2016

...pen times - Andrey reports that CERN use 100GB of memory for DB servers in castor to run 2.1.15 (vs our 32GB), Oracle are not providing adequate support at t * CASTOR 2.1.15

2 KB (380 words) - 16:05, 1 April 2016
Operations Bulletin 040416

... resolution of a problem around memory usage by the Oracle database behind Castor. * "GEN Scratch" storage in Castor will be decommissioned.

44 KB (5,446 words) - 23:24, 3 April 2016
Tier1 Operations Report 2016-04-06

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * The draining of Castor disk servers is very slow. We need to drain a few old servers to provide sp

12 KB (1,352 words) - 12:13, 6 April 2016
RAL Tier1 weekly operations castor 08/04/2016

* CERN steered us not to move to SRM 2.1.14 before castor 2.1.15 ...pen times - Andrey reports that CERN use 100GB of memory for DB servers in castor to run 2.1.15 (vs our 32GB), Oracle are not providing adequate support at t

3 KB (459 words) - 10:47, 8 April 2016
Tier1 Operations Report 2016-04-13

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * The draining of Castor disk servers for Atlas is very slow. We need to drain a few old servers to

12 KB (1,266 words) - 09:46, 20 April 2016
RAL Tier1 weekly operations castor 15/04/2016

* 2.1.16 castor needs deployment to tape servers * CERN steered us not to move to SRM 2.1.14 before castor 2.1.15

3 KB (532 words) - 10:57, 15 April 2016
Operations Bulletin 180416

...ind Castor. In the meantime we will carry out the (separate) update of the Castor SRMs to version 2.14. * "GEN Scratch" storage in Castor will be decommissioned.

40 KB (4,974 words) - 12:23, 17 April 2016
RAL Tier1 weekly operations castor 18/11/2016

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

2 KB (322 words) - 11:08, 21 November 2016
Tier1 Operations Report 2016-04-20

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * The draining of Castor disk servers for Atlas is very slow. We need to drain a few old servers to

13 KB (1,408 words) - 10:29, 20 April 2016
RAL Tier1 weekly operations castor 22/04/2016

7.Anything for CASTOR-Fabric? * gfalcat does not work with castor, underlying issue fixed for gfalcopy but not gfalcat (gfal developers respo

3 KB (429 words) - 09:34, 6 May 2016
Operations Bulletin 020516

...time. However, following advice from CERN this will not be done before the Castor update. * "GEN Scratch" storage in Castor will be decommissioned. This has been announced via an EGI broadcast.

46 KB (5,853 words) - 07:32, 9 May 2016
Tier1 Operations Report 2016-04-27

...his week we are concerned to ensure that the "gfal" commands fully support Castor. This has not been the case so far. ...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed)

13 KB (1,468 words) - 13:32, 27 April 2016
Tier1 Operations Report 2016-05-04

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * The draining of Castor disk servers for Atlas using the inbuilt Castor tool is very slow. We need to drain a few old servers to provide spares - t

12 KB (1,309 words) - 10:33, 10 May 2016
Operations Bulletin 090516

...time. However, following advice from CERN this will not be done before the Castor update. * "GEN Scratch" storage in Castor will be decommissioned. This has been announced via an EGI broadcast.

46 KB (5,853 words) - 07:33, 9 May 2016
Tier1 Operations Report 2016-05-18

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * All Tapeservers now running Castor version 2.1.16-0

14 KB (1,576 words) - 08:37, 25 May 2016
Tier1 Operations Report 2016-05-25

...This can be seen in the HammerCloud results below. Problem traced to Atlas Castor instance which was fixed during the day. ...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed)

14 KB (1,613 words) - 10:29, 25 May 2016
Tier1 Operations Report 2016-10-19

...e load on Castor. The weighting of gridftp jobs on atlasScratchDisk within Castor was changed on 10th Oct to increase the job capacity. ...ther sites by a particular ILC user. This has affected other VOs access to Castor (for example ALICE). We are in contact with the user to tackle this.

16 KB (1,868 words) - 12:28, 19 October 2016
Tier1 Operations Report 2016-05-11

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * The draining of Castor disk servers for Atlas using the inbuilt Castor tool is very slow. We need to drain a few old servers to provide spares - t

15 KB (1,734 words) - 10:46, 11 May 2016
RAL Tier1 weekly operations castor 13/05/2016

... There was a question as to how batch is turned back on, concerns swamping castor? * gfalcat does not work with castor, underlying issue fixed for gfalcopy but not gfalcat (gfal developers respo

3 KB (476 words) - 11:08, 13 May 2016
RAL Tier1 weekly operations castor 20/05/2016

New CASTOR functional testing using xrootd will be enabled on Monday 23/5/2016 == CASTOR issues ==

1 KB (237 words) - 13:03, 23 May 2016
Operations Bulletin 230516

...time. However, following advice from CERN this will not be done before the Castor update. * "GEN Scratch" storage in Castor will be decommissioned. This has been announced via an EGI broadcast.

45 KB (5,695 words) - 00:36, 24 May 2016
Operations Bulletin 300516

...time. However, following advice from CERN this will not be done before the Castor update. * "GEN Scratch" storage in Castor will be decommissioned. This has been announced via an EGI broadcast. Write

43 KB (5,401 words) - 01:24, 30 May 2016
Tier1 Operations Report 2016-06-01

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * Decommissioning of "GEN Scratch" storage in Castor. (Formally announced by EGI broadcast). Write access to this area has now b

13 KB (1,454 words) - 07:42, 7 June 2016
RAL Tier1 weekly operations castor 27/05/2016

Automated workflow for disk server deployment has been disabled New CASTOR functional testing using xrootd will be enabled on Monday 23/5/2016 CASTOR issues

3 KB (466 words) - 12:06, 1 June 2016
Operations Bulletin 060616

* "GEN Scratch" storage in Castor will be decommissioned. This has been announced via an EGI broadcast. Write ...GO service are in place; RAL T1 Castor SRM systems. SRM upgrade waiting on Castor upgrade.

44 KB (5,505 words) - 16:19, 3 June 2016
RAL Tier1 weekly operations castor 03/06/2016

7.Anything for CASTOR-Fabric? 40 files in atlas scratch had zero size in CASTOR namespace, BD declare lost to Atlas

3 KB (522 words) - 09:30, 10 June 2016
Tier1 Operations Report 2016-06-15

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * Decommissioning of "GEN Scratch" storage in Castor. (Formally announced by EGI broadcast). Write access to this area has now b

13 KB (1,445 words) - 10:54, 15 June 2016
Tier1 Operations Report 2016-06-08

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * Decommissioning of "GEN Scratch" storage in Castor. (Formally announced by EGI broadcast). Write access to this area has now b

14 KB (1,602 words) - 11:11, 8 June 2016
RAL Tier1 weekly operations castor 17/06/2016

The CASTOR 2.1.15 upgrade seems to work apart from the part that deals with the SRM re BD to review outstanding RT tickets on CASTOR queue

3 KB (485 words) - 10:59, 24 June 2016
RAL Tier1 weekly operations castor 10/06/2016

Further progress has been made with CASTOR 2.1.15 upgrade BD and CP to find out about zero-sized files on CASTOR facilities

3 KB (477 words) - 11:58, 10 June 2016
Operations Bulletin 130616

* "GEN Scratch" storage in Castor will be decommissioned. This has been announced via an EGI broadcast. Write ...GO service are in place; RAL T1 Castor SRM systems. SRM upgrade waiting on Castor upgrade.

51 KB (6,664 words) - 22:21, 13 June 2016
Operations Bulletin 200616

* "GEN Scratch" storage in Castor will be decommissioned. This has been announced via an EGI broadcast. Write ...GO service are in place; RAL T1 Castor SRM systems. SRM upgrade waiting on Castor upgrade.

44 KB (5,489 words) - 22:21, 19 June 2016
RAL Tier1 Incident 20160600 Tape Library Software Crashes

| Restarter cron added to all tape servers to cope with Castor mount "feature" and tape library problems.

4 KB (550 words) - 11:17, 22 June 2016
Operations Bulletin 040716

* All access to "GEN Scratch" storage in Castor has now been stopped. ...GO service are in place; RAL T1 Castor SRM systems. SRM upgrade waiting on Castor upgrade.

43 KB (5,335 words) - 16:28, 3 July 2016
RAL Tier1 weekly operations castor 24/06/2016

CASTOR will be replaced in CERN by 2022. Need to consider what will happen in RAL BD to review outstanding RT tickets on CASTOR queue

3 KB (602 words) - 12:30, 24 June 2016
Operations Bulletin 270616

* "GEN Scratch" storage in Castor will be decommissioned. This has been announced via an EGI broadcast. Write ...GO service are in place; RAL T1 Castor SRM systems. SRM upgrade waiting on Castor upgrade.

45 KB (5,697 words) - 11:17, 27 June 2016
Tier1 Operations Report 2016-06-29

* Yesterday afternoon (Tuesday 28th) there was a problem with the Atlas Castor instance that lasted for a few hours. This was resolved by a restart of pro ...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed)

15 KB (1,605 words) - 09:56, 29 June 2016
RAL Tier1 weekly operations castor 01/07/2016

CASTOR will be replaced in CERN by 2022. Need to consider what will happen in RAL CASTOR TEAM Durham / Leicester Dirac data - need to create separate tape pools / u

4 KB (632 words) - 12:50, 8 July 2016
Operations Bulletin 110716

...GO service are in place; RAL T1 Castor SRM systems. SRM upgrade waiting on Castor upgrade. Getting CASTOR to publish GLUE2 information. No news for a while on this, could do with an

50 KB (6,425 words) - 22:25, 11 July 2016
Tier1 Operations Report 2016-07-06

... of some of the monitoring checking tasks. This includes a validation that Castor is in step with the tape drives. Initial results suggest this enables us to ...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed)

12 KB (1,248 words) - 12:21, 6 July 2016
RAL Tier1 weekly operations castor 08/07/2016

CASTOR will be replaced in CERN by 2022. Need to consider what will happen in RAL CASTOR TEAM Durham / Leicester Dirac data - need to create separate tape pools / u

3 KB (614 words) - 13:12, 15 July 2016
Tier1 Operations Report 2016-07-20

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * Castor:

12 KB (1,173 words) - 12:29, 20 July 2016
Tier1 Operations Report 2016-07-13

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * Castor:

12 KB (1,293 words) - 14:21, 13 July 2016
RAL Tier1 weekly operations castor 12/08/2016

7. Anything for CASTOR-Fabric? The gridFTP problem in CASTOR 2.1.15 was fixed. Xroot remains to be fixed

1 KB (192 words) - 11:19, 12 August 2016
RAL Tier1 weekly operations castor 15/07/2016

Draining of gdss748 is complete. The server is out of castor and handed over to the fabric team to swap back drives with gdss755 ...upgrade continues liaising with CERN. Need to find the license under which CASTOR is distibuted for the new users.

3 KB (482 words) - 13:25, 15 July 2016
Operations Bulletin 180716

...GO service are in place; RAL T1 Castor SRM systems. SRM upgrade waiting on Castor upgrade.

45 KB (5,664 words) - 20:46, 17 July 2016
Tier1 Operations Report 2016-07-27

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * Castor:

13 KB (1,367 words) - 09:13, 27 July 2016
RAL Tier1 weekly operations castor 22/07/2016

7. Anything for CASTOR-Fabric? CASTOR TEAM Durham / Leicester Dirac data - need to create separate tape pools / u

3 KB (438 words) - 09:33, 29 July 2016
RAL Tier1 weekly operations castor 29/07/2016

# Anything for CASTOR-Fabric? CASTOR TEAM Durham / Leicester Dirac data - need to create separate tape pools / u

2 KB (351 words) - 10:17, 12 August 2016
RAL Tier1 weekly operations castor 05/08/2016

All 9 new Dell tape-backed disk servers have been deployed into CASTOR Good progress has been made with the CASTOR 2.1.15 upgrade. The gridFTP transfer problem was fixed and a configuration

2 KB (377 words) - 09:28, 12 August 2016
Tier1 Operations Report 2016-08-03

* There was a problem with the GEN Castor GEN instance caused by a particular disk server late on Saturday evening, 3 ...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed)

12 KB (1,328 words) - 15:58, 9 August 2016
RAL Tier1 weekly operations castor 26/08/2016

7. Anything for CASTOR-Fabric? Work on Castor 2.1.15 draining continues

1 KB (189 words) - 10:46, 26 August 2016
Operations Bulletin 220816

[https://ggus.eu/?mode=ticket_info&ticket_id=117683 117683] - Glue 2 for Castor 

45 KB (5,779 words) - 09:19, 22 August 2016
Tier1 Operations Report 2016-08-10

* We are seeing some errors in Castor relating to missing CAs. These started on the LHCb instance on the 25th Jul ...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed)

13 KB (1,381 words) - 10:57, 10 August 2016
Operations Bulletin 150816

Castor not publishing glue2 - which requires some development slogging. Any news?

50 KB (6,392 words) - 07:47, 16 August 2016
Tier1 Operations Report 2016-08-17

...ntion on the server but the corresponding cleanup had not been done in the Castor database. This has since been resolved. ...ne of the nodes that make up the Oracle RAC cluster that hosts some of the Castor databases. The fail-over of the databases running on that node (which happe

12 KB (1,323 words) - 08:20, 24 August 2016
Vo.DiRAC.ac.uk archived actions

...eep data access separate between DIRAC sites. If needed new voms roles and castor setup may need to be enabled. ...e to assign specific DNs associated with each site to their own uid within castor.

3 KB (401 words) - 11:22, 3 October 2017
Tier1 Operations Report 2016-08-24

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * Castor:

12 KB (1,164 words) - 13:13, 24 August 2016
RAL Tier1 weekly operations castor 19/08/2016

7. Anything for CASTOR-Fabric? Work on Castor 2.1.15 draining continues

1 KB (144 words) - 09:00, 26 August 2016
Operations Bulletin 290816

[https://ggus.eu/?mode=ticket_info&ticket_id=117683 117683] - Glue 2 for Castor 

45 KB (5,728 words) - 23:02, 29 August 2016
RAL Tier1 weekly operations castor 05/04/2019

** Tape library for CASTOR-side testing in progress now * CASTOR metric reporting for GridPP

4 KB (562 words) - 09:54, 5 April 2019
Tier1 Operations Report 2016-09-07

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) | All Castor that uses tape

13 KB (1,414 words) - 14:58, 7 September 2016
RAL Tier1 weekly operations castor 09/09/2016

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

2 KB (346 words) - 11:09, 9 September 2016
Tier1 Operations Report 2016-09-28

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * Castor:

12 KB (1,257 words) - 13:15, 28 September 2016
Tier1 Operations Report 2016-09-14

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * Castor:

12 KB (1,290 words) - 11:29, 14 September 2016
Operations Bulletin 190916

Developing glue2 support for Castor. Any update will do?! On hold (5/4)

52 KB (6,769 words) - 11:13, 18 September 2016
RAL Tier1 weekly operations castor 23/09/2016

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

1 KB (186 words) - 15:04, 27 September 2016
Tier1 Operations Report 2016-09-21

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * Castor:

12 KB (1,312 words) - 13:22, 28 September 2016
RAL Tier1 weekly operations castor 30/09/2016

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

2 KB (312 words) - 09:05, 5 October 2016
Operations Bulletin 171016

...nday (2nd Oct) LHCb raised a GGUS alarm ticket for a problem with the LHCb Castor instance. This was resolved during that day. This was traced to some disk s * Owing to staff availability the upgrade of Castor to version 2.1.15 is being scheduled to tale place in January.

47 KB (5,958 words) - 09:02, 15 October 2016
Operations Bulletin 101016

* Owing to staff availability the upgrade of Castor to version 2.1.15 is being scheduled to tale place in January. glue2 publishing for castor. Really, really, really could do with an update - even a null one! On hold

54 KB (7,110 words) - 14:59, 10 October 2016
Operations Bulletin 300117

* There were some problems following the Castor 2.1.15 update of the LHCb instance on the 18th. These problems were resolve ...g a lot of CMS SRM test failures - which are attributed to load on the CMS Castor instance.

46 KB (5,756 words) - 13:12, 30 January 2017
Operations Bulletin 211116

* Owing to staff availability the upgrade of Castor to version 2.1.15 is being scheduled to take place in January.

47 KB (6,026 words) - 09:22, 21 November 2016
Tier1 Operations Report 2017-03-15

* Yesterday (Tuesday 14th) there were problems with Castor during a reconfiguration of the back-end database nodes. A GGUS ticket was * Update Castor SRMs. Propose LHCb SRMs first - target date 22nd March.

15 KB (1,598 words) - 12:18, 15 March 2017
Operations Bulletin 241016

...nday (2nd Oct) LHCb raised a GGUS alarm ticket for a problem with the LHCb Castor instance. This was resolved during that day. This was traced to some disk s * Owing to staff availability the upgrade of Castor to version 2.1.15 is being scheduled to tale place in January.

50 KB (6,392 words) - 08:14, 23 October 2016
Tier1 Operations Report 2016-11-16

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * Castor:

12 KB (1,270 words) - 10:53, 17 November 2016
Tier1 Operations Report 2016-10-26

...ped. In essence we stopped the batch system on Monday (24th Oct). Storage (Castor) was able to continue running. At the time of the meeting we are testing th ...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed)

14 KB (1,523 words) - 13:14, 26 October 2016
RAL Tier1 weekly operations castor 07/10/2016

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

1 KB (188 words) - 09:25, 28 October 2016
RAL Tier1 weekly operations castor 14/10/2016

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

2 KB (230 words) - 09:31, 28 October 2016
RAL Tier1 weekly operations castor 28/10/2016

1. Castor 2.1.15 2. SL7 upgrade on tape servers 7. Anything for CASTOR-Fabric?

2 KB (255 words) - 10:01, 31 October 2016
Operations Bulletin 311016

* We have had load issues for both the Atlas Scratch Disk and on the Castor 'GEN' instance. This latter caused some access problems for all users of 'G * Owing to staff availability the upgrade of Castor to version 2.1.15 is being scheduled to tale place in January.

50 KB (6,401 words) - 08:48, 31 October 2016
Operations Bulletin 071116

* Owing to staff availability the upgrade of Castor to version 2.1.15 is being scheduled to tale place in January.

51 KB (6,619 words) - 11:34, 7 November 2016
Tier1 Operations Report 2016-11-02

... Wednesday afternoon (26th Oct) following patching. There was an outage of Castor yesterday (1st Nov) for all its systems to be rebooted to pick up the new k ...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed)

14 KB (1,581 words) - 17:12, 2 November 2016
RAL Tier1 weekly operations castor 04/11/2016

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

2 KB (314 words) - 12:34, 4 November 2016
Operations Bulletin 141116

* Owing to staff availability the upgrade of Castor to version 2.1.15 is being scheduled to tale place in January.

52 KB (6,786 words) - 05:33, 14 November 2016
Tier1 Operations Report 2017-10-04

* Castor: ** Move to generic Castor headnodes.

15 KB (1,509 words) - 10:08, 11 October 2017
Tier1 Operations Report 2016-11-09

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * Castor:

12 KB (1,334 words) - 12:48, 16 November 2016
RAL Tier1 weekly operations castor 11/11/2016

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

2 KB (311 words) - 12:43, 11 November 2016
Tier1 Operations Report 2016-11-23

* There was a problem on the Castor 'GEN' instance following the renewal of the host certificates on the SRMs. ...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed)

13 KB (1,436 words) - 14:51, 23 November 2016
Tier1 Operations Report 2017-06-07

* Following the upgrade of the Castor GEN instance last Wednesday (31st) there was a problem with the 'OPS' tests * There was a transitory problem with the CMS Castor instance for a couple of hours in the early morning of Friday (2nd June) ca

16 KB (1,833 words) - 13:11, 7 June 2017
Tier1 Operations Report 2016-12-14

* Since yesterday (Tuesday 13th Dec) we are seeing high load on CMSTape in Castor - and are failing SAM tests as a result. ...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed)

14 KB (1,563 words) - 12:48, 21 December 2016
Tier1 Operations Report 2016-11-30

...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed) * Castor:

13 KB (1,400 words) - 14:23, 30 November 2016
Operations Bulletin 051216

* Owing to staff availability the upgrade of Castor to version 2.1.15 is being scheduled to take place in January.

44 KB (5,462 words) - 10:47, 6 December 2016
RAL Tier1 weekly operations castor 02/12/2016

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

2 KB (330 words) - 16:34, 8 December 2016
RAL Tier1 weekly operations castor 25/11/2016

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

3 KB (398 words) - 12:16, 25 November 2016
Tier1 Operations Report 2016-12-07

* There was some problems with the CMS Castor instance during last week. A restart of the "transfermanager" on Friday cle ...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed)

14 KB (1,483 words) - 16:58, 12 December 2016
RAL Tier1 weekly operations castor 27/1/2017

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

1 KB (187 words) - 09:14, 3 February 2017
Operations Bulletin 121216

* Owing to staff availability the upgrade of Castor to version 2.1.15 is being scheduled to take place in January.

45 KB (5,745 words) - 13:04, 12 December 2016
Operations Bulletin 090117

* The upgrade of Castor to version 2.1.15 will cause a series of outages as announced via the GOC D

49 KB (6,219 words) - 11:44, 9 January 2017
RAL Tier1 weekly operations castor 29/04/2016

Alice use farm (quite significant) but dont really use castor *2014 disk serevrs can be put into castor - poss cms .. for IO throughput

848 B (147 words) - 09:43, 8 December 2016
RAL Tier1 weekly operations castor 09/12/2016

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

3 KB (458 words) - 09:16, 16 December 2016
RAL Tier1 weekly operations castor 16/12/2016

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

2 KB (312 words) - 14:05, 16 December 2016
Operations Bulletin 201216

* The upgrade of Castor to version 2.1.15 is being scheduled to take place in January and has been

46 KB (5,848 words) - 09:12, 20 December 2016
Past Ticket Bulletins 2017

Henry has tacked on a query on the health of Castor after noticing some MICE transfer failures last night, but this ticket coul ...ficate won't work as it's already associated with another VO (MICE) due to CASTOR reasons. Waiting for reply (28/11)

121 KB (19,081 words) - 12:04, 23 January 2018
RAL Tier1 weekly operations castor 09/1/2017

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

2 KB (218 words) - 14:21, 9 January 2017
Tier1 Operations Report 2016-12-21

* We have had load issues on CMSTape in Castor which has led to some SAM test failures. ...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed)

14 KB (1,569 words) - 14:30, 21 December 2016
RAL Tier1 CASTOR Experiments Completed Actions 2014

| 20131023-03 || Normal || ATLAS || Matthew || Report back about ATLAS CASTOR deletion problem after F2F discussion with developers || Closed. || 2014-01 | 20140827-02 || Normal || N/A || Rob || Report on plans for Castor 2.1.15 upgrade. || Done || 2014-10-28

2 KB (289 words) - 14:09, 21 December 2016
RAL Tier1 CASTOR Experiments Completed Actions 2015

...mal || CMS || Andrew L || Ensure the relevant people are looking into CMS CASTOR problems || Closed || 2015-02-11 ... || Normal || All || Rob Appleyard || Propagate information about upcoming CASTOR interventions || Done || 2015-08-26

1 KB (163 words) - 14:10, 21 December 2016
Operations Bulletin 020117

* The upgrade of Castor to version 2.1.15 is being scheduled to take place in January and has been

49 KB (6,317 words) - 09:33, 3 January 2017
Tier1 Operations Report 2017-01-18

* Yesterday (17th Jan) there was a problem with teh ALICE Castor instance with many xrootd connections to disk servers but not much activity ...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed)

14 KB (1,561 words) - 14:46, 18 January 2017
Tier1 Operations Report 2017-01-11

...he total load. On Friday an adjustment was made to the number of transfers Castor allocates to the newer three disk servers - which may help but not resolve ...w but persistent rate of failure when copying the results of batch jobs to Castor. There is also a further problem that sometimes occurs when these (failed)

14 KB (1,531 words) - 18:01, 17 January 2017
RAL Tier1 weekly operations castor 13/1/2017

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

1 KB (147 words) - 13:57, 17 January 2017
Operations Bulletin 160117

* The upgrade of Castor Nameserver to version 2.1.15 was successfully carried out on Tuesday (10th * We have had a Castor access problem for LHCB.

51 KB (6,530 words) - 08:56, 16 January 2017
Operations Bulletin 230117

* The LHCb Castor stager updates to version 2.1.15 took place on Wednesday (18th Jan). The ne ...g a lot of CMS SRM test failures - which are attributed to load on the CMS Castor instance.

47 KB (5,971 words) - 09:04, 23 January 2017
Tier1 Operations Report 2017-01-25

* There was a performance problem on the LHCb Castor instance following the upgrade last Wednesday. This was resolved by the end * On Monday the LHCb Castor instance was stopped while OS security patches were applied and nodes reboo

15 KB (1,614 words) - 14:30, 25 January 2017
RAL Tier1 weekly operations castor 20/1/2017

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

1 KB (172 words) - 10:17, 25 January 2017
Tier1 Operations Report 2017-02-15

* Since the castor upgrade we have seen a couple of further problems which are now mainly fixe ... for CMS xroot. This was fixed by changing the weighting for xrootd in the Castor transfermanagerd.

13 KB (1,310 words) - 11:47, 15 February 2017
Tier1 Operations Report 2017-02-08

... 'GEN' upgrade. ALICE require a special version of the xroot component for Castor. Checks that the xroot component would install under 2.1.15 had been made - * Since the castor upgrade we have seen a couple of further problems:

16 KB (1,827 words) - 13:03, 9 February 2017
Operations Bulletin 070217

... with the Alice SRM tests which were not fully resolved until Monday 30th. Castor CMS is being updated today (31st January). ...ng regular CMS SRM test failures - which are attributed to load on the CMS Castor instance.

44 KB (5,446 words) - 15:00, 8 February 2017
RAL Tier1 weekly operations castor 03/2/2017

1. Castor 2.1.15 7. Anything for CASTOR-Fabric?

1 KB (175 words) - 14:58, 9 February 2017
Operations Bulletin 130217

...blems with CMS xroot redirection tests following this. The next change for Castor is to to update the SRMs. Glue 2 publishing for Castor ticket. Did Jens and Rob have any luck tackling this in the pre-Christmas g

49 KB (6,270 words) - 11:57, 13 February 2017
RAL Tier1 weekly operations castor 10/2/2017

2. SRM upgrade to SL6/CASTOR 2.1.16 7. Anything for CASTOR-Fabric?

1 KB (216 words) - 12:18, 13 February 2017
RAL Tier1 weekly operations castor 17/2/2017

2. SRM upgrade to SL6/CASTOR 2.1.16 7. Anything for CASTOR-Fabric?

1 KB (169 words) - 12:00, 17 February 2017
Operations Bulletin 200217

...blems with CMS xroot redirection tests following this. The next change for Castor is to to update the SRMs.

43 KB (5,356 words) - 10:15, 20 February 2017
Tier1 Operations Report 2017-02-22

* There remains some issues following the Castor 2.1.15 upgrade - ... the instances. Investigations into this are ongoing. There is a bugfix to Castor in version 2.1.16 in this area.

14 KB (1,425 words) - 14:24, 22 February 2017
Tier1 Operations Report 2018-01-03

... stable although not completely quiet for the oncall team. There were some Castor disk server failures and staff did attend site over the holiday to replace ...; margin-bottom: 0; padding-top: 0.1em; padding-bottom: 0.1em;" | Resolved Castor Disk Server Issues

17 KB (1,714 words) - 14:41, 3 January 2018
Operations Bulletin 270217

* Following completion of the Castor 2.1.15 update: The next change for Castor is to to update the SRMs. We expect to be able to do this in late March.

44 KB (5,413 words) - 16:17, 27 February 2017
Tier1 Operations Report 2017-03-01

* Castor: * Castor:

15 KB (1,577 words) - 14:44, 1 March 2017
Tier1 Operations Report 2017-08-16

...use is not understood. We have previously had some problems with the Atlas Castor SRMs but the symptoms of this failure appeared different to those. ... "lazy-download". The CMS SRM SAM test success rate has improved since the Castor 2.1.16 upgrade on the 25th May, although is still not 100%. Our investigati

16 KB (1,690 words) - 13:58, 16 August 2017
RAL Tier1 weekly operations castor 02/3/2017

2. SRM upgrade to SL6/CASTOR 2.1.16 3. SL5 elimination from CASTOR functional test boxes and tape verification server

1 KB (215 words) - 11:00, 6 March 2017
Operations Bulletin 060317

* Following completion of the Castor 2.1.15 update: The next change for Castor is to to update the SRMs. We expect to be able to do this in late March.

44 KB (5,399 words) - 12:16, 6 March 2017
Operations Bulletin 130317

* Castor SRM updates: Planning for LHCb SRMs to be the first to be updated on 22nd M

42 KB (5,126 words) - 10:15, 13 March 2017
Tier1 Operations Report 2017-03-08

* There was a problem reported for access to AtlasScratchDisk in Castor this morning (8th Mar). Atlas have reported a large backlog of outstanding * CMS PhEDEx debug transfers switched from CASTOR to CEPH ECHO.

14 KB (1,423 words) - 15:14, 8 March 2017
Operations Bulletin 220517

...ded to version 2.1.16 on Thursday 11th May. This follows the update of the Castor central services on the 9th. The plan is to update the Atlas instance on Tu

43 KB (5,301 words) - 10:00, 22 May 2017
Tier1 Operations Report 2017-03-22

...nstance on the evening of Wednesday 15th Mar. The oncall was contacted and Castor services restarted to resolve the problem. The cause was a known bug that c | Upgrade of Castor SRMs for LHCb to version 2.1.16-10

14 KB (1,476 words) - 15:53, 22 March 2017
Tier1 Operations Report 2017-03-29

| Castor CMS and GEN instances (SRMs) * Update Castor SRMs - ongoing.

16 KB (1,752 words) - 15:22, 29 March 2017
Operations Bulletin 270317

* Castor SRM updates to version 2.1.16. LHCb SRMs to be the first to be updated on 2

45 KB (5,641 words) - 09:51, 27 March 2017
Operations Bulletin 030417

* Castor SRM updates to version 2.1.16: The LHCb SRMs were successfully updated last

45 KB (5,539 words) - 09:16, 3 April 2017
Operations Bulletin 170417

* There have been problems with the LHCb Castor instance since the SRM upgrade on the 23rd March. On the 12th April this is

45 KB (5,594 words) - 03:25, 18 April 2017
Tier1 Operations Report 2017-04-12

...d job rate of the LHCb merging jobs with are a particular cause of load on Castor. I note that a GGUS alarm ticket was received from LHCb about this problem. ... (see availability report below). This Castor instance is running the same Castor SRM version as LHCb.

15 KB (1,663 words) - 13:18, 12 April 2017
Tier1 Operations Report 2017-04-05

* LHCb Castor instance has been running with problems all this last week. Initially it ap ... service on Thursday 30th March. The server seemed to be behaving badly in Castor and was taken out of production and checked over by Fabric Team although no

14 KB (1,455 words) - 09:30, 5 April 2017
Operations Bulletin 100417

* There have been problems with the LHCb Castor instance since the SRM upgrade on the 23rd March. On the 12th April this is

45 KB (5,594 words) - 03:26, 18 April 2017
Tier1 Operations Report 2017-04-19

...day 12th April when the SRM upgrade was reverted for LHCb. Since then LHCb Castor has performed much better and the backlog of work has largely been eliminat * LHCb Castor SRM downgraded to version 2.11.

15 KB (1,526 words) - 09:57, 23 April 2017
Operations Bulletin 240417

* There have been problems with the LHCb Castor instance since the SRM upgrade on the 23rd March. On the 12th April this is

45 KB (5,597 words) - 04:19, 23 April 2017
Operations Bulletin 100717

solidexperiment.org requesting tape support at RAL. Passed on to the castor team. In progress (16/6) Glue2 publishing for Castor. Some progress, but no news for a few months. On hold (10/5)

48 KB (6,069 words) - 09:31, 10 July 2017
Operations Bulletin 010517

...entified that is fixed in Castor 2.1.16. Work is now underway to test this Castor version with the aim of moving too it as soon as possible.

44 KB (5,391 words) - 21:07, 1 May 2017
RAL Tier1 weekly operations castor 21/4/2017

2. SRM upgrade to SL6/CASTOR 2.1.16 3. SL5 elimination from CASTOR functional test boxes and tape verification server

1 KB (168 words) - 11:42, 26 April 2017
Storage site status

|Castor/CEPH

3 KB (422 words) - 08:06, 12 September 2020
Tier1 Operations Report 2017-05-03

...tored at RAL. Nevertheless we still need to follow up with improvements to Castor to resolve throughput and stability issues. The SRMs required a restart on ...equently cleaned up (- i.e. a delete request). These deletions take all of Castor's capacity to delete files. The result is that once a disk area fills the p

16 KB (1,689 words) - 12:44, 4 May 2017
RAL Tier1 weekly operations castor 28/4/2017

2. SRM upgrade to SL6/CASTOR 2.1.16 3. SL5 elimination from CASTOR functional test boxes and tape verification server

1 KB (176 words) - 13:35, 3 May 2017
Operations Bulletin 080517

...articularly in light of the problems seen by LHCb. The proposed dates are: Castor central services (nameserver) and LHCb on Tuesday 9th May. Atlas and CMS in

44 KB (5,473 words) - 21:31, 6 May 2017
Tier1 Operations Report 2017-05-17

* There was a significant problem with the CMS Castor instance over the weekend that severely affected availabilities. Space was * Atlas and CMS were affected for a couple of hours yesterday when Castor was reporting disk pools as full. An update had unexpectedly caused process

16 KB (1,685 words) - 13:08, 17 May 2017
Tier1 Operations Report 2017-05-10

* Following the upgrade of the Castor central services to version 2.1.16 yesterday (Tuesday 9th May) there was a ...ed off "lazy-download". This will be re-addresses once we have upgraded to Castor 2.1.16.

17 KB (1,843 words) - 13:14, 10 May 2017
RAL Tier1 weekly operations castor 05/5/2017

2. SRM upgrade to SL6/CASTOR 2.1.16 3. SL5 elimination from CASTOR functional test boxes and tape verification server

1 KB (149 words) - 13:30, 11 May 2017
RAL Tier1 weekly operations castor 12/5/2017

2. SRM upgrade to SL6/CASTOR 2.1.16 3. SL5 elimination from CASTOR functional test boxes and tape verification server

2 KB (260 words) - 10:27, 12 May 2017
Operations Bulletin 150517

...s is being on done Thursday (11th). Remaining updates to bring the rest of Castor to this version likely to be in around two weeks time.

45 KB (5,648 words) - 10:05, 15 May 2017
Operations Bulletin 290517

* The LHCb Castor instance was upgraded to version 2.1.16 on Thursday 11th May and Atlas on t

42 KB (5,118 words) - 09:19, 29 May 2017
RAL Tier1 weekly operations castor 19/5/2017

2. SRM upgrade to SL6/CASTOR 2.1.16 3. SL5 elimination from CASTOR functional test boxes and tape verification server

2 KB (229 words) - 13:45, 25 May 2017
Tier1 Operations Report 2017-05-24

* There is a specific problem with Castor affecting LHCb when a TURL returned by the SRM does not always work when us ...ed off "lazy-download". This will be re-addresses once we have upgraded to Castor 2.1.16.

15 KB (1,676 words) - 14:43, 24 May 2017
Tier1 Operations Report 2017-05-31

...although is still not 100%. It is still planned to re-visit this issue now Castor has been upgraded. ...Castor 2.1.16 upgrade was not seen. The main limitation encountered within Castor was load on the older disk servers in the instance. Following the 2.1.16 up

15 KB (1,685 words) - 13:14, 31 May 2017
RAL Tier1 weekly operations castor 26/5/2017

2. SRM upgrade to SL6/CASTOR 2.1.16 3. SL5 elimination from CASTOR functional test boxes and tape verification server

2 KB (217 words) - 13:21, 31 May 2017
RAL Tier1 weekly operations castor 02/6/2017

2. SRM upgrade to SL6/CASTOR 2.1.16 3. SL5 elimination from CASTOR functional test boxes and tape verification server

2 KB (310 words) - 07:47, 8 June 2017
Operations Bulletin 120617

* All Castor instances have now been upgraded to version 2.1.16. Overall it is working w

41 KB (5,091 words) - 19:49, 12 June 2017
Operations Bulletin 050617

* All Castor instances have now been upgraded to version 2.1.16. This includes upgrading

41 KB (5,069 words) - 16:29, 5 June 2017
Tier1 Operations Report 2017-06-14

* There were problems with the SRMs for the Castor GEN instance over the last weekend (10/11 June) with on pf the processes fa ...although is still not 100%. It is still planned to re-visit this issue now Castor has been upgraded.

14 KB (1,456 words) - 10:07, 14 June 2017
RAL Tier1 weekly operations castor 09/6/2017

2. SRM upgrade to SL6/CASTOR 2.1.16 3. SL5 elimination from CASTOR functional test boxes and tape verification server

1 KB (194 words) - 10:41, 15 June 2017
RAL Tier1 weekly operations castor 16/6/2017

2. SRM upgrade to SL6/CASTOR 2.1.16 3. SL5 elimination from CASTOR functional test boxes and tape verification server

2 KB (261 words) - 16:16, 19 June 2017
Tier1 Operations Report 2017-06-21

...although is still not 100%. It is still planned to re-visit this issue now Castor has been upgraded. * Castor:

14 KB (1,526 words) - 10:25, 21 June 2017
Operations Bulletin 200617

* All Castor instances have now been upgraded to version 2.1.16. Overall it is working w

41 KB (5,091 words) - 08:07, 20 June 2017
Operations Bulletin 260617

* All Castor instances have now been upgraded to version 2.1.16. Overall it is working w

41 KB (5,107 words) - 09:34, 26 June 2017
Operations Bulletin 030717

* There are still some issues for CMS access to Castor - these can be seen in timeouts affecting the CMS SRM SAM tests.

42 KB (5,273 words) - 09:35, 3 July 2017
Tier1 Operations Report 2017-06-28

...tried to query the status of files and transfers during the problem. Atlas Castor was declared down in the GOC DB from Friday afternoon to Sunday morning whe ...although is still not 100%. It is still planned to re-visit this issue now Castor has been upgraded.

15 KB (1,649 words) - 09:48, 5 July 2017
RAL Tier1 weekly operations castor 23/6/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

1 KB (175 words) - 08:18, 30 June 2017
RAL Tier1 weekly operations castor 30/6/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

2 KB (360 words) - 14:43, 6 July 2017
Tier1 Operations Report 2017-07-12

...although is still not 100%. It is still planned to re-visit this issue now Castor has been upgraded. * Castor:

15 KB (1,670 words) - 15:00, 12 July 2017
Tier1 Operations Report 2017-07-05

* Castor Gen instance has been failing OPS SRM tests since Friday (30th June). This ...although is still not 100%. It is still planned to re-visit this issue now Castor has been upgraded.

15 KB (1,595 words) - 14:25, 5 July 2017
RAL Tier1 weekly operations castor 14/7/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

3 KB (505 words) - 11:18, 14 July 2017
Operations Bulletin 170717

* There will upgrades to the RAID card firmware for some Castor disk servers in D1T0 service classes during next week (17-21 July).

43 KB (5,408 words) - 10:35, 17 July 2017
Tier1 Operations Report 2017-07-19

* There was a problem with the Atlas Castor instance for a few hours during the afternoon of Friday 14th July. A proble ...tion of CMS Castor problems (with 'unable to issue PrepareToPut request to Castor' errors) but at a much reduced rate.

16 KB (1,785 words) - 07:47, 26 July 2017
RAL Tier1 weekly operations castor 28/7/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

2 KB (362 words) - 10:33, 28 July 2017
Operations BUlletin 070817

The most venerable ticket - glue 2 publishing for Castor. Slowly chugging along in the background. On hold (6/7)

41 KB (5,014 words) - 00:26, 8 August 2017
Operations Bulletin 070817

The most venerable ticket - glue 2 publishing for Castor. Slowly chugging along in the background. On hold (6/7)

41 KB (5,014 words) - 00:29, 8 August 2017
Tier1 Operations Report 2017-08-09

...problem with file transfers initiated by the CERN FTS3 service to/from our Castor storage was ongoing at the time of the last meeting (26th July). This was t ...f (old) disk servers in the AtlasScratch pool causing poor performance for Castor. The merger of this pool into the larger AtlasDataDisk pool may reduce the

19 KB (2,046 words) - 13:05, 9 August 2017
RAL Tier1 weekly operations castor 11/8/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

2 KB (346 words) - 10:34, 11 August 2017
RAL Tier1 weekly operations castor 25/8/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

2 KB (281 words) - 14:42, 25 August 2017
RAL Tier1 weekly operations castor 18/8/2017

2. SL5 elimination from CASTOR functional test boxes and tape verification server 3. CASTOR stress test improvement

2 KB (303 words) - 13:09, 18 August 2017
Tier1 Operations Report 2017-08-23

... "lazy-download". The CMS SRM SAM test success rate has improved since the Castor 2.1.16 upgrade on the 25th May, although is still not 100%. Our investigati ...dsay (17th Aug) six disk servers (from 2012 generations) were deployed ino Castor CMSTape.

17 KB (1,707 words) - 09:27, 23 August 2017

Search results

Page title matches

Page text matches

Navigation menu

Personal tools

Namespaces

Variants

Views

Actions

Search

Main GridPP website

Navigation

Tools