Difference between revisions of "RAL Tier1 weekly operations castor 09/12/2016"
(Created page with "1. Problems encountered this week 2. Upgrades/improvements made this week 3. What are we planning to do next week? 4. Long-term project updates (if not already covered) ...") |
(→Operation problems) |
||
Line 29: | Line 29: | ||
gdss650 (LHCbUser) failed on Saturday morning, 3rd Dec. It was returned to service on 6th Dec. A disk had failed - and the replacement to that disk also failed. During the RAID rebuild a further disk drive started reporting problems and was also swapped. | gdss650 (LHCbUser) failed on Saturday morning, 3rd Dec. It was returned to service on 6th Dec. A disk had failed - and the replacement to that disk also failed. During the RAID rebuild a further disk drive started reporting problems and was also swapped. | ||
− | gdss701 (LHCbDst) was taken out of service on Saturday (3rd Dec) when it reported FSProbe errors when a disk was replaced. It was returned to service on the 5th Dec. | + | gdss701 (LHCbDst) was taken out of service on Saturday (3rd Dec) when it reported FSProbe errors when a disk was replaced. It was returned to service on the 5th Dec. |
+ | |||
+ | There a problem on one of the Power Distribution Units to a rack in the UPS room during the early hours of Monday morning (5th Dec). This affected two network switches - which in turn affected some core services | ||
== Operation news == | == Operation news == | ||
LHCbUser and LHCbDst disk pools have now been merged | LHCbUser and LHCbDst disk pools have now been merged |
Revision as of 10:08, 9 December 2016
1. Problems encountered this week
2. Upgrades/improvements made this week
3. What are we planning to do next week?
4. Long-term project updates (if not already covered)
1. Castor 2.1.15 2. SL7 upgrade on tape servers
5. Special topics
6. Actions
7. Anything for CASTOR-Fabric?
8. AoTechnicalB
9. Availability for next week
10. On-Call
11. AoOtherB
Operation problems
gdss650 (LHCbUser) failed on Saturday morning, 3rd Dec. It was returned to service on 6th Dec. A disk had failed - and the replacement to that disk also failed. During the RAID rebuild a further disk drive started reporting problems and was also swapped.
gdss701 (LHCbDst) was taken out of service on Saturday (3rd Dec) when it reported FSProbe errors when a disk was replaced. It was returned to service on the 5th Dec.
There a problem on one of the Power Distribution Units to a rack in the UPS room during the early hours of Monday morning (5th Dec). This affected two network switches - which in turn affected some core services
Operation news
LHCbUser and LHCbDst disk pools have now been merged