RAL Tier1 weekly operations castor 13/09/2010
From GridPP Wiki
Contents
Work previous week
- Matthew:
- Debugging gridftp problems on SL5 and found fix by downgrading VDT
- Testing transfers on preprod
- 2.1.9 change control document
- Shaun:
- Testing frewall by pass rules
- xrootd testing (with and w/o ALICE security)
- Moved instances to local nameserver
- ASGC support
- Monitoring LHCb production instance
- Chris:
- ..
- Richard:
- Helped Cheney with quattor issues building head nodes for facilities instance
- Brian:
- ..
- Jens:
- ..
Operations Issues
- Still transfer problems with SL5 disk servers. lcg-cp works but lcg-cr doesn't for ATLAS
- LHCb is still throttled
- testing gridftp-internal externally showed that there is a configuration problem in the site firewall causing some transfers to fail
- There are transfer problems to a new batch of disk servers at NDGF affecting only RAL
PreProd
- Wrong firewall settings are preventing a number of new (ssv06-***) disk servers transferring externally
Blocking issues
- Any ongoing production problems at present will jepordize the timeline for starting 2.1.9 upgrades at the end of this month.
Planned, Scheduled and Cancelled Interventions
Entries in/planned to go to GOCDB
Description | Start | End | Type | Affected VO(s) |
---|---|---|---|---|
Update LHCb to use local firewall | 13/03/2010 10:00 | 13/03/2010 12:00 | At-risk | LHCb |
Update Gen to use local firewall | 14/03/2010 10:00 | 14/03/2010 12:00 | At-risk | Gen |
Update CMS to use local firewall | 15/03/2010 10:00 | 15/03/2010 12:00 | At-risk | CMS |
Update ATLAS to use local firewall | 16/03/2010 10:00 | 16/03/2010 12:00 | At-risk | ATLAS |
Advanced Planning
- Upgrade to 2.1.9 2010
Staffing
- Castor on Call person: Shaun
- Staff absences:
- ..