Difference between revisions of "UKI-SOUTHGRID-BHAM-HEP"
(No difference)
|
Latest revision as of 14:25, 25 July 2012
Contents
UKI-SOUTHGRID-BHAM-HEP
Topic: HEPSPEC06
Correct as of April, 2012
Processor+cores | OS | Kernel | Kernel 32/64 | Compile 32/64 | mem | gcc | Total | Per Core | Notes |
Dual Xeon 2.0GHz | SL4.5 | 2.6.9-78.0.1.ELsmp | 32 | 32 | 1GB | 6.65 | 3.325 | ||
Dual Xeon 3.0GHz | SL4.5 | 2.6.9-78.0.8.ELsmp | 32 | 32 | 2GB | 10.1 | 5.05 | ||
Dual 4-core Xeon E5450 3.0GHz | SL4.6 | 2.6.9-78.0.22.ELsmp | 32 | 32 | 16GB | 72.8 | 9.1 | ||
Dual 4-core Xeon E5450 3.0GHz | SL5.4 | 2.6.18-164.11.1.el5 | 64 | 32 | 16GB | 4.1.2 | 76.88 | 9.61 | |
Dual 2-core AMD2218 2.6GHz | SL4.7 | 2.6.18-92.1.13.el5 | 64 | 32 | 8GB | 31.24 | 7.81 | ||
Four 12-core AMD6234 | SL5.8 | 2.6.18-308.4.1.el5 | 64 | 32 | 96GB | 4.1.2 | 368.64 | 7.68 | Turbo disabled |
Four 12-core AMD6234 | SL5.8 | 2.6.18-308.4.1.el5 | 64 | 64 | 96GB | 4.1.2 | 453.6 | 9.45 | Turbo disabled,64-bit |
Topic: Middleware_transition
- An overhaul is pending in the following weeks (9/11 - 10/11) where we hope to shift all service nodes to a new set of hardware and retire the LCG CEs
- All nodes are on SL5 (.4 or .5) except the LCG-CEs which are SL4.
gLite3.2/EMI
ARGUS : gLite 3.2
BDII_site : gLite 3.2
CE (CREAM) : 2 x gLite 3.2
CE (LCG) : 2 x gLite 3.1
glexec : gLite 3.2
SE : gLite 3.2, DPM 1.8
UI : gLite 3.2
WMS (No WMS) : N/A
WN : gLite 3.2
Comments
- This is probably available somewhere and I just don't know, but a comprehensive, one stop guide for versions of software that should running and also expected loads/requirements would be *very* useful!
Topic: Protected_Site_networking
- Local cluster is on a well defined subnet.
- The shared cluster is also on a subnet, however, this subnet also contains other parts of the cluster
- Connection to JANET via the main University hub.
- Use Ganglia for the majority of online network monitoring.
Topic: Resiliency_and_Disaster_Planning
- This section intentionally left blank
Topic: SL4_Survey_August_2011
we are running our two LCG-CEs (epgr02 & epgr04) on glite 3.1 and SL4. Our CREAM CEs seem OK though so if the consensus is to retire them, I'm happy to!
(Not scheduled, but (effectively) planned)
Topic: Site_information
Memory
1. Real physical memory per job slot:
- PP Grid cluster: 2048MB/core
- eScience cluster: 1024MB/core
- Atlas cluster: 512MB/core
2. Real memory limit beyond which a job is killed: None
3. Virtual memory limit beyond which a job is killed: None
4. Number of cores per WN:
- PP Grid cluster: 8
- Mesc cluster: 2
- Atlas cluster: 2
Comments:
Network
1. WAN problems experienced in the last year: None
2. Problems/issues seen with site networking:
- DNS problems, faulty GBIC, several reboot of core switches in summer 08
- Broken switch connecting Mesc workers on 26/12/08 (second hand replacement 100MB/s switch installed on 12/01/09)
- Networking between SE and WN is poor according to Steve's networking tests - ongoing investigation
3. Forward look:
Replace 100MB/s switches by gigabit switches for workers
Comments:
Topic: Site_status_and_plans
SL5 WNs
Current status (10/02/10): All WNs now running SL5.3
Planned upgrade: Complete.
SRM
Current status (27/10/09): DPM 1.7.2-4 on SL 4.6
Planned upgrade: Complete.
ARGUS/glexec
Current status (22/03/11):
Planned deployment: Deployed for the local cluster, still testing. Working on deploying for the shared cluster, but this requires glexec for a tarball WN release.
CREAM CE
Current status (22/03/11): Complete. Both clusters have a working CreamCE.