UKI-LT2-RHUL

From GridPP Wiki
Revision as of 14:14, 25 July 2012 by Stephen jones (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

UKI-LT2-RHUL

Topic: HEPSPEC06


UKI-LT2-RHUL
CPU+cores,Bits OS Kernel 32/64 mem gcc Total Per Core Note
Dual CPU,Intel(R) Xeon 3.06 GHz RHEL3 update 9 2.4.21-57.ELsmp 32 1 GB 3.2.3 8.34 4.17 ce1.pp.rhul.ac.uk
Dual CPU, Quad core, Intel Xeon E5345 2.33 GHz (Clovertown) SL4.3 2.6.26.5 ELsmp 64 16 GB 3.4.6 58.26 7.3 ce2.ppgrid1.rhul.ac.uk
Dual CPU, Quad core, Intel Xeon E5345 2.33 GHz (Clovertown) SL5.5 2.6.18-164.11.1.el5 64 16 GB 4.1.2 62.89 7.9 ce2.ppgrid1.rhul.ac.uk
Intel Xeon X5660 @ 2.80GHz, 2 cpu x 6 cores, Dell C6100 SL5.5 2.6.18-238.5.1.el5 64 24 GB 4.1.2 158 13.3 cream2.ppgrid1.rhul.ac.uk
Intel Xeon X5660 @ 2.80GHz, 2 cpu x 6 cores x 2 (HT), Dell C6100 SL5.5 2.6.18-274.12.1.el5 64 48 GB 4.1.2 202 8.43 cream2.ppgrid1.rhul.ac.uk

--Simon george 09:39, 16 Dec 2011 (GMT)

Test systems, not in production
CPU+cores,Bits OS Kernel 32/64 mem gcc Total Per Core Note
Dual CPU, Quad Core, Intel Xeon X5560 2.80 GHz (Nehalem) HT enabled SLC5.3 2.6.18-128.2.1.el5 64 24 GB 4.1.2 135.42 16.92 8 processes (1 per core)
Dual CPU, Quad Core, Intel Xeon X5560 2.80 GHz (Nehalem) HT enabled SLC5.3 2.6.18-128.2.1.el5 64 24 GB 4.1.2 307.94 19.24 16 processes (1 per hyperthread)
Dual CPU, Quad Core, Intel Xeon X5560 2.80 GHz (Nehalem) HT disabled SLC5.3 2.6.18-128.2.1.el5 64 24 GB 4.1.2 138.82 17.35 8 processes (1 per hyperthread)
Dual CPU,Intel Xeon 3.06 GHz no HT SLC 4.8 2.6.9-89.0.3.EL.cernsmp 32 1 GB 3.4.6 10.46 5.23 2 processes, 1 per CPU
AMD Phenom II X4 955 3.2 GHz SLC5.3 2.6.18-128.4.1.el5 64 4 GB 4.1.2 53.0 13.25 4 processes (4 cores on single CPU)
AMD Athlon64 3000+ SLC 4.8 2.6.9-89.0.9.EL.cern 64 2 GB 3.4.6 7.80 7.80 1 single core CPU so 1 process


Topic: Middleware_transition

lcg-CE:- have to plan for this by finding a suitable hardware.

gLite3.2/EMI


ARGUS - 3.2.4-2

BDII_site - 3.2.11-1

CE (CREAM/LCG)3.2.10

glexec 3.2.6-3

SE 1.8.1

UI


WN 3.2.11

Comments



Topic: Protected_Site_networking


  • Separate subnet - 134.219.225.0/24
  • Plan to setup dedicated monitoring like cacti etc.
  • Dedicated 1Gb/s link to Janet at IC - Until April shared the link with college, which was capped at =~ 300 Mb/s
  • Network Hardware upgrade - 8x Dell PowerEdge 6248, double-stacked (2x48Gb/s + redundant loop) - SN: 4x1Gb/s bonded & vlan, WN: 2x1Gb/s bonded (only on private network)


Topic: Resiliency_and_Disaster_Planning



      • This section intentionally left blank


Topic: SL4_Survey_August_2011

lcg-CE:- have to plan for this by finding a suitable hardware.

Topic: Site_information


Memory

1. Real physical memory per job slot:

ce1: 0.5 GB

ce2: 2 GB

2. Real memory limit beyond which a job is killed:

ce1: none

ce2: none

3. Virtual memory limit beyond which a job is killed:

ce1: none

ce2: none

4. Number of cores per WN:

ce1: 2

ce2: 8

Comments:

ce1 o/s is RHEL3 so most VOs can't use it.

ce2 o/s is SL4.

Network

1. WAN problems experienced in the last year:

Firewall problems related to ce2/se2, now resolved.

2. Problems/issues seen with site networking:

LAN switch stack came up wrongly configured after a scheduled power cut at end of Nov. It was very difficult to debug remotely and was not resolved until early January.

3. Forward look:

See below.

Comments:

The ce2/se2 cluster is due to be relocated from IC to new RHUL machine room around mid-2009.
Power and cooling should be good. More work is required to minimise the impact on the network performance.

Currently at IC the cluster has a short, uncontended 1Gb/s connection direct to a PoP on LMN which has given very reliable performance for data transfers.
Current network situation at RHUL is a 1Gb/s link shared with all other campus internet traffic.
The point has been made that this move must not be a backward step in terms of networking, so RHUL CC are investigating options to address this.

Topic: Site_status_and_plans


SRM

Current status: DPM 1.8.3 in production since 15Dec11

Argus/glexec

Current status: Installed Argus/Glexec on all worker node and passing tests 26May11


Comments:
CREAM CE

Current status: cream2 in production