Example Build of an ARC/Condor Cluster

From GridPP Wiki
Jump to: navigation, search

Introduction

A multi-core job is one which needs to use more than one processor on a node. Until recently, multi-core jobs have not been used on the grid infrastructure. This has all changed because Atlas and other large users have now asked sites to enable multi-core on their clusters.

Unfortunately, it is not just a simple task of setting some parameter on the head node and sitting back while jobs arrive. Different grid system have varying levels of support for multi-core, ranging from non-existent to virtually full support.

This report discusses the multi-core configuration at Liverpool. We decided to build a test cluster using one of the most capable batch systems currently available, called HTCondor (or condor for short). We also decided to fron the system with an ARC/Condor CE.


Infrastructure/Fabric

The multicore test cluster consists of an SL6 headnode to run the ARC CE and the Condor batch system. The headnode has a dedicated set of 11 workernodes of various types, providing a total of 96 single threads of execution.

Head Node

The headnode is a virtual system running on KVM.

Head node hardware
Host Name OS CPUs RAM Disk Space


hepgrid2.ph.liv.ac.uk SL6.4 8 2 gig 35 gig


Worker nodes

The physical workernodes are described below.

Head node hardware
Rack names CPU type CPUs Per Node Slots used per cpu Slots used per node Total nodes Total slots HepSpec per slot Total hepspec OS RAM Disk Space


r21-n01-4 E5620 2 5 10 4 8 12.05 482 SL6.4 24 GB 1.5 TB