Difference between revisions of "PerfSonarInstall"

From GridPP Wiki
Jump to: navigation, search
(Sites progress)
(PerfSonar Documentation Links)
Line 24: Line 24:
* [https://twiki.cern.ch/twiki/bin/view/LCG/PerfsonarDeployment WLCG perfSonar Task Force]
* [https://twiki.cern.ch/twiki/bin/view/LCG/PerfsonarDeployment WLCG perfSonar Task Force]
* [https://twiki.cern.ch/twiki/bin/view/LCG/PerfsonarDeployment#Background_Documentation_and_per Background Documentation for LHC experiments]
* [https://twiki.cern.ch/twiki/bin/view/LCG/PerfsonarDeployment#Background_Documentation_and_per Background Documentation for LHC experiments]
* [http://stats.es.net/ServicesDirectory/ Global Host Dir] (takes a long time to load)
== Installing PerfSonar ==
== Installing PerfSonar ==

Revision as of 15:09, 31 August 2017

Warning, this is very much a WIP/draft.

Principles of PerfSonar

Unlike gridmon, the PerfSonar system doesn't use any central control, each endpoint is a free-standing server with its own configuration. There is a small element of centralisation in the shape of the 'registration service' which allows PerfSonar boxes to advertise their existence and services offered, and to consult the registry to discover other servers.

Each PerfSonar machine fulfils a dual role - both as a test target for other PerfSonar machines, and as an originator of tests which it runs against its administrator's choice of targets. For our deployment, we're using the common configuration of a pair of PS servers per site; one to run latency (i.e. essentially ping based) tests, and one to run bandwidth (i.e. iperf) tests. Since other people will be configuring their endpoints to run tests against yours, it may be helpful to give them meaningful names (for example, the Oxford ones are t2ps-latency.physics.ox.ac.uk and t2ps-bandwidth.physics.ox.ac.uk). Note that the name that PerfSonar boxes advertise is determined by a reverse DNS lookup on their IP address, so their names need to be 'real' ones; if you give the hardware generic names and then a 'friendly' alias, the advertising will use the underlying real name, so aliases should be avoided.

PerfSonar supports the concept of 'communities' which are effectively arbitrary tags that appear in the registry of servers and make it easy to automatically select a subset of them. Any given node can be a member of as many communities as it wants, and a community is created simply by tagging a node. We will be using the 'GridPP' community tag, but may also use others (for example the RAL Tier 1 endpoints are members of the LHCOPN community. Note that community names are case sensitive.

PerfSonar Documentation Links

The following links will give further information on the software and deployment.

Installing PerfSonar

  • Get the installation image from the main PerfSonar downloads page. You're looking for the network install iso for the whole toolkit (or just use this direct link). Save the ISO to your desktop PC.
  • Connect to the node's iDRAC6 management card. You'll need a browser on a machine with Java and Java Web Start support, and the ability to connect to the iDRAC over the network.
  • Launch the virtual console, open the 'Virtual Media' tool from the menu, add the ISO image, and map it to the node's virtual CD drive.


screenshot goes here

  • On the latency box set the running services to 'all latency services'
  • On the bandwidth box set the running services to 'all bandwidth services'

Configuring PerfSonar

Registering your endpoints

At this point you should have a PerfSonar machine that is basically in full working order, but is not actually doing anything. The next thing to do is fill in the 'Toolkit Administration'->'Administrative Information' section with

  • Organization Name (e.g. UKI-SOUTHGRID-OX-HEP)
  • Host Location (e.g. Oxford, England)
  • Administrator Name
  • Administrator Email

Once this is done the node should register itself, making it visible to other sites to run tests against.

Joining Communities


https://<your host>/toolkit/admin/administrative_info/

(first link in the Toolkit Administration section of the left hand navigation bar) you can sees list of available communities and join them.

perfsonar communities

Enabling Services


https://<your host>/toolkit/admin/enabled_services/

(forth link in the Toolkit Administration section of the left hand navigation bar)

you can enable and disable services as a whole. For example, you can enable the services that allow others to run throughput tests against your host. The settings here will affect how your host publishes its capabilities in the communities you join.

perfsonar services

Scheduling tests against other sites

Follow the LHCONE notes to setup bandwidth, latency and traceroute tests


Firewall handling

Look at the FAQs: http://psps.perfsonar.net/toolkit/FAQs.html in particular Q.6 but also others and http://fasterdata.es.net/performance-testing/perfsonar/ps-howto/perfsonar-firewall-requirements/ for more detailed incoming/outgoing traffic information which some firewall admins might ask for.

Web Interface notes

  • After making configuration changes in the web interface make sure to click the "Save" button at the bottom of the page - otherwise any changes will be lost when you navigate away - it is inconsistent about warning you.

OS Configuration notes

  • By default log rotate is set up without compression. This is worth changing.
  • Similarly you may wish to install local (nagios, ganglia) monitoring, ssh etc.

Upgrading to v3.3.1 and use of the WLCG mesh

Details of how to upgrade your perfSONAR hosts to v3.3.may be found here https://twiki.cern.ch/twiki/bin/view/LCG/PerfsonarDeployment#Upgrading_Existing_Instances and details of how to configure the mesh here: https://twiki.cern.ch/twiki/bin/view/LCG/PerfsonarDeployment#Configuring_Sites_for_Participat

Sites progress

  • Installed: machine installed
  • Communities: at the very least join GridPP
  • Enable tests against all sites listed in GridPP communities
  • v3.3.1/new mesh: Site has upgraded to v3.3.1 of the perfSONAR toolkit and is using the latest mesh configuration
    • Review when new sites are added
  • UK Dashboard
Sites Installed Communities Tests enabled BNL dashboard v3.3.1/new mesh Notes or date for upgrade
RAL-LCG2 yes yes yes yes
UKI-LT2-Brunel yes yes yes yes
UKI-LT2-IC-HEP yes yes yes yes yes both hosts ipv6
UKI-LT2-QMUL yes yes yes yes yes
UKI-LT2-RHUL yes yes yes yes
UKI-LT2-UCL-HEP yes yes yes Yes Yes
UKI-NORTHGRID-LANCS-HEP yes yes yes yes yes
UKI-NORTHGRID-LIV-HEP yes yes yes yes yes
UKI-NORTHGRID-MAN-HEP yes yes yes yes
UKI-NORTHGRID-SHEF-HEP yes yes yes yes
UKI-SCOTGRID-DURHAM yes yes yes yes yes
UKI-SCOTGRID-ECDF yes yes yes yes
UKI-SCOTGRID-GLASGOW yes yes yes yes yes
UKI-SOUTHGRID-BHAM-HEP yes yes yes yes yes
UKI-SOUTHGRID-BRIS-HEP yes yes yes yes yes
UKI-SOUTHGRID-CAM-HEP yes yes yes yes yes
UKI-SOUTHGRID-OX-HEP yes yes yes yes yes latency IPv6
UKI-SOUTHGRID-RALPP yes yes yes yes
UKI-SOUTHGRID-SUSX yes yes yes yes Firewall issues