Difference between revisions of "PerfSonarInstall"
(→Sites progress) |
(→PerfSonar Documentation Links) |
||
Line 24: | Line 24: | ||
* [https://twiki.cern.ch/twiki/bin/view/LCG/PerfsonarDeployment WLCG perfSonar Task Force] | * [https://twiki.cern.ch/twiki/bin/view/LCG/PerfsonarDeployment WLCG perfSonar Task Force] | ||
* [https://twiki.cern.ch/twiki/bin/view/LCG/PerfsonarDeployment#Background_Documentation_and_per Background Documentation for LHC experiments] | * [https://twiki.cern.ch/twiki/bin/view/LCG/PerfsonarDeployment#Background_Documentation_and_per Background Documentation for LHC experiments] | ||
+ | * [http://stats.es.net/ServicesDirectory/ Global Host Dir] (takes a long time to load) | ||
== Installing PerfSonar == | == Installing PerfSonar == |
Revision as of 15:09, 31 August 2017
Warning, this is very much a WIP/draft.
Contents
Principles of PerfSonar
Unlike gridmon, the PerfSonar system doesn't use any central control, each endpoint is a free-standing server with its own configuration. There is a small element of centralisation in the shape of the 'registration service' which allows PerfSonar boxes to advertise their existence and services offered, and to consult the registry to discover other servers.
Each PerfSonar machine fulfils a dual role - both as a test target for other PerfSonar machines, and as an originator of tests which it runs against its administrator's choice of targets. For our deployment, we're using the common configuration of a pair of PS servers per site; one to run latency (i.e. essentially ping based) tests, and one to run bandwidth (i.e. iperf) tests. Since other people will be configuring their endpoints to run tests against yours, it may be helpful to give them meaningful names (for example, the Oxford ones are t2ps-latency.physics.ox.ac.uk and t2ps-bandwidth.physics.ox.ac.uk). Note that the name that PerfSonar boxes advertise is determined by a reverse DNS lookup on their IP address, so their names need to be 'real' ones; if you give the hardware generic names and then a 'friendly' alias, the advertising will use the underlying real name, so aliases should be avoided.
PerfSonar supports the concept of 'communities' which are effectively arbitrary tags that appear in the registry of servers and make it easy to automatically select a subset of them. Any given node can be a member of as many communities as it wants, and a community is created simply by tagging a node. We will be using the 'GridPP' community tag, but may also use others (for example the RAL Tier 1 endpoints are members of the LHCOPN community. Note that community names are case sensitive.
PerfSonar Documentation Links
The following links will give further information on the software and deployment.
- Platform Overview
- LHCOPN perfSONAR-PS setup
- Details of the perfSONAR-PS 3.2
- Release notes for 3.2.2
- BNL's Wiki
- BNL Perfsonar Dashboard UK view
- PS 3.1 FAQ (for firewall ports and other very interesting information)
- WLCG perfSonar Task Force
- Background Documentation for LHC experiments
- Global Host Dir (takes a long time to load)
Installing PerfSonar
- Get the installation image from the main PerfSonar downloads page. You're looking for the network install iso for the whole toolkit (or just use this direct link). Save the ISO to your desktop PC.
- Connect to the node's iDRAC6 management card. You'll need a browser on a machine with Java and Java Web Start support, and the ability to connect to the iDRAC over the network.
- Launch the virtual console, open the 'Virtual Media' tool from the menu, add the ISO image, and map it to the node's virtual CD drive.
- Boot the machine and follow the generic install instructions
- Once you've got a basically running box, log into the web interface
screenshot goes here
- On the latency box set the running services to 'all latency services'
- On the bandwidth box set the running services to 'all bandwidth services'
Configuring PerfSonar
Registering your endpoints
At this point you should have a PerfSonar machine that is basically in full working order, but is not actually doing anything. The next thing to do is fill in the 'Toolkit Administration'->'Administrative Information' section with
- Organization Name (e.g. UKI-SOUTHGRID-OX-HEP)
- Host Location (e.g. Oxford, England)
- Administrator Name
- Administrator Email
Once this is done the node should register itself, making it visible to other sites to run tests against.
Joining Communities
At:
https://<your host>/toolkit/admin/administrative_info/
(first link in the Toolkit Administration section of the left hand navigation bar) you can sees list of available communities and join them.
Enabling Services
At:
https://<your host>/toolkit/admin/enabled_services/
(forth link in the Toolkit Administration section of the left hand navigation bar)
you can enable and disable services as a whole. For example, you can enable the services that allow others to run throughput tests against your host. The settings here will affect how your host publishes its capabilities in the communities you join.
Scheduling tests against other sites
Follow the LHCONE notes to setup bandwidth, latency and traceroute tests
https://twiki.cern.ch/twiki/bin/view/LHCONE/SiteList
Firewall handling
Look at the FAQs: http://psps.perfsonar.net/toolkit/FAQs.html in particular Q.6 but also others and http://fasterdata.es.net/performance-testing/perfsonar/ps-howto/perfsonar-firewall-requirements/ for more detailed incoming/outgoing traffic information which some firewall admins might ask for.
Web Interface notes
- After making configuration changes in the web interface make sure to click the "Save" button at the bottom of the page - otherwise any changes will be lost when you navigate away - it is inconsistent about warning you.
OS Configuration notes
- By default log rotate is set up without compression. This is worth changing.
- Similarly you may wish to install local (nagios, ganglia) monitoring, ssh etc.
Upgrading to v3.3.1 and use of the WLCG mesh
Details of how to upgrade your perfSONAR hosts to v3.3.may be found here https://twiki.cern.ch/twiki/bin/view/LCG/PerfsonarDeployment#Upgrading_Existing_Instances and details of how to configure the mesh here: https://twiki.cern.ch/twiki/bin/view/LCG/PerfsonarDeployment#Configuring_Sites_for_Participat
Sites progress
- Installed: machine installed
- Communities: at the very least join GridPP
- Enable tests against all sites listed in GridPP communities
- v3.3.1/new mesh: Site has upgraded to v3.3.1 of the perfSONAR toolkit and is using the latest mesh configuration
- Review when new sites are added
- UK Dashboard
Sites | Installed | Communities | Tests enabled | BNL dashboard | v3.3.1/new mesh | Notes or date for upgrade |
---|---|---|---|---|---|---|
RAL-LCG2 | yes | yes | yes | yes | ||
UKI-LT2-Brunel | yes | yes | yes | yes | ||
UKI-LT2-IC-HEP | yes | yes | yes | yes | yes | both hosts ipv6 |
UKI-LT2-QMUL | yes | yes | yes | yes | yes | |
UKI-LT2-RHUL | yes | yes | yes | yes | ||
UKI-LT2-UCL-HEP | yes | yes | yes | Yes | Yes | |
UKI-NORTHGRID-LANCS-HEP | yes | yes | yes | yes | yes | |
UKI-NORTHGRID-LIV-HEP | yes | yes | yes | yes | yes | |
UKI-NORTHGRID-MAN-HEP | yes | yes | yes | yes | ||
UKI-NORTHGRID-SHEF-HEP | yes | yes | yes | yes | ||
UKI-SCOTGRID-DURHAM | yes | yes | yes | yes | yes | |
UKI-SCOTGRID-ECDF | yes | yes | yes | yes | ||
UKI-SCOTGRID-GLASGOW | yes | yes | yes | yes | yes | |
UKI-SOUTHGRID-BHAM-HEP | yes | yes | yes | yes | yes | |
UKI-SOUTHGRID-BRIS-HEP | yes | yes | yes | yes | yes | |
UKI-SOUTHGRID-CAM-HEP | yes | yes | yes | yes | yes | |
UKI-SOUTHGRID-OX-HEP | yes | yes | yes | yes | yes | latency IPv6 |
UKI-SOUTHGRID-RALPP | yes | yes | yes | yes | ||
UKI-SOUTHGRID-SUSX | yes | yes | yes | yes | Firewall issues |