Difference between revisions of "Site monitoring status"

From GridPP Wiki
Jump to: navigation, search
m (added some Bristol info)
Line 99: Line 99:
 
|-
 
|-
 
|UKI-SOUTHGRID-BHAM-HEP
 
|UKI-SOUTHGRID-BHAM-HEP
|
+
|older cluster Ganglia & Pakiti; newer uses Nagios
|
+
|Going to ditch Ganglia & Pakiti for nagios on older cluster.
|
+
|Wish Munin scaled well!
  
 
|-
 
|-

Revision as of 07:43, 2 July 2014

This page is intended to gather together the tools that sites are currently using to monitoring their local sites. Please fill in the details for your site with the following pieces of information:


1) Current solution(s): What tools are currently used at the site and for what purpose?

2) Future plans: What plans (if any) does your site have for future monitoring?

3) Notes: Any other information you think might be useful.


Site

Current solution(s) Future plans Notes
RAL Tier-1
UKI-LT2-Brunel
UKI-LT2-IC-HEP
UKI-LT2-QMUL
UKI-LT2-RHUL
UKI-LT2-UCL-HEP
UKI-NORTHGRID-LANCS-HEP
UKI-NORTHGRID-LIV-HEP
UKI-NORTHGRID-MAN-HEP
UKI-NORTHGRID-SHEF-HEP
UKI-SCOTGRID-DURHAM
UKI-SCOTGRID-ECDF
UKI-SCOTGRID-GLASGOW Naemon (status and alerting), Ganglia/Graphite (metric & time series graphing), Cacti (network monitoring) Dashboards (Dashing or similar), reconsidering network monitoring We currently use Ganglia for systems metrics, Graphite for a higher cluster level view
UKI-SOUTHGRID-BHAM-HEP older cluster Ganglia & Pakiti; newer uses Nagios Going to ditch Ganglia & Pakiti for nagios on older cluster. Wish Munin scaled well!
UKI-SOUTHGRID-BRIS
UKI-SOUTHGRID-CAM-HEP
UKI-SOUTHGRID-OX-HEP
UKI-SOUTHGRID-RALPP
UKI-SOUTHGRID-SUSX