Difference between revisions of "Site monitoring status"

From GridPP Wiki
Jump to: navigation, search
(Created page with "== Sites batch system status == This page has been setup to collect information from GridPP sites regarding their batch systems in February 2014. The information will help w...")
 
(Sites batch system status)
Line 1: Line 1:
== Sites batch system status ==  
+
== Sites monitoring status ==  
  
This page has been setup to collect information from GridPP sites regarding their batch systems in February 2014. The information will help with wider considerations and strategy. The table seeks the following:
+
This page is intended to gather together the tools that sites are currently using to monitoring their local sites.  
  
1) Current product (local/shared) - what is the current batch system at the site. Is it locally managed or shared with other groups?
+
1) Current solution(s): What tools are currently used at the site and what is their purpose?
  
2) Concerns - has your site experienced any problems with the batch system in operation?
+
2) Future plans: What plans (if any) does your site have for future monitoring.
  
3) Interest/Investigating/Testing - Does your site already have plans to change and if so to what. If not are you actively investigating or testing any alternatives?
+
3) Notes: Any other information you think might be useful
 
+
4) CE type(s) - What CE type (gLite, ARC...) do you currently run and do you plan to change this, perhaps in conjunction with a batch system move?
+
 
+
5) Cloud interface(s)? - Does your site offer access to resources in ways other than via a CE?
+
 
+
6) Notes - Any other information you wish to share on this topic.
+
  
  
Line 23: Line 17:
 
|-style="background:#7C8AAF;color:white"
 
|-style="background:#7C8AAF;color:white"
 
|Site
 
|Site
|Current product (local/shared)
+
|Current solution(s)
|Concerns and observations
+
|Future plans
|Interest/Investigating/Testing
+
|CE type(s) & plans at site
+
|Cloud interface available/plans
+
 
|Notes
 
|Notes
  
 
|-
 
|-
|RAL Tier-1
+
|UKI-SCOTGRID-GLASGOW
|<span style="color:green">HTCondor (local)</span>
+
|Naemon(status and alerting), Ganglia/Graphite (metric graphing), Cacti (network)
|<span style="color:green">None</span>
+
|Dashboard (Dashing or similar), network monitoring
|<span style="color:green">No reason to change</span>
+
|We currently use Ganglia for systems metrics, Graphite for higher cluster level view
|<span style="color:green">ARC & CREAM CEs, but would like to decommission CREAM CEs eventually</span>
+
|<span style="color:green"></span>
+
|
+
  
 
|}
 
|}

Revision as of 12:57, 1 July 2014

Sites monitoring status

This page is intended to gather together the tools that sites are currently using to monitoring their local sites.

1) Current solution(s): What tools are currently used at the site and what is their purpose?

2) Future plans: What plans (if any) does your site have for future monitoring.

3) Notes: Any other information you think might be useful



Site Current solution(s) Future plans Notes
UKI-SCOTGRID-GLASGOW Naemon(status and alerting), Ganglia/Graphite (metric graphing), Cacti (network) Dashboard (Dashing or similar), network monitoring We currently use Ganglia for systems metrics, Graphite for higher cluster level view