2010
From GridPP Wiki
Revision as of 09:38, 9 August 2010 by Jeremy coles (Talk | contribs)
Contents
UKI-SOUTHGRID-OX-HEP - Nagios outage
Description:
Put here a reasonable description of the event. Ensure you include which service etc. is affected.
Impact
Describe the type of impact. Include which services / VOs. How long they were impacted for and give the dates. If data loss ensure this is clearly flagged.
Timeline of the Incident
When | What |
---|---|
Date & maybe time e.g. 20th July 09:00 | Blah Team did something |
Incident details
Put a reasonably detailed description of the incident here.
Analysis
This section to include a breakdown of what happened. Include any related issues.
Follow Up
This is what we used to call future mitigation. Include specific points to be done. It is not necessary to use the table below, but may be easier to do so.
Issue | Response |
---|---|
Issue 1 | Mitigation for issue 1. |
Issue 2 | Mitigation for issue 2. |
Related issues
List any related issue and provide links if possible. If there are none then remove this section.
Reported by: Your Name at date/time
Summary Table
Start Date | Date e.g. 20 July 2010 |
Impact | Select one of: >80%, >50%, >20%, <20% |
Duration of Outage | Hours e.g. 3hours |
Status | select one from Open, Understood, Closed |
Root Cause | Select one from Unknown, Software Bug, Hardware, Configuration Error, Human Error, Network, User Load |
Data Loss | Yes/No |