From GridPP Wiki
Jump to: navigation, search

UKI-SOUTHGRID-OX-HEP - Nagios outage


Put here a reasonable description of the event. Ensure you include which service etc. is affected.


Describe the type of impact. Include which services / VOs. How long they were impacted for and give the dates. If data loss ensure this is clearly flagged.

Timeline of the Incident

When What
Date & maybe time e.g. 20th July 09:00 Blah Team did something

Incident details

Put a reasonably detailed description of the incident here.


This section to include a breakdown of what happened. Include any related issues.

Follow Up

This is what we used to call future mitigation. Include specific points to be done. It is not necessary to use the table below, but may be easier to do so.

Issue Response
Issue 1 Mitigation for issue 1.
Issue 2 Mitigation for issue 2.

Related issues

List any related issue and provide links if possible. If there are none then remove this section.

Reported by: Your Name at date/time

Summary Table

Start Date Date e.g. 20 July 2010
Impact Select one of: >80%, >50%, >20%, <20%
Duration of Outage Hours e.g. 3hours
Status select one from Open, Understood, Closed
Root Cause Select one from Unknown, Software Bug, Hardware, Configuration Error, Human Error, Network, User Load
Data Loss Yes/No