Machines power off/reboots for maintainance

From GridPP Wiki
Jump to: navigation, search

PLEASE DO NOT SWITCH OFF/REBOOT MACHINES FOR MAINTENANCE AND/OR UPGRADES-

a) Without planning.

b) Without opening a ticket describing what you are going to do. (Mandatory)

c) Without agreeing with or warning well in advance users and colleagues.

d) Without checking for, and warning users who are logged on at the time.

e) In the case of worker nodes, a check should be made to see if there are jobs still running and if necessary place the node offline in pbs, then wait until the jobs finish.

There are cases where not all of the above apply, although procedures (a and b) should be followed every time, these are -

a) A node that is apparently up and running but has lost connectivity with the network for some reason.

b) A node that had crashed, and is powered up but unresponsive and/or halted or powered down.