RAL Tier1 Grid Team Actions

From GridPP Wiki
Jump to: navigation, search

This is a Wiki area for open Tier-1 Grid Services Team Actions

Upcoming service disruptions/alterations/interventions

Once we are ready to perform these interventions appropriate notification should be made as per the intervention procedure. But note them here to allow others to see what could potentially affect them.


Type corresponds to the EGEE Intervention classification A-D or n/a if the intervention should be transparent.


Services with per VO component

Service Type Planned Intervention Alice Atlas CMS LHCb Others Type
CE Migrate non-LHC vos to new CE n/a n/a n/a n/a No date yet
Experiment Software Areas Migrate VOs from common area to dedicated host ? Done Done Done ?

Generic Grid Services

Service Type Planned Interventions Date Type
WN Move gLite software to AFS ?
WN Migrate to SL5/64-bit ?

Ongoing Actions List

Obsolete. Now using footprints helpdesk.

See also RAL Tier1 Grid Team Closed Actions.

Action ID prefix Status
G = From Grid Services Team Meeting Open = Action has been created
T = Added by Team members or Team Leader Progress = Action is being worked on
P = Created by other project members Closed = Action is complete
R = Created by UKI ROC/Production Manager Rejected = Action is rejected


Action ID Priority Owner Action Title Target date Status Date closed Notes
G-89 Medium Catalin Coordinate off Site Replication of LFC/FTS (also disaster recovery testing) Open Arrange meeting and generate plan as to how we set up a second LFC/FTS out of ATLAS building.[17/10/08]Meeting held , will write up plan [21/11/08] A draft plan exists, needs some refining.[12/12/08]Need to coordinate with Database Services team[16/1/09]Catalin to send plan to Matt[30/1/09]Awaiting for input from Andrew and/or DB team[2009/03/06]DB group to circulate proposals for off-site replication to elicit feedback [2009/05/08]Meeting scheduled for Thursday to discuss options[2009/05/15]Will use old db servers for failover[2009/05/22]DB Service have setup streaming service between old db servers, wil switch to production once we are convinced it is working correctly[2009/06/19]Debugging issues[2009/06/26]Now unable to make any more progress on this for a month[2009/09/11]Meeting on 14th with DB team to discuss[2009-10-02]Request for two boxes for dataguard testing (one-node RAC master to single-server slave); RT ticket submitted.[2009/10/23]Still need hardware


G-111 Medium Derek Find alternative solution to backing up helpdesk db tables Open mysqlhotcopy can't copy table type used by helpdesk, need to find another solution[2009/07/31]Dump takes 6 minutes but locks tables, will try with different options to see if any improvement[209/08/07]Different options do not lock table, will implement a backup cron job[2009/8/14]Backup cron job implemented, need to test restore.[2009/10/23]Imported succesfully, need to check that RT is happy with it
T-115 Medium Derek Add VO Shares to Information System Open [2009/10/23]Have script to publish this, but overwrites another part of info system so not in use at the moment


T-118 Low Derek Check if publishing dummy clusters still required by LHCb Open [2009/12/11]Mailed Roberto Santinelli who forwarded me on, still awaiting response
T-119 Medium Derek Check ALICE requirements for jobs with lcgadmin VOMS role Open [2009/10/23]Moved to Derek, Need to reconfigure Cream CE, waiting for opportunity to implement[2009/12/11]Plan to rollout in January
T-120 Medium Matt/Derek Atlas New [2009/11/20]Investigate fixes for Atlas install jobs blocking Atlas SAM test