Difference between revisions of "RAL Tier1 Experiments Liaison Meeting"

From GridPP Wiki
Jump to: navigation, search
 
 
(381 intermediate revisions by 11 users not shown)
Line 2: Line 2:
  
 
Covers all aspects of the Tier1. Meeting access information is available from  
 
Covers all aspects of the Tier1. Meeting access information is available from  
[https://indico.cern.ch/event/307502/ Indico]. Previous special presentations can be found [http://www.gridpp.ac.uk/wiki/RAL_Tier1_Experiments_Liaison_Meeting_Presentations here].
+
[https://indico.cern.ch/category/4562/ Indico]. Previous special presentations can be found [http://www.gridpp.ac.uk/wiki/RAL_Tier1_Experiments_Liaison_Meeting_Presentations here].
  
 
== Agenda ==
 
== Agenda ==
  
'''Chairman:''' Andrew Sansum
+
'''Chair:''' Darren Moore/Alastair Dewhurst
  
'''Secretary:''' Andrew Lahiff
+
'''Secretary:''' Brian Davies
  
# Summary of [[RAL_Tier1_Experiments_Liaison_Meeting_Operations_Reports|Operational Status and Issues]]
+
# Major Incidents/Changes
# Highlights/summary of the Tier1 Monday operations meeting.
+
#* Summary of [[RAL_Tier1_Experiments_Liaison_Meeting_Operations_Reports|Operational Status and Issues]]
#* [[RAL_Tier1_weekly_operations_Grid|Grid Services]]
+
#* Highlights for Operations_Bulletin_Latest
#* [[RAL_Tier1_weekly_operations_Fabric|Fabric]]
+
# Experiment operational issues
#* [[RAL_Tier1_weekly_operations_castor|CASTOR]]
+
#* GGUS#RT Tickets
#* [[RAL_Tier1_weekly_operations|Other]]
+
#* VOs present
# [[RAL_Tier1_CASTOR_planning#KEY_DATES|Experiment plans]] and operational issues
+
#* Issues raised through other methods
#* CMS
+
# Experiment planning
#* <font face="Comic Sans MS">ATLAS</font>
+
#* DUNE/protoDUNE
#* LHCb
+
#* EUCLID
#* ALICE
+
#* SKA - https://indico.cern.ch/event/724632/timetable/#20180426
#* Others
+
#* Echo Quotas - https://wiki.e-science.cclrc.ac.uk/web1/bin/view/EScienceInternal/EchoQuotaManagement
# Special topics/presentations (agreed in advance)
+
# Continual improvement of T1 procedures
#* None
+
#*
# Actions
+
# Highlights for [[Operations_Bulletin_Latest|Operations Bulletin Latest]]
+
 
# AoB
 
# AoB
 +
 +
== Incubators ==
 +
[VO List[https://www.gridpp.ac.uk/wiki/GridPP_VO_Incubator]]
  
 
== Open Actions ==
 
== Open Actions ==
Line 32: Line 33:
 
{| border=1 align=center
 
{| border=1 align=center
 
|- bgcolor="#7c8aaf"
 
|- bgcolor="#7c8aaf"
! Action ID !! Priority !! Experiment(s) !! Owner !! Action !! Status
+
! Action ID !! Priority !! Experiment(s) !! Owner !! Action !! Status
|-
+
| 20140312-01 || Normal || ALICE || Lee || Check ALICE plans for tape access ||
+
|-
+
| 20140225-01 || Normal || MICE || Catalin || Investigate why it takes 5 hours for MICE to unpack a tarball ||
+
|-
+
| 20140225-02 || Normal || MICE || Catalin || Give MICE access to all WMSs at RAL ||
+
|-
+
| 20140205-01 || Normal || Non-LHC || Catalin || Ensure that non-LHC VOs are aware of alternatives to the NFS software server || Ongoing
+
|-
+
|}
+
 
+
== Completed Actions ==
+
 
+
Archives of actions completed can be found at:
+
 
+
* [[RAL Tier1 CASTOR Experiments Completed Actions 2007]]
+
* [[RAL Tier1 CASTOR Experiments Completed Actions 2008]]
+
* [[RAL Tier1 CASTOR Experiments Completed Actions 2009]]
+
* [[RAL Tier1 CASTOR Experiments Completed Actions 2010]]
+
* [[RAL Tier1 CASTOR Experiments Completed Actions 2011]]
+
 
+
{| style="background:#D6D4E3" border="1" align="center"
+
|- bgcolor="#7c8aaf"
+
! Action ID !! Priority !! Experiment(s) !! Owner !! Action !! Status !! Completed date
+
|-
+
| 20131023-03 || Normal || ATLAS || Matthew || Report back about ATLAS CASTOR deletion problem after F2F discussion with developers || Closed. || 2014-01-08
+
|-
+
|20131023-01  || Normal || N/A || Catalin || Send Henry details about UI baseline version for SHA-2 compliance || Done. || 2013-10-30
+
|-
+
| 20120111-01 || Normal || CMS || Andrew L, Chris B || Find out what's happening about disk/tape separation || Done. ||  2012-01-18
+
|-
+
| 20111214-01 || Normal || Non-LHC || Andrew L || Add Chris Walker to mailing list || Done. || 2012-01-18
+
|-
+
| 20120118-01 || Normal || ALICE || Alastair || Provide full list of LFNs on T0D1 to Lee || Done || 2012-01-25
+
|-
+
| 20111221-01 || Normal || LHC || Brian D || Discuss delegation of FTS channel management by T2 sys-admins. || Decided that we should retain control. || 2012-02-08 
+
|-
+
| 20120125-01 || Normal ||  || Andrew S || Review GGUS ticket 77026 in advance of next week's meeting. || Not done. || 2012-02-08
+
|-
+
| 20120229-03 || Medium || || Andrew S || Talk to Ian about possibility of using perfsonar for validating the new OPN subnet || Done. Ian is working on it. || 2012-03-07
+
|-
+
| 20120229-04 || Medium || MICE || Andrew L || Send list of closed MICE tickets to Henry || Done. List sent to Henry || 2012-03-14
+
|-
+
| 20120229-01 || Medium || || Andrew S || Discuss strategy for funding LSF in 2012 with CASTOR team || No longer necessary, since an LSF license has been purchased for the rest of the year. || 2012-03-22
+
|-
+
| 20120321-05 || Medium || CMS || Andrew L || Find out if RAL can start using CVMFS || Done. Can't yet move to CVMFS, but immenent. || 2012-03-28
+
|-
+
| 20120321-03 || Medium || ALICE || Shaun || Determine the age distribution of ALICE files on aliceTape || Done. Files even 2 years old are still staged. || 2012-03-28
+
|-
+
| 20120328-01 || Medium || All || Gareth || Create a deployment schedule for 2011 CPU and check MoU committments || Done. All 2011 CPU in production. || 2012-04-04
+
|-
+
| 20120321-01 || Medium || ALICE || Lee, Shaun || Find out about the load on CASTOR from Japan || Closed. No longer relevant. || 2012-04-25
+
|-
+
| 20120321-02 || Medium || ALICE || Chris K || Look for any correlation between ALICE CPU efficiency and LSF efficiency || Closed. No longer relevant. || 2012-04-25
+
|-
+
| 20120321-04 || Medium || ZEUS || Gareth || Contact ZEUS representatives about low CPU efficiencies || Closed. Emails sent. || 2012-04-25
+
|-
+
| 20120404-01 || Medium || LHCb || Gareth || Make sure xrootd pre-staging back door is closed || Done. || 2012-04-25
+
|-
+
| 20120502-01 || Medium || MICE || Shaun || Check permissions || Done by Shaun. || 2012-05-09
+
|-
+
| 20120425-01 || Medium || || Gareth || Review batch system limits || Done. Limits have been removed or increased. || 2012-05-23
+
|-
+
| 20120229-02 || Medium || || Andrew S || Ensure that the new OPN subnet in the Tier-1 has the correct routing across the WAN || Closed. || 2012-06-27
+
|-
+
| 20120509-01 || Medium || || Gareth || Circulate information about gridTest queue || Closed. Replaced with new action. || 2012-07-12
+
|-
+
| 20120627-01 || Medium || NA62 || Alastair || Clarify NA62 storage requirements || Done. || 2012-07-12
+
|-
+
| 20120627-02 || Medium || MICE || Shaun || Check permissions and ownership of MICE directories in CASTOR || Done. || 2012-07-12
+
|-
+
| 20120712-01 || Medium || All || Orlin || After setting up some test EMI-2 worker nodes, contact VO reps about testing. || Postponed || 2012-08-01
+
|-
+
| 20120822-01 || Medium || LHCb || Andrew L || Provide details about the gridTest queue to Raja || Done || 2012-08-29
+
|-
+
| 20120815-01 || Medium || || Andrew L || Ask Martin L about coordination of EMI-2 worker nodes || Done || 2012-09-05
+
|-
+
| 20120905-01 || Medium || MICE || Shaun || Investigate migration from tape1 to tape2 || Done || 2012-10-03
+
|-
+
| 20121003-01 || Medium || N/A || Gareth || Check if CA 1.50 certificates have been distributed everywhere || Closed || 2012-10-17
+
|-
+
| 20121003-02 || Medium || biomed || Gareth || Make sure that GGUS 85077 is given an owner.|| Closed || 2012-10-17
+
|-
+
| 20120530-01 || Medium || ALICE || Shaun || Ask ALICE if they can remove files from CASTOR after unsuccessfully trying to put files in || Done || 2012-12-19
+
|-
+
| 20130109-01 || Medium || ALICE || Andrew L, Shaun || Check ALICE tape usage & allocation. || Done || 2013-01-16
+
|-
+
| 20130116-01 || Medium || T2K || Alastair, Shaun || Talk to Jonathan about T2K storage. || Done || 2013-01-23
+
|-
+
| 20130116-01 || Medium || T2K || Alastair || Write document: Idiot's guide to storage at RAL for non-LHC VOs. || Done. || 2013-01-30
+
|-
+
| 20130123-01 || Medium ||  || Andrew S || Try to ensure that Friday's electrical work is delayed || Closed || 2013-01-30
+
|-
+
| 20130206-01 || Medium || ALICE || Rob || Check consistency & accuracy of ALICE disk reporting || Closed || 2013-02-20
+
|-
+
| 20130306-01 || Medium || MICE || Henry || Henry to email production team about MICE computing contacts. || Done. || 2013-03-13
+
|-
+
| 20130313-02 || Medium ||  || Andrew L || Setup a place for previous special presentations. || Done. || 2013-04-03
+
|-
+
| 20130220-01 || Medium || T2K || Gareth || Post-mortem on gdss594 || Done. || 2013-04-10
+
|-
+
| 20130313-01 || Medium || ATLAS || Alastair || Make sure ATLAS GGUS ticket about CASTOR problems affecting FTS is up-to-date || Closed || 2013-05-01
+
|-
+
| 20130123-01 || Medium ||    || Gareth || Ensure that the problem of CRL expiry is addressed || Closed || 2013-09-18
+
|-
+
| 20140108-01 || Normal || ALICE || Gareth || Why was the ALICE VOBOX rebooted 19 days ago? What happened to it? || Closed || 2014-02-05
+
 
|-
 
|-
| 20131023-02 || Normal || LHCb || Jens || Explain why SRM and CIP/BDII usage for LHCb are different and inform Raja which to use. || Closed || 2014-02-12
+
| - || - || - || -|| - || -
 
|-
 
|-
| 20140219-01 || Normal || N/A || John || Start post-mortem of loss of FTS3 database VM || Closed || 2014-02-25
 
 
|}
 
|}

Latest revision as of 12:33, 24 July 2019


Covers all aspects of the Tier1. Meeting access information is available from Indico. Previous special presentations can be found here.

Agenda

Chair: Darren Moore/Alastair Dewhurst

Secretary: Brian Davies

  1. Major Incidents/Changes
  2. Experiment operational issues
    • GGUS#RT Tickets
    • VOs present
    • Issues raised through other methods
  3. Experiment planning
  4. Continual improvement of T1 procedures
  5. AoB

Incubators

[VO List[1]]

Open Actions

Action ID Priority Experiment(s) Owner Action Status
- - - - - -