|
|
Line 76: |
Line 76: |
| <!-- *********************************************************** -----> | | <!-- *********************************************************** -----> |
| <!-- ***********************Start ops coord text*********************** -----> | | <!-- ***********************Start ops coord text*********************** -----> |
− | '''Thursday 7th May''' | + | '''Tuesday 26th May''' |
− | * The [https://indico.cern.ch/event/392739/ agenda]. [https://twiki.cern.ch/twiki/bin/view/LCG/WLCGOpsMinutes150507 Minutes] | + | * [https://twiki.cern.ch/twiki/bin/view/LCG/WLCGOpsMinutes150521 Minutes of the 21st May WLCG ops coordination meeting are available]. A snapshot of the updates: |
− | * News: Alessandra will present the WLCG workshop conclusions at next week's GDB.
| + | * Baselines: New EMI release with Storm 1.11.8. dCache 2.10.28/2.12.8 verified and set as baseline. Torque 2.5.13 added to baselines table. |
− | * Middleware news: UMD 3.12.0 released this week (fixes for ARGUS-PAP and dCache server)
| + | * MW issues: NTR. |
− | * Middleware baselines: dCache 2.6.x removed. New version 2.10.28/ 2.12.8 of dCache. Sites should avoid simultaneous updates. | + | * T0 services: CASTOR updated to 2.1.15. SRM validation ongoing. xroot is the main access protocol. |
− | * Middleware issues: major upgrade of torque arrived in EPEL (from torque-2.5.7 to torque-4.2.10) which is not compatible standard EMI torque installation. If upgraded the patched 2.5.13 version of torque has been pushed to the EMI third-party repo in order to downgrade. | + | * T1 services: KIT, IN2P3 and RRC-KI-T1 dCahce upgrades. |
− | * T0 & T1 upgrades: FTS 3.2.33 upgraded at CERN & RAL. | + | * T0 news: LFC to be stopped 22nd June. CMS pilot role mappings changed from static to pool account mapping (for tests and scheduling to work properly). |
− | * T0 news: batch HTCondor pilot is open for grid submission. Lower-than-usual WLCG availability figures in March for Atlas and CMS - possible overload. | + | |
| * T1 feedback: NTR | | * T1 feedback: NTR |
| * T2 feedback: NTR | | * T2 feedback: NTR |
− | * OS support in UMD: Plans in EGI for CentOS7 support. 13 products are ready for EPEL7, but in general CentOS7 is not a viable option for sites. The release of UMD4 (supporting EPEL7 and Ubuntu) is foreseen for September 2015 and the decommissioning of SL5 for March 2016. It is likely that some products relevant for WLCG will not be ready for EPEL7 before 2016. The requirement for WLCG is to provide SL6 until the end of Run2, however, there are already offers for resources on CentOS7 and this is an incentive for experiments to validate their software on it.
| + | * ALICE: Normal-high activity. No ops issues. |
− | * ALICE: CASTOR at CERN - some re-reco job instabilities. | + | * ATLAS: >200k running jobs - requires single/multi-core job mix. Final data transfer tests this week. Request T1s to avoid major downtimes until the summer. |
− | * ATLAS: ~running full. Considering increasing job lengths for all MCORE jobs. Need sites to provide MCORE resources. Rucio/FTS issue was discovered - fix via update. Tier-0 data and computing workflow fully commissioned. | + | * CMS: Started Run2 DIGI-RECO. Several EOS issues. Increased use/dependence on global xrootd re-director at CERN (so criticality impact has increased). |
− | * CMS: CMS production activities continue - Several sites reported network saturation. Evaluating to use selected “strong" Tier-2 sites to add computing capacity for DIGI-RECO. Plan to drop support of CRC32 checksum in CMS data transfer systems. | + | * LHCb: Computing workshop last week. Problems with SARA SRM. |
− | * LHCb: Various operational issues reported - CASTOR/CERN SRM access problems; other data access issues.
| + | * glexec: NTR |
− | * gLExec: ATLAS 61 out of 94 sites. RAL, RALPP and TW-FTT issue was due to a bug in the pilot code that showed up with ARC CE + Condor sites. | + | * RFC proxies: SAM-Nagios refresh_proxy probe fixed in UMD-3. Once sites check ok, central services of the experiments will be checked. |
− | * SHA-2: old VOMS server aliases (lcg-)voms.cern.ch were removed on Tue Apr 28. | + | * Machine job features: NTR |
− | * RFC proxies: RFC proxy readiness to be followed up per experiment. SAM-Nagios proxy renewal code fix to support RFC proxies. | + | * MW readiness: dCahce v2.10.28 done. StoRM in progress. Progress on [http://indico.cern.ch/event/392665/contribution/3/material/slides/1.pdf MW Readiness App]. |
− | * Machine/Job features: NTR | + | * Multi-core: |
− | * MW readiness: 10th meeting on 6th [http://indico.cern.ch/event/392665/ agenda]. WG is making a check-point of goals and priorities. ARGUS testbed at CERN is set-up and ready to start. Pakiti client requested at other test sites. | + | ** ATLAS goal is 80% production resources MC enabled. Looking to increase job lengths to 10-15hrs for 8-core slots to improve efficiency (also looking at a global fairshare). Sites should give more priority to multi-core over single core jobs. Sites that dynamically cap should adjust their cap to respect the 80% share. |
− | * MC deployment: NTR | + | ** CMS consolidating deployment to T1s. Looking into multicore pilots inefficiencies. |
− | * IPv6: LHCb: DIRAC was made IPv6-compatible back in November, but testing has started in April. Issue found at CERN with python library (wrong IPV6 address returned).
| + | * IPv6: NTR |
− | * Network/Transfers WG: NTR | + | * Squid monitoring/HTTP proxy: No news. |
− | * HTTP deployment: perfSONAR - Security: NDT 3.7.0.1 was released. The latest perfSONAR Toolkit version that all sites should be running is 3.4.2-12.pSPS. Network performance incidents process put in place as was agreed at the last meeting. OSG/Datastore validation progressing well. Publishing results to message bus progressing, development has finalized for esmond2mq prototype. [https://indico.cern.ch/event/382623/ Recent meeting] focussed on FTS performance. [https://indico.cern.ch/event/382624/ Next meeting 3rd June]. Plan is to focus it on latency ramp up and proximity service. | + | * Network & transfer metrics WG: |
| + | |
| + | |
| | | |
| <!-- **********************End ops coord text************************** -----> | | <!-- **********************End ops coord text************************** -----> |