PoC: Elena K, Daniela B (on emails please also cc David Colling)
Elena is the production manager.
UKDC == Imperial
LZ UK Data Centre which is based on GridPP CPU resources passed its Final Design Review on 10.03.2017:
Imperial will host a complete copy of the LZ data (GridPP funding be willing). An outline of this can be found in the LZ conceptual design report: CDR (page 258/Chapter 15).
Mock Data Challenge 2 has finished. UKDC has successfully run half of production and reconstruction jobs and all preliminary steps. There were several problems found which are now being tested. MDC3 is planned for April-September 2019. Preparation will start soon.
Mock Data Challenge 2 is in progress. UKDC has finished production of 3 months of background (other 3 months were done at USDC) and calibration data. It is now in reprocessing stage.
After preparation in January-March LZ started production for Mock Data Challenge 2 (MDC2). The goal of MDC2 is to produce 6 month of life time and calibration data. The work is splitted between UK and US Data Centres.
UKDC successfully passed Acceptance review on 19.12.2017. In preparation for Mock Data Challenge 2.
Running simulations for October 2017 Background Review
Summary of Mock Data Challenge 1: the Monte Carlo production for MDC1 has been run on UKDC only and the results have been transferred to USDC. Data processing and reprocessing have been successuffully run on GridPP resources. The main issue in MC generation and especially in reprocessing was memory consumption. Jobs reaching 6GB had to be rerun by hand at Sheffield for MC step. Processing and reprocessing were running as emergency jobs in IC: : Use memory allocation for two cores, but only run one job and in Sheffield on wn’s with free slots.
Summary of the Monte Carlo production run for Mock Data Challenge 1:
Looking at the result, the Monte Carlo production for MDC1 at GridPP has been a success: We produced 132.77 TiB of data comprising 732288 files. About 90% of the data was produced in the allotted time frame (4 weeks), and the last file was transferred to NERSC on July 10th.
The GridPP DIRAC instance was able to handle the LZ workflow, though we had to adjust the configuration to insure that the resources at the UKDC proper were used to the full extent.
The data transfer from UKDC to NERSC using a combination of the UKDC FTS server and the DTN (gridftp doors) transfer nodes at NERSC worked very well. We have made some minor optimisations (database cache, number of threads etc) on the UK side and have enquired with NERSC about the possibility of raising the number of concurrent transfers.
The UKDC (Daniela, Simon, Alex) estimates to have spent about 4 weeks FTE between their staff on LZ work during the production run.
The production run has uncovered a number of issues, which should be resolved before MDC2.
Work ongoing at Imperial on production job submission interface (see page 9-11 in CHEP 2016 talk)