A Statistical Analysis of Job Performance within LCG Grid
Authors: M. Aggarwal, D. Colling, B. MacEvoy, G. Moont, O. van der Aa
The LCG is an operational Grid currently running at 136 sites in 36 countries, offering its users access to nearly 14,000 CPUs and approximately 8PB of storage [1]. Monitoring the state and performance of such a system is challenging but vital to successful operation. In this context the primary motivation for this research is to analyze LCG performance by doing a statistical analysis of the lifecycles of all jobs submitted to it. In this paper we define metrics that will describe typical job lifecycles. The statistical analysis of these metrics enables us to gain insight into the work load management characteristics of the LCG Grid [2]. Finally we show how those metrics can be used to spot Grid failures by identifying statistical changes over time in the monitored metrics.
[1] GridPP-UK Computing for Particle Physics: http://www.gridpp.ac.uk/
[2] Crosby P, Colling D, Waters D, Efficiency of resource brokering in grids for high-energy physics computing, IEEE Transactions on Nuclear Science, 2004, Vol: 51, Pages: 884 - 891, ISSN: 0018-9499
Last modified Thu 24 November 2005 . View page history
Switch to HTTPS . Website Help . Print View . Built with GridSite 1.4.3