SL ATLAS tests

From GridPP Wiki
Jump to: navigation, search

UK ATLAS Grid Tests

Jobs are sent to each UK Site periodically and the results summarised. The jobs attempt to use the latest production version of the software installed at each site. Jobs are cancelled if they don't complete within 8 hours.

Hello World Grid Jobs (MyHWPackage)

These jobs execute a precompiled version of Athena HelloWorld that prints a customized string once per event. The correct result should be "10", the number of times this string is printed. These jobs test that the ATLAS software is installed and the runtime works OK.

Building New Package Grid Jobs (MyAlgPackage)

These jobs attempt to create a new Athena package, build it and run it for 10 events. The package is a simple alogithm that just increments a counter and prints out its value at the end of the job. Once the output is returned it is scanned for this value and hence for 10 events the correct result should be "10". Essentially the job tests that the ATLAS software is installed and (at least some of it) is functional, that CMT creates and configures the new package and that gmake works to build it. Finally it checks that the ATLAS runtime is OK.

Analysis Grid Jobs (MyAnalPackage)

These jobs attempt to analyse a file of AOD data (100 Z->e+e-) that has been replicated on each site's SE. Currently the data is copied from the SE to the WN where the job is running. A new Athena package is created as in the simple job which reads the data, loops over all electron pairs and calculates the Z mass. Hence the correct result should be around 90 (GeV).

In addition to the tests in the simpler jobs, this job tests that the replica still exists on the SE, that lcg-cp can copy it to the WN, that the AOD data can be read from StoreGate and that a sensible Z mass is calculated.

Results 11th January 2007 to 5th March 2007

The results here are based on the test results which can be seen here: The y-axis shows the overall percentage success of the jobs submitted for a given day. There are some problems beyond site control which impact the results. For example, a problem with the RB being used to submit the jobs can prevent jobs running for a long period as happened on 10th and 15th February. (To help spot such periods the average success across all sites is plotted on each graph as a white line). In addition, if it has not been possible to replicate the AOD data file to the site at some point (not with every job) or the ATLAS release installed can not be checked, then the result will be 0. Some success variability is to be expected if a site is full of jobs and fairshare policies come into effect. Also note that ATLAS production jobs take priority over these "standard" user jobs.

File:All.JPG File:Birmingham.JPG

Due to small WN disk space, the Atlas software could not even be installed at Bristol till 23 Feb 2007, when one WN was rebuilt with a larger disk. That same day, the atlassgm job to install software landed fortuitously on that WN, the software was installed, & as is obvious Atlas tests succeed pretty well from there on.

File:Bristol.JPG File:Brunel.JPG File:Cambridge.JPG File:Durham.JPG File:Edinburgh.JPG

Problems at end of January when latest version of Atlas software could not be installed in experiment software area due to lack of disk space. Hardware and software updates to our dCache during February resulted in the SE being temporarily unavailable, resulting in some of Steve's tests failing.

File:Glasgow.JPG File:Imperial HEP.JPG File:Imperial LeSC.JPG File:Lancaster.JPG File:Liverpool.JPG File:Manchester.JPG File:Oxford.JPG File:QMUL.JPG File:RAL PPD.JPG File:RAL Tier-1.JPG File:RHUL.JPG File:Sheffield.JPG File:UCL CENTRAL.JPG File:UCL HEP.JPG