n west (APC)Last
modified: Thu Nov 29 13:07:44 GMT 2007
MINOS input to UB Meeting on 04 Dec 2007
Prologue
This UB meeting represents a watershed for MINOS. In the past,
problems with the GRID infrastructure have not impacted us greatly and we have
continued to use qsub for all our production work. We have invested a lot of
effort trying to prepare, but still have serious concerns that the migration to
the GRID, although inevitable, will badly effect us in the short term .
Consequently I am reiterating a plea, that I have made before, that someone
within GridPP be given the job of helping small VOs like MINOS as a primary
responsibility. Ideally this would be permanent arrangement, but if that is not
possible, at least for the next 3 to 6 months while we transition. I expand this
request in the section Request
for Small VO Support
Contents
Alex
Sousa should have returned (or shortly will) a spreadsheet requesting * CPU: 125 kSpecInt2k
* Disk: NFS 6.9TB
* Disk: dCache 2.0TB (migrating to Castor
* Tape: 10 TB (dCache migrating to Castor)
We are
only just starting to get our Castor allocation, our disk servers where given to
LHCb some months ago, so have nothing to report.
On the very narrow question on whether we can
submit jobs from other UIs, we have done a series of test runs both at Oxford
and RAL PPD and find no problems. On the broader question on whether we are
happy to given up the RAL UI we do have concerns and request retention of the
UI, or equivalent at RAL as part of our Special
cases for non-Grid access.
We have
yet to run any full length production jobs on the GRID; despite having a long
lived MyProxy server running all our jobs fail as soon as the short term proxy
expires! As the grid500M queue has typically a day's worth of jobs, often our
jobs don't even start to run. Derek Ross is investigating, but this means we
don't have progress to report.
Also the grid500M appears very heavily used, based on EstRespTime, the
grid1000M and grid2000M appear to have much better times but we cannot use them.
Some of our analysis jobs now require 1GB of memory so need grid1000M but beyond
that our main effort, MC, only needs grid500M but is starved of CPU. Can
anything be done to even this up?
On a related point we don't know how to debug when jobs exceed some limit and
are aborted, returning no output at all. What do other experiments do? Is there
anyway to access the output of jobs running on the GRID similar to qcat that
could help us understand what is going on? I have spoken to Stephen Burke and it
looks like we have to wait for Glite/WMS for any improvement. I have just heard
that Catalin have finished setting this up so there may be a partial solution
soon.
The Summary
This section is rather long and there will not be time at
the meeting to discuss in detail. So, to summarise:-
- Request access for VO management, a need which I think has been recognised
and accepted.
- A UI for group wide batch submission both because production management
roles rotate between universities so centralising on RAL is logical and also
because most of our data is on NFS disk making management from any other UI
very clumsy.
- A short term request for existing users (<= 20) access to RAL until we
no longer have most of our data on NFS disk. Continuing to use the UI there
for private batch submission would be nice but not essential.
Matt
Hodges suggested that it might be possible to set up a "VO box" to be shared by
small VOs like ours. MINOS we be very interested in such a service. Indeed if it
were at RAL and it had access to our NFS disks then we could probably withdraw
all of the requests in this section.
The Detail
Before I get to
the requests I think it would help if I were briefly to explain our data access
strategy. At the moment our data is distributed over a heterogeneous mixed of
devices:- 1) FNAL enstore
2) Local NFS disk
3) RAL dCache
4) RAL Castor (currently not available)
To further complicate things some of our datasets are defined in terms of
SAM queries on the FNAL data store although the data themselves are often on
local disk or SE. To isolate production from details about physical location I
have, over the years, developed DCM (Data Cache Management) that presents all
the data in the same way with catalogues that map user requests from file name
to file locations. A data driven back-end then selects the appropriate commands
(currently cp, wget, dccp, rfio and soon to be added lcg-utils). To obtain DCM
catalogues of the SEs at RAL I requested, and was given, permission to perform
nightly "sympathetic" scans of our entire data set.
In a sense DCM is our equivalent of LFC and lcg-utils but in the short to
medium term, with our substantial NFS disk allocation and our reliance on FNAL
for data and meta-data, LFC cannot be a complete solution for us.
Right, now on to the requests for resources.
Now I believe that the case has
already been accepted for interactive access to maintain VOs but I will
formalise it by listing the types of activity for which we still need
interactive access:-
- Database Management
We run a database distribution system and
nightly update our databases on the MySQL server sql.gridpp. Mostly this runs
as a cron job although I do occasionally require to log in to investigate
problems.
- Catalogue Generation
As explained in the preamble, I have a
nightly cron that forms catalogues of our SE. Again I may need interactive
access to fix problems.
- General trouble shooting
Hard to enumerate exactly what reasons
I would need access, but given that most of our local storage is NFS disk,
interactive access is the only sensible way to manage it.
I think it
wise to allow one other person access so that we don't have a single point
failure and would request to permanently have:- 1) 2 accounts for VO managers
2) ~ 50GB disk space
3) < 1 cpu
4) Interactive and cron login
The
UK group perform a very valuable service to the collaboration as a whole,
providing approximately 50% of all the Monte Carlo. It makes sense to
concentrate all of this effort at a single UI. We would request that this UI be
at RAL for the following reasons:-
- Currently most of the data is on NFS disk so sorting out problems without
interactive access would be very hard. Although this does not require a UI it
makes sense to consolidate operations of job submission and data management at
a single place.
- Even when our NFS disk use has declined and our local data is in SEs we
would like to retain our UI for collaboration wide work at RAL as this
simplifies the rotation of our Production Manager. As I write this, Mike
Kordosky (UCL) is stepping down from the role and Marta Tavera (Sussex) is
taking over. At Sussex there isn't a UI and apparently little interest in
hosting one. Further, Alex Sousa (Oxford) is taking over Mike's role of
Physicist representative. Naturally the two roles are closely coupled and he
will need access to the UI use for production work. As both roles rotate it
would make our lives far simpler if we could remain at RAL rather than pick
some UI at a university.
So our requests, again building in
redundancy, are:- 1) 2 accounts for Production Managers
2) ~ 50GB disk space
3) < 1 cpu
For both VO management and Production Managers we would actually prefer
single accounts each with two SSH keys, or possibly even one account with four
SSH keys as then we won't run into file permission problems and the like. I have
already proposed this in a slightly extended form and got back the very clear
message that this was considered a security risk but, as there are good
management reasons for wanting this, I really would like to understand why two
accounts each holding one SSH key is considered safer than one account holding
two, if anything I would have thought that the later was marginally better.
MINOS fully accept
in the long term there is no case to be made for requesting the RAL UI for
private work carried out by UK members. At Oxford we are currently running tests
and though we have some problems, they are not related to the physical location
of the UI. However, in the short term, while our data remains on NFS disk, data
management for private work is far simpler if interactive access is permitted.
Here our request is:- 1) Existing MINOS UK account holders be permitted to retain them until
our data has moved into Castor.
In total this is <= 20 accounts.
As a convenience it would be simpler for us if we could continue to use the
UI at RAL for these users, but that is not critical.
As a small VO
MINOS has, not too surprisingly, not been top of the service list. Here is a
list of problems/requests and response times in the past 6 months.
- At the end of June requested a VO name change from minos ->
minos.vo.gridpp.rl.ac.uk in order to avoid conflicts with the FNAL VO. It took
approximately 15 weeks before we had a running VO.
- Sometime in October, or before, we lost our Castor allocation without
consultation or warning. After 5 weeks we still don't have it back
- In October I requested a UI at Oxford. It took 3 weeks to get set
up.
- In early November I asked a question on support@gridpp.rl.ac.uk. It took
2 weeks to get a reply.
- At the end of October the software disk became full. It took 9 days
to fix.
- In late November we discovered that none of our jobs would continue to run
after the short term proxy expired despite having setup a MyProxy. After 7
days it is still not fixed.
- We have had problems with IS reporting our LFC server. It has been fixed
twice in the recent past and is broken again as I write this.
- We have had multiple problems with dCache, the last was at the end of
October when it ran out of memory and broke our production.
Now I am
not claiming that all these are show stoppers, they aren't. Nor am I claiming
that people aren't doing their jobs, I know they are and when they can spare the
time I have had lots of very helpful email exchanges. I am sure that it is
simply that with the LHC coming on-stream and with so much going on, its just
too easy to overlook a small VO. Until now that hasn't really mattered but from
now on it will.
If there were a single contact in GridPP for small VOs (i.e those who have
never had GridPP funded posts) who could give general advice and help sort out
problems as and when they arise I believe it would be of great benefit.
I
believe that about a year ago an offer was made to help people migrate data from
the old ADS Tape store. MINOS have about 4TB (1 TB MINOS + 3TB Soudan2) that
they would like to migrate, either to dCache, or given that this service will
close soon, to Castor. What support is there for this?