Glue SE Schema

From GridPP Wiki
Jump to: navigation, search

Version 1.2 of the glue storage schema is known now to have some problems. This page lists some of them, and the work that is going on to work around them within the schema (i.e. without waiting for GLUE 2.0).

The following is taken from the minutes of the Grid Storage phone conference 22 Feb 2006, summarised by Jens:

Issues with GLUE storage has been raised with the LCG GDB. See Graeme's mail to the list: [1] and [2]

It was discussed at CHEP as well, and out came the "BOF declaration". As a do for now thing, they want to change the semantics of the values so permanent now means "written to tape", and durable means "lives on disk forever".

Graeme points out there are three dimensions:

  • Max access time/latency - the maximal time it will take the SE to get any file ready for transfer;
  • "Lifetime" in the SRM sense, volatile, durable, and permanent, as indicators of how the file is managed in the disk cache;
  • Quality/durability - how likely the file is to not vanish unintentionally.

Of course an SE can offer more than one of the above. And for different VOs. So the usual scalability problems apply where you need to give the same SE more than one name (e.g. dcache-tape.blah and dcache.blah for disk) because the schema can't cope (depending on whether this stuff is published for the SE or the SA).

The problem affects of course also SRM 1s who use the same GLUE schema, but LCG want to change the semantics for 2.1 only, so even if it's the *same* schema in the GRIS, the semantics is different between SRM 1s and 2s.

It has been suggested that a 2.1 client can ask for specific values of the above when doing a prepareToPut, and the server should be able to offer which values the client can get. And the client can then decide whether to accept or not (cf. the way cache expiry time was supposed to be managed for durable and volatile files).

Unless one does something Very Clever(tm) for 2.1, it will require an extension to SRM, effectively becoming WLCG SRM version 2.2.

Outcome of the SC workshop, includes a mailing list managed by Maarten.

GLUE 1.3

In the GLUE 1.3 schema, the Storage Area (SA) is now decoupled from the VO information block, with a one SA to many VOs relation between them. The idea is that accounting can only be done by the SA, because only the SA holds the "free" space, not the VO. Another reason for this choice was that different VOs may access the same space via different space token descriptions.

The SA contains the following attributes (among others):

SA Attributes
Attribute Type Description
LocalID string identifies the SA, unique within the SE only
Path string Published for compatibility with 1.2, for clients not querying the path from VOInfo
Name string A human-readable (and meaningful) name for the SA
TotalOnlineSize int32 Total size (static) in gigabytes
UsedOnlineSize int32 Used size (dynamic) in gigs
FreeOnlineSize int32 See discussion below...
ReservedOnlineSize int32 See discussion below...
TotalNearlineSize int32 Total tape capacity (static, more or less)
UsedNearlineSize int32 See discussion below...
FreeNearlineSize int32 See discussion below...
ReservedNearlineSize int32 See discussion below...
RetentionPolicy retentionPol_t Retention Policy
AccessLatency accessLat_t Access Latency
ExpirationMode expirationMode_t Expiration Mode
Capability string[*] Array of possibly implementation dependent capabilites of the SA.

Some of the accounting attributes listed above have been the subject of some debate in the community. In the sections below, we attempt to summarise the discussion and describe the implementation.

Used space

If an SE creates an extra internal copy of a file, should the VO be billed for twice the space? No, because the extra copy can be cleaned up. But what then if the primary copy is on tape, and the secondary copy is on disk and the disk copy is necessary for the VO to access the file? This is why this information is published separately for nearline and online spaces.

Available or Free Space

Available space is synonymous with free space, but some people felt that the name should be chosen with care: if a client decides to select an SE based on it having sufficient "available" space, then it can lead to undesirable behaviour on the grid. For example, you may get "flocking" where suddenly all files go to the same SE, overwhelming it. Or your job may fail because of the basic latency in publishing: the client has no way of knowing how "stale" the information is, or whether another client has taken the available space in the mean time. However, Stephen Burke points out that the number can be used for negative selection: if there is not enough free space

For available or free space, as for used space, temporary copies (or deleted copies waiting to be garbage collected should not count as not available, which puts some additional burden on the developer writing the query tool. You cannot just query the amount of space available on disk/tape, because some of the space that is used could be freed up. For tape it is arguably worse, because cleaning up temporary or deleted copies cannot always be done dynamically (or at all in some cases - CASTOR 2.1.2 and earlier do not have a repacking facility, and in 2.1.3 the feature is still somewhat experimental).

Then there are permissions. Is the space actually available if you don't have permission to use it? The SA itself carries a permissions list for compatibility with 1.2, but the real permission is supposed to be carried by the VOInfo objects, because different VOs typically have different permissions. In either case, the permission may be coarse grained: it is likely to list which VOs or roles in the VO who can access the SA; the actual access control decision for any individual user may be based on more complicated access control lists (decided at the time of the srmPut or srmGet). Finally, the published permission does not itself distinguish between reading, writing, deleting, etc.

For tape SEs, LCG have decided that they only really care about free space on the disk cache, so for tapestore systems the amount of available space on tape should not be published - tape is seen as infinite.

Reserved space

Reserved space can be reserved dynamically, by the user with the srmReserveSpace function in SRM 2.2, or statically set up by a system administrator. The GLUE schema does not inherently distinguish between these.

So if you put a file into reserved space, does the space then shrink, or is it still reserved? It was decided at the WLCG workshop in January 07 that reserved space (as published) does not shrink' when written into. For dCache, this should be made configurable per installation, since it is entirely feasible that others will reason that reserved space should shrink as it is being used (because reserved space does not count as free space).

GLUE 2.0

TODO.