Site Local Catalog Middleware
The installation of a site local catalog is a middleware service which is avaliable to all VOs. How, and whether, the VO uses this service depends upon their data processing model.
The model is then:
- Central catalog holds data set subscriptions, i.e., which datasets are held on which SEs.
- Local catalog on each site holds the SURL mappings for the files in that dataset.
This helps greatly with scaling problems - 100 sites running thousands of ATLAS jobs could easily overload a central catalog with updates (something that was a real problem with the old EDG Replica Manager). The local catalogs help distribute the data management load so it will scale much better. Local catalogs also help the robustness of data access by jobs - they are able to query a local catalog to find data, and data produced can be registered in the local catalog, which will have a higher availability than a central catalog.
After that, the ATLAS VO Box takes care of scheduling output data transfer back to a Tier 1 and can deal with central catalog registrations.