Summary of 'Catalog Services for ATLAS' for Metadata GroupSteven Hanlon, June 2004 |
| Use Cases for Metadata | Metadata Working Group | |
Introduction
The document 'Catalog services for ATLAS' describes use cases which define requirements on data catalogues, in the context of the AJDL model.
Catalogue Services
Data is presented to the user primarily through a Virtual Data Catalogue. A minimal implementation of this service would describe and application, its configuration and a list of input datasets. The existence of a service which delivers a concrete instance of a virtual dataset is assumed. The concrete instance could be obtained by locating a replica or by creating a new replica.
A Data Selection Catalogue will hold full the provenance chains of the virtual datasets, as well as any further information for selection purposes. Concrete replicas are recorded in the Data Replica Catalogue. The purpose of the Job Catalogue is obvious!
Use Cases
The use cases are divided into categories.
Data Acquisition
- Begin new run - create new virtual dataset to reference written data.
- Acquire data
- End run - close open files and write Selection Catalogue attributes.
- Check RAW data file - run application on single file dataset.
- Check RAW files - merge files and run application.
Reconstruction
- Virtual reconstruction - create entry in Virtual Data Catalogue with RAW data and application/configuration information.
- Concrete reconstruction
- Reconstruct file - i.e. without waiting for the end of the run.
- Virtual production of Analysis-Oriented Data
Combining Runs
- Combining reconstructed datasets - combine multiple runs into single datasets.
Event Selection
- Virtual event selection - i.e. catalogue input datasets and selection criteria.
- Concrete event selection
- Separate into streams - re-order the events in the dataset according to event characteristics.
Copying Event Data
- Copy selected events into new files
Analysis
- Analysis job
- Select a task - select an application configuration.
- Select an input dataset
Required Catalogues
From these use cases, it is inferred that the following catalogues are required:
- Application repository
- Task repository (a configuration for an application is called a task in AJDL jargon)
- Task selection catalogue
- Dataset repository
- Dataset replica catalogue
- Dataset selection catalogue
- Single-file dataset catalogue
- Virtual data catalogue
- Job catalogue
s.hanlon@physics.gla.ac.uk
Last modified Mon 14 June 2004 . View page history
Switch to HTTPS . Website Help . Print View . Built with GridSite 1.4.3