Storage Issues

From GridPP Wiki
Jump to: navigation, search

Storage Issues

These are summaries of some of the known issues - in no particular order for now, plus a description of the least bad solutions. Some of these arose from GridPP16, others have been known for some time.


1. The meta-question of which priorities to assign to the issues below.


2. The definitions of "uptime" and "available".

2.1. What does it mean that the SE is available? How does the SE's availability affect the site's availability.

3. Enabling "dark storage" (sort of analogous to dark matter; it's there but you can't see it).

Specifically, enabling storage on pool nodes. Support, and non-support, of distributed filesystems.

4. How to improve an SE's resilience (related to available)

Which bits are most brittle? Which are easiest to improve? How to provide failover? Can you overprovide and then afford to lose something.

How can (nagios) monitoring help improve SE resilience?

5. The case of the cross-site SE (don't do this on your production system!)

6. Will it be useful to have new milestones, and if so, what should they be?

7. Should we go back to the more proactive ("superhero") approach to support? Are there specific sites for which it will be the useful approach.

8. Need to get optimisation recipes. Need to get site setup (store in cvs?)

9. Summarising purchasing recommendations - how to meet the CPU/storage ratio; what to buy.

10. Quotas - how to implement. Storage recommendations for supporting VO allocations. Draining pools.

11. SE to catalogue synchronisation

11.1. Automating VO shutdown - scripts to remove files

11.2. srmLs support

11.3. dCache to CASTOR migration

12. Chasing dCache issues. Falling behind current release.

13. dCache MSS interface - needed? Is this the best way to use the SAN?

14. Experiments use of storage space. Accounting.

15. Interfacing to the userboard.