StoRM
StoRM is a disk based storage system with a SRM interface, developed by INFN. Indeed the capitalisation of StoRM is a play on SRM. It is an increasingly used solutions for UK sites involved in LCG to provide an SRM interface to the Grid (the others being Disk Pool Manager and DCache). It offers particular advantages with respect to those systems in offering POSIX access as well as being able to operate on top of cluster filesytems such as IBM's GPFS and LUSTRE (or whamcloud verson http://www.whamcloud.com/lustre/) . This page intends to provide information for Tier-2 sites who are deploying StoRM as their SRM. The information here is intended to augment the official documentation, not replace it.
Contents
Installation
Storm can easily be installed with YAIM. Detailed documentation, including an installation guide are available in the official documentation. The pages linked below aim to offer a more basic HOWTO for new users.
- Storm Install This is now out of date (it was written in 2009), but contains some useful information.
GridPP sites using StoRM
- UKI-LT2-QMUL (Queen Mary, University of London).
- QMUL is an early adopter for StoRM.
- UKI-SOUTHGRID-SUSSEX (Sussex)
Configuration Tips
Checksums
StoRM supports checksums - and it is strongly recommended that they be enabled. Currently (StoRM 1.11.1), the same checksum algorithm must be used for all VOs - and the LHC VOs have chosen to use adler32.
- Checksums are stored in an extended attribute of the file: user.storm.checksum.adler32
[root@se03 dteam]# getfattr -d testfile-put-1277458233-3a96016c8354.txt # file: testfile-put-1277458233-3a96016c8354.txt user.storm.checksum.adler32="1a400272"
Enabling checksums
StoRM's gridftp server supports calculating checksums on the fly - as the file is downloaded. To enable this, the following parameter needs to be enabled in your site-info.def:
GRIDFTP_WITH_DSI="yes"
To calculate the adler32 for a file you can use this script (adler32.py) run with python adler32.py filename
#!/usr/bin/env python BLOCKSIZE=256*1024*1024 import sys from zlib import adler32 for fname in sys.argv[1:]: asum = 1 with open(fname) as f: while True: data = f.read(BLOCKSIZE) if not data: break asum = adler32(data, asum) if asum < 0: asum += 2**32 print hex(asum)[2:10].zfill(8).lower(), fname
Argus
StoRM can use Argus for authentication and user banning. Information about configuration for banning can be found here.
Implement by setting in the storm config
STORM_FE_USER_BLACKLISTING=true
gridftp isn't banned using argus - a simple patch in the ticket shows how to fix this.
Spacetokens for Atlas
All of this should end up on the StoRM website, but is here for those who may find it useful.
Note that at the time of writing (April 2012) these links to the mailing list archive no longer work. The new list server is at https://lists.infn.it/sympa/arc/storm-users , but needs a password to be setup.
- How to set spacetoken sizes using YAIM in Storm 1.5.4
https://iris.cnaf.infn.it/pipermail/storm-users/2010-October/001025.html
- Setting up permissions for ATLAS spacetokens
Yaim's site-info.def
Files written by a atlas production user will not, by default, be readable by normal atlas users. If the SRM layer is used to access a file, this shouldn't be a problem, but sometimes ATLAS bypass this. To ensure normal ATLAS users have access to a file, STORM_TOKENNAME_DEFAULT_ACL_LIST="atlas:R" to ensure that users in the atlas group have read access to the file by default.
Previously I recommended extending this to prdatl as well - but those users are also in the atlas group, so this is not necessary.
STORM_ATLASDATADISK_VONAME=atlas STORM_ATLASDATADISK_ACCESSPOINT=/atlas/atlasdatadisk STORM_ATLASDATADISK_ROOT=$STORM_DEFAULT_ROOT/atlas/atlasdatadisk STORM_ATLASDATADISK_TOKEN=ATLASDATADISK STORM_ATLASDATADISK_ONLINE_SIZE=589000 STORM_ATLASDATADISK_DEFAULT_ACL_LIST="atlas:R"
# GROUPDISK is being incorporated into datadisk - new sites may not want to deploy this token. STORM_ATLASGROUPDISK_VONAME=atlas STORM_ATLASGROUPDISK_ACCESSPOINT=/atlas/atlasgroupdisk STORM_ATLASGROUPDISK_ROOT=$STORM_DEFAULT_ROOT/atlas/atlasgroupdisk STORM_ATLASGROUPDISK_TOKEN=ATLASGROUPDISK STORM_ATLASGROUPDISK_ONLINE_SIZE=285000 STORM_ATLASGROUPDISK_DEFAULT_ACL_LIST="atlas:R"
STORM_ATLASLOCALGROUPDISK_VONAME=atlas STORM_ATLASLOCALGROUPDISK_ACCESSPOINT=/atlas/atlaslocalgroupdisk STORM_ATLASLOCALGROUPDISK_ROOT=$STORM_DEFAULT_ROOT/atlas/atlaslocalgroupdisk STORM_ATLASLOCALGROUPDISK_TOKEN=ATLASLOCALGROUPDISK STORM_ATLASLOCALGROUPDISK_ONLINE_SIZE=110000 STORM_ATLASLOCALGROUPDISK_DEFAULT_ACL_LIST="atlas:R"
STORM_ATLASPRODDISK_VONAME=atlas STORM_ATLASPRODDISK_ACCESSPOINT=/atlas/atlasproddisk STORM_ATLASPRODDISK_ROOT=$STORM_DEFAULT_ROOT/atlas/atlasproddisk STORM_ATLASPRODDISK_TOKEN=ATLASPRODDISK STORM_ATLASPRODDISK_ONLINE_SIZE=15000 STORM_ATLASSCRATCHDISK_VONAME=atlas STORM_ATLASSCRATCHDISK_ACCESSPOINT=/atlas/atlasscratchdisk STORM_ATLASSCRATCHDISK_ROOT=$STORM_DEFAULT_ROOT/atlas/atlasscratchdisk STORM_ATLASSCRATCHDISK_TOKEN=ATLASSCRATCHDISK STORM_ATLASSCRATCHDISK_ONLINE_SIZE=60000 STORM_ATLASSCRATCHDISK_DEFAULT_ACL_LIST=atlas:R
It shouldn't be necessary to have the following space tokens: ATLASGENERATEDDISK, ATLASINSTALLDISK and ATLAS. The ATLAS token with path /atlas/notoken is necessary (and needs to be last) as a default for files that don't specify a spacetoken. ATLASINSTALLDISK and ATLASGENERATEDDISK were needed in 2011, but probably aren't any more.
STORM_ATLASGENERATEDDISK_VONAME=atlas STORM_ATLASGENERATEDDISK_ACCESSPOINT=/atlas/generated STORM_ATLASGENERATEDDISK_ROOT=$STORM_DEFAULT_ROOT/atlas/generated STORM_ATLASGENERATEDDISK_TOKEN=ATLASGENERATEDDISK STORM_ATLASGENERATEDDISK_ONLINE_SIZE=1000
STORM_ATLASINSTALLDISK_VONAME=atlas STORM_ATLASINSTALLDISK_ACCESSPOINT=/atlas/install STORM_ATLASINSTALLDISK_ROOT=$STORM_DEFAULT_ROOT/atlas/install STORM_ATLASINSTALLDISK_TOKEN=ATLASINSTALLDISK STORM_ATLASINSTALLDISK_ONLINE_SIZE=1000 STORM_ATLAS_VONAME=atlas STORM_ATLAS_ACCESSPOINT=/atlas/atlasnotoken STORM_ATLAS_ROOT=$STORM_DEFAULT_ROOT/atlas/atlasnotoken #STORM_ATLAS_TOKEN=ATLAS STORM_ATLAS_ONLINE_SIZE=1000 #STORM_ATLAS_DEFAULT_ACL_LIST=atlas:R
#Hotdisk has been decomissioned, so this is historical # STORM_ATLASHOTDISK_VONAME=atlas # STORM_ATLASHOTDISK_ACCESSPOINT=/atlas/atlashotdisk # STORM_ATLASHOTDISK_ROOT=$STORM_DEFAULT_ROOT/atlas/atlashotdisk # STORM_ATLASHOTDISK_TOKEN=ATLASHOTDISK # STORM_ATLASHOTDISK_ONLINE_SIZE=3000 # STORM_ATLASHOTDISK_DEFAULT_ACL_LIST="atlas:R"
Restricting access
To limit writing to certain storage areas to only users with a production role, the following is used (note atlas production users are in the unix group prdatl ).
/etc/storm/backend-server/path-authz.db contains:
################################# # Path Authorization DataBase # ################################# # Evaluation algorithm # - possible values are: # 1) it.grid.storm.authz.path.model.PathAuthzAlgBestMatch # - To determine if a request succeeds, the algorithm process # the ACE entries in a computed order. Only ACEs which have # a "local group" that matches the subject requester are considered. # The order of ACE is defined on the base of distance from StFN # targetted and the Path specified within the ACE. Each ACE is # processed until all of the bits of the requester's access have # been checked. The result will be: # - NOT_APPLICABLE if there are no ACE matching with the requester. # - INDETERMINATE if there is at least one bit not checked. # - DENY if there is at least one bit DENIED for the requestor # - PERMIT if all the bits are PERMIT algorithm=it.grid.storm.authz.path.model.PathAuthzAlgBestMatch # ================== # SRM Operations # ================== # PTP --> WRITE_FILE + CREATE_FILE # RM --> DELETE_FILE # MKDIR --> CREATE_DIRECTORY # RMDIR --> DELETE # LS --> LIST_DIRECTORY # PTG --> READ_FILE # ================== # Operations on Path # ================== # 'W' : WRITE_FILE "Write data on existing files" # 'R' : READ_FILE "Read data" # 'F' : MOVE/RENAME "Move a file" # 'D' : DELETE "Delete a file or a directory" # 'L' : LIST_DIRECTORY "Listing a directory" # 'M' : CREATE_DIRECTORY "Create a directory" # 'N' : CREATE_FILE "Create a new file" # #--------+----------------------+---------------+---------- # user | Path | Permission | ACE # class | | mask | Type #--------+----------------------+---------------+---------- prdatl /atlas/atlasdatadisk WRFDLMN permit pilatl /atlas/atlasdatadisk RL permit atlas /atlas/atlasdatadisk RL permit @ALL@ /atlas/atlasdatadisk WRFDLMN deny prdatl /atlas/atlasscratchdisk WRFDLMN permit pilatl /atlas/atlasscratchdisk WRFDLMN permit atlas /atlas/atlasscratchdisk WRFDLMN permit @ALL@ /atlas/atlasscratchdisk WRFDLMN deny prdatl /atlas/atlaslocalgroupdisk WRFDLMN permit pilatl /atlas/atlaslocalgroupdisk RL permit atlas /atlas/atlaslocalgroupdisk RL permit @ALL@ /atlas/atlaslocalgroupdisk WRFDLMN deny prdatl /atlas/ RL permit pilatl /atlas/ RL permit atlas /atlas/ RL permit @ALL@ /atlas/ WRFDLMN deny prdlon /vo.londongrid.ac.uk/ WRFDLMN permit longrid /vo.londongrid.ac.uk/ WRFDLMN permit @ALL@ /vo.londongrid.ac.uk/ RL permit @ALL@ /vo.londongrid.ac.uk/ WFDMN deny @ALL@ / WRFDLMN permit
Operational issues
Generating a list of SURLS
It is often useful to generate a list of SURLs. For Lustre, lfs find is faster than find, and sed can then be used to turn a filename into a SURL. Here's an example for QMUL:
lfs find -type f /mnt/lustre_0/storm_3/atlas/ | sed s%/mnt/lustre_0/storm_3/%srm://se03.esc.qmul.ac.uk/%
In the case of a disk server that is down:
lfs df
will tell you which OSTs are down.
lfs find -obd lustre_0-OST002f_UUID /mnt/lustre_0/storm_3/atlas/ | sed s%/mnt/lustre_0/storm_3/%srm://se03.esc.qmul.ac.uk/%
will find files on a particular OST: lustre_0-OST002f_UUID in this case.
Syncat Dumps
Christopher Walker has a very alpha quality syncat dump script which is available on request.
Hardware
The official documentation has some information on hardware requirements, but it may be useful to know what sites currently have deployed - so here are some examples. Note that this is not a statement of what is actually required.
StoRM's hardware requirements (as opposed to the underlying GPFS/Lustre filesystem) are modest - the main point is that the GridFTP servers need lots of bandwidth.
QMUL
QMUL runs the frontend, backend and database on one machine.
Hardware config as of April 2017 :
Dell R510 * 72 CPU: Dual X5650 Memory: 24Gig RAM DISK: 12 * 2(3)TB in Raid 6 Network: 10Gig connectivity.
Dell R730XD * 20 CPU: Dual E5 2609 V3 Memory: 64Gig RAM DISK: 16 * 6TB in Raid 6 Network: 10Gig connectivity.
HPE APOLLO 4200 * 2 CPU: Dual E5 2609 V3 Memory: 128Gig RAM DISK: 14 * 8TB in 2 * Raid 6 Network: 10Gig connectivity.
Note, we updated the driver for this from the SL6.7 default to the latest on the Intel website after suffering some hangs.
QMUL also has a second GridFTP server which runs on similar hardware - initially this was deployed to make use of a second college link. It is currently being used to provide a bit of extra performance and test jumbo frames.
This page is a Key Document, and is the responsibility of Dan Traynor. It was last reviewed on 2017-07-05 when it was considered to be 70% complete. It was last judged to be accurate on 2014-10-02.