What is it?
Grid storage enables storage of files in a fault-tolerant and scalable environment, and sharing it with distributed teams. Your data can be accessed through multiple protocols, and can be replicated across different providers to increase fault-tolerance. Grid storage gives you complete control over what data you share, and with whom you share the data.
The main features of grid storage:
- Access highly-scalable storage from anywhere
- Control the data you share
- Organise your data using a flexible, hierarchical structure
Several grid storage implementations are available in the EGI Cloud, the most common being:
The grid storage endpoints that are available to a user’s Virtual Organizations are discoverable via the EGI Information System (BDII).
lcg-infosites command can be used to obtain VO-specific information on
existing grid storages, using the following syntax:
$ lcg-infosites --vo voname -[v] -f [site name] [option(s)] [-h| --help] [--is BDII]
For example, to list the Storage Elements (SEs) available to the
biomed VO, we
could issue the following command:
$ lcg-infosites --vo biomed se Avail Space(kB) Used Space(kB) Type SE ------------------------------------------ 280375465082 n.a SRM ccsrm.ihep.ac.cn 10995116266 11 SRM cirigridse01.univ-bpclermont.fr
Access from the command line
Access to grid storage via a command line interface (CLI) requires users to obtain a valid X.509 user VOMS proxy. Please refer to the Check-in documentation for more information.
NoteIntegration via OpenID Connect to the EGI Check-in service is under piloting at some of the endpoints of the EGI Cloud infrastructure , but it has not yet reached the production stage.
The CLI widely used to access grid-storage is gfal2, which is available for installation both on RHEL and Debian compatible systems.
gfal2 provides an abstraction layer on top of several storage
protocols (XRootD, WebDAV, SRM, gsiftp, etc), offerint a convenient API that
can be used over different protocols.
gfal2 CLI can be installed as follows (for RHEL compatible systems):
$ yum install gfal2-util gfal2-all
gfal2-all will install all the plug-ins (to deal with all the
Below you can find examples of the usual commands needed to access storage via
gfal2. For a complete list of available commands, and the guide on how to use
them, please refer to the
NoteIn the examples below, the used
gsiftpprotocol can be replaced by any other supported protocol.
List files on a given endpoint
$ gfal-ls gsiftp://dcache-door-doma01.desy.de/dteam 1G.header-1 domatest gb SSE-demo test tpctest
Create a folder
$ gfal-mkdir gsiftp://dcache-door-doma01.desy.de/dteam/test
Copy a local file
$ gfal-copy test.json gsiftp://dcache-door-doma01.desy.de/dteam/test Copying file:///root/Documents/test.json [DONE] after 0s
Copy files between storages
$ gfal-copy gsiftp://prometheus.desy.de/VOs/dteam/public-file gsiftp://dcache-door-doma01.desy.de/dteam/test Copying gsiftp://prometheus.desy.de/VOs/dteam/public-file [DONE] after 3s
Download a file to a local folder
$ gfal-copy gsiftp://prometheus.desy.de/VOs/dteam/public-file /tmp Copying gsiftp://prometheus.desy.de/VOs/dteam/public-file [DONE] after 0s
Delete a file
$ gfal-rm gsiftp://dcache-door-doma01.desy.de/dteam/test/public-file gsiftp://dcache-door-doma01.desy.de/dteam/test/public-file DELETED
Access via EGI Data Transfer
The EGI Data Transfer service provides mechanisms to optimize the transfer of files between EGI Online Storage endpoints. Both a graphical user interface (GUI) and command line interfaces (CLI) are available to perform bulk movement of data. Please check out the related documentation for more information.
Integration with Data Management frameworks
Grid storage access, most of the time, is hidden from users by the integration with the Data Management Frameworks (DMFs) used by Collaborations and Experiments.
For example, the EGI Workload Manager provides a way to efficiently access grid storage endpoints in order to read/store files, and to catalogue the existing file and related metadata.
When running computation via the EGI Workload Manager, users do not actually access the storage directly. However, users can retrieve the output of the computation once it has been stored on the grid.