3.2.2.2. Grid interface
The Grid interface is recommended in cases that your project data and/or processing is tied to the Grid authentication and authorisation. To use the supported Grid clients on Spider you need to have an X509 Grid certificate installed into your local directory and be a part of a Virtual Organisation (VO). Please refer to our Grid documentation page for instructions on how to get a certificate and join a (VO).
3.2.2.2.1. Grid authentication
To be able to interact with dCache using a storage client, you need to create a proxy. A proxy is a short-lived certificate/private key combination which is used to perform actions on your behalf without using passwords.
Create a proxy with the following command:
voms-proxy-init --voms [YOUR_VO]
On Spider your proxy is generated by default
in your $HOME/.proxy
location such that it is accessible from anywhere on Spider.
You can check this with echo $X509_USER_PROXY
.
The default lifetime of a proxy is 12h. If your application runs longer then you can create a proxy that is valid up to 7 days with the following command:
voms-proxy-init --voms [YOUR_VO] --valid 168:00
3.2.2.2.2. Grid clients
There are many Grid clients to interact with dCache. On Spider we support gfal
.
In the examples below, a user who is a member of the VO named lsgrid, has the certificate installed on to the Spider login node and will copy data from dCache to/from his home directory on Spider.
gfal client
Note
The gfal
commands fail on our centos 8 worker nodes due to the security setup. The workaround is to run the commands below with a centos 7 container, i.e. singularity run -B /etc/grid-security/certificates /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7 gfal-ls -l [gsiftp://path-to-file]
Please note that you need a valid proxy to run the following commands.
Listing directories on dCache:
gfal-ls -l gsiftp://gridftp.grid.sara.nl:2811/pnfs/grid.sara.nl/data/lsgrid/
Create directory on dCache:
gfal-mkdir gsiftp://gridftp.grid.sara.nl:2811/pnfs/grid.sara.nl/data/lsgrid/path-to-your-data/newdir/
Copy file from dCache to Spider:
gfal-copy \
gsiftp://gridftp.grid.sara.nl:2811/pnfs/grid.sara.nl/data/lsgrid/path-to-your-data/your-data.tar \
file:///`pwd`/your-data.tar
Copy file from Spider to dCache:
gfal-copy \
file:///$HOME/your-data.tar \
gsiftp://gridftp.grid.sara.nl:2811/pnfs/grid.sara.nl/data/lsgrid/path-to-your-data/your-data.tar
Remove a file from dCache:
gfal-rm gsiftp://gridftp.grid.sara.nl:2811/pnfs/grid.sara.nl/data/lsgrid/path-to-your-data/your-data.tar
Remove a whole (non empty) directory from dCache:
gfal-rm -r gsiftp://gridftp.grid.sara.nl:2811/pnfs/grid.sara.nl/data/lsgrid/path-to-your-data/
Recursive transfer of files (transferring a directory) is not supported with the gfal-copy command.
Tip
Need more examples? See gfal Grid documentation
3.2.2.2.3. Grid data processing
Below we show an example for I/O intensive applications. In this example you submit a job on Spider that performs the following steps:
Creates a runtime directory on local
scratch
(i.e.$TMPDIR
)Retrieves the input data from dCache
Runs the analysis
Stores the output data on dCache
Here is a job script template for local scratch
usage;
#!/bin/bash
#SBATCH -N 1 #request 1 node
#SBATCH -c 1 #request 1 core and 8GB RAM
#SBATCH -t 5:00 #request 5 minutes jobs slot
mkdir "$TMPDIR"/myanalysis
cd "$TMPDIR"/myanalysis
gfal-copy gsiftp://gridftp.grid.sara.nl:2811/pnfs/grid.sara.nl/data/path-to-your-data/your-data.tar file:///`pwd`/your-data.tar
# = Run you analysis here =
#when done, copy the output to dCache
tar cf output.tar output/
gfal-copy file:///`pwd`/output.tar gsiftp://gridftp.grid.sara.nl:2811/pnfs/grid.sara.nl/data/path-to-your-data/output.tar
echo "SUCCESS"
exit 0
Please note that in the above example, it is assumed that the data is present on the disk storage on dCache. If the data is stored on tape, it may need to be copied to disk first (called staging).