3.2.2.2. Grid interface

The Grid interface is recommended in cases that your project data and/or processing is tight to the Grid authentication and authorisation. To use the supported Grid clients on Spider you need to have an X509 Grid certificate installed into your .globus directory and be a part of a Virtual Organisation (VO). Please refer to our Grid documentation page for instructions on how to get a certificate and join a (VO).

3.2.2.2.1. Grid authentication

To be able to interact with dCache using a storage client, you need to create a proxy. A proxy is a short-lived certificate/private key combination which is used to perform actions on your behalf without using passwords.

Create a proxy with the following command:

voms-proxy-init --voms [YOUR_VO]

On Spider your proxy is generated by default in your $HOME/.proxy location such that it is accessible from anywhere on Spider. You can check this with echo X509_USER_PROXY.

The default lifetime of a proxy is 12h. If your application runs longer then you can create a proxy that is valid up to 7 days with the following command:

voms-proxy-init --voms [YOUR_VO] --valid 168:00

3.2.2.2.2. Grid clients

There are many Grid clients to interact with dCache. On Spider we support globus-url-copy and gfal.

In the examples below, a user who is a member of the VO e.g., lsgrid, has the certificate installed on to the Spider login node and will copy data from dCache to/from your home directory on Spider.

Globus client

Please note that you need a valid proxy to run the following commands.

  • Listing directories on dCache:

    globus-url-copy -list gsiftp://gridftp.grid.sara.nl:2811/pnfs/grid.sara.nl/data/lsgrid/
    
  • Copy file from dCache to Spider:

    globus-url-copy \
        gsiftp://gridftp.grid.sara.nl:2811/pnfs/grid.sara.nl/data/lsgrid/path-to-your-data/your-data.tar \
        file:///`pwd`/your-data.tar
    
  • Copy file from Spider to dCache:

    globus-url-copy \
        file:///$HOME/your-data.tar \
        gsiftp://gridftp.grid.sara.nl:2811/pnfs/grid.sara.nl/data/lsgrid/path-to-your-data/your-data.tar
    
  • Copy directory from dCache to Spider:

First create the directory locally, e.g. testdir.

globus-url-copy -cd -r \
 gsiftp://gridftp.grid.sara.nl:2811/pnfs/grid.sara.nl/data/lsgrid/path-to-your-data/testdir/ \
 file:///$HOME/testdir/

The globus-* client does not offer an option to create/delete directories or delete files. For this purpose you may use the gfal client as described below.

gfal client

Note

The gfal commands fail on our centos 8 worker nodes due to the security setup. The workaround is to run the commands below with a centos 7 container, i.e. singularity run -B /etc/grid-security/certificates /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7 gfal-ls -l [gsiftp://path-to-file]

Please note that you need a valid proxy to run the following commands.

  • Listing directories on dCache:

gfal-ls -l gsiftp://gridftp.grid.sara.nl:2811/pnfs/grid.sara.nl/data/lsgrid/
  • Create directory on dCache:

gfal-mkdir gsiftp://gridftp.grid.sara.nl:2811/pnfs/grid.sara.nl/data/lsgrid/path-to-your-data/newdir/
  • Copy file from dCache to Spider:

gfal-copy \
    gsiftp://gridftp.grid.sara.nl:2811/pnfs/grid.sara.nl/data/lsgrid/path-to-your-data/your-data.tar \
    file:///`pwd`/your-data.tar
  • Copy file from Spider to dCache:

gfal-copy \
    file:///$HOME/your-data.tar \
    gsiftp://gridftp.grid.sara.nl:2811/pnfs/grid.sara.nl/data/lsgrid/path-to-your-data/your-data.tar
  • Remove a file from dCache:

gfal-rm gsiftp://gridftp.grid.sara.nl:2811/pnfs/grid.sara.nl/data/lsgrid/path-to-your-data/your-data.tar
  • Remove a whole (non empty) directory from dCache:

gfal-rm -r gsiftp://gridftp.grid.sara.nl:2811/pnfs/grid.sara.nl/data/lsgrid/path-to-your-data/

Recursive transfer of files (transferring a directory) is not supported with the gfal-copy command. For this purpose you may use globus-url-copy.

Tip

Need more examples? See gfal Grid documentation and globus Grid documentation

3.2.2.2.3. Grid data processing

Below we show an example for I/O intensive applications. In this example you submit a job on Spider that performs the following steps:

  • Creates a runtime directory on local scratch (or $TMPDIR)

  • Retrieves the input data from dCache

  • Runs the analysis

  • Stores the output data on dCache

Here is a job script template for local scratch usage;

#!/bin/bash
#SBATCH -N 1      #request 1 node
#SBATCH -c 1      #request 1 core and 8GB RAM
#SBATCH -t 5:00   #request 5 minutes jobs slot

mkdir "$TMPDIR"/myanalysis
cd "$TMPDIR"/myanalysis
gfal-copy gsiftp://gridftp.grid.sara.nl:2811/pnfs/grid.sara.nl/data/path-to-your-data/your-data.tar file:///`pwd`/your-data.tar

# = Run you analysis here =

#when done, copy the output to dCache
tar cf output.tar output/
gfal-copy file:///`pwd`/output.tar gsiftp://gridftp.grid.sara.nl:2811/pnfs/grid.sara.nl/data/path-to-your-data/output.tar
echo "SUCCESS"
exit 0

Please note that in the above example, it is assumed that the data is present on the disk storage on dCache. If the data is stored on Tape, it may need to be copied to disk first (called as staging).