Data Transfer Tools: GridFTP (certificate-based authentication)

This article describes how to transfer data using gridftp with certificate-based authentication. It covers:

  • Basics of certificate-based authentication
  • Getting a short-term credential
  • Example usage

Basics of certificate-based authentication

Gridftp servers commonly use a network of "trust" based on electronic certificates. In order to make use of a gridftp server at one end of your proposed transfer, you will need to use a certificate which identifies you as the user, and which is issued by an identity provider which is "trusted" by the servers at both ends. The trust between the servers is maintained by the administrators of the service who will ensure that the necessary certificates are in place.

The presentation of a valid credential which is trusted by the server at the other end is merely the authentication step (proving who you are). Authorisation also needs to follow: you, as a user (identified by the credential you present) need to be authorised to use the resource at the other end. You should check with the operator of the other gridftp server to see what additional steps are required before you can actually perform a transfer.

Getting a short-term credential

There are currently 2 options available on JASMIN for obtaining a short-term credential for use with the JASMIN gridftp server:

  1. via interaction with Online Certificate Authority ("OnlineCA" method, preferred)
  2. via MyProxy

Both currently involve providing the credentials of your CEDA account, and being issued with a small file which you then use in subsequent gridftp commands.

OnlineCA method (preferred)

On the machine you intend to use as the transfer client, e.g. jasmin-xfer1.ceda.ac.uk, in your JASMIN home directory on, for example jasmin-xfer1.ceda.ac.uk, download a shell script which will interact with the Online CA for you, and make it executable:

$ wget https://raw.githubusercontent.com/cedadev/online_ca_client/master/contrail/security/onlineca/client/sh/onlineca-get-cert-wget.sh
$ chmod u+x onlineca-get-cert-wget.sh

View help information for the shell script:

$ ./onlineca-get-cert-wget.sh -h

Obtain a credential, to be written to an output file credfile using your CEDA username USERNAME:

./onlineca-get-cert-wget.sh -U https://slcs.ceda.ac.uk/onlineca/certificate/ -l USERNAME -o ./credfile

When prompted, enter the password associated with your CEDA account (NOT your SSH passphrase)

This credential obtained by this method is valid by default for 72 hours, as you can see by inspecting the certificate using the following command:

$ openssl x509 -noout -in credfile -text
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 2115 (0x843)
    Signature Algorithm: sha256WithRSAEncryption
        Issuer: DC=uk, DC=ac, DC=ceda, O=STFC RAL, CN=Centre for Environmental Data Analysis
        Validity
            Not Before: Jan 17 12:56:19 2017 GMT
            Not After : Jan 20 12:56:19 2017 GMT
        Subject: DC=uk, DC=ac, DC=ceda, O=STFC RAL, CN=https://ceda.ac.uk/openid/Your.Name
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                Public-Key: (2048 bit)
                Modulus:
                    00:fa:7d:40:0f:9c:5b:84:dd:6b:39:99:a8:3b:76:
...

The first time you use this method, it may be necessary to "bootstrap trust" to ensure that your local store of certificates contains the correct entries for the servers you intend to interact with:

$ wget https://raw.githubusercontent.com/cedadev/online_ca_client/master/contrail/security/onlineca/client/sh/onlineca-get-trustroots.sh
$ chmod u+x onlineca-get-trustroots.sh
$ ./onlineca-get-trustroots.sh -U https://slcs.ceda.ac.uk/onlineca/trustroots/ -b
Bootstrapping Short-Lived Credential Service root of trust.
Trust roots have been installed in /home/users/USERNAME/.globus/certificates.

MyProxy Logon method (alternative)

Use the myproxy-logon utility on the machine you intend to use as the transfer client, e.g. jasmin-xfer1.ceda.ac.uk, to obtain a short-term credential. Note that the address of the server in this case is slcs1.ceda.ac.uk (and not slcs.ceda.ac.uk as used in the OnlineCA service URL, in the previous section).

View help information for the myproxy-logon command:

$ myproxy-logon -help

Issue the command to obtain a short-term credential:

$ myproxy-logon -s slcs1.ceda.ac.uk -o credfile -l USERNAME

When prompted, enter the password associated with your CEDA account (NOT your SSH passphrase)

The first time you use this method, it may be necessary to add the -b option to ensure that your local store of certificates contains the correct entries for the servers you intend to interact with:

$ myproxy-logon -s slcs1.ceda.ac.uk -o credfile -l USERNAME -b

This credential obtained by this method is valid by default for 12 hours, but can be extended to cover a longer period using the -t option (currently up to 72 hours). See the help information (above) for more details.

Example Gridftp usage

Once you have obtained a valid short-term credential on the client transfer server, and assuming that the gridftp server at the remote end of the transfer recognises and is able to authorize you via this credential (See here for example of how to set this up with ARCHER/RDF), then you should be able to transfer data between the remote end (server) and local end (client) with commands such as shown below:

Please consult the documentation for the globus-url-copy command for the full range of options and arguments.

Please note that the examples below use a fictitious server gridftp.remotesite.ac.uk which needs to be replaced in your commands with the hostname of the gridftp server you are trying to connect to.

$ globus-url-copy -help

See also  http://toolkit.globus.org/toolkit/docs/latest-stable/gridftp/user/#gridftp-user-basic

1. Remote directory listing issued by client on jasmin-xfer1.ceda.ac.uk to server gridftp.remotesite.ac.uk where you have a home directory /home/users/USERNAME:

$ globus-url-copy -cred credfile -vb -list gsiftp://gridftp.remotesite.ac.uk/home/users/USERNAME/

2. Download a file from remote directory /home/users/USERNAME to destination on the client machine, for example a group workspace on JASMIN:

$ globus-url-copy -cred credfile -vb gsiftp://gridftp.remotesite.ac.uk/home/users/USERNAME/myfile file:///group_workspaces/jasmin/myworkspace/myfile

The -p N and -fast options can additionally be used in combination to enable N parallel streams at once, as shown below. You can experiment with N in the range 4 to 32 to obtain the best performance, but please be aware that many parallel transfers can draw heavily on shared resources and degrade performance for other users:

$ globus-url-copy -cred credfile -vb -p 16 -fast gsiftp://gridftp.remotesite.ac.uk/home/users/USERNAME/myfile file:///group_workspaces/jasmin/myworkspace/myfile

3. Recursively download the contents of a directory on a remote location to a local destination.

$ globus-url-copy -cred credfile -vb -p 4 -fast -cc 4 -cd -r gsiftp://gridftp.remotesite.ac.uk/home/users/USERNAME/mydir/ file:///group_workspaces/jasmin/myworkspace/mydir/

Where:

  • -cc N requests N concurrent transfers (in this case, each with p=4 parallel streams)
  • -cd requests creation of the destination directory if this does not already exist
  • -r denotes recursive transfer of directories
  • -sync and -sync-level options can be used to synchronise data between the two locations, where destination files do not exist or differ (by criteria that can be selected) from corresponding source files. See -help option for details.
  • the file:/// URI is used to specify the destination on the local file system.

Uploading data

The above commands can also be adapted to invoke transfers from a local source to a remote destination, i.e. uploading data, since the commands all take the following general form:

$ globus-url-copy [OPTIONS] source-uri desination-uri

However, when using gridftp using certificate authentication as we are doing here, you can only use the jasmin transfer servers jasmin-xfer[12].ceda.ac.uk and cems-xfer1.cems.rl.ac.uk as a client, i.e. you need to be logged in via SSH to one of these hosts and can initiate a transfer by invoking globus-url-copy in one of the ways above. 

JASMIN GridFTP server

So far the examples have used a server within JASMIN as the client in the GridFTP transfer. In order to do transfer using a JASMIN host as a destination url, you would need to interact with the JASMIN GridFTP server data-xfer1.ceda.ac.uk. You cannot log in to this server directly via SSH: you only initiate GridFTP transfers to and from it from another client.

In the following example, a client is initiated on the ARCHER/RDF server dtn01.rdf.ac.uk and tests the connection by transferring from /dev/zero on the local machine (at RDF) to /dev/null on the JASMIN gridftp server. Note that you can use the SLCS server at CEDA to obtain the short-term credential required.

[username@dtn01 ~] myproxy-logon -s slcs1.ceda.ac.uk -l <ceda username> -o credfile
[username@dtn01 ~] $ globus-url-copy -cred credfile -vb -p 8 -fast /dev/zero gsiftp://data-xfer1.ceda.ac.uk/dev/null
Source: file:///dev/
Dest:   gsiftp://data-xfer1.ceda.ac.uk/dev/
  zero  ->  null

4153409536 bytes       792.20 MB/sec avg       792.20 MB/sec inst

This server is also used as the JASMIN GridFTP Server globus endpoint, see  GridFTP transfers using Globus Online.

Please note that the servers jasmin-xfer[12].ceda.ac.uk and cems-xfer1.cems.rl.ac.uk are not gridftp servers. They have the globus-url-copy client installed, so can be used as clients to connect to remote gridftp servers, and also support gridftp over SSH (both incoming and outgoing), but do not act as servers for certificate-based gridftp as shown in these examples. The JASMIN gridftp server for read-write access to home directories and group workspaces is data-xfer1.ceda.ac.uk. Access to this requires registration for high-performance data transfer. See also Transfer Servers

Third-party transfers

It should be possible, with the correct configuration at each site, to initiate on host A a transfer of data between two other gridftp servers B and C (a third party transfer). Both URIs would use  gsiftp: as the protocol:

globus-url-copy -vb -p 4 gsiftp://B/source gsiftp://C/destination

Further information will be provided on this method in due course, however  Globus Online provides a managed service to orchestrate and monitor transfers between gridftp endpoints in a more user-friendly way, so is recommended as an alternative to setting up third-party transfers manually. See data-transfer-tools-gridftp-using-globus-online

Still need help? Contact Us Contact Us