Delivery Tools

Available Delivery Tools

Data providers can upload data to CEDA using a variety of tools. The data scientist liaising with the provider will be happy to advise on the most appropriate mechanism to use and will help to set the data provision route.

Delivery Route (click for details) Types of data transfer is suitable for For Archiving For GWS delivery
Via HTTP - the CEDA File uploader Suitable for small scale data providers and short lived projects y n
FTP Suitable for small - medium scale data uploads for suitable projects where RSYNC is not an option y y
RSYNC Our preferred delivery mechanism which is suitable for all types of data provision. This route is particularly suited to regular, automated data uploading and is especially useful for very large files and dataset transfers. y y

HTTP

The file uploader service allows single files to be transfered to CEDA for archive ingestion - http://arrivals.ceda.ac.uk/. Please ask your CEDA liaison officer if you need to know which ingest stream to use.

FTP

Any ftp client can be used to connect to CEDA's ftp arrivals server:

ftp <userId>@arrivals.ceda.ac.uk

After logging in with your CEDA account credentials you will arrive at your delivery area which will contain sub-directory for each data "stream" to which you are permitted to deposit data.  If you are unable to locate the required sub-directory or are unsure which one to use please contact your CEDA team liaison.

If possible, please use a temporary filename during transfer and rename the file once the file has completed transferring to ensure that we are able to distinguish partially and completely transferred files.

Please note - we ask depositors not to deposit any files in to their top folder, but to always use one of the available sub-folders (Please see the note above about  ingestion streams)

Details of how to use ftp are available   here (note, the general ftp guide has been written using the download ftp server, as opposed to the upload ftp server). 

RSYNC

Once you have your RSYNC server account login (this is different from your normal CEDA account login - available from your CEDA team liaison officer) data can be rsynced as follows:

rsync -av --password-file=<path to password file> <path to source> <ceda account id>@arrivals.ceda.ac.uk::<ceda_account id>/<TARGET_DIR>

For example

rsync -av --password-file=mypasswordfile.txt data_dir fbloggs@arrivals.ceda.ac.uk::fbloggs/upload_dir

Notes:

  1. the double colon between the arrivals.ceda.ac.uk URL and the users account ID.
  2. The option to use a a file to hold the rsync account password is shown above - recommended for routine deliveries, but may not be needed for rsync transfers by hand.
  3. If the password file approach is used (see note 2) then the password file needs to be set as only user r/w privileges - i.e. it should show as having permissions: "r-w------". Otherwise a "password file must not be other-accessible" error will occur.
  4. other rsync options can also be used e.g. –a for recursing down through the source
  5. If you want to rsync contents of a source directory over you need to add a trailing slash ("/") to the source path. No trailing slash will be needed for the target directory path though

Please be aware, however, that RSYNC will carry out a full comparison between the source and destination. Thus, if you wish to send only a few files from your source to update the CEDA archive holdings then care is needed to avoid unnecessarily transferring large parts of the source to the CEDA system

After logging in with your CEDA account credentials you will arrive at your delivery area which will contain sub-directories for each data " ingest stream" to which you are permitted to deposit data.  If you are unable to locate the required sub-directory or are unsure which one to use please contact your CEDA team liaison.

Please note - we ask depositors not to deposit any files in to their top folder, but to always use one of the available sub-folders.

Ingest from Group-workspaces/Project-spaces

CEDA supports projects through shared storage spaces such as JASMIN group workspaces or FTP project spaces. Users of these services should understand that:

  • this is NOT the archive - placing data into these areas will not constitute having deposited data in the CEDA archive
  • the group-workspaces/project-spaces are NOT managed by CEDA and so content should be considered at risk

However, it is possible to prepare a dataset in these areas for eventual ingestion into the archive. If you wish to do this please contact your CEDA support officer in the first instance to discuss ingestion into the archive as it may be possible to ingest directly from these areas.

Still need help? Contact Us Contact Us