Available Delivery Tools
Data providers can upload data to CEDA using a variety of tools. The data scientist liaising with the provider will be happy to advise on the most appropriate mechanism to use and will help to set the data provision route.
|Delivery Route (click for details)||Types of data transfer is suitable for||For Archiving||For GWS delivery|
|Via HTTP - the CEDA File uploader||Suitable for small scale data providers and short lived projects||y||n|
|FTP||Suitable for small - medium scale data uploads for suitable projects where RSYNC is not an option||y||y|
|RSYNC||Our preferred delivery mechanism which is suitable for all types of data provision. This route is particularly suited to regular, automated data uploading and is especially useful for very large files and dataset transfers.||y||y|
The file uploader service allows single files to be transfered to CEDA for archive ingestion - http://arrivals.ceda.ac.uk/. Please ask your CEDA liaison officer if you need to know which ingest stream to use.
Any ftp client can be used to connect to CEDA's ftp arrivals server:
After logging in with your CEDA account credentials you will arrive at your delivery area which will contain sub-directory for each data "stream" to which you are permitted to deposit data. If you are unable to locate the required sub-directory or are unsure which one to use please contact your CEDA team liaison.
If possible, please use a temporary filename during transfer and rename the file once the file has completed transferring to ensure that we are able to distinguish partially and completely transferred files.
Please note - we ask depositors not to deposit any files in to their top folder, but to always use one of the available sub-folders (Please see the note above about ingestion streams)
Details of how to use ftp are available here (note, the general ftp guide has been written using the download ftp server, as opposed to the upload ftp server).
Once you have your RSYNC server account login (this is different from your normal CEDA account login - available from your CEDA team liaison officer) data can be rsynced as follows:
rsync -av --password-file=<path to password file> <path to source> <ceda account id>@arrivals.ceda.ac.uk::<ceda_account id>/<TARGET_DIR>
rsync -av --password-file=mypasswordfile.txt data_dir email@example.com::fbloggs/upload_dir
- the double colon between the arrivals.ceda.ac.uk URL and the users account ID.
- The option to use a a file to hold the rsync account password is shown above - recommended for routine deliveries, but may not be needed for rsync transfers by hand.
- If the password file approach is used (see note 2) then the password file needs to be set as only user r/w privileges - i.e. it should show as having permissions: "r-w------". Otherwise a "password file must not be other-accessible" error will occur.
- other rsync options can also be used e.g. –a for recursing down through the source
- If you want to rsync contents of a source directory over you need to add a trailing slash ("/") to the source path. No trailing slash will be needed for the target directory path though
Please be aware, however, that RSYNC will carry out a full comparison between the source and destination. Thus, if you wish to send only a few files from your source to update the CEDA archive holdings then care is needed to avoid unnecessarily transferring large parts of the source to the CEDA system
After logging in with your CEDA account credentials you will arrive at your delivery area which will contain sub-directories for each data " ingest stream" to which you are permitted to deposit data. If you are unable to locate the required sub-directory or are unsure which one to use please contact your CEDA team liaison.
Please note - we ask depositors not to deposit any files in to their top folder, but to always use one of the available sub-folders.
Ingest from Group-workspaces/Project-spaces
CEDA supports projects through shared storage spaces such as JASMIN group workspaces or FTP project spaces. Users of these services should understand that:
- this is NOT the archive - placing data into these areas will not constitute having deposited data in the CEDA archive
- the group-workspaces/project-spaces are NOT managed by CEDA and so content should be considered at risk
However, it is possible to prepare a dataset in these areas for eventual ingestion into the archive. If you wish to do this please contact your CEDA support officer in the first instance to discuss ingestion into the archive as it may be possible to ingest directly from these areas.