Ingest script writing conventions
The working versions of ingest scripts are generally in the badc home area under software/datasets.
The scripts should be in a git repository on gitlab (https://breezy.badc.rl.ac.uk/) to record changes and act as a preserved copy of the scripts. Ingest scripts could contain passwords for external services so the internal gitlab repository is most suitable.
The scripts used for ingest are generally small scale and hard to separate from configuration used in the deployment. Configuration and scripts are changed very frequently and so should be kept in the same package.
Creating a ingest script package
Use the gitlab web interface to make a package.
git clone package into the software/datasets dir
Updating scripts and configuration
Change files and commit to git
git status
git add
git commit -m 'a commit message'
Push the commits to gitlab
git push
Scheduling and running
If there is a chance that the script will be run again, either manually or as a scheduled job, then it should be added to ingest_control (https://ceda-internal.helpscoutdocs.com/article/4272-ingest-control). Adding script to ingest control allows other users to control the dataset workflow if the original creator of the script is not there, and also documents the setup and environment for the script.