Ingest and Publication of ESGF Datasets

Ingest and Publication of ESGF Datasets
1. Introduction

Introduction

This page provides details of the ingest and publication workflow for datasets that are bound for the Earth System Grid Federation (ESGF) services. The basic workflow is as follows:

A collection of files arrives (a "batch")
Run the ceda-cc compliance checker on the batch.
Run the drs_tool to ingest the data into the archive.
Run the post_ingest_processor.py script to post-process/check the data once in the archive.
Run the generate_mapfiles.py script to generate mapfiles to be used for ESGF publication.
Run the ESGF publisher to scan the data.
Run the ESGF publisher to generate THREDDS catalogues.
Run the ESGF publisher to put the data in the ESGF Search Catalogue.

Whilst developing the documentation we are documenting some examples:

opman/ingest/ESGFIngestAndPublication/SPECSIngest

There is also a page on setting up the software environment . The following page looks at what needs to be done to automate this process: opman/ingest/ESGFIngestAndPublication/SPECSIngestAutomation