Dataset authorisation info
Introduction
(updated 20230215)
This document describes how the authorisation system works. It covers public, registered user and restricted resources.
To set up access control fully there are 3 systems that need interacting with:
- accessInstructor- this is where the actual access control within the archive is set for FTP and web download and JASMIN access
- userDB - this is where the application system is set up and maps to user accounts for actual user access
- MOLES - will harvest the access information from the accessInstructor for internal content or the CEDA Manual Metadata Store entry for externally hosted items that are catalogued.
For the following 'access group' refers to the archive access group NOT the Linux group, which will be referred to as the 'jasmin group' (though that can be set in the accessInstructor too).
Important basic principles to note
- To proceed you need to know the archive path where you wish to set the rules to apply. This does NOT have to be done for every dataset within a given point in the archive, but could be higher up the directory tree if there is a common access control, licensing and JASMIN access that applies to all the datasets within a given part of the archive.
- A 'default' entry can be set at this higher point which can be overwritten/superseded by a more specific rule further down.
- These more specific rules can be set to expire and then access will go to the default option. This is very useful for embargoed data where the embargo will expire on a given day.
- The default rule, if nothing is set, should be to prevent people from accessing the data, but beware of what has been set further up the directory tree!
- for 'restricted data' a 'group' can be used to give access to a set of resources where the access control and licensing will be in common.
1. Licence file
From the DMP you should have an idea of what licence is needed for the data you are looking to get up access control for. Ideally, it should be one of the common licences. The 'licence selector' spreadsheet can help determine this.
NOTE this also includes two generic licences specifically designed for 'embargoed/restricted access' data (the RUGL and RUNCGL licences), which should remove the need to create any bespoke licences in most cases!.
However, should you need to have a new, specific licence not already in the CEDA Artefacts server then :
- get a PDF of the licence file (a PDF version is needed to work with the application system)
- add it to the CEDA Artefacts server under the licences/specific_licencesfolder in the github repository (https://github.com/cedadev/artifacts). This will then be copied to the live artefacts server by a cron job in a few minutes.
- Inform Graham that you've added a new licence as this will then need to be classified for the types of use the licence permits. [these are used to convey to users what may be the permitted user that is allowed and aids in filtering datasets]
2. Set up accessInstructor rules
This is where the archive "XACML policy files" are set, plus the JASMIN group. The XACML policy files set the web access control. JASMIN access is driven by this process, though there isn't an option to set this directly due to a mapping that is required... see note lower down.
- login to https://accessctl.ceda.ac.uk/admin/
- select 'rules' and do a quick search for the upper most directory covering the path you want to set a rule to see what has already been set.. if there isn't one that meets your needs, then proceed
- select 'add rule', top right
- enter the path where the rule should apply (use + to add a new one)
- set 'rule type' - select ONE only of :
  - publicfor fully 'public' data (non-registered user AND registered users will have access).
- reguserfor those just needing a CEDA user account
- groupif specific restriction is needed.- This will then present a Group' selection list - either use a pre-existing group or add a new group. If you need to set up a new 'group' use the green plus symbol REMEMBER THE SELECTED GROUP! This should be lowercase, with no spaces and use '[a-z0-9], '-' or '_' as needed 
 
- Search for and select a licence or add a new licence (see note below)
- If required, set expiry date for the rule
- Add comment if you wish
- click Save
JASMIN access
Due to the limited number of linux groups that can be utilised to control local (JASMIN) access and to enable systems to work properly with the archive, it is not possible to implement the archive access groups fully as jasmin groups. Instead there are mappings between some archive access options and jasmin groups. Changes _can_ be made, but will need to be discussed with the head of curation.
The following table describes the main JASMIN access groups in use:
| linux group | archive mapping/notes | 
| open | 'public' or 'reguser' | 
| bacyl | used when 'groups' being used and not mapped in category below | 
| badcint | no download access | 
| cmip5_research | Selected CMIP datasets under open research use CMIP licence | 
| esacat1 | ESA Category 1 datasets | 
| ecmwf | all ECMWF dataset groups | 
| eurosat | ESA satellite dataset groups | 
| ukmo_wx | Met Office weather datasets under `ukmo_wx` and `ukmo_wx_gov` | 
| ukmo_clim | Met Office climate datasets - range of access groups | 
Adding a new licence.
If you need to add a new licence then provide the following
- Code - a lowercase, short code to be used to speed selection
- title - a full verbose name for the title
- url link - link to the licence URL in a reliable location (if not external then add to the artefacts server), preferably the link to the item on the artefacts server (see top of this page). Note that at present (Aug 2025) links can only be to pdf files. Links to html pages will not currently be accepted by the registration system.
- comment - a free comment if you wish
- categories - these are classifications for the types of permitted use that the licence permits - speak to Graham about this.
3. Set up the service on the CEDA Services Portal (for 'restricted data')
IF you have set up a new group in step 2.6c above you will also need to set up the access application route in the CEDA Services Portal as follows:
- Log into https://services-beta.ceda.ac.uk/admin/ with your CEDA account.
- Create a new service from the admin site.
- Set the "Category" to "Archive".
- Set the "Name" to the exact name of your "group".
- Add a summary of the service. This will be displayed to the user when they register
- Select "Save"
This will then enable you to test the application system. To do this use the following URL (replace the <datasetid>  with the one you've set up:
https://services-beta.ceda.ac.uk/services/archive/<datasetid>/apply/USER/
The licence used for the application is determined by the service.ceda.ac.uk system calling out to the accessCtl.ceda.ac.uk system for the given access group, so there's no need to supply this into the system.
WARNING - this only works with PDFs of the licence and not web URLs at present.
To add an external authoriser to an access route:
- Go to: https://services-beta.ceda.ac.uk/services/archive/<datasetid>/grant/
- insert the CEDA account userID and role 'MANAGER'
- select an expiry date for the role (otherwise you'll get an error message)
- Send the authoriser a link to: CEDA Services Portal – Guide for Dataset Managers
4. Catalogue (MOLES) permission harvestig for datasets
The final step is to then make sure that the access control and licensing is conveyed to the user and provide the link to gain access if required. This is done via the CEDA data catalogue (MOLES). The required actions all happen 'under the hood' and will automatically get harvested from the accessInstructor system when a cron script runs each night OR can be manually forced by using the 'get Constraints' button on the admin view of the dataset (observation) record.
In cases where the cataloging an external, offline or removed resource where an entry in the accessInstructor doesn't work then an entry can be added into the CEDA Manual Metadata Store (CMMS).
