Near-line archive (NLA)

Sometimes we keep data on tape only. This is used for very voluminous data sets such as Sentinel.

Overview

Whilst much of the CEDA archive is stored on disk we do not have the storage capacity to keep it all online. We therefore migrate many data sets on to tape. If the data you require is held on tape then you can use the Near-line archive (NLA) tool to request the files are restored to disk.

Accessing the tape archive from JASMIN

1. Requesting a tape quota

Firstly, you need to request a tape quota by emailing the CEDA Helpdesk

2. Installing the tool (on a JASMIN server)

Recently (in 2020), the nla command line tool was updated to be compatible with both Python 3 and Python 2.7.
This requires a change to the previously published method of installation, and (slightly) different methods of installing for Python 2.7 and Python 3.

2a.  Installing on Python 2.7

  1. On the jasmin-sci?.ceda.ac.uk servers, you do not need to load the JASPY Python 2.7 module.
    On the  sci?.jasmin.ac.uk you do need to load the JASPY Python 2.7 module:
    module load jaspy/2.7<br>
    	
  2. Create, or use an existing, Python 2.7 virtual environment:
    virtualenv ./nla_venv
    	
  3. Activate the Python 2.7 virtual environment:
    source ./nla_venv/bin/activate
    	
  4. Download the command-line interface from the GitHub NLA, by typing:
    git clone https://github.com/cedadev/nla_client
    	
  5. Install the command-line interface from the downloaded repository:
    pip install ./nla_client
    	
  6. As well as the command line tool, this code contains a python library for use in scripting use of the NLA system. The code has inline documentation. 

2b. Installing on Python 3.x

  1. For both the jasmin-sci?.ceda.ac.uk servers and sci?.jasmin.ac.uk servers you need to load the JASPY Python 3 module:
    module load jaspy
    	
  2. Create, or use an existing, Python 3 virtual environment:
    python3 -m venv  ./nla_venv
    	
  3. Activate the Python 3 virtual environment:
    source ./nla_venv/bin/activate
    	
  4. Download the command-line interface from the  GitHub NLA:
    git clone https://github.com/cedadev/nla_client
    	
  5. Install the command-line interface from the downloaded repository:
    pip install ./nla_client
    	

3. Using the command line tool

Run the command-line tool by first activating the virtual environment (step 3 above) and then running the nla command.

source ./nla_venv/bin/activate
nla

This opens an interactive tool. You can type “help” to get a list of help topics / commands. For example:

===========================
CEDA Near line tape utility.
NLA>>> help
Documented commands (type help <topic>):
========================================
EOF     listing_request  notify_first     quit   requested_files
expire  ls               notify_last      quota  requests
label   notify           pattern_request  req    retainType "help <command>" to get help using a particular command.

The first stage is to determine the names of the files in the NLA that you want to restore. Use the “ls” command to do this, which can also take a sub-string to search for, e.g. “ls sentinel1a”.

4. Requesting files to be restored to disk

Once you know which files you want to restore to disk you can issue a request. There are two ways to make a request:

  1. Listing request.  Here you supply a list of file names to restore in a file.  E.g. “listing_request request_1.txt”. (you can supply any path to the listing file, here it is in the same directory as nla.py, which is probably not how you would typically invoke it).
  2. Pattern request.  Here you supply a sub-string that must appear in the filename.  E.g. “pattern_request 2015” will recover all files with “2015” in the filename.  This particular request ("2015") is not recommend as it will restore a lot of files!  Something more specific like “S2A_OPER_PRD_MSIL1C_PDMC_20161007” would be better.

5. Additional commands

You can view your requests using the “requests” command. This will show you how much of your quota you have used.

You can view the details of a request by “req <request_number>”.

You can check which files are in a request using: “requested_files <request_number>”.

Requests have a retention date.  After this date the restored files will be removed.  You can extend this retention date by using “retain request-number yyyy-mm-dd”.

You can expire a request early using “expire request-number”.  This will remove your restored files within 24 hours and free up some of your quota.

You can label a request by using “label label-name”.  The default label name is either the first file in a Listing Request or the pattern in a Pattern Request.

You can be informed via email when your files are ready using the “notify” command.  The system knows your email address so simply doing “notify” will inform you when the first files arrive and the last files arrive.  To notify someone else use “notify <email_address>”.  To notify different people when the first and last arrive use “notify_first <email_address>” and “notify_last <email_address>".

You can check your quota via “quota”.

6. Additional documentation

There also exists a library and REST-API that can be used to interact with NLA programmatically.
The documentation for these is at:
Additional NLA documentation

Did this answer your question? Thanks for the feedback There was a problem submitting your feedback. Please try again later.

Still need help? Contact Us Contact Us