File Formats Demystified
A file format is way to encoded information in a computer file. A format specifies how to interpret the bytes in the files as information with meaning to the programs and people reading and writing them. Each format is designed to carry a particular type of data, but some formats are more specific or more general in their realm of operation. For example, the PNG format is excellent for encoding an image, but could not be easily used to store a 3D computer aided design model.
Text formats are those where the bytes in the file should be interpreted as text characters. This means that generic text editors can be used to view or change the data. This is very useful if your data is small and can be interpreted by human inspection. There are different ways to encode text, but most are encoded with ASCII or unicode.
Binary formats are those where the bytes have to be interpreted by the specific format rules to work out their meaning. This necessitates the use of specialised programs to read and write the data.
The CEDA archives cover a wide range of file formats. The table below list some of the main formats within the CEDA archives with links to tools supporting the format. For information about which format is used for a dataset please CEDA data catalogue.
Additional information about metadata formats is given on the "Introduction to metadata" article.
Format | Type | File endings | Commonly used for |
BADC-CSV | text | .csv | simple "1-D" type of data, e.g. instrument time series data |
NASA Ames | text | .na | aircraft and older instrument data (older data may have an older file-naming convention) |
HITRAN | text | various | spectroscopy data |
JCAMP-DX | text | .dx, .jdx | only suitable for spectra from spectroscopy experiments |
NetCDF | binary | .nc | model data and observational data with more than 1 dimension (e.g. time-height data) |
HDF | binary | .hdf | satellite data |
PP | binary | .pp | Met Office model output |
GRIB | binary | .grb | ECMWF model output |
tar | |
.tar | Aggregating a number of files as a "tar ball" allows a set of files to be downloaded together |
gzip, bzip, zip |
compression | .gz, .bz, .zip | to reduce the volume required for the file to aid transfer and storage |