Welcome to `datatool`’s documentation!

datatool is an HPC-indepenent workflow enabling end-users extracting subsets from community meteorological datasets through a simple command-line interface (CLI). The tool works at large with NetCDF files, but is not limited to any file format, structure, or dataset.

Through crowsourcing, datatool aims to enable end-users extract subsets from any dataset available to the community members.

User Interface

This repository contains scripts to process meteorological datasets in NetCDF file format. The general usage of the script (i.e., ./extract-dataset.sh) is as follows:

Usage:
  extract-dataset [options...]

Script options:
  -d, --dataset                     Meteorological forcing dataset of interest
  -i, --dataset-dir=DIR             The source path of the dataset file(s)
  -v, --variable=var1[,var2[...]]   Variables to process
  -o, --output-dir=DIR              Writes processed files to DIR
  -s, --start-date=DATE             The start date of the data
  -e, --end-date=DATE               The end date of the data
  -l, --lat-lims=REAL,REAL          Latitude's upper and lower bounds
                                    optional; within the [-90, +90] limits
  -n, --lon-lims=REAL,REAL          Longitude's upper and lower bounds
                                    optional; within the [-180, +180] limits
  -a, --shape-file=PATH             Path to the ESRI shapefile; optional
  -m, --ensemble=ens1,[ens2,[...]]  Ensemble members to process; optional
                                    Leave empty to extract all ensemble members
  -M, --model=model1,[model2,[...]] Models that are part of a dataset,
                                    only applicable to climate datasets, optional
  -S, --scenario=scn1,[scn2,[...]]  Climate scenarios to process, only applicable
                                    to climate datasets, optional
  -j, --submit-job                  Submit the data extraction process as a job
                                    on the SLURM system; optional
  -k, --no-chunk                    No parallelization, recommended for small domains
  -p, --prefix=STR                  Prefix prepended to the output files
  -b, --parsable                    Parsable SLURM message mainly used
                                    for chained job submissions
  -D, --dependency                  Executation dependency submission ID, optional
  -c, --cache=DIR                   Path of the cache directory; optional
                                    defaults to $HOME/scratch
  -E, --email=user@example.com      E-mail user when job starts, ends, or
                                    fails; optional
  -C, --cluster=JSON                JSON file detailing cluster-specific details
  -L, --list-datasets               List all the available datasets and the
                                    corresponding keywords for '--dataset' option
  -V, --version                     Show version
  -h, --help                        Show this screen and exit

Use the navigation menu on the left to explore the datatool’s documentation!

User Manual

Datasets

Datasets
- Summary
- Detailed Descriptions

License

License

Welcome to datatool’s documentation!

User Interface

Welcome to `datatool`’s documentation!