SD8 Workflow Management Systems

Workflow Management Systems are tools to enable reproducible data analysis, especially if many data analyis processing steps are involved.

Workflow Management Systems cover all data analysis from A to Z, e.g. data preprocessing, quality filtering, analysis and statistical evaluation of processed data. On HPC clusters this may include, starting jobs, staging data to compute nodes, running the computations, deleting temporary data, generating publication ready reports, archiving and cleaning work environments. They guarantee portability of their workflows across systems and transparenty of their processing.

There are some Workflow Management Systems, such as Nextflow or Snakemake, which are designed to automate this process on HPC Clusters.

Learning objectives

## Subskills