This skill provides an overview of the scheduling of jobs on a supercomputer.
It covers generic and widely used concepts that serve the purpose to maximize the efficiency of a supercomputer.
Batch jobs submitted to a job queue define the workloads in batch systems.
A workload manager of a cluster system typically deals with:
Job Control to provide a user interface for submitting jobs to job queues, monitoring their state during processing (e.g. to check their estimated starting time), and intervening in their execution (e.g. to abort them manually)
Scheduling and Resource Management to select a waiting job for execution and to allocate nodes to the job meeting all its other demands for computing resources (memory, special processing elements like GPUs, etc.)
Accounting to record historical data about how many computing resources (e.g. computing time) have been consumed by a job