skill-tree:k:4:1:b
Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
skill-tree:k:4:1:b [2020/06/05 17:06] – external edit 127.0.0.1 | skill-tree:k:4:1:b [2025/04/16 18:30] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | # K4.1-B Introduction to job scheduling | + | # K4.1 Basic principles of Job Scheduling |
- | # Background | + | |
- | This skill provides an overview | + | This skill provides an overview |
It covers generic and widely used concepts that serve the purpose to maximize the efficiency of a supercomputer. | It covers generic and widely used concepts that serve the purpose to maximize the efficiency of a supercomputer. | ||
- | # Aim | + | Batch jobs submitted to a job queue define the workloads in batch systems. |
- | To enable practitioners | + | A workload manager of a cluster system typically deals with: |
+ | * Job Control to provide a user interface for submitting jobs to job queues, monitoring their state during processing (e.g. to check their estimated starting time), | ||
+ | * Scheduling and Resource Management to select a waiting job for execution and to allocate nodes to the job meeting all its other demands for computing resources (memory, special processing elements like GPUs, etc.) | ||
+ | * Accounting to record historical data about how many computing resources (e.g. computing time) have been consumed by a job | ||
+ | |||
+ | ## Learning Outcomes | ||
+ | |||
+ | * Comprehend the exclusive | ||
+ | * Differentiate batch and interactive job submission. | ||
+ | * Comprehend the generic | ||
+ | * Explain environment variables as a means to communicate. | ||
+ | * Comprehend accounting principles. | ||
+ | * Explain the generic steps to run and monitor a single job. | ||
+ | * Comprehend scheduling principles (first come first served, shortest job first, backfilling) to achieve objectives like minimizing the averaged elapsed program runtimes, and maximizing the utilization of the available HPC resources. | ||
+ | * Comprehend the differences between **Batch Systems** and **Time-Sharing Systems**. | ||
+ | * Explain the concepts and procedures | ||
+ | * Run interactive jobs and batch jobs. | ||
+ | * Comprehend and describe the expected behavior of job scripts. | ||
+ | * Change provided job scripts and embed them into shell scripts to run a variety of parallel applications. | ||
+ | * Analyze the output generated from a job scheduler and describe the cause of typically generated errors. | ||
+ | * Comprehend accounting principles (billing for the jobs). | ||
+ | * Comprehend the set of terms for performance criteria like: | ||
+ | * Resource Utilization. | ||
+ | * Throughput. | ||
+ | * Waiting Time. | ||
+ | * Execution Time. | ||
+ | * Turnaround Time. | ||
+ | * Comprehend scheduling strategies that increase productivity. | ||
+ | * Comprehend that typical goals of job scheduling are: | ||
+ | * Maximization of resource utilization. | ||
+ | * Maximization of throughput. | ||
+ | * Minimization of waiting time. | ||
+ | * Minimization of turnaround time. | ||
+ | * Comprehend that there is a variety of scheduling algorithms from rather simple to more complex like: | ||
+ | * First-Come-First-Served (FCFS). | ||
+ | * Shortest-Job-First (SJF). | ||
+ | * Priority-based. | ||
+ | * Fair-Share. | ||
+ | * Backfilling. | ||
+ | * Apply advanced scheduling principles (e.g. backfilling) to achieve objectives like minimizing the averaged elapsed program runtimes, and maximizing the utilization of the available HPC resources. | ||
+ | * Discuss sophisticated scheduling principles (e.g. fair share) to achieve objectives like treating the users fair, and maximizing the utilization of the available HPC resources. | ||
- | # Outcomes | ||
- | * comprehend the exclusive and shared usage model in HPC | ||
- | * differentiate batch and interactive job submission | ||
- | * comprehend the generic concepts and architecture of resource manager, scheduler, job and job script | ||
- | * explain environment variables as a means to communicate | ||
- | * comprehend accounting principles | ||
- | * explain the generic steps to run and monitor a single job | ||
- | * comprehend scheduling principles (first come first served, shortest job first, backfilling) to achieve objectives like minimizing the averaged elapsed program runtimes, and maximizing the utilization of the available HPC resources | ||
- | # Subskills | ||
skill-tree/k/4/1/b.1591369587.txt.gz · Last modified: 2020/06/05 17:06 by 127.0.0.1