Differences

This shows you the differences between two versions of the page.

--- skill-tree:pe:2:3:b [2020/07/19 11:30] – external edit 127.0.0.1
+++ skill-tree:pe:2:3:b [2025/04/16 18:30] (current) – external edit 127.0.0.1
@@ Line 1: / Line 1: @@
-# PE2.3-B I/O Performance
+# PE2.3 Profiling tools
-# Background
-Running the same application with different I/O configurations gives the possibility to tune the I/O system according to the application access pattern.
+Profiling is explained for the CPU level, where it can be supported by hardware performance counters and by sampling techniques.
-One way to predict application performance in HPC systems with different I/O configurations is using modelling and simulation techniques.
+Sampling is used to see, by examining the program counter, what routines and source code lines of a program are responsible for which portions of the total runtime.
-Modeling the system allows assessing obtained performance and therewith estimate the performance potentially gained by optimizations.
-There are several aspects involved in delivering high I/O performance to parallel applications, from hardware characteristics to methods that manipulate workloads to improve achievable performance.
+Automatically adding trace code to a parallel program by so-called instrumentation to record its execution in a strict chronology is explained and the difference to profiling is emphasized.
-File systems are implemented in the operating system which deploys strategies to improve performance such as scheduling, caching and aggregation.
+Similar techniques are explained for profiling the network level (e.g. based on InfiniBand counters and I/O server states).
-Therefore, the observable I/O performance depends on more than the capabilities of the raw block device.
-# Aim
+## Learning Outcomes
-  * To develop general considerations about what influences the I/O performance.
+* Demonstrate the use of Score-P for collecting program traces.
-  * To analyze access pattern and define how it defines the performance of the I/O subsystems.
+* Demonstrate the use of Scalasca for analyzing traces.
-  * To apply I/O strategies to improve the access pattern.
+* Demonstrate the analysis of program traces using Vampir.
-  * To identify options for the deployed optimization strategies in a specific parallel file system.
+* Understand Darshan.
+* Demonstrate PIKA to check the performance of anyprogram without instrumenting it.
+* Demonstrate collecting traces of a program usig L02s.
+* Demonstrate analysis program from NVIDIA for CUDA code.
-# Outcomes
+## Subskills
-  * Select performance models to assess and optimize the application I/O performance.
+* [[skill-tree:pe:2:3:1:b]]
-  * Identify tools capable of predicting the behavior of applications in HPC.
+* [[skill-tree:pe:2:3:2:b]]
-  * Apply methods to manipulate workloads to improve achievable performance.
+* [[skill-tree:pe:2:3:3:b]]
+* [[skill-tree:pe:2:3:4:b]]
-# Subskills
+* [[skill-tree:pe:2:3:5:b]]
+* [[skill-tree:pe:2:3:6:b]]
+* [[skill-tree:pe:2:3:7:b]]