Comprehend that in traditional CPUs - although CPU stands for Central Processing Unit - there is no central, i.e. single, processing unit any more because today all CPUs have multiple compute cores which all have the same functionality
Comprehend that GPUs (Graphical Processing Units) or GPGPUs (General Purpose Graphical Processing Units) were originally used for image processing and displaying images on screens before people started to utilize the computing power of GPUs for other purposes
Comprehend that FPGAs (Field-Programmable Gate Arrays) are devices that have configurable hardware and configurations are specified by hardware description languages
Comprehend that FPGAs are interesting if one uses them to implement hardware features that are not available in CPUs or GPUs (e.g. low precision arithmetic that needs only a few bits)
Comprehend that Vector units are successors of vector computers (i.e. the first generation of supercomputers) and that they are supposed to provide higher memory bandwidth than CPUs
Comprehend that at an abstract level the high-speed network connects compute units and main memory which leads to three main parallel computer architectures
Shared Memory where all compute units can directly access the whole main memory
Distributed memory where individual computers are connected with a network
NUMA (Non-Uniform Memory Access) combines properties from shared and distributed memory systems, because at the hardware level a NUMA system resembles a distributed memory
Comprehend that in general, the effort for programming parallel applications for distributed systems is higher than for shared memory systems
parallelization techniques at the instruction level of a processing element (e.g. pipelining, SIMD processing)
advanced instruction sets that improve parallelization (e.g., AVX-512)
hybrid approaches, e.g. combining CPUs with GPUs or FPGAs
typical network topologies and architectures used for HPC systems, like fat trees based on switched fabrics using e.g. fast Ethernet (1 or 10 Gbit) or InfiniBand
special or application-specific hardware (e.g. TPUs)