High Performance Computing
High Performance Computing leverages computing power, in standalone or cluster form, with highly efficient software, message passing interfaces, memory handling capabilities to allow solutions to improve scalability and minimize run times.
The acceleration of hardware and software capabilities has allowed for the solution of highly complex and large problems in a variety of applications in science, technology, medicine, and engineering.
Finite Element Analysis inherently involves handling large matrix factorizations, inversions, and other manipulations across multiple dimensions. Depending on the solvers (iterative, direct, and so on) being used, the matrix operations involved may lead to very long run times on standalone workstations.
Computer Architecture
Computer Architecture is comprised of the structure and interaction of various entities of a computational system. Understanding the basics of Computer Architecture is essential to effectively utilize the High Performance Computing features available in OptiStruct.
A typical computational system consists of the Central Processing Unit (CPU) wherein all the basic calculations are performed. This is called Cores in typical workstations. The Cores/CPUs are part of the physical processor in the Socket of the motherboard. A processor (in the Socket) can consist of multiple Cores/CPUs. A single node can contain multiple sockets (each with a processor). A single Core/CPU can handle single or multiple connections in parallel with the software/operating system. These connections are called Threads, and typically a Core can only handle a single connection (Thread) at a time. If the Core/CPU is capable of hyper-threading, then multiple threads/connections between the Core/CPU and the software can exist in parallel.
Cluster
Node
A Physical Process consists of single or multiple Cores/CPUs. Unless a CPU is equipped with hyper-threading capabilities, a single Core/CPU can only handle one connection in the same CPU time interval. If hyper-threading is available, then multiple connections between the CPU and the software can exist in parallel. A Core/CPU is a full set of required data and hardware required to execute a program.
Serial versus Parallel Computing
- Serial Computing
- Parallel Computing
- Solution is divided into discrete instructions.
- Solution is divided into sections, which are in-turn divided into discrete instructions.
- Sequential Execution of discrete instructions one logical processor.
- Parallel Execution of discrete instructions of all sections simultaneously on multiple logical processors.
- At each point in the time-domain, only a single discrete instruction is executed.
- At each point in the time-domain, multiple discrete instructions relating to multiple parts are executed simultaneously.
- Run times are typically high compared to Parallel Computing.
- Run times are typically lower than Serial Computing.
Implementation
High Performance Computing in OptiStruct allows the usage of multiple threads within a single shared memory system (SMP) to improve speedup, multiple nodes in a distributed memory system cluster (SPMD) via a message passing interface (MPI) implementation for further scalability, and/or a Graphics Processing Unit (GPU) via the NVIDIA CUDA implementation.
The parallelization options are available for a standalone systems, clusters, or GPU enhanced workstations.
The following High Performance Computing options are available: