Article Preview
TopIntroduction
The plausible solution to meet the high computational demand is the modern supercomputers, which is also the center focus for many research areas (Yin, J. et al., 2019). One of the finest and foremost examples of supercomputers have recently been used to train, scale and accelerate the Morpheus Machine Learning Model, which is a convolutional neural network that helps to explain the color images of the distant universe provided by the James Webb Space Telescope (JWST) (Morpheus ML Model, n.d.). A NVIDIA graphics processing unit-enabled supercomputer, Lux, at the University of California, Santa Cruz, was used to train the Morpheus model having 80 CPU-only compute nodes and 28 GPU-only nodes. On the other hand, modern supercomputer Fugaku is being used to search for coronavirus treatment (used Tofu interconnect, Ajima, Y. (2018) and Y. Ajima et al. (2012)). Fugaku used A64FX microprocessor, which can achieve about 415 PFLOPS requiring about 28,335kW power usage (Kodama et al., 2020; Kudo et al., 2020). However, along with high performance, key to achieve the exascale performance is to ensure the high performance over power usage. Table 1 shows the recent green listed supercomputers (Vikram, 2015), where the recent MPC systems like- Gyoukou ensures the maximum petaFlops/megawatt efficiency of 14.14 (about 42.88% better than the Piz Diant). Therefore, performance per watt of the MPC systems is prime concern for the next generation supercomputers with other constraints like- low network performance, low scalability, low throughput, and latency (Sanchez et al., 2010).
The performance as well as power usage of a supercomputer highly depends on its inter-connectivity between the core-to-core, chip-to-chip, node-to-node and rack-to-rack. This inter-connectivity between various levels of networks is called as the “Interconnection Network” (Minkenberg, 2013). In case of power usages, the on-chip interconnection networks consume about 50% of the total power and off-chip bandwidth is limited to the total number of outgoing physical links (John, 2007). In modern supercomputer, on-chip networks are usually considered with electrical interconnects for their low power usage and off-chip networks are considered with optical interconnects, which are connected through the GBIC Modules for the high-speed connectivity (Pavlidis & Friedman, 2007). This research considered the conventional interconnection networks and extensively analyzed the power usage, network performance, and also investigated the effects of virtual channels in case of performance and power usages.
The later part of this paper describes the network structure of Mesh and Torus network, reviews the routing algorithm, illustrates the static network performance, compares the dynamic communication performance (DCP) for various networks, estimates the on-chip power consumption, and finally performance & power requirement analysis with increased virtual channels.
Table 1. petaFlops/megawatt analysis for supercomputers
MPC system | Performance (petaFlops) | Power (MW) | pFlops/MW | Interconnection Network |
Sunway Taihulight | 93.0 | 15.4 | 6.04 | Sunway |
Tianhe-2 | 33.9 | 17.8 | 1.91 | Fat Tree Network |
Piz Daint | 19.6 | 2.27 | 8.63 | Aries Interconnect |
Gyoukou | 19.1 | 1.35 | 14.14 | Infiniband EDR |