An Extensive Power and Performance Analysis for High Dimensional Mesh and Torus Interconnection Networks

An Extensive Power and Performance Analysis for High Dimensional Mesh and Torus Interconnection Networks

Faiz Al Faisal, M. M. Hafizur Rahman, Yasushi Inoguchi
Copyright: © 2023 |Pages: 19
DOI: 10.4018/IJDST.321208
Article PDF Download
Open access articles are freely available for download

Abstract

The next generation parallel computers are the keys to achieve exascale performance, whereas sequential computers have already been saturated. In order to achieve this mighty target of exascale computing, one of the main challenges is the reduction of power consumption along with achieving suitable performance. Energy efficiency is a key feature to ensure the trade-off between the performances over the required power usages. Hence, to focus on those issues, the target of this article is to analyze the performance versus power usage trade-off for the conventional networks like- Mesh and Torus. High degree networks show much better performance than the low degree of networks. However, high degree networks require higher power usage for their high degree of interconnected links. This article showed that with zero load latency, the 3D Torus could show about 57.07% better performance than a 2D Torus. On the other hand, a 2D Mesh network requires about 24.22% less router power usage than the 3D Mesh, & 5D Torus requires about 66.8% higher router power usage than a 3D Torus network.
Article Preview
Top

Introduction

The plausible solution to meet the high computational demand is the modern supercomputers, which is also the center focus for many research areas (Yin, J. et al., 2019). One of the finest and foremost examples of supercomputers have recently been used to train, scale and accelerate the Morpheus Machine Learning Model, which is a convolutional neural network that helps to explain the color images of the distant universe provided by the James Webb Space Telescope (JWST) (Morpheus ML Model, n.d.). A NVIDIA graphics processing unit-enabled supercomputer, Lux, at the University of California, Santa Cruz, was used to train the Morpheus model having 80 CPU-only compute nodes and 28 GPU-only nodes. On the other hand, modern supercomputer Fugaku is being used to search for coronavirus treatment (used Tofu interconnect, Ajima, Y. (2018) and Y. Ajima et al. (2012)). Fugaku used A64FX microprocessor, which can achieve about 415 PFLOPS requiring about 28,335kW power usage (Kodama et al., 2020; Kudo et al., 2020). However, along with high performance, key to achieve the exascale performance is to ensure the high performance over power usage. Table 1 shows the recent green listed supercomputers (Vikram, 2015), where the recent MPC systems like- Gyoukou ensures the maximum petaFlops/megawatt efficiency of 14.14 (about 42.88% better than the Piz Diant). Therefore, performance per watt of the MPC systems is prime concern for the next generation supercomputers with other constraints like- low network performance, low scalability, low throughput, and latency (Sanchez et al., 2010).

The performance as well as power usage of a supercomputer highly depends on its inter-connectivity between the core-to-core, chip-to-chip, node-to-node and rack-to-rack. This inter-connectivity between various levels of networks is called as the “Interconnection Network” (Minkenberg, 2013). In case of power usages, the on-chip interconnection networks consume about 50% of the total power and off-chip bandwidth is limited to the total number of outgoing physical links (John, 2007). In modern supercomputer, on-chip networks are usually considered with electrical interconnects for their low power usage and off-chip networks are considered with optical interconnects, which are connected through the GBIC Modules for the high-speed connectivity (Pavlidis & Friedman, 2007). This research considered the conventional interconnection networks and extensively analyzed the power usage, network performance, and also investigated the effects of virtual channels in case of performance and power usages.

The later part of this paper describes the network structure of Mesh and Torus network, reviews the routing algorithm, illustrates the static network performance, compares the dynamic communication performance (DCP) for various networks, estimates the on-chip power consumption, and finally performance & power requirement analysis with increased virtual channels.

Table 1.
petaFlops/megawatt analysis for supercomputers
MPC systemPerformance (petaFlops)Power (MW)pFlops/MWInterconnection Network
Sunway Taihulight93.015.46.04Sunway
Tianhe-233.917.81.91Fat Tree Network
Piz Daint19.62.278.63Aries Interconnect
Gyoukou19.11.3514.14Infiniband EDR

Complete Article List

Search this Journal:
Reset
Volume 15: 1 Issue (2024)
Volume 14: 2 Issues (2023)
Volume 13: 8 Issues (2022)
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing