June 22–26, 2014
Leipzig, Germany

Presentation Details

Name: (05a) Proprietary Interconnect with Low Latency for HA-PACS/TCA
Time: Thursday, June 26, 2014
10:30 am - 11:00 am
Room:   Hall 4
CCL - Congress Center Leipzig
Breaks:10:30 am - 11:00 am Coffee Break
07:30 am - 10:30 am Welcome Coffee
Presenter:   Toshihiro Hanawa, University of Tokyo
Abstract:   In recent years, heterogenious clusters using accelerators are widely used for high performance computing system. In such clusters, the inter-node communication among accelerators requires several memory copies via CPU memory, and the communication latency causes severe performance degradation. To address this problem, we propose Tightly Coupled Accelerators (TCA) architecture, which is capable of reducing the communication latency between accelerators over different nodes. The TCA architecture communicates directly via the PCIe protocol, which allows it to eliminate protocol overhead, such as that associated with IB and MPI, as well as the memory copy overhead. We constructed HA-PACS/TCA cluster which is equipped with the TCA communication board (PEACH2 board) as the proprietary interconnect for GPU to utilize GPU-to-GPU direct communication over the nodes. As the result of performance evaluation, HA-PACS/TCA demonstrates that the TCA interconnect achieves good performance for GPU-to-GPU communication as a latency of 2.3 us and bandwidth of 2.7GB/s.

Toshihiro Hanawa, University of Tokyo; Yuetsu Kodama, University of Tsukuba; Taisuke Boku, University of Tsukuba; Mitsuhisa Sato, University of Tsukuba