|Name:||Fast & Energy-Efficient Breadth-First Search on a Single NUMA System|
|Time:||Thursday, June 26, 2014
09:00 am - 09:30 am
CCL - Congress Center Leipzig
|Breaks:||07:30 am - 10:30 am Welcome Coffee|
|Speaker:||Yuichiro Yasui, Kyushu University & JST CREST|
|Abstract:||Breadth-first search (BFS) is an important graph analysis kernel. The Graph500 benchmark measures a computer's BFS performance using the traversed edges per second (TEPS) ratio. Our previous nonuniform memory access (NUMA)-optimized BFS reduced memory accesses to remote RAM on a NUMA architecture system; its performance was 11 GTEPS (giga TEPS) on a 4-way Intel Xeon E5-4640 system. Herein, we investigated the computational complexity of the bottom-up, a major bottleneck in NUMA-optimized BFS. We clarify the relationship between vertex out-degree and bottom-up performance. In November 2013, our new implementation achieved a Graph500 benchmark performance of 37.66 GTEPS (fastest for a single node) on an SGI Altix UV1000 (one-rack) and 31.65 GTEPS (fastest for a single server) on a 4-way Intel Xeon E5-4650 system. Furthermore, we achieved the highest Green Graph500 performance of 153.17 MTEPS/W (mega TEPS per watt) on an Xperia-A SO-04E with a Qualcomm Snapdragon S4 Pro APQ8064.
Yuichiro Yasui, Kyushu University & JST CREST; Katsuki Fujisawa, Chuo University & JST CREST; Yukinori Sato, JAIST & JST CREST