|Name:||Cyme, a Library Maximizing SIMD Computation on User-Defined Containers|
|Time:||Wednesday, June 25, 2014
09:00 am - 09:15 am
CCL - Congress Center Leipzig
|Breaks:||07:30 am - 10:00 am Welcome Coffee|
|Presenter:||Timothée Ewart, EPFL|
|Abstract:||This paper presents Cyme, a C++ library aiming at abstracting the usage of SIMD instructions while maximising the usage of the underlying hardware. Unlike similar efforts such as boostSIMD or VC, our solution allows not only for vector data structures but also general containers. Cyme accomplishes this by 1) optimization of the Abstract Syntax Tree (AST) using Expression Template Programming to prevent temporary copies and maximize the use of Fuse Multiply Add instructions 2) creating a data layout in memory (AoS or AoSoA), which minimizes data addressing and manipulation throughout all SIMD hardware registers. Implementation of Cyme library has been accomplished on the IBM Blue Gene/Q architecture using the 256 bit SIMD extensions (QPX) of the Power A2 processor. Functionality of the library is demonstrated on a computationally intensive kernel of a neuro-scientific application and demonstrates an increase of GFlops performance by a factor of 6.67 over the original implementation.
Timothée Ewart, Fabien Delalondre & Felix Schürmann, EPFL