June 22–26, 2014
Leipzig, Germany

Presentation Details

Name: (17a) Exploration of Application-level Lossy Compression for Fast Checkpoint/Restart
Time: Thursday, June 26, 2014
10:30 am - 11:00 am
Room:   Hall 4
CCL - Congress Center Leipzig
Breaks:10:30 am - 11:00 am Coffee Break
07:30 am - 10:30 am Welcome Coffee
Presenter:   Naoto Sasaki, Tokyo Institute of Technology
Abstract:   The computational power fo High Performance Computing (HPC) systems and supercomputers is growing exponentially, driven by extreme-scale scientific simulations. However, the overall system failure rate can also increase as the system size grows. Althourgh Checkpoint/Restart is one of widely used fault tolerance techniques for scientific applications running for a day or weeks at a time, checkpoint and restart time are expected to become huge overhead due to the high failure rate. To minimize checkpoint and restart time, we explore application-level lossy compression technique based on a wavelet transformation. Our preliminary studies show that our lossy compression approach can reduce size of simulation data of a real climate application by 86-87% with 0.09% of an average error.

Naoto Sasaki, Kento Sato, Toshio Endo & Satoshi Matsuoka, Tokyo Institute of Technology