June 22–26, 2014
Leipzig, Germany

Presentation Details

Name: (07a) Cancer Genome Analysis Using Next Generation Sequencing & High Performance Computing
Time: Thursday, June 26, 2014
10:30 am - 11:00 am
Room:   Hall 4
CCL - Congress Center Leipzig
Breaks:10:30 am - 11:00 am Coffee Break
07:30 am - 10:30 am Welcome Coffee
Presenter:   Hyojin Kang, KISTI
Abstract:   Recent advances in DNA sequencing technology have enabled Next Generation Sequencing (NGS) instruments to generate billions of DNA reads in few days. The rapidly dropping cost of NGS is enabling cancer research community to perform whole genome sequence at a massive scale, initiating many new projects, leading to better understanding of cancer genomics and designing personalized treatment. However, the management of enormous NGS data requires a great deal of computing power and memory as well as huge disk storage. Here we developed a cancer genome analysis pipeline using HPC resources in National Institute of Supercomputing and Networking (NISN) to overcome the obstacles of NGS data analysis. First, we applied pBWA (Parallel Burrows- Wheeler Aligner), which is based on Open MPI library, to speedup reference alignment step. This enables us to reduce computing wall-time by combining multithreading and parallelization. Second, we developed a disk IO aware job submission scheduler to maximize disk IO usage but not hampering previously running jobs due to the heavy disk IO of a new job. The cancer analysis pipeline can be divided into several tasks based on its disk usage and submitted to queue system according to current disk IO load.

Hyojin Kang, Junehawk Lee, Yongseong Cho & Insung Ahn, Korea Institute of Science & Technology Information