Utilizing High Performance Networking for Data Intensive Applications – Lessons Learned for Small, Medium & Large Big Data Deployments

Eli Karpilovski, Mellanox Technologies

Big data applications gained traction in the last several years, answering the need to store massive amounts of data in a scalable manner.

In our presentation we will cover several use cases of HPC based architectures in the big data space.

We will discuss usage in high throughput data ingress, intensive MapReduce workload and large volume analytics applications. In addition, we will conduct cost effectiveness analysis of HPC architecture in a big data application settings, enabling higher data value for unstructured and semi-structured data repositories.

We shall present a novel algorithm to handle large scale data analytics with Hadoop MapReduce without compromising on the performance and data throughput capabilities.

Looking beyond the direct attached storage (DAS), we will review the options of using other distributed, network connected, storage systems such as Lustre and OrangeFS for the use of big data applications, and the challenges and benefits of using these architectures.