Introduction to Network Technologies for HPC
Speakers: Dhabaleswar K. (DK) Panda and Hari Subramoni (The Ohio State University)
Abstract: InfiniBand (IB), High-speed Ethernet (HSE), RoCE, Omni-Path, EFA, Tofu, and Slingshot technologies are generating a lot of excitement towards building next generation High-End Computing (HEC) systems including clusters, datacenters, file systems, storage, cloud computing and Big Data (Hadoop, Spark, HBase and Memcached) environments. This tutorial will provide an overview of these emerging technologies, their offered architectural features, their current market standing, and their suitability for designing HEC systems. It will start with a brief overview of IB, HSE, RoCE, Omni-Path, EFA, Tofu, Slingshot, and Omni-Path. In-depth overview of the architectural features of IB, HSE (including iWARP and RoCE), and Omni-Path, their similarities and differences, and the associated protocols will be presented. An overview of the emerging NVLink, NVLink2, and NVSwitch architectures will also be given. Next, an overview of the OpenFabrics stack which encapsulates IB, HSE, and RoCE (v1/v2) in a unified manner will be presented. An overview of libfabrics stack will also be provided. Hardware/software solutions and the market trends behind these networking technologies will be highlighted. Sample performance numbers of these technologies and protocols for different environments will be presented. Finally, hands-on exercises will be carried out for the attendees to gain first-hand experience of running experiments with high- performance networks.
Dhabaleswar K. (DK) Panda is a Professor of Computer Science and Engineering and University Distinguished Scholar at the Ohio State University. His research interests include parallel computer architecture, high-performance networking, Exascale computing, Big Data, Deep Learning, programming models, accelerators, high-performance file systems and storage, virtualization, and cloud computing. He has published over 500 papers in major journals and international conferences related to these research areas. Dr. Panda and his research group members have been doing extensive research on modern networking technologies including InfiniBand, High-Speed Ethernet, RDMA over Converged Enhanced Ethernet (RoCE), Omni-Path, and EFA. Dr. Panda and his team have been actively working on high-performance MPI and PGAS libraries (http://mvapich.cse.ohio-state.edu), Deep Learning libraries (http://hidl.cse.ohio-state.edu) and Big Data libraries (http://hibd.cse.ohio-state.edu). Dr. Panda has served (or serving) as Program Chair/Co-Chair/Vice-Chair of many international conferences. He is an IEEE Fellow and a member of ACM. More details are available at http://www.cse.ohio-state.edu/~panda.
Hari Subramoni received a Ph.D. degree in Computer Science from The Ohio State University, Columbus, OH, in 2013. He is a research scientist in the Department of Computer Science and Engineering at the Ohio State University, USA, since Sept 2015. His research interests include high-performance interconnects and protocols, parallel computer architecture, network-based computing, exascale computing, network topology aware computing, QoS, power-aware LAN-WAN communication, fault tolerance, virtualization, cloud computing, and deep learning. He has published over 70 papers in international journals and conferences related to these research areas. He has been actively involved in various professional activities in academic journals and conferences. Recently, Dr. Subramoni is doing research and working on design and development for MVAPICH2 (High-Performance MPI over InfiniBand, iWARP, and RoCE) and MVAPICH2-X (Hybrid MPI and PGAS (OpenSHMEM and UPC)) software packages. More details about Dr. Subramoni are available at http://www.cse.ohio-state.edu/~subramon.