- This event has passed.
High Performance Machine Learning, Deep Learning, and Data Science: Principles and Practice
Title: High Performance Machine Learning, Deep Learning, and Data Science: Principle and Practice
Speakers: Dhabaleswar K. (DK) Panda, Hari Subramoni, Aamir Shafi, and Nawras Alnaasan (Department of Computer Science and Engineering, The Ohio State University)
Abstract: Recent advances in Machine and Deep Learning (ML/DL) have led to many exciting challenges and opportunities. Modern ML/DL and Data Science frameworks including TensorFlow, PyTorch, and Dask have emerged that offer high- performance training and deployment for various types of ML models and Deep Neural Networks (DNNs). This tutorial provides an overview of recent trends in ML/DL and the role of cutting-edge hardware architectures and interconnects in moving the field forward. We will also present an overview of different DNN architectures and ML/DL frameworks with special focus on parallelization strategies for model training. We highlight new challenges and opportunities for communication runtimes to exploit high-performance CPU/GPU architectures to efficiently support large-scale distributed training. We also highlight some of our co-design efforts to utilize MPI for large-scale DNN training on cutting-edge CPU/GPU architectures available on modern HPC clusters. The tutorial covers training traditional ML models including— K-Means, linear regression, nearest neighbours—using the cuML framework accelerated using MVAPICH2-GDR. Also, the tutorial presents accelerating GPU-based Data Science applications using MPI4Dask, which is an MPI-based backend for Dask. Throughout the tutorial, we include hands-on exercises to enable attendees to gain first-hand experience of running distributed ML/DL training and Dask on a modern GPU cluster.
Bios:
Dhabaleswar K. (DK) Panda is a Professor of Computer Science and Engineering and University Distinguished Scholar at the Ohio State University. His research interests include parallel computer architecture, high-performance networking, Exascale computing, Big Data, Deep Learning, programming models, accelerators, high-performance file systems and storage, virtualization, and cloud computing. He has published over 500 papers in major journals and international conferences related to these research areas. Dr. Panda and his research group members have been doing extensive research on modern networking technologies including InfiniBand, High-Speed Ethernet, RDMA over Converged Enhanced Ethernet (RoCE), Omni-Path, and EFA. Dr. Panda and his team have been actively working on high-performance MPI and PGAS libraries (http://mvapich.cse.ohio-state.edu), Deep Learning libraries (http://hidl.cse.ohio-state.edu) and Big Data libraries (http://hibd.cse.ohio-state.edu). Dr. Panda has served (or serving) as Program Chair/Co-Chair/Vice-Chair of many international conferences. He is an IEEE Fellow and a member of ACM. More details are available at http://www.cse.ohio-state.edu/~panda.
Dr. Hari Subramoni is an assistant professor in the Department of Computer Science and Engineering at the Ohio State University. His current research interests include high performance interconnects and protocols, parallel computer architecture, network-based computing, exascale computing, network topology aware computing, QoS, power-aware LAN-WAN communication, fault tolerance, virtualization, big data, deep learning and cloud computing. He has published over 100 papers in international journals and conferences related to these research areas. He has been actively involved in various professional activities in academic journals and conferences. Dr. Subramoni is doing research on the design and development of MVAPICH2 (High Performance MPI over InfiniBand, iWARP and RoCE) and MVAPICH2-X (Hybrid MPI and PGAS (OpenSHMEM, UPC and CAF)) software packages. He is a member of IEEE & ACM.
Dr. Aamir Shafi is currently a Research Scientist in the Department of Computer Science & Engineering at the Ohio State University where he is involved in the High Performance Big Data project led by Dr. Dhabaleswar K. Panda. Dr. Shafi was a Fulbright Visiting Scholar at the Massachusetts Institute of Technology (MIT) in the 2010-2011 academic year where he worked with Prof. Charles Leiserson on the award-winning Cilk technology. Dr. Shafi received his PhD in Computer Science from the University of Portsmouth, UK in 2006. He got his Bachelors in Software Engineering degree from NUST, Pakistan in 2003. Dr. Shafi’s current research interests include architecting robust libraries and tools for Big Data computation with emphasis on Machine and Deep Learning applications. Dr. Shafi co-designed and co-developed a Java-based MPI-like library called MPJ Express. More details about Dr. Shafi are available from https://people.engineering.osu.edu/people/shafi.16.
Nawras Alnaasan is a Graduate Research Associate at the Network-Based Computing Laboratory, Columbus, OH, USA. He is currently pursuing a Ph.D. degree in computer science and engineering at The Ohio State University. His research interests lie at the intersection of deep learning and high-performance computing. He works on advanced parallelization techniques to accelerate the training of Deep Neural Networks and exploit underutilized HPC resources covering a wide range of DL applications including supervised learning, semi-supervised learning, and hyperparameter optimization. He is actively involved in several research projects including HiDL (High-performance Deep Learning) and ICICLE (Intelligent Cyberinfrastructure with Computational Learning in the Environment). Alnaasan received his B.S. degree in computer science and engineering from The Ohio State University.