Leveraging SmartNICs for HPC and Data Center Applications

Name: Leveraging SmartNICs for HPC and Data Center Applications
Start: 2024-08-23T11:00:00+00:00
End: 2024-08-23T13:45:00+00:00

August 23 @ 11:00 am - 1:45 pm UTC+0

Presenter Names: Jeffrey Young, Rich Graham, Oscar Hernandez, Antonio Peña, Richard Vuduc, Sergio Iserte

Abstract: The past few years have witnessed a surge in the number of advanced network adapters, known as “SmartNICs”, that offer additional functionalities beyond standard packet processing capabilities. These devices often feature programmable lightweight processing cores, FPGAs, and even CPU- and GPU-based platforms capable of running separate operating systems. Though primarily aimed at data center operations, such as infrastructure management, packet filtering, and I/O acceleration, SmartNICs are increasingly being explored for high-performance computing (HPC) application acceleration. This half-day (4 hour) tutorial offers an in-depth exploration of the state-of-the-art for SmartNICs and the emerging software ecosystems supporting them. Attendees will engage in hands-on exercises to better understand how to use SmartNICs for HPC application acceleration, including MPI collective operation offloading, OpenMP remote offload, and algorithmic modifications to maximize on-board processing power. Participants will have the opportunity to execute these exercises using cutting-edge SmartNICs like NVIDIA’s BlueField-3 Data Processing Unit (DPU). The tutorial presenters will further discuss optimizing applications to harness SmartNICs as communication accelerators in HPC systems.

Bio: Jeffrey Young is a principal research scientist with Georgia Tech’s Partnership for Advanced Computing Environments (PACE). With a background in computer architecture, his main research interests have focused on the intersection of high-performance computing and novel accelerators including GPUs, FPGAs, and next-generation processors. He is the director of a novel architecture testbed, the CRNCH Rogues Gallery, that aims to simplify and democratize access to novel post-Moore accelerators in the neuromorphic, reversible, and smart networking spaces. He received his PhD in computer engineering in 2013 from Georgia Tech’s ECE department.

Dr. Richard Graham is the Senior Director of HPC Technology at NVIDIA’s Networking Business unit. His main area of expertise revolves around HPC network software and hardware capabilities for present and upcoming HPC technologies. Before joining Mellanox/NVIDIA, Dr. Graham accumulated thirteen years of experience at Los Alamos National Laboratory and Oak Ridge National Laboratory, where he held technical and administrative positions in computer science. His technical focus encompassed communication libraries and application analysis tools. Additionally, Dr. Graham played a significant role as a co-founder of the Open MPI collaboration and served as the chairman of the MPI 3.0 standardization efforts.

Oscar Hernandez holds a PhD in Computer Science and currently works at Oak Ridge National Laboratory (ORNL). Dr. Hernandez conducts research on programming models, compilers, and tools deployed at supercomputers such as Summit and Frontier at the Leadership Computing Facility (OLCF). At ORNL, he has contributed to the standardization of parallel languages and APIs for accelerated nodes, including OpenACC/OpenMP, as well as communication libraries and frameworks like OpenSHMEM and UCX. Furthermore, he has been involved with the Exascale Computing Project, leading various initiatives to implement these technologies on Exascale systems. Oscar has also collaborated closely with application teams, including the CAAR, INCITE, and ALCC projects, as well as numerous projects funded by DOE, DoD, NSF, and Industrial Partners within the Oil & Gas industry. Additionally, he has extensive experience delivering tutorials at various events, including Supercomputing, ISC, Exascale Computing Annual Meeting, and for NSF.

Antonio J. Peña is a Leading Researcher and Group Manager at the Barcelona Supercomputing Center (BSC), Computer Sciences Department, where he leads the ”Accelerators and Communications for HPC“ Group. He holds a secondary appointment as Teaching and Research Staff at Universitat Politecnica de Catalunya. Dr. Peña is a Ramon y Cajal Fellow and former Marie Sklodowska-Curie Individual Fellow. He currently holds an ERC Consolidator Grant. Among others, he is a recipient of the 2023 Agustín de Betancourt y Molina Award from the Spanish Royal Academy of Engineering, and a 2017 IEEE TCHPC Award for Excellence for Early Career Researchers in High Performance Computing. Being an ACM/IEEE Sr. Member, he is involved in the organization and steering committees of several conferences and workshops such as SC, IEEE Cluster, or AsHES. His research interests in the area of runtime systems and programming models for high performance computing include resource heterogeneity and communications.

Dr. Sergio Iserte holds the degrees of BS in Computer Engineering (2011), MS in Intelligent Systems (2014), and PhD in Computer Science (2018) from Universitat Jaume I (UJI), Spain. Sergio is a senior researcher in the Computer Science Department at Barcelona Supercomputing Center (BSC), and instructor of the High-performance Computing (HPC) course in the Computer Science Master at Open University of Catalonia (UOC). He is currently involved in projects related to parallel distributed computing, resource management, workload modeling, deep learning for industrial applications, and in-network accelerators.

Richard (Rich) Vuduc is an Associate Professor at the Georgia Institute of Technology in the School of Computational Science and Engineering, a department devoted to the study of computer-based modeling and simulation of natural and engineered systems. His research lab, The HPC Garage is interested in high-performance computing, with an emphasis on algorithms, performance analysis, and performance engineering. He is a recipient of a DARPA Computer Science Study Group grant; an NSF CAREER award; a collaborative Gordon Bell Prize in 2010; Lockheed-Martin Aeronautics Company Dean’s Award for Teaching Excellence (2013); and Best Paper Awards at the SIAM Conference on Data Mining (SDM, 2012) and the IEEE Parallel and Distributed Processing Symposium (IPDPS, 2015), among others. Most recently, Dr. Vuduc has led an effort to map high-performance applications like LAMMPS and MueLu to Data Processing Unit (DPUs) with early results resulting in an IPDPS 2022 Best Paper nominee.

Session Link

Leveraging SmartNICs for HPC and Data Center Applications

HotI 2024

Archive