Unified Communication X for Performance Portable Network Acceleration
Jeffrey Young (Georgia Institute of Technology), Yossi Itigin (NVIDIA), Matthew Baker (Oak Ridge National Laboratory), Oscar Hernandez (NVIDIA/Mellanox),
Over the past 10 years, the Unified Communication X (UCX) framework has developed from an initial vision for a portable networking middleware layer to a fully implemented framework specification that supports multiple architectures including x86, Arm, Power and GPU. UCX provides a high-performance stacked architecture for communication runtimes like MPI and OpenSHMEM, and the definition of UCX protocol and transport layers continue to evolve with the introduction of new GPU supercomputers, edge computing devices, and smart networking cards.
This tutorial covers some of the most recent advances around the UCX ecosystem and provides meaningful examples of the usage of UCX with MPI, OpenSHMEM, and as part of higher-level tools like NVIDIA’s RAPIDS and Apache Spark. In addition to understanding the latest support for UCX layers, attendees will learn how to run simple examples of UCX codes that are relevant for Python developers with PyUCX, traditional MPI and SHMEM programmers.
Jeffrey Young is a senior research scientist in Georgia Tech’s School of Computer Science. With a background in computer architecture, his main research interests have focused on the intersection of high-performance computing and novel accelerators including GPUs, Xeon Phi, FPGAs, and Arm SVE processors. He is a co-director for Georgia Tech’s Center for High Performance Computing and is also the director of a novel architecture testbed, the CRNCH Rogues Gallery, that aims to simplify and democratize access to novel post-Moore accelerators in the neuromorphic, reversible, and novel networking spaces. He received his PhD in computer engineering in 2013 from Georgia Tech’s ECE department. Jeff has given multiple tutorials and has coordinated workshops related to Arm high-performance computing in multiple venues including Supercomputing, ISC, ASPLOS, and NSF’s PEARC.
Yossi Itigin is the Communication middleware Team Leader at NVIDIA and is the maintainer of UCX codebase. Yossi participated in the 2021 Hot Interconnects UCX tutorial and has over eight years of experience in working with networking communication protocols at Mellanox and more recently at NVIDIA.
Matthew Baker is a R&D staff at Oak Ridge National Laboratory. After graduating from ETSU he went on to do research at ORNL. He has been working on research in benchmarks, system languages, interconnect programming libraries, and cpu simulators. He is one of the developers of UCX for uGNI, UCX-py and on the team that implemented BlazingSQL/UCX. His main interests are parallel programing languages and communication libraries like OpenSHMEM. Matt Baker has also organized OpenSHMEM workshops in the past and presented some of his work on different venues.
Oscar Hernandez has a Phd in Computer Science and recently joined NVIDIA/Mellanox in 2021 after working 12 years at Oak Ridge National Laboratory (ORNL) where he was a senior staff member of the Programming Systems Group, which does research on programming models, compilers and tools that are deployed at supercomputers like Summit and Frontier at the Leadership Computing Facility (OLCF). At ORNL he helped standardize parallel languages and APIs for accelerated nodes such as OpenACC/OpenMP and communication libraries and frameworks like OpenSHMEM and UCX. Oscar has experience giving tutorials in different venues, like Supercomputing, ISC, Exascale Computing Annual Meeting and for NSF.