Views Navigation

Event Views Navigation

Today

Latest Past Events

ASTRA-sim and Chakra: Enabling Software-Hardware Co-design Exploration for Distributed Machine Learning Platforms

Presenter Names: Tushar Krishna and William Won (Georgia Tech) Abstract: As Artificial Intelligence (AI) models are scaling at an unprecedented rate, Machine Learning (ML) execution heavily relies on Distributed ML over customized neural accelerator (e.g., GPU or TPU)-based High-Performance Computing (HPC) platforms connected via high-speed interconnects (e.g., NVLinks). Deep Neural Network (DNN) execution involves a complex […]

Principles and Practice of Scalable and Distributed Deep Neural Networks Training and Inference

Presenter Names: Dhabaleswar K. (DK) Panda, Hari Subramoni, Aamir Shafi, Nawras Alnaasan Abstract: Recent advances in Deep Learning (DL) have led to many exciting challenges and opportunities. Modern DL frameworks including TensorFlow, PyTorch, Horovod, and DeepSpeed enable high-performance training, inference, and deployment for various types of Deep Neural Networks (DNNs). This tutorial provides an overview of recent […]

Chip-let Interconnect Test and Repair

Presenter Name: Sreejit Chakravarty Abstract: The goal of this tutorial is to introduce the attendees to the chip-let interconnect test and repair problem. In part 1 of the proposed two-part tutorial, we delve into the fundamentals of chip-let interconnect test and repair. This will include topics listed under Topic number 1 in Section 2.5. Broadly […]