Loading Events

Die-to-Die Interconnects

August 18 @ 1:00 pm - 1:50 pm UTC-7

Invited Talks
13:00 – 13:20: Bunch of Wires: An Open and Versatile PHY Standard for Die-to-Die Interconnects
Invited Speakers: Elad Alon, Shahab Ardalan, Boris Murmann, Bapi Vinnakota and Venkata Satya Rao

Chiplet-based designs realize an ASIC product in a collection of dice within a single package. In 2019 at Hot Interconnect, we proposed the idea of a simple parallel, clock-forwarded D2D PHY named Bunch of Wires (BoW), resulting in the first open standard to support both commodity (organic laminate) and advanced packaging technologies. The BoW specification enables cost- and energy-efficient (0.25 – 0.5pJ/bit), high-performance (2-16Gb/s/line) designs across a wide range of process nodes (65nm – 5nm), packages, and use cases (25mm reach, <1e-15 BER), hence enabling significant economies of scale. This paper will review significant aspects of the specification, the flexibility the interface offers for system design, and prototypes as well as ASIC products already being developed around BoW.

13:20 – 13:50: Synchronous and Low-Latency Die-to-Die Interface for the IBM z16™ Telum Processor
Invited Speakers: Mike Spear, Dan Dreps, Chad Marquart and Andrew Turner

This talk describes the z16™ on-module die-to-die interface that connects two Telum processors on an 8-2-8 organic dual-chip module. The chip and interface architecture achieves a low power of 0.26 pJ/bit using standard available voltage rails and latency of 370 ps from latch to latch. Both chips are clocked from a single phase-locked loop from the primary chip and have a clock alignment means to allow the chip crossing to be totally synchronous without a forwarded I/O clock lane. The design is single ended with a bit rate of 2.7 Gb/s per lane. The logic and circuits have a small area footprint that allows for scaling below 50-micron C4 bumps if needed. Lane sparing methods are applied in manufacturing and test for the benefit of chip and package yield. Built-in self-test logic is implemented that requires no external probing of any C4 or micro-C4 bumps. The link training is simple with margin determination at module and box test. The circuit approaches are compatible with any current processor CMOS technology.