Abstract is missing.
- Bandwidth Scaling for AI Interconnect - More Wavelengths versusMore Fiber?Katharine Schmidtke, Daniel M. Kuchta, Peter Winzer 0001, Rebecca Schaevitz, Amit Nagra, Alan Liu, Bardia Pezeshki. [doi]
- Message from the TPC Chairs: HOTI 2024Rohit Zambre, Sayan Ghosh. [doi]
- Message from the General Chairs HOTI 2024Matthew G. F. Dosanjh, Artem Y. Polyakov. [doi]
- Rail-only: A Low-Cost High-Performance Network for Training LLMs with Trillion ParametersWeiyang Wang, Manya Ghobadi, Kayvon Shakeri, Ying Zhang, Naader Hasani. 1-10 [doi]
- Characterizing Communication in Distributed Parameter-Efficient Fine-Tuning for Large Language ModelsNawras Alnaasan, Horng-Ruey Huang, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda 0001. 11-19 [doi]
- Quality-of-Service Provision for BXI3-Based Interconnection NetworksMiguel Sánchez de la Rosa, Gabriel Gomez-Lopez, Francisco J. Andújar, Jesús Escudero-Sahuquillo, José L. Sánchez 0002, Francisco J. Alfaro, Pierre-Axel Lagadec. 20-23 [doi]
- A New Mechanism to Identify Congesting Packets in High-Performance Interconnection NetworksCristina Olmedilla, Jesús Escudero-Sahuquillo, Pedro Javier García, Francisco J. Quiles 0001, Wenhao Sun, Long Yan, Yunping Lvu, José Duato. 24-32 [doi]
- Towards a Standardized Representation for Deep Learning Collective AlgorithmsJinsun Yoo, William Won, Meghan Cowan, Nan Jiang, Benjamin Klenk, Srinivas Sridharan 0002, Tushar Krishna. 33-36 [doi]
- Unified Collective Communication (UCC): An Unified Library for CPU, GPU, and DPU CollectivesManjunath Gorentla Venkata, Valentine Petrov, Sergey Lebedev, Devendar Bureddy, Ferrol Aderholdt, Joshua Ladd, Gil Bloch, Mike Dubman, Gilad Shainer. 37-46 [doi]
- OHIO: Improving RDMA Network Scalability in MPI_Alltoall Through Optimized Hierarchical and Intra/Inter-Node Communication Overlap DesignTu Tran, Goutham Kalikrishna Reddy Kuncham, Bharath Ramesh 0005, Shulei Xu, Hari Subramoni, Mustafa Abduljabbar, Dhabaleswar K. Panda 0001. 47-56 [doi]
- Demystifying the Communication Characteristics for Distributed Transformer ModelsQuentin Anthony, Benjamin Michalowicz, Jacob Hatef, Lang Xu, Mustafa Abdul Jabbar, Aamir Shafi, Hari Subramoni, Dhabaleswar K. Panda 0001. 57-65 [doi]