BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260422T000711Z
LOCATION:401-402
DTSTART;TZID=America/Denver:20231115T163000
DTEND;TZID=America/Denver:20231115T170000
UID:submissions.supercomputing.org_SC23_sess174_pap318@linklings.com
SUMMARY:High-Performance and Programmable Attentional Graph Neural Network
 s with Global Tensor Formulations
DESCRIPTION:Maciej Besta (ETH Zurich - Swiss Federal Institute of Technolo
 gy); Paweł Renc (AGH University of Science and Technology, Krakow, Poland;
  Sano Centre for Computational Medicine, Krakow, Poland); Robert Gerstenbe
 rger (ETH Zurich - Swiss Federal Institute of Technology); Paolo Sylos Lab
 ini (Free University of Bozen-Bolzano, Italy; ETH Zurich - Swiss Federal I
 nstitute of Technology); Alexandros Ziogas, Tiancheng Chen, Lukas Gianinaz
 zi, Florian Scheidl, Kalman Szenes, Armon Carigiet, and Patrick Iff (ETH Z
 urich - Swiss Federal Institute of Technology); Grzegorz Kwasniewski (Next
 Silicon Inc); Raghavendra Kanakagiri (University of Illinois); Chio Ge and
  Sammy Jaeger (ETH Zurich - Swiss Federal Institute of Technology); Jarosł
 aw Wąs (AGH University of Science and Technology, Krakow, Poland); Flavio 
 Vella (University of Trento); and Torsten Hoefler (ETH Zurich - Swiss Fede
 ral Institute of Technology)\n\nGraph attention models (A-GNNs), a type of
  Graph Neural Networks (GNNs), have been shown to be more powerful than si
 mpler convolutional GNNs (C-GNNs). However, A-GNNs are more complex to pro
 gram and difficult to scale. To address this, we develop a novel mathemati
 cal formulation, based on tensors that group all the feature vectors, targ
 eting both training and inference of A-GNNs The formulation enables straig
 htforward adoption of communication-minimizing routines, it fosters optimi
 zations such as vectorization, and it enables seamless integration with es
 tablished linear algebra DSLs or libraries such as GraphBLAS. Our implemen
 tation uses a data redistribution scheme explicitly developed for sparse-d
 ense tensor operations used heavily in GNNs, and fusing optimizations that
  further minimize memory usage and communication cost. We ensure theoretic
 al asymptotic reductions in communicated data compared to the established 
 message-passing GNN paradigm. Finally, we provide excellent scalability an
 d speedups of >5x over modern libraries such as Deep Graph Library.\n\nTag
 : Artificial Intelligence/Machine Learning, Compilers, Performance Measure
 ment, Modeling, and Tools, Performance Optimization, Programming Framework
 s and System Software, Tensors\n\nRegistration Category: Tech Program Reg 
 Pass\n\nReproducibility Badges: Artifact Available, Artifact Functional, R
 esults Reproduced\n\nSession Chair: Kazem Cheshmi (McMaster University, On
 tario, Canada)\n\n
END:VEVENT
END:VCALENDAR