BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260422T000713Z
LOCATION:405-406-407
DTSTART;TZID=America/Denver:20231116T160000
DTEND;TZID=America/Denver:20231116T163000
UID:submissions.supercomputing.org_SC23_sess170_pap519@linklings.com
SUMMARY:Runtime Composition of Iterations for Fusing Loop-Carried Sparse D
 ependence
DESCRIPTION:Kazem Cheshmi (McMaster University); Michelle Mills Strout (Un
 iversity of Arizona, Hewlett Packard Enterprise (HPE)); and Maryam Mehri D
 ehnavi (University of Toronto)\n\nDependence between iterations in sparse 
 computations causes inefficient use of memory and computation resources. T
 his paper proposes sparse fusion, a technique that generates efficient par
 allel code for the combination of two sparse matrix kernels, where at leas
 t one of the kernels has loop-carried dependencies. Existing implementatio
 ns optimize individual sparse kernels separately. However, this approach l
 eads to synchronization overheads and load imbalance due to the irregular 
 dependence patterns of sparse kernels, as well as inefficient cache usage 
 due to their irregular memory access patterns. Sparse fusion uses a novel 
 inspection strategy and code transformation to generate parallel fused cod
 e optimized for data locality and load balance. Sparse fusion outperforms 
 the best of unfused implementations using ParSy and MKL by an average of 4
 .2× and is faster than the best of fused implementations using existing sc
 heduling algorithms, such as LBC, DAGP, and wavefront by an average of 4× 
 for various kernel combinations.\n\nTag: Compilers, Performance Measuremen
 t, Modeling, and Tools, Performance Optimization, Programming Frameworks a
 nd System Software\n\nRegistration Category: Tech Program Reg Pass\n\nRepr
 oducibility Badges: Artifact Available, Artifact Functional, Results Repro
 duced\n\nSession Chair: Martin Kong (Ohio State University)\n\n
END:VEVENT
END:VCALENDAR
