BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260422T000711Z
LOCATION:401-402
DTSTART;TZID=America/Denver:20231115T113000
DTEND;TZID=America/Denver:20231115T120000
UID:submissions.supercomputing.org_SC23_sess157_pap248@linklings.com
SUMMARY:PanguLU: A Scalable Regular Two-Dimensional Block-Cyclic Sparse Di
 rect Solver on Distributed Heterogeneous Systems
DESCRIPTION:Xu Fu, Bingbin Zhang, Tengcheng Wang, Wenhao Li, Yuechen Lu, E
 nxin Yi, Jianqi Zhao, Xiaohan Geng, Fangying Li, Jingwen Zhang, Zhou Jin, 
 and Weifeng Liu (China University of Petroleum, Beijing)\n\nSparse direct 
 solvers play a vital role in large-scale high performance computing in sci
 ence and engineering. Existing distributed sparse direct methods employ mu
 ltifrontal/supernodal patterns to aggregate columns of nearly identical fo
 rms and to exploit dense basic linear algebra subprograms (BLAS) for compu
 tation. We propose a new sparse direct solver called PanguLU. Our work rel
 ies on simpler regular 2D blocking and stores blocks in their sparse forms
  to avoid any extra fill-ins. Based on sparse patterns of blocks, a variet
 y of block-wise sparse BLAS methods are developed and selected for higher 
 efficiency on local GPUs. To make PanguLU more scalable, we also adjust ma
 pping of blocks to processes for overall more balanced workload, and propo
 se a synchronization-free communication strategy to reduce overall latency
  overhead. Experiments on two distributed heterogeneous platforms consisti
 ng of 128 A100 GPUs and 128 MI50 GPUs demonstrate that PanguLU achieves up
  to 11.70x and 17.97x speedups over SuperLU_DIST.\n\nTag: Accelerators, Al
 gorithms, Linear Algebra\n\nRegistration Category: Tech Program Reg Pass\n
 \nAward Finalist: Best Paper Finalist\n\nReproducibility Badges: Artifact 
 Available, Artifact Functional, Results Reproduced\n\nSession Chair: Hartw
 ig Anzt (Technical University of Munich; University of Tennessee, Knoxvill
 e)\n\n
END:VEVENT
END:VCALENDAR
