BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260422T000712Z
LOCATION:605
DTSTART;TZID=America/Denver:20231113T094100
DTEND;TZID=America/Denver:20231113T100000
UID:submissions.supercomputing.org_SC23_sess441_ws_p3hpc102@linklings.com
SUMMARY:Performance Portability Evaluation of Blocked Stencil Computations
  on GPUs
DESCRIPTION:Oscar Antepara, Samuel Williams, and Hans Johansen (Lawrence B
 erkeley National Laboratory (LBNL)) and Tuowen Zhao, Samantha Hirsch, Priy
 a Goyal, and Mary Hall (University of Utah)\n\nIn this new era where multi
 ple GPU vendors are leading the supercomputing landscape, and multiple pro
 gramming models are available to users, the drive to achieve performance p
 ortability across platforms faces new challenges.  Consider stencil algori
 thms, where architecture-specific solutions are required to optimize for t
 he parallelism hierarchy and memory hierarchy of emerging systems.  In thi
 s work, we analyze performance portability of the BrickLib domain-specific
  library and vector code generator for stencils. BrickLib employs fine-gra
 in data blocking to reduce the large amount of data movement associated wi
 th stencils.  We compare different GPUs (NVIDIA, AMD and Intel) and their 
 associated programming models (CUDA, HIP and SYCL). By testing a wide rang
 e of stencil configurations, we show that overall, BrickLib achieves good 
 performance independent of machine or programming model.  Moreover, we int
 roduce correlation models as a new tool for comparing architectures and pr
 ogramming models from Roofline model data.\n\nTag: Performance Measurement
 , Modeling, and Tools, Performance Optimization\n\nRegistration Category: 
 Workshop Reg Pass\n\nSession Chairs: Judith C. Hill (Lawrence Livermore Na
 tional Laboratory (LLNL)), CJ Newburn (NVIDIA Corporation), Scott J. Parke
 r (Argonne National Laboratory (ANL)), and John Pennycook (Intel Corporati
 on)\n\n
END:VEVENT
END:VCALENDAR
