BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260422T000713Z
LOCATION:DEF Concourse
DTSTART;TZID=America/Denver:20231114T100000
DTEND;TZID=America/Denver:20231114T170000
UID:submissions.supercomputing.org_SC23_sess291_rpost137@linklings.com
SUMMARY:That's Right – The Same C++ STL Asynchronous Parallel Code Runs on
  CPUs and GPUs
DESCRIPTION:Muhammad Haseeb, Weile Wei, Jack Deslippe, and Brandon Cook (L
 awrence Berkeley National Laboratory (LBNL), National Energy Research Scie
 ntific Computing Center (NERSC))\n\nHigh-performance computing application
 s running on modern-day supercomputers frequently encounter performance an
 d portability challenges especially if using multiple programming models, 
 languages and compilers. In this work, we explore the proposed C++26 langu
 age standard model for asynchronous parallelism, called std::execution or 
 stdexec, powered with stdpar, std::mdspan, among other C++23 features, to 
 port and analyze multiple scientific HPC applications on CPUs and GPUs. Th
 ese applications include sequence alignment codes from ADEPT and heat tran
 sfer from AMReX. Our experiments depict near-native performance for our po
 rted implementations on NVIDIA A100 GPUs running on the Perlmutter superco
 mputer. We also study and analyze the data transfer traffic patterns and o
 verheads between the host and device for stdpar and provide helpful insigh
 ts in application performance. Finally, we discuss some challenges and lim
 itations encountered while porting these apps to C++26 with stdexec, as we
 ll as their workarounds, until the stdexec is fully integrated and functio
 n in the NVHPC compilers.\n\nTag: Artificial Intelligence/Machine Learning
 , Architecture and Networks, Heterogeneous Computing, I/O and File Systems
 , Performance Measurement, Modeling, and Tools, Post-Moore Computing, Prog
 ramming Frameworks and System Software, Quantum Computing\n\nRegistration 
 Category: Tech Program Reg Pass, Exhibits Reg Pass\n\n
END:VEVENT
END:VCALENDAR
