BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260422T000711Z
LOCATION:605
DTSTART;TZID=America/Denver:20231113T105700
DTEND;TZID=America/Denver:20231113T111600
UID:submissions.supercomputing.org_SC23_sess441_ws_p3hpc112@linklings.com
SUMMARY:Evaluating the Performance of One-Sided Communication on CPUs and 
 GPUs
DESCRIPTION:Nan Ding, Muhammad Haseeb, Taylor Groves, and Samuel Williams 
 (Lawrence Berkeley National Laboratory (LBNL))\n\nAs high-performance GPU 
 computing becomes the trend, GPU-initiated one-sided communication becomes
  a viable solution for multi-GPU scaling. It also raises attention to the 
 use of one-sided communication on CPUs. However, the lack of deep understa
 nding of one-sided communication performance and its impact on an applicat
 ion's performance becomes a hurdle. In this paper, we overcome this hurdle
  by proposing a Message Roofline model, which characterizes an application
 ’s sustained messaging performance (GB/s) as a function of its message siz
 e, number of messages per synchronization, peak network bandwidth, and net
 work latency. We use three benchmarks to demonstrate the potentials of one
 -sided communication on CPUs and GPUs. These benchmarks include Stencils, 
 Sparse Triangular Solve and Distributed HashTable. Our evaluation provides
  insights into practically understanding the two-sided and one-sided commu
 nications in MPI applications, and can also guide hardware vendors with de
 sign principles lest the potential performance of one-sided communications
  being under-utilized.\n\nTag: Performance Measurement, Modeling, and Tool
 s, Performance Optimization\n\nRegistration Category: Workshop Reg Pass\n\
 nSession Chairs: Judith C. Hill (Lawrence Livermore National Laboratory (L
 LNL)), CJ Newburn (NVIDIA Corporation), Scott J. Parker (Argonne National 
 Laboratory (ANL)), and John Pennycook (Intel Corporation)\n\n
END:VEVENT
END:VCALENDAR
