BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260422T000711Z
LOCATION:505
DTSTART;TZID=America/Denver:20231112T165000
DTEND;TZID=America/Denver:20231112T171000
UID:submissions.supercomputing.org_SC23_sess428_ws_ipdrm105@linklings.com
SUMMARY:MPI-xCCL: A Portable MPI Library over Collective Communication Lib
 raries for Various Accelerators
DESCRIPTION:Chen-Chun Chen, Kawthar Shafie Khorassani, Pouya Kousha, Qingh
 ua Zhou, Jinghan Yao, Hari Subramoni, and Dhabaleswar K. Panda (Ohio State
  University)\n\nThe evolution of high-performance computing toward diverse
  accelerators, including NVIDIA, AMD, Intel GPUs, and Habana Gaudi Acceler
 ators, demands a user-friendly and efficient utilization of these technolo
 gies. While both GPU-aware MPI libraries and vendor-specific communication
  libraries cater to communication requirements, trade-offs emerge based on
  library selection across various message sizes. Thus, prioritizing usabil
 ity, we propose MPI-xCCL, a Message Passing Interface-based runtime with c
 ross-accelerator support for efficient, portable, scalable, and optimized 
 communication performance. MPI-xCCL incorporates vendor-specific libraries
  with GPU-aware MPI runtimes ensuring multi-accelerator compatibility whil
 e adhering to MPI standards. The proposed  hybrid designs leverage the ben
 efits of MPI and xCCL algorithms and transparently to the end user. We eva
 luated our designs on various HPC systems using OSU Micro-Benchmarks, and 
 Deep Learning frameworks TensorFlow with Horovod. On NVIDIA-GPU-enabled Th
 etaGPU, our designs outperformed Open MPI by 4.6x. On emerging Habana Gaud
 i-based systems, MPI-xCCL was also able to deliver similar performance as 
 vendor-provided communication runtimes.\n\nTag: Distributed Computing, Mid
 dleware and System Software, Runtime Systems\n\nRegistration Category: Wor
 kshop Reg Pass\n\nSession Chairs: Barbara Chapman (Hewlett Packard Enterpr
 ise (HPE), Stony Brook University); Joseph Manzano (Pacific Northwest Nati
 onal Laboratory (PNNL)); Shirley Moore (University of Texas at El Paso); E
 unJung (EJ) Park (Qualcomm); and Joshua Suetterlein (Pacific Northwest Nat
 ional Laboratory (PNNL))\n\n
END:VEVENT
END:VCALENDAR
