BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260422T000713Z
LOCATION:401-402
DTSTART;TZID=America/Denver:20231114T163000
DTEND;TZID=America/Denver:20231114T170000
UID:submissions.supercomputing.org_SC23_sess179_pap290@linklings.com
SUMMARY:Interference-Aware Multiplexing for Deep Learning in GPU Clusters:
  A Middleware Approach
DESCRIPTION:Wenyan Chen (University of Macau; Shenzhen Institute of Advanc
 ed Technology, Chinese Academy of Sciences); Zizhao Mo and Huanle Xu (Univ
 ersity of Macau); Kejiang Ye (Shenzhen Institute of Advanced Technology, C
 hinese Academy of Sciences); and Chengzhong Xu (University of Macau)\n\nA 
 common strategy for improving efficiency in training deep learning entails
  multiplexing tasks on a single GPU. To mitigate the interference caused b
 y multiplexing, existing approaches primarily employ kernel-level solution
 s to regulate GPU kernel execution, or harness hardware-level techniques t
 o explicitly restrict GPU streaming multiprocessors and memory. Neverthele
 ss, none of them perform satisfactorily in optimizing the completion time 
 of tasks.\n\nIn this paper, we present IADeep, a middleware solution desig
 ned to significantly improve multiplexing efficiency. The core concept is 
 the co-optimization of task assignments within a cluster and interference 
 mitigation on each device. IADeep coordinates the configuration of all co-
 located tasks in a less fine-grained fashion, effectively reducing interfe
 rence and enhancing task training performance. Across the entire cluster, 
 IADeep intelligently selects applications suitable for multiplexing to fur
 ther amplify the advantages of optimizing task configurations. Evaluations
  on a 20 RTX 3090-GPU cluster demonstrate that IADeep can significantly ou
 tperform state-of-the-art multiplexing solutions.\n\nTag: Accelerators, Di
 stributed Computing, Middleware and System Software, Performance Measureme
 nt, Modeling, and Tools, Post-Moore Computing\n\nRegistration Category: Te
 ch Program Reg Pass\n\nAward Finalist: Best Paper Finalist\n\nReproducibil
 ity Badges: Artifact Available, Artifact Functional, Results Reproduced\n\
 nSession Chair: Hari Subramoni (The Ohio State University)\n\n
END:VEVENT
END:VCALENDAR
