BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260422T000713Z
LOCATION:301-302-303
DTSTART;TZID=America/Denver:20231116T143000
DTEND;TZID=America/Denver:20231116T150000
UID:submissions.supercomputing.org_SC23_sess156_pap448@linklings.com
SUMMARY:Adaptive Workload-Balanced Scheduling Strategy for Global Ocean Da
 ta Assimilation on Massive GPUs
DESCRIPTION:Junmin Xiao (Institute of Computing Technology, Chinese Academ
 y of Sciences); Chaoyang Shui (Institute of Computing Technology, Institut
 e of Computing Technology, Chinese Academy of Sciences); and Di Cai, Kangy
 u Wang, Yunfei Pang, Mingyi Li, Hui Ma, and Guangming Tan (Institute of Co
 mputing Technology, Chinese Academy of Sciences)\n\nGlobal ocean data assi
 milation is a crucial technique to estimate the actual oceanic state by co
 mbining numerical model outcomes and observation data, which is widely use
 d in climate research. Due to the imbalanced distribution of observation d
 ata in global ocean, the parallel efficiency of recent methods suffers fro
 m workload imbalance. When massive GPUs are applied for global ocean data 
 assimilation, the workload imbalance becomes more severe, resulting in poo
 r scalability. In this work, we propose a novel adaptive workload-balance 
 scheduling strategy, assimilation, which successfully estimates the total 
 workload prior to execution and ensures a balanced workload assignment. Fu
 rther, we design a parallel dynamic programming approach to accelerate the
  schedule decision, and develop a factored dataflow to exploit the paralle
 l potential of GPUs. Evaluation demonstrates that our algorithm outperform
 s the state-of-the-art method by up to 9.1x speedup. This work is the firs
 t to scale global ocean data assimilation to 4,000 GPUs.\n\nTag: Accelerat
 ors, Algorithms, Graph Algorithms and Frameworks\n\nRegistration Category:
  Tech Program Reg Pass\n\nReproducibility Badges: Artifact Available, Arti
 fact Functional, Results Reproduced\n\nSession Chair: Alessio Sclocco (Net
 herlands eScience Center)\n\n
END:VEVENT
END:VCALENDAR
