BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260422T000712Z
LOCATION:301-302-303
DTSTART;TZID=America/Denver:20231116T133000
DTEND;TZID=America/Denver:20231116T140000
UID:submissions.supercomputing.org_SC23_sess156_pap294@linklings.com
SUMMARY:Parallel Top-K Algorithms on GPU: A Comprehensive Study and New Me
 thods
DESCRIPTION:Jingrong Zhang, Akira Naruse, Xipeng Li, and Yong Wang (NVIDIA
  Corporation)\n\nThe top-K problem is an essential part of many important 
 applications in scientific computing, information retrieval, etc. As data 
 volume grows rapidly, high-performance parallel top-K algorithms become cr
 itical. We propose two parallel top-K algorithms, AIR top-K (Adaptive and 
 Iteration-fused Radix top-K) and GridSelect, for GPU. AIR top-K employs an
  iteration-fused design to minimize CPU-GPU communication and device data 
 access. Its adaptive strategy eliminates unnecessary device memory traffic
  automatically under various data distributions. GridSelect can process da
 ta on-the-fly. It adopts a shared queue and parallel two-step insertion to
  decrease the frequency of costly operations. We comprehensively compare 8
  open-source GPU implementations and our methods for a wide range of probl
 em sizes and data distributions. For batch sizes 1 and 100, respectively, 
 AIR top-K shows 1.98-21.48X and 8.01-574.78X speedup over previous radix t
 op-K algorithm, and 1.44-7.34X and 1.38-31.91X speedup over state-of-the-a
 rt methods. GridSelect shows up to 882.29X speedup over its baseline.\n\nT
 ag: Accelerators, Algorithms, Graph Algorithms and Frameworks\n\nRegistrat
 ion Category: Tech Program Reg Pass\n\nReproducibility Badges: Artifact Av
 ailable, Artifact Functional, Results Reproduced\n\nSession Chair: Alessio
  Sclocco (Netherlands eScience Center)\n\n
END:VEVENT
END:VCALENDAR
