BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260422T000713Z
LOCATION:505
DTSTART;TZID=America/Denver:20231112T113500
DTEND;TZID=America/Denver:20231112T114000
UID:submissions.supercomputing.org_SC23_sess421_ws_rsdha103@linklings.com
SUMMARY:Evaluating Primitives in Deep Neural Network Libraries:  A Case St
 udy with the Softmax Functions
DESCRIPTION:Zheming Jin (Oak Ridge National Laboratory (ORNL)) and Jeffrey
  Vetter (IEEE Computer Society)\n\nA deep neural network library (DNNL) is
  an optimized library of low-level computational primitives for deep neura
 l networks. In this study, we choose the softmax function, a primitive com
 monly used in new computing models for DNNs, as a case study on evaluating
  the unique programming models adopted by the vendors’ DNNLs (cuDNN, MIOpe
 n, and oneDNN) and the performance and portability of DNNLs on NVIDIA and 
 AMD GPUs. We find that cuDNN selects different compute kernels to execute 
 based on the problem size for the primitive, which may have a significant 
 performance impact. oneDNN successfully enables functional portability of 
 the primitive across vendors’ platforms, but performance portability will 
 need to be improved. In addition, the performance of a primitive in the DN
 NLs may be suboptimal compared to a custom implementation.\n\nTag: Acceler
 ators, Edge Computing, Heterogeneous Computing\n\nRegistration Category: W
 orkshop Reg Pass\n\nSession Chairs: Ali Akoglu (University of Arizona), Me
 hmet E Belviranli (Colorado School of Mines), and Seyong Lee (Oak Ridge Na
 tional Laboratory (ORNL))\n\n
END:VEVENT
END:VCALENDAR
