BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260422T000711Z
LOCATION:708
DTSTART;TZID=America/Denver:20231113T120000
DTEND;TZID=America/Denver:20231113T123000
UID:submissions.supercomputing.org_SC23_sess451_misc234@linklings.com
SUMMARY:Performance Portability in the Age of Extreme Heterogeneity
DESCRIPTION:John Shalf (Lawrence Berkeley National Laboratory (LBNL))\n\nM
 oore’s Law is a techno-economic model that has enabled the IT industry to 
 double the performance and functionality of digital electronics roughly ev
 ery 2 years within a fixed cost, power and area. This expectation has led 
 to a relatively stable ecosystem (e.g. electronic design automation tools,
  compilers, simulators and emulators) built around general-purpose process
 or technologies, such as the x86, ARM and Power instruction set architectu
 res. However, the historical improvements in performance offered by succes
 sive generations of lithography are waning while costs for new chip genera
 tions are growing rapidly. In the near term, the most practical path to co
 ntinued performance growth will be architectural specialization in the for
 m of many different kinds of accelerators. New software implementations, a
 nd in many cases new mathematical models and algorithmic approaches, are n
 ecessary to advance the science that can be done with these specialized ar
 chitecture. This trend will not only continue but also intensify as the tr
 ansition from multi-core systems to hybrid systems has already caused many
  teams to re-factor and redesign their implementations. But the next step 
 to systems that exploit not just one type of accelerator but a full range 
 of heterogeneous architectures will require more fundamental and disruptiv
 e changes in algorithm and software approaches. This applies to the broad 
 range of algorithms used in simulation, data analysis and learning. New pr
 ogramming models or low-level software constructs that hide the details of
  the architecture from the implementation can make future programming less
  time-consuming, but they will not eliminate nor in many cases even mitiga
 te the need to redesign algorithms. Future software development will not b
 e tractable if a completely different code base is required for each diffe
 rent variant of a specialized system.\n\nThe aspirational desire for “mini
 mizing the number of lines of code that must be changed to migrate to diff
 erent systems with different arrangements of specialization” is encapsulat
 ed in the loaded phrase “Performance Portability.” However, performance po
 rtability is likely not an achievable goal if we attempt to do it using im
 perative languages like Fortran and C/C++. There is simply not enough flex
 ibility built in to the specification of the algorithm for a compiler to d
 o anything other than what the algorithm designer explicitly stated in the
 ir code. To make this future of diverse accelerators usable and accessible
  in the former case will require the co-design of new compiler technology 
 and domain- specific languages (DSLs) designed around the requirements of 
 the target computational motifs. The higher levels of abstraction and decl
 arative semantics offered by DSLs enable more degrees of freedom to optima
 lly map the algorithms onto diverse hardware than traditional imperative l
 anguages that over-prescribe the solution. Because this will drastically i
 ncrease the complexity of the mapping problem, new mathematics for optimiz
 ation will be developed, along with better performance introspection (both
  hardware and software mechanisms for online performance introspection) th
 rough extensions to the roofline model. Use of ML/AI technologies will be 
 essential to enable analysis and automation of dynamic optimizations.\n\nT
 ag: Large Scale Systems, Middleware and System Software, Programming Frame
 works and System Software\n\nRegistration Category: Workshop Reg Pass\n\nS
 ession Chairs: Dhabaleswar K. (DK) Panda (The Ohio State University), Karl
  Schulz (Advanced Micro Devices (AMD) Inc), Aamir Shafi (The Ohio State Un
 iversity), and Hari Subramoni (The Ohio State University)\n\n
END:VEVENT
END:VCALENDAR