BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260422T000712Z
LOCATION:503-504
DTSTART;TZID=America/Denver:20231116T103000
DTEND;TZID=America/Denver:20231116T110000
UID:submissions.supercomputing.org_SC23_sess254_exforum135@linklings.com
SUMMARY:Exploring Converged HPC and AI on the Groq AI Inference Accelerato
 r
DESCRIPTION:Tobias Becker (Groq Inc)\n\nConverged compute infrastructure r
 efers to a trend where HPC clusters are set up for both AI and traditional
  HPC workloads, allowing these workloads to run on the same infrastructure
 , potentially reducing under-utilization. Here, we explore opportunities f
 or converged compute with GroqChip™, an AI accelerator optimized for runni
 ng large-scale inference workloads with high throughput and ultra-low late
 ncy. GroqChip features a Tensor Streaming architecture optimized for matri
 x-oriented operations commonly found in AI, but GroqChip can also efficien
 tly compute other applications such as linear algebra-based HPC workloads.
 \n\nWe consider two opportunities for using the Groq AI accelerator for co
 nverged HPC. The first example is a structured grid solver for Computation
 al Fluid Dynamics (CFD). This solver can run in a classical implementation
  as a direct numerical solver (DNS) using the pressure projection method. 
 In a hybrid AI implementation, the same DNS solver is augmented with CNN-b
 ased downscaling and upscaling steps. This enables a reduction of grid siz
 e from 2048 to 64, thus significantly reducing the amount of compute neces
 sary while maintaining a similar quality of results after upscaling. A spe
 edup of three orders of magnitude is made possible by the combination of r
 educing the number of compute steps in the algorithm through introducing A
 I, and by accelerating both the CNN and DNS stages with GroqChip. The seco
 nd example is using HydraGNN for materials science and computational chemi
 stry. These problems are typically solved with Density Field Theory algori
 thms, but recently, Graph Neural Networks (GNNs) have been explored as an 
 alternative. For example, GNNs can be used to predict the total energy, ch
 arge density, and magnetic moment for various atom configurations, identif
 ying molecules with desired reactivity. The computation requires many para
 llel walks of HydraGNN with low batch sizes, and can be solved on GroqChip
  30-50x faster than an A100 graphics processor.\n\nTag: Accelerators, Arti
 ficial Intelligence/Machine Learning, Architecture and Networks, Hardware 
 Technologies\n\nRegistration Category: Tech Program Reg Pass, Exhibits Reg
  Pass\n\nSession Chair: Jay Lofstead (Sandia National Laboratories, Univer
 sity of New Mexico)\n\n
END:VEVENT
END:VCALENDAR