BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260422T000621Z
LOCATION:505
DTSTART;TZID=America/Denver:20231115T153000
DTEND;TZID=America/Denver:20231115T170000
UID:submissions.supercomputing.org_SC23_sess308@linklings.com
SUMMARY:Best ACM SRC Poster Presentations
DESCRIPTION:Supercharging Scientific Serverless:  Slashing Cold Starts wit
 h Python UniKernels\n\nServerless computing platforms use containers to cr
 eate custom and isolated execution environments.  Thus, the time to serve 
 a function in the Function-as-a-Service (FaaS) paradigm, is dependent on t
 he time to load the necessary container.  FaaS platforms try to avoid "col
 d-starts'', instead pre-loa...\n\n\nJamison Kerney (Illinois Institute of 
 Technology)\n---------------------\nROI Preservation in Streaming Lossy Co
 mpression\n\nToday’s state-of-the-art scientific high-performance computin
 g (HPC) applications generate extensive data in diverse domains, placing a
  significant strain on data transfer and storage systems. Most compression
  algorithms are more computationally complex, requiring more processing po
 wer and tim...\n\n\nAvinash Kethineedi (Clemson University)\n-------------
 --------\nNear-Optimal Reduce on the Cerebras Wafer-Scale Engine\n\nEffici
 ent reduce and allreduce communication collectives are crucial building bl
 ocks in many workloads, including deep learning training, and have been op
 timized for various architectures. We provide the first systematic investi
 gation of the reduce operation on the Cerebras Wafer-Scale Engine (WSE) ..
 .\n\n\nPiotr Luczynski (ETH Zürich)\n---------------------\nGenome Assembl
 y Using an Asynchronous Distributed Actor-Based Approach\n\nWe use genome 
 assembly as a representative case to showcase the use of the ‘actor model’
 , a novel programming system for high-performance data-intensive workloads
 . The actor version of the 𝑘-mer counting kernel shows on average 1.6× spe
 edup over similar MPI implementation. We pro...\n\n\nSouvadra Hati (Georgi
 a Institute of Technology)\n---------------------\nNavigating the Molecula
 r Maze:  A Python-Powered Approach to Virtual Drug Screening\n\nThe COVID-
 19 pandemic has highlighted the power of using computational methods for v
 irtual drug screening. However, the molecular search space is enormous and
  the common protein docking methods are still computationally intractable 
 without access to the world’s largest supercomputers. Instead,...\n\n\nJoh
 n Raicu (University of Chicago)\n---------------------\nAccelerating Colle
 ctive Communications with Lossy Compression on GPU\n\nGPU-aware collective
  communication has become a major bottleneck for modern computing platform
 s as GPU computing power rapidly rises. To address this issue, traditional
  approaches integrate lossy compression directly into GPU-aware collective
 s, which still suffer from serious issues such as underuti...\n\n\nJiajun 
 Huang (University of California, Riverside)\n---------------------\nChasin
 g Clouds with Donkeycar:  Holistic Exploration of Edge and Cloud Inferenci
 ng Trade-Offs in E2E Self-Driving Cars\n\nIn autonomous driving, computati
 onal resources are strained by inference models. The viability of offloadi
 ng inference to the cloud, considering latency between the car and data ce
 nter, is questioned. We introduce a Cloud-Aided Real-time Inferencing Fram
 ework, integrating with Donkeycar and distribu...\n\n\nKyle Zheng (Nationa
 l Science Foundation (NSF))\n---------------------\nA Formal Specification
  of Tensor Cores via Satisfiability Modulo Theories\n\nIn this work, we ex
 plore how to replicate the behavior of undocumented hardware units -- in t
 his case, NVIDIA's Tensor Cores -- and reason about them.\n\nWhile prior w
 ork has employed manual testing to identify hardware behavior, we show tha
 t SMT can be used to generate inputs that can discriminate be...\n\n\nBenj
 amin Valpey (University of Rochester)\n---------------------\nFile Aggrega
 tion for Asynchronous Multi-Level Checkpointing\n\nCheckpointing serves nu
 merous functionalities in modern-day HPC systems and applications. In rece
 nt years, synchronous checkpointing, which blocks the application until ch
 eckpoints are persisted to external storage, suffers rising synchronizatio
 n overheads at scale, resulting in little forward progr...\n\n\nMikaila J.
  Gossman (Clemson University)\n---------------------\nComparative Study of
  the Cache Utilization Trends for Regional Scientific Data Caches\n\nLarge
  scientific collaborations often have many users accessing the same data f
 iles, creating repeated file transfers over long distances. Data accesses 
 to the distant data sources cause long latency to the applications and can
  be further delayed due to limited network bandwidth. XCache-based in-net.
 ..\n\n\nRonak Monga (Indiana University, Lawrence Berkeley National Labora
 tory (LBNL))\n\nRegistration Category: Tech Program Reg Pass\n\nSession Ch
 air: Ana Gainaru (Oak Ridge National Laboratory (ORNL))
END:VEVENT
END:VCALENDAR
