BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260422T000713Z
LOCATION:505
DTSTART;TZID=America/Denver:20231116T163000
DTEND;TZID=America/Denver:20231116T164500
UID:submissions.supercomputing.org_SC23_sess310_drs120@linklings.com
SUMMARY:Enabling Reproducibility and Scalability of Scientific Workflows i
 n HPC and Cloud
DESCRIPTION:Paula Olaya (University of Tennessee)\n\nScientific communitie
 s across fields like earth science, biology, and materials science increas
 ingly run complex workflows for their scientific discovery. We work closel
 y with these communities to leverage high-performance computing (HPC), big
  data analytics, and artificial intelligence/machine learning (AI/ML) to i
 ncrease and accelerate their workflows’ productivity. Our work addresses t
 he new challenges brought about by this optimization process.\n\nWe identi
 fy three main challenges in these workflows: i) they integrate AI/ML metho
 ds with limited transparency and include many interoperable components (da
 ta and applications) that are hard to trace and reuse to reproduce results
 ; ii) they hide the complexity of large intermediate data and their overal
 l execution can be affected by the I/O bandwidth of the underlying infrast
 ructure; and iii) they run on heterogeneous and distributed infrastructure
  with data and application dependencies that require efficient data manage
 ment and resource allocation.\n\nTo address these challenges, we provide s
 olutions that leverage the convergence between high-performance and cloud 
 computing. First, we design and develop fine-grained containerized environ
 ments that enable data traceability and results explainability by automati
 cally annotating and seamlessly attaching provenance information. Second, 
 since the workflows are already containerized, we integrate them in HPC an
 d native-cloud infrastructure and tune the storage technology to enable be
 tter I/O and data scalability. Finally, we orchestrate the end-to-end exec
 ution of workflows, ensuring efficient allocation of infrastructure resour
 ces and intermediate data management, and supporting reproducibility and r
 eusability of workflows’ executions.\n\nTag: Accelerators, Applications, C
 loud Computing, Data Compression, Heterogeneous Computing, I/O and File Sy
 stems, Reproducibility, Software Engineering\n\nRegistration Category: Tec
 h Program Reg Pass\n\nSession Chairs: André Brinkmann (Johannes Gutenberg 
 University Mainz) and Xubin He (Temple University, Department of Computer 
 and Information Sciences)\n\n
END:VEVENT
END:VCALENDAR
