BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260422T000616Z
LOCATION:403-404
DTSTART;TZID=America/Denver:20231115T103000
DTEND;TZID=America/Denver:20231115T120000
UID:submissions.supercomputing.org_SC23_sess161@linklings.com
SUMMARY:Handling Hardware Faults
DESCRIPTION:Unity ECC: Unified Memory Protection Against Bit and Chip Erro
 rs\n\nDRAM vendors utilize On-Die Error Correction Codes (OD-ECC) to corre
 ct random bit errors internally. Meanwhile, system companies utilize Rank-
 Level ECC (RL-ECC) to protect data against chip errors. Separate protectio
 n increases the redundancy ratio to 32.8% in DDR5 and incurs significant p
 erformance...\n\n\nDongwhee Kim, Jaeyoon Lee, and Wonyeong Jung (Sungkyunk
 wan University); Michael Sullivan (NVIDIA Corporation); and Jungrae Kim (S
 ungkyunkwan University)\n---------------------\nDesign Considerations and 
 Analysis of Multi-Level Erasure Coding in Large-Scale Data Centers\n\nMult
 i-level erasure coding (MLEC) has seen large deployments in the field, but
  there is no in-depth study of design considerations for MLEC at scale. In
  this paper, we provide comprehensive design considerations and analysis o
 f MLEC at scale. We introduce the design space of MLEC in multiple dimensi
 ...\n\n\nMeng Wang, Jiajun Mao, and Rajdeep Rana (University of Chicago); 
 John Bent (Los Alamos National Laboratory (LANL)); Serkay Olmez (Seagate R
 esearch); Anjus George (Oak Ridge National Laboratory (ORNL)); Garrett Wil
 son Ransom (Los Alamos National Laboratory (LANL)); Jun Li (CUNY Queens Co
 llege & Graduate Center); and Haryadi S. Gunawi (University of Chicago)\n-
 --------------------\nUnderstanding the Effects of Permanent Faults in GPU
 ’s Parallelism Management and Control Units\n\nModern Graphics Processing 
 Units (GPUs) demand life expectancy extended to many years, exposing the h
 ardware to aging (i.e., permanent faults arising after the end-of-manufact
 uring test). Hence, techniques to assess permanent fault impacts in GPUs a
 re strongly required, especially in safety-critical...\n\n\nJuan David Gue
 rrero Balaguera and Josie Esteban Rodriguez Condia (Politecnico di Torino)
 ; Fernando Fernandes dos Santos (University of Rennes, Inria Rennes - Bret
 agne Atlantique Research Centre); Matteo Sonza Reorda (Politecnico di Tori
 no); and Paolo Rech (University of Trento)\n\nTag: Accelerators, Architect
 ure and Networks, Data Analysis, Visualization, and Storage, Fault Handlin
 g and Tolerance\n\nRegistration Category: Tech Program Reg Pass\n\nAward F
 inalist: Best Student Paper Finalist\n\nReproducibility Badges: Artifact A
 vailable, Artifact Functional, Results Reproduced\n\nSession Chair: Ignaci
 o Laguna (Lawrence Livermore National Laboratory (LLNL))
END:VEVENT
END:VCALENDAR
