BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260422T000711Z
LOCATION:405-406-407
DTSTART;TZID=America/Denver:20231117T084000
DTEND;TZID=America/Denver:20231117T093000
UID:submissions.supercomputing.org_SC23_sess462_misc226@linklings.com
SUMMARY:Keynote: Empowering Large AI Models Based on Heterogeneous Memory
DESCRIPTION:Dong Li (University of California, Merced)\n\nThe size of larg
 e artificial intelligence (AI) models has increased by at least 100x in th
 e past few years, which leads to memory consumption at the scale of hundre
 ds of GBs and even TBs. Recent advance of heterogeneous memory (HM) provid
 es a cost-effective approach to increase memory capacity. Using external m
 emory (e.g., CXL memory expansion and GPU-like accelerator's memory) as an
  extension to GPU memory, we can build an HM to enable large-scale AI mode
 l inference and training without using extra GPUs to accommodate large mem
 ory consumption. However, not only HM imposes challenges on tensor allocat
 ion and migration on HM itself, but it is also unclear how HM affects trai
 ning/inference throughput. AI model workload possesses unique characterist
 ics of memory access patterns and data structures, which places challenges
  on the promptness of data migration, load balancing, and tensor redundanc
 y on GPU.  In this talk, I will discuss the work we have been done to opti
 mize the management of HM for large language model and graph neural networ
 ks. The key insights in our designs are to leverage AI domain knowledge to
  reconcile the tensions between multiple design targets (e.g., minimizing 
 tensor migration volume and maintaining high system throughput). Finally, 
 I will discuss the opportunities and challenges for future HM management i
 n the era of large generative models.\n\nTag: Data Movement and Memory, Fa
 ult Handling and Tolerance, Hardware Technologies, Heterogeneous Computing
 , I/O and File Systems, Performance Measurement, Modeling, and Tools, Prog
 ramming Frameworks and System Software, Security\n\nRegistration Category:
  Workshop Reg Pass\n\nSession Chairs: João Barreto (INESC-ID, IST, Univers
 ity of Lisbon; Universidade de Lisboa); Hatem ElShazly (Sony); Antonio J. 
 Peña (Barcelona Supercomputing Center (BSC); Universitat Politècnica de Ca
 talunya (UPC), Spain); and Harald Servat (Intel Corporation)\n\n
END:VEVENT
END:VCALENDAR