BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260422T000712Z
LOCATION:DEF Concourse
DTSTART;TZID=America/Denver:20231114T100000
DTEND;TZID=America/Denver:20231114T170000
UID:submissions.supercomputing.org_SC23_sess289_spostu106@linklings.com
SUMMARY:Scaling Studies for Efficient Parameter Search and Parallelism for
  Large Language Model Pretraining
DESCRIPTION:Chris Pierre Paul (Oak Ridge Institute For Science And Educati
 on, Florida State University) and Leo Phan (Oak Ridge Institute For Scienc
 e And Education, George Washington University)\n\nAI accelerator processin
 g and memory constraints largely dictate the scale in which machine learni
 ng workloads (training and inference) can be executed within a desirable t
 ime frame. Training a transformer-based model requires the utilization of 
 HPC harnessed through inherent parallelism embedded in processor design, t
 o deliberate modification of neural networks to increase concurrency durin
 g training and inference. Our model is the culmination of different perfor
 mance tests seeking the ideal combination of frameworks and configurations
  for training a 13-billion-parameter translation model for foreign languag
 es. We performed ETL over the corpus, which involved training a balanced i
 nterleaved dataset. We investigated the impact of batch size, learning rat
 e, and different forms of precision on model training time, accuracy, and 
 memory consumption. We use DeepSpeed stage 3 and Huggingface accelerate to
  parallelize our model. Our model, based on the mT5 architecture, is train
 ed on the mC4 and language-specific datasets, enabling question-answering 
 in the fine-tuning process.\n\nRegistration Category: Tech Program Reg Pas
 s, Exhibits Reg Pass\n\n
END:VEVENT
END:VCALENDAR
