BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20210916T132446Z
LOCATION:Ernesto Bertarelli
DTSTART;TZID=Europe/Stockholm:20210708T160000
DTEND;TZID=Europe/Stockholm:20210708T163000
UID:submissions.pasc-conference.org_PASC21_sess175_pap130@linklings.com
SUMMARY:Refactoring the MPS/University of Chicago Radiative MHD (MURaM) Mo
 del for GPU/CPU Performance Portability Using OpenACC Directives
DESCRIPTION:Paper\n\nRefactoring the MPS/University of Chicago Radiative M
 HD (MURaM) Model for GPU/CPU Performance Portability Using OpenACC Directi
 ves\n\nWright, Przybylski, Miller, Suresh, Rempel...\n\nThe MURaM (Max Pla
 nck University of Chicago Radiative MHD) code is a solar atmosphere radiat
 ive MHD model that has been broadly applied to solar phenomena ranging fro
 m quiet to active sun, including eruptive events such as flares and corona
 l mass ejections. The treatment of physics is sufficiently realistic to al
 low for the synthesis of emission from visible light to extreme UV and X-r
 ays, which is critical for a detailed comparison with available and future
  multi-wavelength observations. This component relies critically on the ra
 diation transport solver (RTS) of MURaM; the most computationally intensiv
 e component of the code. The benefits of accelerating RTS are multiple fol
 d: A faster RTS allows for the regular use of the more expensive multi-ban
 d radiation transport needed for comparison with observations, and this wi
 ll pave the way for the acceleration of ongoing improvements in RTS that a
 re critical for simulations of the solar chromosphere. We present challeng
 es and strategies to accelerate a multi-physics, multi-band MURaM using a 
 directive-based programming model, OpenACC in order to maintain a single s
 ource code across CPUs and GPUs. <br />Results for a $288^3$ test problem 
 show that MURaM with the optimized RTS routine achieves 1.73x speedup usin
 g a single NVIDIA V100 GPU over a fully subscribed 40-core Intel Skylake C
 PU node and with respect to the number of simulation points (in millions) 
 per second, a single NVIDIA V100 GPU is equivalent to 69 Skylake cores. We
  also measure parallel performance on up to 96 GPUs and present weak and s
 trong scaling results.\n\nDomain: Physics
END:VEVENT
END:VCALENDAR
