BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20210916T132447Z
LOCATION:Henry Dunant
DTSTART;TZID=Europe/Stockholm:20210709T120000
DTEND;TZID=Europe/Stockholm:20210709T123000
UID:submissions.pasc-conference.org_PASC21_sess188_pap136@linklings.com
SUMMARY:Ensuring Statistical Reproducibility of Ocean Model Simulations in
  the Age of Hybrid Computing
DESCRIPTION:Paper\n\nEnsuring Statistical Reproducibility of Ocean Model S
 imulations in the Age of Hybrid Computing\n\nMahajan\n\nNovel high perform
 ance computing systems that feature hybrid architectures require large sca
 le code refactoring to unravel underlying exploitable parallelism. Such re
 design can often be accompanied with machine-precision changes as the orde
 r of computation cannot always be maintained. For chaotic systems like cli
 mate models, these round-off level differences can grow rapidly. Systemati
 c errors may also manifest initially as machine-precision differences. Iso
 lating genuine round off level differences from such errors remains a chal
 lenge. Here, we apply two sample equality of distribution tests to evaluat
 e statistical reproducibility of the ocean model component of US Departmen
 t of Energy’s Energy Exascale Earth System Model (E3SM). A 2-year co
 ntrol simulation ensemble is compared to a modified ensemble as a test cas
 e – after a known non-bit-for-bit change in a model component is int
 roduced – to evaluate the null hypothesis that the two ensembles are
  statistically indistinguishable. To quantify the false negative rates of 
 these tests, we conduct a formal power analysis using a targeted suite of 
 short simulation ensembles. The ensemble suite contains several perturbed 
 ensembles, each with a progressively different climate than the baseline e
 nsemble - obtained by perturbing the magnitude of a single model tuning pa
 rameter, the Gent and McWilliams $\kappa$, in a controlled manner. The nul
 l hypothesis is evaluated for each of perturbed ensembles using these test
 s. The power analysis informs on the detection limits of the tests for giv
 en ensemble size allowing model developers to evaluate the impact of an in
 troduced non-bit-for-bit change to the model.\n\nDomain: CS and Math, Clim
 ate and Weather
END:VEVENT
END:VCALENDAR
