BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20210916T132528Z
LOCATION:
DTSTART;TZID=Europe/Stockholm:20210706T173000
DTEND;TZID=Europe/Stockholm:20210706T190000
UID:submissions.pasc-conference.org_PASC21_sess182@linklings.com
SUMMARY:Poster Session
DESCRIPTION:Poster\n\nP40 - MaMiCo: Fast Non-Local Means Filtering for Mol
 ecular-Continuum HPC Flow Simulation\n\nJarmatz, Neumann\n\nThe Non-Local 
 Means (NLM) algorithm is an advanced denoising method that was developed f
 or image processing and is used in the context of 2D and 3D medical imagin
 g. In the area of multiscale computational fluid dynamics, coupled molecul
 ar-continuum flow simulations are often impaired by thermal nois...\n\n---
 ------------------\nP21 - AI for Earth System Sciences: Bottlenecks, Pitfa
 lls and Recommendations\n\nWeigel, Lüttgau, Kadow, Greenberg, Bouwer...\n\
 nDKRZ and Hereon are prototyping new services to support scientists from E
 arth system sciences who seek to apply AI methods. AI tools promise a numb
 er of major benefits for Earth system science, but wider uptake and succes
 sful applications require radically new infrastructure and support service
 s. Th...\n\n---------------------\nP11 - A Local Environment Descriptor th
 at Combines Topology and Geometry, for Atom-Resolved Off-Lattice kMC\n\nGu
 nde, Salles, Hemeryck, Martin-Samos\n\nKinetic Monte Carlo (kMC) is a mult
 i-scale computational approach which simulates evolution of a system by pr
 opagating a set of small-scale events, given by their initial and final co
 nfigurations, and a probability of event occurence. A basic process of kMC
  is to parse the system of simulation and a...\n\n---------------------\nP
 15 - Loki: A Source-to-Source Translation Tool for Numerical Weather Predi
 ction Codes\n\nReuter, Lange, Marsden\n\nThe upcoming era of exascale comp
 uting promises significant improvements in model resolution and thus forec
 asting skill for numerical weather codes. However, all known or presumed c
 andidates for exascale supercomputers are going to feature novel computing
  hardware or heterogeneous architectures. Usi...\n\n---------------------\
 nP09 - Distributed Linear Algebra with (HPX) Futures\n\nInvernizzi, Nikolo
 v, Querciagrossa, Solcà\n\nThe de-facto standard library for distributed l
 inear algebra is ScaLAPACK, a library that has been developed in 1995, whe
 n supercomputers were based on nodes which had a single CPU core. Since th
 en, the node architecture has evolved; nowadays, supercomputers are built 
 upon multi-socked nodes, multi-...\n\n---------------------\nP13 - Introdu
 cing NB-LIB: An Extensible, User-Friendly and High Performance Library for
  Atomistic Simulation Workflows\n\nKanduri, Keller, Rusu, Jordan, Hess\n\n
 A large number of scientific applications use particle interactions. Howev
 er, while computers have become more specialized, this has made these
  codes increasingly difficult to optimize for curent and emerging HPC
  architectures.<br />NonBonded-LIBrary (NB-LIB) project aims to 
 addr...\n\n---------------------\nP22 - CPU and FPGA Performance Compariso
 n of a Conjugate Gradient Solver Extracted from a Molecular Dynamics Code\
 n\nProuveur, Haefele, Voss\n\nFPGA devices used in the HPC context promise
  an increased energy efficiency, enhancing the computing systems Flop/W ra
 te. This work compares an FPGA and a CPU implementation of a conjugate gra
 dient solver in terms of both time to solution and energy to solution metr
 ics. The starting point is MetalWa...\n\n---------------------\nP36 - High
  Performance Training for Support Vector Machines\n\nCormican\n\nThis proj
 ect developed a new method for parallel training of support vector ma
 chines using the conjugate gradient algorithm. The work built on the seria
 l implementation proposed by Wen, Tong, et al (2003). The algorithm h
 as been implemented in <em>C</em> and parallelised using <e...\n
 \n---------------------\nP33 - PSIP Toolkit: A Lightweight Process for Inc
 remental Software Process Improvement\n\nRaybourn, Gonsiorowski, Milewicz,
  Rogers, Sims...\n\nMany scientific software development teams experience 
 challenges in their development practices.  They may or may not be re
 cognized as “pain points”, but nevertheless can adversely
  impact the team’s development or scientific productivity, sustainab
 ility of their software,...\n\n---------------------\nP01 - Neural Nets as
  an Aid for Constructing Tangent-Linear and Adjoint Models\n\nHatfield, Ch
 antry, Dueben\n\nLinearised numerical models have a number of uses in comp
 utational science, notably in numerical weather prediction. Tangent-linear
  models allow us to evolve perturbations forwards in time, whereas adjoint
  models allow us to propagate gradients backwards in time. Both are essent
 ial for the increment...\n\n---------------------\nP05 - Trajectory-Based 
 Machine Learning Method for Molecular Dynamics\n\nHan\n\nA trajectory-base
 d machine learning package (TrajML) for molecular dynamics (MD) was develo
 ped for instant modeling of the force field and prediction of molecular co
 nfigurations for MD trajectories. In the code, the ML features were constr
 ucted in a way that the ML processes were independent for eac...\n\n------
 ---------------\nP04 - Tasmania: Towards a Python-Based Approach to Atmosp
 heric Modeling\n\nUbbiali, Bianco, Gonzalez Paredes, Groner, Sawyer...\n\n
 We present the Tasmania framework to ease the development of atmospheric m
 odels in Python. Tasmania features a component-based architecture, where e
 ach component represents either the dynamical core or a physical parameter
 ization. The library provides different couplers which mold the components
  int...\n\n---------------------\nP30 - Improving the Flow: Introducing th
 e Lucent Dataflow Programming Language for FPGAs\n\nBrown\n\nDataflow arch
 itectures, and specifically Field Programmable Gate Arrays (FPGAs), are de
 monstrating early but significant potential benefits to HPC. Whilst the co
 mmunity has explored this technology before, recent advances in hardware c
 apability and software environments have made them a far more ser...\n\n--
 -------------------\nP10 - SpFFT: A Library for Distributed Computation of
  Sparse 3D FFT with GPU Acceleration\n\nFrasch\n\nFast Fourier Transformat
 ions are an essential part of many applications and libraries for the gene
 ral dense case are widely used. However, some applications can benefit fro
 m more specialized implementations. In particular, data with spherical cut
 off is required in some computational material science...\n\n-------------
 --------\nP34 - A Direct Method to Assess Floating-Point Accuracy\n\nDemeu
 re, Chevalier, Denis, Dossantos\n\nFloating-point numbers represent only a
  subset of real numbers. As such, floating-point arithmetic introduces app
 roximations that can compound and have a significant impact on numerical s
 imulations.We introduce a new way to estimate the numerical error of an ap
 plication and provide a reference imple...\n\n---------------------\nP27 -
  Single- and Two-Level Dynamic Load Balancing of Scientific Applications\n
 \nMohammed, Eleliemy, Müller Korndörfer, Cabezón, Ciorba\n\nModern high pe
 rformance computing (HPC) systems exhibit increased hardware parallelism a
 t node level (hundreds of compute cores) and at system level (hundreds of 
 nodes). Exploiting this hardware parallelism across and within comput
 e nodes, optimally and concurrently, is challenging. Computation...\n\n---
 ------------------\nP24 - Fully Resolved Lattice Boltzmann Simulations of 
 Turbulent Flow Through Porous Media\n\nSchwarzmeier, Rüde, Ambekar, Buwa\n
 \nWe present simulations of turbulent fluid flow through porous media usin
 g a parallel lattice Boltzmann method. The model porous medium is construc
 ted with randomly aligned spheres, the arrangement of which is computed wi
 th a rigid body dynamics simulation of a sedimentation process. We apply t
 he cum...\n\n---------------------\nP28 - Using AiiDA for Gas Adsorption: 
 Open Source, Reproducibility & Automation\n\nOngari, Yakutovich, Talir
 z, Smit\n\nThe challenge of performing gas adsorption and separation using
  nanoporous materials can be seen as a purely combinatorial problem. On th
 e one hand we have tens of industrially important applications, ranging fr
 om carbon capture to noble gasses separation. On the other hand we have th
 ousands of new ...\n\n---------------------\nP16 - MARS: Mesh Adaptive Ref
 inement for Supercomputing\n\nGanellari, Zulian, Fink, Fadel, Cumming...\n
 \nMARS is an open-source mesh management library designed to handle N-dime
 nsional elements (N <= 4). MARS is developed in C++ and makes use 
 of template meta-programming to have compile time dimensions of elements a
 nd vectors, thus allowing for both compile time performance optimizations 
 and co...\n\n---------------------\nP20 - GHEX: Performance Portable Commu
 nication Layer for Grid Applications\n\nBettiol, Bianco, Bösch, Krotkiewsk
 i\n\nGrid-based PDE solvers are amongst the most widely used numerical met
 hods in scientific HPC, e.g., in atmospheric sciences, astrophysics, geolo
 gy. In these applications halo exchange is performed often, impacting the 
 strong scalability of the applications at large scale. Consequently, the h
 alo excha...\n\n---------------------\nP17 - Middleware for Memory and Dat
 a-Awareness in Workflows (Maestro)\n\nTessier, Haus, Pleiter, Arenaz\n\nHi
 gh Performance Computing (HPC) and High Performance Data Analytics (HPDA) 
 opens up the opportunity to solve a wide variety of questions and challeng
 es. The number and complexity of challenges that HPC and HPDA can help wit
 h are limited by the performance of computer software and hardware. Increa
 si...\n\n---------------------\nP19 - Scalable Distributed Memory Implemen
 tation of Dissipative Quantum Dynamics Subject to a Non-Markovian Environm
 ent\n\nOvcharenko, Fingerhut\n\nThe dynamics of a quantum system in contac
 t with environment is central to the understanding of numerous processes, 
 e.g., the ultrafast dynamics of photoexcited biological systems or operati
 on of qubits in quantum computers. Established numerical methods for the r
 eal time description of dissipative ...\n\n---------------------\nP26 - CO
 SMA: Accelerating Electronic Structure Calculations without Changing the C
 ode\n\nKabic, VandeVondele\n\nA building block of many Electronic Structur
 e Calculations, that is often dominating the total runtime, is a distribut
 ed multiplication of dense matrices, or tensors. As an example, when simul
 ating 128 water molecules with CP2K code, using the Random Phase Approxima
 tion Method (RPA), dense matrix mu...\n\n---------------------\nP06 - Best
  Practice for Efficient and Scalable Application Performance\n\nWylie, Gar
 cia-Gasulla\n\nNowadays, the whole HPC community is looking forward to the
  exascale era. On the one hand, computer and system architects are investi
 ng their effort to design the first exascale supercomputer. On the other h
 and, application developers and scientists from different fields relying o
 n HPC struggle to p...\n\n---------------------\nP18 - Performance Analysi
 s and Source-Code Instrumentation Toolsuite (PASCIT)\n\nGerbes, Kunkel\n\n
 The Performance Analysis and Source-Code Instrumentation Toolsuite (PASCIT
 ) will ultimately optimize source code for HPC compute dwarfs and analyze 
 the inefficiencies of compilers to increase performance. Compilers often l
 ack the opportunity of high-level optimization potential in General Purpos
 e La...\n\n---------------------\nP03 - Ginkgo: A Node-Level Sparse Linear
  Algebra Library for High Performance Computing\n\nAnzt, Cojean, Gruetmach
 er, Nayak, Ribiel...\n\nWith the rise of manycore accelerators like GPUs, 
 there exists an increasing demand for linear algebra libraries that can ef
 ficiently exploit the concurrency and performance available in a single co
 mpute node. At the same time, more and more application projects move towa
 rds an object-oriented softw...\n\n---------------------\nP38 - Extreme-Sc
 ale Tile Low-Rank Cholesky Factorization Using the PaRSEC Task-Based Runti
 me\n\nCao, Pei, Akbudak, Mikhalev, Bosilca...\n\nThis work investigates th
 e necessary capabilities of a task-based runtime for efficient low-rank ma
 trix computations. Unlike their dense counterparts, variable tile ranks in
  low-rank computations generate a significant computational, memory and co
 mmunication load imbalance, dependent on the input da...\n\n--------------
 -------\nP07 - Quadrature-Free Discontinuous Galerkin Method for Shallow-W
 ater Equations\n\nFaghih-Naini\n\nThe discontinuous Galerkin (DG) methods 
 are already well-established in nearly all areas of computational and geop
 hysical fluid dynamics. Their strengths include the ability to use high-or
 der approximation spaces, robustness for problems with shocks and disconti
 nuities, natural support of h- and p-a...\n\n---------------------\nP43 - 
 An Ensemble-Based Statistical Methodology to Detect Differences in Weather
  and Climate Model Executables\n\nZeman, Schär\n\nSince their first operat
 ional application in the 1950s, atmospheric numerical models have become e
 ssential tools in weather and climate prediction. As such, they are a cons
 tant subject to changes, thanks to advances in computer systems, numerical
  methods, more and better observations, and the increa...\n\n-------------
 --------\nP37 - A FLASH5 Orchestration System for Exposing and Capitalizin
 g on Hierarchies of Parallelism\n\nDubey, O'Neal, Wahib, Weide\n\nFLASH5 i
 s a highly-composable, multiscale, multiphysics software package faced wit
 h a difficult transition to the next generation of high-performance comput
 ing platforms with heterogeneous nodes. These platforms pose a significant
  challenge to running a wide variety of substantial problems while mak...\
 n\n---------------------\nP14 - lbmpy: Fast and Flexible Multi-Phase Latti
 ce Boltzmann Simulations for High Density Ratios with Code Generation\n\nH
 olzer, Bauer, Rüde\n\nWe present a multiphase Lattice Boltzmann method bas
 ed on the conservative Allen Cahn model. This approach is suitable for hig
 h density ratios and high Reynolds numbers. The code generation framework 
 lbmpy is used to produce optimized code for CPUs and GPUs. A roofline anal
 ysis demonstrates the exce...\n\n---------------------\nP25 - Implementati
 on of the Performance-Portable ICON Model\n\nSawyer, Lapillonne, Alexeev, 
 Dietlicher, Kornblueh...\n\nThe ICON modeling framework is a unified numer
 ical weather and climate model used for operational numerical weather pred
 iction as well as low- and high-resolution climate projection. It utilizes
  the Message-Passing Interface (MPI) for domain decomposition and has been
  extensively optimized for OpenM...\n\n---------------------\nP35 - Optimi
 sed Allgatherv, Reduce_scatter and Allreduce Communication in Message-Pass
 ing Systems\n\nJocksch, Ohana, Lanti, Karakasis, Villard\n\nCollective com
 munications, namely the patterns allgatherv, reduce_scatter, and allreduce
  in message-passing systems are optimised based on measurements done at th
 e installation time of the library. The algorithms used are set up in an i
 nitialisation phase of the communication, similar to the method ...\n\n---
 ------------------\nP29 - Don’t Compete, Let’s Cooperate: A&nb
 sp;Cooperative Scheduling Approach\n\nEleliemy, Ciorba\n\nScientific 
 application developers are concerned with improving their&n
 bsp;applications' execution time and less concerned with increasing H
 PC systems' utilization. HPC system operators prioriti
 ze increased system utilization over improved individual appli...\n\n
 ---------------------\nP39 - Battling Lock Contention - One at a Time\n\nP
 feiler, Haensel, Morgenstern, Beckmann, Kabadshow\n\nThe abundance of avai
 lable compute resources on modern HPC systems is a challenge for algorithm
 s requiring strong-scaling. With dozens or even hundreds of cores per node
  compute-bound problems can easily shift to be bound by synchronization ov
 erheads. Fine-grained parallelism and concurrency control...\n\n----------
 -----------\nP02 - Porting Nek5000 on GPUs Using OpenACC and CUDA\n\nJocks
 ch, Gong, Jansson, Peplinski, Gray...\n\nNek5000 is a spectral element cod
 e for fluid dynamics applications. We revisit the existing OpenACC port [1
 ,2] and obtain speedup of 40% by rearrangement of loops. A distinctive fea
 ture of the code is small dense matrix-matrix multiplications leading to a
 n irregular memory access pattern. The most d...\n\n---------------------\
 nP23 - Robust Wave Function Optimization in SIRIUS Accelerated QuantumESPR
 ESSO\n\nPintarelli, Frasch, Taillefumier, Kozhevnikov, Huber...\n\nThe sel
 f-consistent iterative algorithms commonly used in DFT calcuations, based 
 for example on charge-density or potential mixing, are not guaranteed to c
 onverge. The success rate of the iterative approach, observed by the THEOS
  group at EPFL, for a single-point calculation is around 90%, and for l...
 \n\n---------------------\nP32 - StencilFlow: Mapping Large Stencil Progra
 ms to Distributed Dataflow Systems\n\nKuster, de Fine Licht, Hoefler\n\nAc
 curate and reliable weather forecast is of vital importance for a broad fi
 eld of industries and the general public. Highly regular and statically an
 alyzable stencil operators on structured grids are used to numerically sol
 ve the partial differential equations of such weather prediction models. S
 in...\n\n---------------------\nP08 - Executing Containers in HPC Systems 
 with Udocker\n\nDavid, Gomes, Campos\n\nudocker is a tool to execute Linux
  containers in user space. The tool is self contained and it does not have
  other dependencies except the python standard library. As such, there is 
 no need to install any aditional software nor require administrative 
 privileges. The execution of a given contain...\n\n---------------------\n
 P12 - Stochastic Simulations with Time-Dependent Parameters to Improve UQ 
 in Conceptual Hydrological Models\n\nBacci, Dal Molin, Fenicia, Sukys, Rei
 chert\n\nWater resources affect human activities in many different ways. R
 ainfall, catchment dynamics, and human interventions are crucial to contro
 l water quality and availability. Hence, to make precise forecasts and tak
 e informed decisions, it is important to accurately model the behavior of 
 river basins,...\n\n---------------------\nP41 - Adaptive Execution Planni
 ng in Biomedical Workflow Management Systems\n\nJaros, Treeby, Jaros\n\nBi
 omedical simulations require very powerful computers. Their execution is d
 escribed by a workflow consisting of a number of different cooperating tas
 ks. The manual execution of individual tasks may be tedious for expert use
 rs, but prohibiting for most inexperienced clinicians. k-Dispatch offers a
  &...\n\n---------------------\nP42 - Database Optimization Techniques for
  Scientific Applications\n\nYellapragada, Yellapragada\n\nDatabases have p
 roven to be a very important component in both commercial and scientific a
 reas. Traditional techniques to optimize database systems have only been h
 elpful in managing structured or semi-structured data. However, many scien
 tific applications like climate observation or biochemical sim...\n
END:VEVENT
END:VCALENDAR
