BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20210916T132456Z
LOCATION:
DTSTART;TZID=Europe/Stockholm:20210706T173000
DTEND;TZID=Europe/Stockholm:20210706T190000
UID:submissions.pasc-conference.org_PASC21_sess182_post172@linklings.com
SUMMARY:P35 - Optimised Allgatherv, Reduce_scatter and Allreduce Communica
 tion in Message-Passing Systems
DESCRIPTION:Poster\n\nP35 - Optimised Allgatherv, Reduce_scatter and Allre
 duce Communication in Message-Passing Systems\n\nJocksch, Ohana, Lanti, Ka
 rakasis, Villard\n\nCollective communications, namely the patterns allgath
 erv, reduce_scatter, and allreduce in message-passing systems are optimise
 d based on measurements done at the installation time of the library. The 
 algorithms used are set up in an initialisation phase of the communication
 , similar to the method used in so-called persistent collective communicat
 ion introduced in the literature. For allgatherv and reduce_scatter the ex
 isting algorithms, recursive multiply/divide and cyclic shift (Bruck's alg
 orithm) are applied with a flexible number of communication ports per node
 . The algorithms for equal message sizes are used with non-equal message s
 izes together with a heuristic for rank reordering. The two communication 
 patterns are applied in a plasma physics application that uses a specialis
 ed matrix-vector multiplication. For the allreduce pattern the cyclic shif
 t algorithm is applied with a prefix operation. The data is gathered and s
 cattered by the cores within the node and the communication algorithms are
  applied across the nodes. In general our routines outperform the non-pers
 istent counterparts in established MPI libraries by up to one order of mag
 nitude or show equal performance, with a few exceptions of number of nodes
  and message sizes.
END:VEVENT
END:VCALENDAR
