Question: YAXT communication overlapping calculation
I have the following YAXT-related question - I'd be happy about some best practice advice ... In short, I'd like to avoid an additional copy of a data array, which seems to be required.
My program overlaps communication with computation.
- It operates on a given multi-dimensional floating-point array
arr(:,:,:)
. -
arr
is MPI-decomposed (in some irregular fashion). - A halo synchronization pattern for
arr
has been implemented with YAXT.
Now, the overall program is as follows:
1. do some calculation on halo indices in "arr"
2. YAXT: launch an asynchronous "xt_redist_a_exchange"
3. do the remaining calculations (for the non-halo points)
4. perform stencil evaluations on non-halo points
5. YAXT: receive asynchronous data exchange from step 2
6. do the remaining stencil evaluations using the received indices
The problem here is that I cannot fill individual entries of arr
in step 3, while the async communication is under way - even if it can be assured that these entries are no destination points of the YAXT communication.
The only remedy I can think of is splitting arr
into a halo-indices part arr_ifc
and an interior part arr_interior
. Furthermore, in order to avoid an index translation, I decided to allocate arr_ifc
and arr_interior
for all points, which is really awkward.
Can you think of any implementation which requires only a single data array for this task?