chapel

How do I efficiently scatter distributed array elements in Chapel?


Consider the following scatter operation :

var A : [DomA] EltType;
var Indices : [DomA] IndexType;
var B : [DomB] EltType;

[(iSrc, iDst) in zip(DomA, Indices)] B[iDst] = A[iSrc];

where the domains are distributed. What is the best way to do this? In particular, is there an easy way to aggregate messages to avoid sending many small messages (assuming that sizeOf(EltType) is small)?


Solution

  • The Chapel team is actively working on aggregation and it is extensively used in Arkouda, but there is currently no built-in support for aggregation. See https://github.com/chapel-lang/chapel/issues/16963 for more information about the current efforts.

    If you want to try the current aggregators you can copy AggregationPrimitives.chpl and CopyAggregation.chpl from https://github.com/chapel-lang/chapel/tree/993f9bd/test/studies/bale/aggregation.

    Your main loop would then look something like:

    forall (iSrc, iDst) in zip(DomA, Indices) with (var agg = new DstAggregator(EltType)) {
      agg.copy(B[iDst], A[iSrc]);
    }
    

    Or maybe a little cleaner as:

    forall (iDst, a) in zip(Indices, A) with (var agg = new DstAggregator(EltType)) {
      agg.copy(B[iDst], a);
    }
    

    These aggregators should provide significant performance speedups over your unaggregated loop: 2-3x on Cray Aries networks, ~1000x on InfiniBand networks, and ~5000x or more on commodity Ethernet networks.

    Longer term, aggregators will be part of the standard library and in many cases, including your example, the compiler should be able to tell that aggregation is safe/legal and automatically use it.