This question concerns the data movement in Xilinx SDSoC and HLS.
I have a large 1D array in my main function, which is being allocated using sds_alloc. It is basically a 2D array (of N rows and M columns) transformed into a 1D array of N*M elements.
I also have a function that accepts two arrays of size N as inputs, on the PL part.
I want this function to process two columns of the original 2D array - so, two parts of N elements stored sequentially in the 1D array, which has been allocated using sds_alloc in the main function.
Is there an efficient way to access these two parts of the array sequentially as a stream in the accelerated function?
As far as I know, sds_alloc
contiguosly allocates memory buffers and SDSoC infers DMA transfers over those buffers (that is what I assume you're aiming for).
I'm not entirely sure whether SDSoC is able infer parallel accesses to a "shared" array, but my gut feeling is that it can.
I'm sure you can call your hardware function over pointers poiting at different locations of the same array (e.g. an argument looking like: &(x[i * N])
).
I would try something like this approach:
void kernel(const data_t* col_1st,
const data_t* col_2nd,
// [...]
) {
// [...]
}
// [...]
data_t* x = sds_alloc(sizeof(data_t) * N * M);
for (int i = 0; i < M; i = i + 2) {
kernel(&(x[i * N]), &(x[(i + 1) * N]), ...);
}