I want to use collect(column1) function that collects all row values for a group By column2 in Agg. Transformation. But since that column1 has duplicates values, I get duplicates in my returned array. I want a function that collects all distinct values.
There is no collectDistinct() function so you cant achieve this by function in dataflow.
You can try this: create two Aggregate Transformation.
First,group by base model number and modelDocId,then add a column(DModelDocId) and expression is first(modelDocId)
.
Second,group by base model number,then add a column(modelDocIds) and expression is collect(DModelDocId)
.
Hope this can help you.