parameter-passingazure-data-factory

How to pass an array parameter into an ADF dataflow


I have a DataFlow in my ADF, which accepts an int array param: enter image description here

and uses it in one of the flow activities: enter image description here

But when I try to invoke that dataflow from a pipeline, it doesn't seem to be at all happy :(

I've tried 3 different versions, so far:

Whichever way I try, I get the same error from the dataflow, saying the parameter is missing. enter image description here enter image description here

{
"StatusCode": "DFExecutorUserError",
"Message": "Job failed due to reason: at Filter 'IdentifyTradesToDelete'Parameter 'TradeIdentityIds'(Line 35/Col 22): Parameter value for TradeIdentityIds missing",
"Details": "at Filter 'IdentifyTradesToDelete'Parameter 'TradeIdentityIds'(Line 35/Col 22): Parameter value for TradeIdentityIds missing"
}

The dataflow invocation log says the parameter was passed, though:

enter image description here

What am I doing wrong?


EDIT: I tried another thing: I set a default parameter on the DF, wiht hard-coded values and then recreated the DF invocation. It auto-populated the parameters with those defaults, and ran fine. But when I modified the invocation to be a different pair of numbers (just changed the digits, so it's syntactically identical) it turns out that it was ignoring the input and just continued using the default values.


Solution

  • I am at this point fairly convinced that there is (currently) a bug in this feature, in ADF.

    Passing an pipeline-resolved array into the Dataflow does not work. The dataflow will behave as though no value was passed in from the pipeline, and will either error, or use the parameter default, if one is configured.

    You can pass data in using the "expression" mode. Configure the parameter in the pipeline to pass in a "string" that is a dataflow-expression-language expression to define an array.

    For example, passing: array(123, 234) will work. (But note the absence of an @ at the start of that - we don't want the pipeline to try to evaluate the expression!)

    It follows that if you have an array variable in the pipeline, you can pass that to the dataflow, by building the string that defines the whole array in dataflow-expression terms, and passing that in.

    That looks a bit like this:

    array(@{join(variables('myVariable'), ',')})

    Note the @{} in the middle which causes the centre bit to be resolved by the pipeline, resulting in a string that looks like this:

    array(val1, val2, val3) where val1 etc. are the values originally in the myVariable array.