azure-data-factorydataverse

Why are dataverse lookup primary name columns null when brought into a data flow in data factory?


In Dataverse, there are the columns suffixed with "name" that take the value of the primary name column of the associated lookup. I'm trying to filter on one of these columns, but all of the values being brought into data factory are null. When I check the database, these columns definitely have values. This lack of values doesn't appear to just be in the data preview as, even though my data flow runs, it doesn't output anything.

enter image description here

The content of the filter is as follows:

in(['Stephanie ...', 'Karen ...', 'Rachel ...', 'Cameron ...', 'Patrick ...', 'Hannah ...', 'Rebecca ...', 'Simon ...', 'Mihaela ...', 'Kate ...', 'Toby ...'], ptl_feeearnername)

The data preview from the select looks like this enter image description here

Here's a subset of the data in the database

ptl_matterid ptl_partner ptl_partnername ptl_feeearner ptl_feeearnername
ADD60453-CA81-4BBC-8E2A-A4B368CB8812 D5806CED-A4FA-4171-9968-A171027E9FE1 Becki ... C91C3F8B-38C0-4A21-B7CB-FD164D2C0880 Patrick ...
308C4899-6BCA-4253-B1D3-96B651D9FE03 2DB19836-6A87-4267-855A-8F0BEBC9F355 Dan ... A3A50C3E-B405-4662-A8CB-357A04EC6CAB Rachel ...

The query I use to get this data is

select ptl_matterid, ptl_partner, ptl_partnername, ptl_feeearner, ptl_feeearnername
from ptl_matterid

Is there any reason why these values will be null when brought into a data flow in data factory?

The associated lookup column (containing the GUID foreign key to the other table) does contain a value when brought in from dataverse, so I could filter on that. However, for readability of my data flow, I'd prefer to filter on the name column.

When there is no filter on the data flow, the flow also runs fine and outputs values into the sink.


Solution

  • Lookup columns do not store the actual values in the table, but instead refer to the values in the associated table. The "name" columns in Dataverse are virtual columns that are generated by the system based on the primary name column of the associated lookup. This is the reason you can query the data without using join.

    But if you want to use the values of the associated table in your Data Factory data flow, you need to bring in the associated table and use a join or lookup transformation to combine the data. Only then, you can get the values of the columns whose value depends on the associated table.