google-bigqueryazure-data-factoryazure-synapselookup-tables

How to dynamically copy multiple datasets from a Google BigQuery project using Azure Synapse Analytics


Is it possible to dynamically copy all datasets from a BigQuery Project to Azure Synapse Analytics, then dynamically copy all tables within each dataset? I know we can dynamically copy all tables within a BigQuery dataset reference to this answered question Loop over of table names ADFv2, but is there a way to do it at the project level with the lookup activity to loop through all datasets? Is there a way to do a SELECT * to the datasets?

SELECT
*
FROM
gcp_project_name.dataset_name.INFORMATION_SCHEMA.TABLES
WHERE table_type = 'BASE TABLE'

According to Microsoft's Lookup activity in Azure Data Factory and Azure Synapse Analytics, this only reaches the dataset level. I also tried just putting in the GCP's project name into the Lookup activity's query, but it did not work, ref Understanding the "Not found: Dataset ### was not found in location US" error


Solution

  • SELECT schema_name
    FROM  `project_name`.INFORMATION_SCHEMA.SCHEMATA
    

    This query will list the datasets in the project.

    enter image description here

    enter image description here

    SELECT
    *
    FROM
    gcp_project_name.dataset_name.@{item().schema_name}.TABLES
    WHERE table_type = 'BASE TABLE'
    

    This will give list of all tables within each dataset.

    Since you cannot nest a for-each inside for-each in ADF, you can design a two-level pipeline where the outer pipeline with the outer ForEach loop iterates over an inner pipeline with the nested loop.

    Refer the NiharikaMoola-MT's answer on this SO thread for Nested foreach in ADF.