Is it possible to dynamically copy all datasets from a BigQuery Project to Azure Synapse Analytics, then dynamically copy all tables within each dataset? I know we can dynamically copy all tables within a BigQuery dataset reference to this answered question Loop over of table names ADFv2, but is there a way to do it at the project level with the lookup activity to loop through all datasets? Is there a way to do a SELECT *
to the datasets?
SELECT
*
FROM
gcp_project_name.dataset_name.INFORMATION_SCHEMA.TABLES
WHERE table_type = 'BASE TABLE'
According to Microsoft's Lookup activity in Azure Data Factory and Azure Synapse Analytics, this only reaches the dataset level. I also tried just putting in the GCP's project name into the Lookup activity's query, but it did not work, ref Understanding the "Not found: Dataset ### was not found in location US" error
This can be done using two-level pipeline. I tried to repro this and below is the approach.
Take a lookup activity and take the Google big query as source dataset. In Query text box, enter the below query.
SELECT schema_name
FROM `project_name`.INFORMATION_SCHEMA.SCHEMATA
This query will list the datasets in the project.
@activity('Lookup1').output.value
as a dynamic content.SELECT
*
FROM
gcp_project_name.dataset_name.@{item().schema_name}.TABLES
WHERE table_type = 'BASE TABLE'
This will give list of all tables within each dataset.
Since you cannot nest a for-each inside for-each in ADF, you can design a two-level pipeline where the outer pipeline with the outer ForEach loop iterates over an inner pipeline with the nested loop.
Refer the NiharikaMoola-MT's answer on this SO thread for Nested foreach in ADF.