google-cloud-platformgoogle-bigqueryterraform

How to automatically create external BigQuery tables in Terraform for all CSV files in a GCS bucket (without listing them manually)?


I am working on a Terraform problem:

Each GCS bucket represents a dataset.

Inside each bucket, I upload multiple .csv files (each one should be an external BigQuery table).

I know how to create the buckets and the datasets with Terraform.

But what I want to achieve is:

Automatically create an external BigQuery table per .csv file in the bucket without manually maintaining a variable like:

variable "sap_files" {
  default = [
    "ACDOCA.csv",
    "AUFK.csv",
    "BKPF.csv",
    # ...
  ]
}

every bucket's name looks like something_datasetname and contains all of the csv files, each file = external table to be created under the same dataset!

In other words, I’d like Terraform to generate external tables for all CSVs found there.

Is there a way in Terraform to get all .csv files from a GCS bucket and create the corresponding external BigQuery tables, without maintaining a manual list of file names? Thank you.


Solution

  • Terraform can’t get the CSV and make it to the table alone. It’s like a blueprint of a building, you can design it to have like 3 floors or 30 floors but you cannot see inside of the building(bucket) and say that 10 CSV make 10 tables.

    Terraform is for defining infrastructure, not reading live data. It can create buckets, datasets, and tables you tell it about, but it won’t look inside a bucket to find CSVs. If you want one table per CSV, you’d have to list the files yourself, Terraform won’t discover them automatically.

    A better approach is to let Terraform handle creating the buckets and datasets, and then use a Cloud Function that triggers whenever a CSV file is uploaded. The Cloud Function can then tell BigQuery to create a new external table for that file, using the query or schema you define.