csvtext-filesgoogle-cloud-dataprep

how to use google Data Prep for several files located in Google Cloud Storage?


I imported a text file from in GCS and did some preparations using DataPrep and write them back to GCS as CSV files. What I want to do is, do this for all the text files in that bucket Is there a way to do this for all the files in that bucket(in GCS) at once?

Below is my procedure. I selected a textfile from GCS(can't select more than one text file) and did some preparations(rename columns .create new columns and etc). Then write it back to GCS as CSV.

enter image description here


Solution

  • You can use the Dataset with parameters feature to load several files at once.

    You can then use a wildcard to select all the files that you want to load. Note that all the files need to have the same schema (same columns) for this to work.

    create dataset with parameters

    See https://cloud.google.com/dataprep/docs/html/Create-Dataset-with-Parameters_118228628 for more information on how to use this feature.

    An other solution is to add all the files into a folder* and to use the large + button to load all the files in that folder.

    [*] technically under the same prefix on GCS