We have a BigQuery dataset that has some long list of tables (with data) in it. Since I am taking over a data pipeline, which I want to familiarize myself with by doing tests, I want to duplicate those dataset/tables without copying-truncating the tables. Essentially, I want to re-create those tables in a test dataset using their schema. How can this be done in bq
client?
You have a couple of options considering you don't want to copy the data but the schema:
1.- extract the schema for each table and then create new ones just empty.
$ bq show --schema --format=prettyjson [PROJECT_ID]:[DATASET].[TABLE] > [SCHEMA_FILE]
$ bq mk --table [PROJECT_ID]:[NEW_DATASET].[TABLE] [SCHEMA_FILE]
2.- run a query with LIMIT 0 and setting a destination table.
bq query "SELECT * FROM [DATASET].[TABLE] LIMIT 0" --destination_table [NEW_DATASET].[TABLE]
NOTE: As per comment, it is important to highlight that LIMIT 0 will still scan the whole table.