I replicate data from MySQL data base to BigQuery using DataFusion.
My original table in MySQL is not partitioned but I want it to be partitioned by a column when it is replicated to BQ.
Additionally BQ assigns a column for clustering by default on its own for PK column when I run replication job in DataFusion.
Questions:
Response from Google Cloud Community:
BigQuery and DataFusion do not directly support partitioning of existing tables that are replicated.
BUT the method: pausing DataFusion replication, creating a new partitioned table, dropping the original table, and then renaming the new table to the original name, resuming DataFusion replication is the recommended approach for partitioning existing replicated tables, and it works.
This approach involves some downtime as the original table is dropped and the new partitioned table is created. To minimize downtime, you can create the new partitioned table with the same schema as the original table and then copy the data from the original table to the new table. Once the data is copied, you can drop the original table and rename the new table to the original name.