mysqlgoogle-cloud-platformgoogle-bigquery

How can I connect an externally hosted MySQL database to BigQuery?


I have a MySQL database that is externally hosted (not on Google Cloud ), and I want to connect it to BigQuery for querying and analysis. Unfortunately, I cannot create an SSH tunnel to the MySQL server.

What are the available methods to connect my MySQL database to BigQuery in this scenario?


Solution

  • When the database is hosted externally, you can't perform federated query or other convenient stuff you can when hosted on Google Cloud.

    Here, no solution, you have to replicate the data.

    The easiest way is to use Datastream to create a replication stream (CDC, Change Data Capture) to replicate in real time your changes in BigQuery.

    You can also imagine weird pattern and create a Cloud SQL MySQL database as a replica of your main database. Let the MySQL instances doing the replication (very similar to CDC in the end) and use a federated query to query the data. You can, but it's strange, more expensive and not the most efficient when you query the data with BigQuery.

    But it's a relevant solution if a day you want to move your database to Cloud SQL (promote the replica as primary and it's done!)

    It depends on your roadmap and requiements