google-bigqueryclojureclojure-java-interop

How to select data from Google BigQuery in Clojure via Java interop?


I couldn't find any examples online. Can anyone point me to an example of how to select data from Google BigQuery in Clojure via Java interop?

[com.google.cloud/google-cloud-bigquery "2.16.0"]

Here's the Java example Google provides:

import com.google.cloud.bigquery.BigQuery;
import com.google.cloud.bigquery.BigQueryException;
import com.google.cloud.bigquery.BigQueryOptions;
import com.google.cloud.bigquery.QueryJobConfiguration;
import com.google.cloud.bigquery.TableResult;

// Sample to query in a table
public class Query {

  public static void main(String[] args) {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "MY_PROJECT_ID";
    String datasetName = "MY_DATASET_NAME";
    String tableName = "MY_TABLE_NAME";
    String query =
        "SELECT name, SUM(number) as total_people\n"
            + " FROM `"
            + projectId
            + "."
            + datasetName
            + "."
            + tableName
            + "`"
            + " WHERE state = 'TX'"
            + " GROUP BY name, state"
            + " ORDER BY total_people DESC"
            + " LIMIT 20";
    query(query);
  }

  public static void query(String query) {
    try {
      // Initialize client that will be used to send requests. This client only needs to be created
      // once, and can be reused for multiple requests.
      BigQuery bigquery = BigQueryOptions.getDefaultInstance().getService();

      QueryJobConfiguration queryConfig = QueryJobConfiguration.newBuilder(query).build();

      TableResult results = bigquery.query(queryConfig);

      results
          .iterateAll()
          .forEach(row -> row.forEach(val -> System.out.printf("%s,", val.toString())));

      System.out.println("Query performed successfully.");
    } catch (BigQueryException | InterruptedException e) {
      System.out.println("Query not performed \n" + e.toString());
    }
  }
}

Solution

  • I wasn't able to test this code, so you will probably have to do some adjustments, but at least for the general idea:

    Dependency: [com.google.cloud/google-cloud-bigquery "2.16.0"]

    Import in ns: (:import (com.google.cloud.bigquery BigQueryOptions QueryJobConfiguration BigQuery BigQueryException BigQuery$JobOption))

    (defn use-query [query]
      (try (let [^BigQuery big-query (.getService (BigQueryOptions/getDefaultInstance))
                 ^QueryJobConfiguration query-config (.build (QueryJobConfiguration/newBuilder query))
                 results (.query big-query
                                 query-config
                                 (into-array BigQuery$JobOption []))]
             (doseq [row (.iterateAll results)
                     val row]
               (printf "%s" val))
             (println "Query performed successfully."))
           (catch BigQueryException e (printf "Query not performed \n %s" e))
           (catch InterruptedException e (printf "Query not performed \n %s" e))))
    
    (let [project-id "MY_PROJECT_ID"
          dataset-name "MY_DATASET_NAME"
          table-name "MY_TABLE_NAME"
          query (str "SELECT name, SUM(number) as total_people\n"
                     " FROM `"
                     project-id
                     "."
                     dataset-name
                     "."
                     table-name
                     "`"
                     " WHERE state = 'TX'"
                     " GROUP BY name, state"
                     " ORDER BY total_people DESC"
                     " LIMIT 20")]
      (use-query query))