sqlgoogle-bigquery

Filtering data from large table using where clause returns 'response too large ...' error


I need to filter some data using 'where' clause from a table that has more than 1.2 TB of size and about 2.8 billion of rows.

I started testing data filtering with small data table (~21 millions of rows and 9.6 of table size) by setting a destination table and enabling 'Allow Large Results' in job configuration.

My query is very simple :

SELECT * FROM [mydataset.mytable] Where DATETIME >= '2014-03-17 00:00:00' and  DATETIME <= '2014-03-17 23:59:59';

I used both BigQuery Browser Tool and bq command line tool, however, I get the 'Response too large to return' error every time I run this query.

In the documentation here https://cloud.google.com/bigquery/docs/running-queries#largequeryresults, Google says:

  • If you plan to run a query that might return larger results, you can set allowLargeResults to true in your job configuration.
  • You must specify a destination table.
  • You can't specify a top-level ORDER BY clause.
  • Window functions can't return large query results.

My query is very simple and does not include any window functions. So, to bypass this problem, is there a specific query syntax that can I do to run my query successfully?

Please note that here at https://cloud.google.com/bigquery/quotas#queries, it is specified that the Maximum response size is 128 MB compressed and unlimited when returning large query results as explained at the above link (https://cloud.google.com/bigquery/docs/running-queries#largequeryresults).


Solution

  • As N.N says, to avoid the 'Response too large to return' error in this case, it should state all fields in the select clause instead using select *.

    I tried running the same query by stating all fields in the select statement, setting a destination table and enabling 'Allow Large Results', and it works fine.

    Query complete (115.6s elapsed, 298 GB processed).

    Number of Rows of destination table: 77,796,259.