Let's say I'm looking to build the following pivot table:
// count by age
age male female
1-25 18 23
26-100 19 10
To do this, I can do a basic aggregation like this:
SELECT age, gender, count(*) GROUP BY age, gender
And get results such as:
gender age count
male 1-25 18
male 26-100 19
male 1-25 23
male 26-100 10
And then I can 'build the results' outside of BigQuery/SQL (such as in javascript or even pandas) to get the pivoted result.
However, I run into trouble when the values can't be simply "added up" or calculated in some straightforward fashion. For example:
// distinct users by country
country male female
us 192,293 64,000
jp 1,203,203 1,000,000
Is there a way to grab a value in BigQuery so that we can do this calculation outside of BigQuery? I've posted a preceding question to this here where it seems like you cannot use the HLL_COUNT
, as that's not available outside. [Or is there some other type of algorithm that can be applied to build some sort of unique sketch that can be merged outside SQL?].
Or is there another approach to building pivot results (outside of BigQuery) that I'm missing?
Update: The above examples are more to give an overview of the issue. The actual pivot queries would be like this: Get the top patent countries, codes in a BQ public dataset, on both the X and Y axis.
I don't understand. Why not do this inside BigQuery?
SELECT age,
COUNTIF( gender = 'male' ) as males,
COUNTIF( gender = 'female' ) as females
GROUP BY age;