sql google-bigquery statistics missing-data calculation

Is there a simplified SQL query to return the number and percentage of missing values of a table ? (BigQuery)

The BigQuery earthquake public dataset has 47 columns most of each have missing values. I need an output showing a summary with column_name, total_entries, non_missing_entries & percentage_missing as the columns of that table.

Currently I am using the query shown here, repeating the blocks for all the 47 columns:

SELECT
    'id' AS column_name,
    COUNT(id) AS non_missing_entries,
    (COUNT(*) - COUNT(id)) * 100.0 / COUNT(*) AS percentage_missing
FROM
    `youtube-factcheck.earthquake_analysis.earthquakes_copy`

UNION ALL

SELECT
    'flag_tsunami' AS column_name,
    COUNT(flag_tsunami) AS non_missing_entries,
    (COUNT(*) - COUNT(flag_tsunami)) * 100.0 / COUNT(*) AS percentage_missing
FROM
    `youtube-factcheck.earthquake_analysis.earthquakes_copy`

UNION ALL

-- Repeat the above block for other columns
-- ...

Output:

| column_name|non_missing_entries | percentage_missing|
| -----------| -------------------| ------------------|
|flag_tsunami|                1869|  70.20564323290291|
|          id|                6273|                  0|
|         ...|                 ...|                ...|

Is there a SQL to avoid the long tedious work of writing 47 blocks of the same query?

Solution

UNPIVOT is your friend. (note I had to change the source as I don't have access to bigquery-public-data.noaa_significant_earthquakes.earthquakes)

with cte as (
  select column_name,
        count(*) as non_missing_entries
  from (
      select cast(id as string) as id,
             flag_tsunami,
             cast(year as string) as year,
             cast(month as string) as month,
             cast(day as string) as day,
             cast(hour as string) as hour,
             cast(minute as string) as minute,
             cast(second as string) as second
      from `bigquery-public-data.noaa_significant_earthquakes.earthquakes`)
      unpivot ( value for column_name in (id, flag_tsunami,year,month,day,hour,minute,second))
      group by column_name
),
id_only as (
  select column_name,non_missing_entries
  from cte
  where column_name = 'id'
)
select cte.column_name,
      cte.non_missing_entries,
      (id_only.non_missing_entries - cte.non_missing_entries) * 100.0 / id_only.non_missing_entries as percentage_missing
from cte
cross join id_only;

it returns this:

You'll have to:

fill-in all 47 columns in UNPIVOT operator
include all 47 columns in the inner SELECT which casts the integers as strings

but I think that's better than UNIONing 47 times.