arrayspostgresqlsql-execution-planexplainpg-catalog

Unnest (pivot) arrays in Postgres pg_stats view


I would like to pivot the arrays in the cells inside the Postgres view pg_stats. The reason for this is to better understand what is going on with the table EXPLAIN ANALYZE commands.

I have this SQL query that I am modifying without success in order to achieve the pivot:

SELECT 
    -- ps.schemaname,
    -- ps.tablename,
    ps.attname as "column_name",
    ps.null_frac,
    ps.avg_width,
    ps.n_distinct,
    ps.most_common_vals,
    ps.most_common_freqs,
    ps.histogram_bounds,
    ps.correlation -- , -- I think this means how close the stats are to the "real rows in the entire table"
    -- unnest(ps.most_common_vals, ps.histogram_bounds)
FROM pg_stats AS ps
CROSS JOIN LATERAL unnest(ps.most_common_vals, ps.histogram_bounds)
WHERE 
        ps.tablename = 'my_table_name'
    AND ps.attname IN ('my_important_column_1', 'my_important_column_2');

However I hit the wall on all sorts of error messages due to the function unnest(...) (see https://www.postgresql.org/docs/9.4/functions-array.html):

SQL Error [42883]: ERROR: function unnest(anyarray, anyarray) does not exist
  Hint: No function matches the given name and argument types. You might need to add explicit type casts.
  Position: 325
SQL Error [42804]: ERROR: cannot determine element type of "anyarray" argument

On the types issues, I am interested in these columns of various types "array" (see https://www.postgresql.org/docs/current/view-pg-stats.html):

If I use all of these columns inside unnest(...) then because types are different (basically because of float4[] I think), I see this error:

SQL Error [42883]: ERROR: function unnest(anyarray, real[], anyarray) does not exist
  Hint: No function matches the given name and argument types. You might need to add explicit type casts.
  Position: 387

Or

SQL Error [42804]: ERROR: cannot determine element type of "anyarray" argument

depending on where I put unnest (as column, or as filter).

I am a bit lost with the Postgres syntax here.

EDIT:

I could do stuff like this in the SELECT: array_to_string(ps.most_common_vals, E'\n') as most_common_vals_lf,

But this leads to "fat cells" that don't render very well in a visual IDE e.g. DBeaver because the line feeds don't expand very well and all the output rows hide most of the cell value just because they don't expand. I really would like to pivot the view and multiply the common columns by the number N of elements in the arrays and get back all these as N rows.


Solution

  • Turns out the issue was around casting the types of those arrays. This works when making sure the arrays types are ::text::text[]:

    SELECT 
        -- ps.schemaname,
        -- ps.tablename,
        ps.attname as "column_name",
        mt.mcv AS "most_common_vals",
        mt.mcf AS "most_common_freqs",
        ps.avg_width,
        ps.n_distinct,
        ps.correlation,
        ps.null_frac -- fraction of column entries that are null
    FROM pg_stats AS ps
    -- for scalar type columns:
    CROSS JOIN LATERAL unnest(
        ps.most_common_vals::text::text[], 
        ps.most_common_freqs::text::text[]
    ) AS mt(mcv, mcf)
    WHERE
            ps.tablename = 'my_table_name'
        AND ps.attname IN ('my_important_column_1', 'my_important_column_2')
        AND mt.mcv IS NOT NULL
        AND mt.mcf IS NOT NULL
    ORDER BY ps.attname, mt.mcf DESC;
    

    The output is the expected pivot of the original view pg_stats with one row for each element of those arrays.

    One gotcha on ps_stats: the pairs of most_common_vals and most_common_freqs can be treated as a map. So in this case it makes sense to pivot the table with both pivoted columns.

    On the other hand the histogram column histogram_bounds is independent from the above. It just shows the 10 bands to go from 0% to 100% in steps of 10%. It doesn't make sense to mix this column with the 2 columns above. This histogram column very much depend on the data we are dealing with, sometimes it doesn't even make sense to take this information into consideration at all.