regexpostgresqlfunctionnullplpgsql

Count not-null columns in row


I'm trying to determine the number of specific columns that are populated, here's what I'm trying to do:

  foo := COALESCE($1.col1, '') || ' ' ||
    COALESCE($1.col2, '') || ' ' ||
    COALESCE($1.col3, '') || ' ' ||
    COALESCE($1.col4, '');
  foo := REPLACE(foo, var, ' ');
  words := string_to_array(foo, ' ');
  RETURN array_length(words, 1);

where var is white space, \s doesn't seem to work. I'm open to any other methods of finding the number of non-nil columns.


Solution

  • You can do this without spelling out all columns - or even knowing about them - with JSON functions in Postgres 9.3 or later:

    SELECT t.*, count(value)::int AS notnull_ct   -- cast to int is optional
    FROM   tbl t, json_each_text(row_to_json(t))  -- implicit LATERAL join
    -- WHERE key LIKE 'col%' -- optionally consider only selected columns
    GROUP  BY tbl_id;        -- PK column
    

    json_each_text() returns (key, value) by default. Use different aliases and / or table-qualify names in case of naming conflicts. If you are only interested in selected columns, you can filter column names in a WHERE clause.

    Or use the additional module hstore for the same purpose, available at least since Postgres 8.3:

    SELECT t.*, count(v)::int AS notnull_ct
    FROM   tbl t, svals(hstore(t)) v
    GROUP  BY tbl_id;
    

    The main feature is that count() does not count NULL values (and never returns NULL either). Exactly what you need.

    You can encapsulate it in a function. A simple SQL function with a polymorphic input type does the job:

    CREATE OR REPLACE FUNCTION f_count_notnull_in_row(ANYELEMENT)
      RETURNS int LANGUAGE sql IMMUTABLE AS
    'SELECT count(value)::int
     FROM   json_each_text(row_to_json($1))';
    

    Call:

    SELECT *, f_count_notnull_in_row(t)
    FROM  tbl t;
    

    SQL Fiddle (reusing Bill's setup).