sqlpostgresql

Query to set a value to null if it contains non-numeric characters


I currently have this query, where response a column defined as varchar(255):

SELECT 
    CAST((NULLIF(REGEXP_REPLACE(response, '[^0-9]+', '', 'g'), ''), '0') AS INTEGER) 
FROM
    my_table;

The idea is that we only want to pull out those rows where the value of response is numeric. So this strips out non-numeric values, sets to null if the result is 0, and casts to an integer.

This is working fine for most columns, except when the column is a date - this removes all the ':' and '+' etc. and ends up with a number which it tries and fails to parse (because it's too long for an integer).

I'm not sure that this is the best way of handling this case anyway - is there a way to return a number if the column contains a string representing a number, and null otherwise? E.g. try to cast to an integer and return null if it fails or something like that?


Solution

  • In Postgres 16 and higher there's pg_input_is_valid().
    demo at db<>fiddle

    SELECT x::integer, response
    FROM my_table
    LEFT JOIN LATERAL(VALUES(NULLIF(REGEXP_REPLACE(response, '[^0-9]+', '', 'g'), ''))) AS v(x)
    ON pg_input_is_valid(x,'integer');
    
    x response
    1 1
    null 2025-12-12T12:34:56.123456+02
    null abc

    This gets you a null whenever your stripped response doesn't form a valid integer input for whatever reason, as per database's understanding of said validity, without you having to know and express those rules.

    If they change in the future, this still works without you having to go back to this code and update it to reflect new rules.

    In Postgres 15 and earlier, you can easily add that function yourself.