I currently have this query, where response
a column defined as varchar(255)
:
SELECT
CAST((NULLIF(REGEXP_REPLACE(response, '[^0-9]+', '', 'g'), ''), '0') AS INTEGER)
FROM
my_table;
The idea is that we only want to pull out those rows where the value of response is numeric. So this strips out non-numeric values, sets to null if the result is 0, and casts to an integer.
This is working fine for most columns, except when the column is a date - this removes all the ':' and '+' etc. and ends up with a number which it tries and fails to parse (because it's too long for an integer).
I'm not sure that this is the best way of handling this case anyway - is there a way to return a number if the column contains a string representing a number, and null otherwise? E.g. try to cast to an integer and return null if it fails or something like that?
In Postgres 16 and higher there's pg_input_is_valid()
.
demo at db<>fiddle
SELECT x::integer, response
FROM my_table
LEFT JOIN LATERAL(VALUES(NULLIF(REGEXP_REPLACE(response, '[^0-9]+', '', 'g'), ''))) AS v(x)
ON pg_input_is_valid(x,'integer');
x | response |
---|---|
1 | 1 |
null | 2025-12-12T12:34:56.123456+02 |
null | abc |
This gets you a null
whenever your stripped response doesn't form a valid integer
input for whatever reason, as per database's understanding of said validity, without you having to know and express those rules.
If they change in the future, this still works without you having to go back to this code and update it to reflect new rules.
In Postgres 15 and earlier, you can easily add that function yourself.