In my application, I have multiple databases in PostGreSql which can contain more than 500 table and each table can have 5 to 20 columns with different data types. The data in the database is imported using external inputs which will not contain all columns as per the schema. Due to this PostGreSql by default inserts null values in the table.
Ex: When I import a json content into PostGreSql which does not have a particular column and its value present inside json. PostGreSql by default inserts a null value.
Sample json content [{"ID":1,"revision":7}]
Now if my table contains one more column "Name", when the above json content is import the "Name" column contains null value for the particular row or any number of rows.
How can I identify these null value and replace them with default values instead dynamically may be using a function or any other better approach?
Ex: The rows of Name column which contain null should be replaced with an empty string OR a boolean type column should have False values instead of null.
The following function checks information_schema.columns
for nullable columns (is_nullable = 'YES'
) in a table. It finds NULL
values and sets default values based on the column types. Then, it dynamically updates the table to replace NULLs
with those defaults.
CREATE OR REPLACE FUNCTION replace_nulls_with_defaults(table_name TEXT) RETURNS VOID AS $$
DECLARE
col RECORD; -- To store column metadata
query TEXT; -- To store dynamic SQL queries
default_val TEXT; -- To determine default value based on column types
BEGIN
FOR col IN
SELECT column_name, data_type
FROM information_schema.columns
WHERE table_name = table_name
AND is_nullable = 'YES' -- Only nullable columns
LOOP
-- Determine the default value
IF col.data_type IN ('character varying', 'text') THEN
default_val := '''''';
ELSIF col.data_type = 'boolean' THEN
default_val := 'FALSE';
ELSIF col.data_type IN ('integer', 'bigint', 'smallint', 'numeric', 'real', 'double precision') THEN
default_val := '0';
ELSIF col.data_type = 'date' THEN
default_val := 'CURRENT_DATE';
ELSIF col.data_type IN ('timestamp without time zone', 'timestamp with time zone') THEN
default_val := 'CURRENT_TIMESTAMP';
ELSE
RAISE NOTICE 'Skipping column %, unsupported type: %', col.column_name, col.data_type;
CONTINUE;
END IF;
-- Build the dynamic SQL query
query := FORMAT(
'UPDATE %I SET %I = %s WHERE %I IS NULL;',
table_name, col.column_name, default_val, col.column_name
);
EXECUTE query;
END LOOP;
END;
If you run the function SELECT replace_nulls_with_defaults('sample_table');
all suitable NULL values will be replaced with defaults.
EDIT
This script will replace nulls in all the tables in the schema public
DO $$
DECLARE
tbl RECORD;
BEGIN
FOR tbl IN
SELECT table_name
FROM information_schema.tables
WHERE table_schema = 'public'
LOOP
PERFORM replace_nulls_with_defaults(tbl.table_name);
END LOOP;
END;
$$ LANGUAGE plpgsql;