postgresqldefault-valuesqldatatypes

Replacing null values with corresponding default data type values in PostGreSql database


In my application, I have multiple databases in PostGreSql which can contain more than 500 table and each table can have 5 to 20 columns with different data types. The data in the database is imported using external inputs which will not contain all columns as per the schema. Due to this PostGreSql by default inserts null values in the table.

Ex: When I import a json content into PostGreSql which does not have a particular column and its value present inside json. PostGreSql by default inserts a null value.

Sample json content [{"ID":1,"revision":7}]

Now if my table contains one more column "Name", when the above json content is import the "Name" column contains null value for the particular row or any number of rows.

How can I identify these null value and replace them with default values instead dynamically may be using a function or any other better approach?

Ex: The rows of Name column which contain null should be replaced with an empty string OR a boolean type column should have False values instead of null.


Solution

  • The following function checks information_schema.columns for nullable columns (is_nullable = 'YES') in a table. It finds NULL values and sets default values based on the column types. Then, it dynamically updates the table to replace NULLs with those defaults.

    CREATE OR REPLACE FUNCTION replace_nulls_with_defaults(table_name TEXT) RETURNS VOID AS $$
    DECLARE
        col RECORD;  -- To store column metadata
        query TEXT;  -- To store dynamic SQL queries
        default_val TEXT;  -- To determine default value based on column types
    BEGIN
        FOR col IN
            SELECT column_name, data_type
            FROM information_schema.columns
            WHERE table_name = table_name
                AND is_nullable = 'YES' -- Only nullable columns
        LOOP
            -- Determine the default value
            IF col.data_type IN ('character varying', 'text') THEN
                default_val := '''''';  
            ELSIF col.data_type = 'boolean' THEN
                default_val := 'FALSE';  
            ELSIF col.data_type IN ('integer', 'bigint', 'smallint', 'numeric', 'real', 'double precision') THEN
                default_val := '0';  
            ELSIF col.data_type = 'date' THEN
                default_val := 'CURRENT_DATE';  
            ELSIF col.data_type IN ('timestamp without time zone', 'timestamp with time zone') THEN
                default_val := 'CURRENT_TIMESTAMP'; 
            ELSE
                RAISE NOTICE 'Skipping column %, unsupported type: %', col.column_name, col.data_type;
                CONTINUE;
            END IF;
    
            -- Build the dynamic SQL query
            query := FORMAT(
                'UPDATE %I SET %I = %s WHERE %I IS NULL;',
                table_name, col.column_name, default_val, col.column_name
            );
    
            EXECUTE query;
        END LOOP;
    END;
    

    If you run the function SELECT replace_nulls_with_defaults('sample_table'); all suitable NULL values will be replaced with defaults.

    EDIT

    This script will replace nulls in all the tables in the schema public

    DO $$
    DECLARE
        tbl RECORD;
    BEGIN
        FOR tbl IN
            SELECT table_name
            FROM information_schema.tables
            WHERE table_schema = 'public' 
        LOOP
            PERFORM replace_nulls_with_defaults(tbl.table_name);
        END LOOP;
    END;
    $$ LANGUAGE plpgsql;