sqlpostgresqlforeign-keysplpgsqlinformation-schema

How do I SELECT all (table, column) referencing the same column as foreign key


(TL;DR at the end)

I am working on merging 2 well sized postgres databases. As there are ID conflicts and many foreign keys I would have enjoyed that UPDATE foo SET bar_id = bar_id + 100000 CASCADE was a thing in SQL so it magically update everything accordingly. Unfortunately, it's not.

So I want to use a LOOP structure (see below) that will simply edit the references everywhere. I want a select query the return all table_name, column_name that references the column I want.

DO
$$
DECLARE
    rec record;
BEGIN
    FOR rec IN 
    (SELECT table_name, column_name FROM /*??*/ WHERE /*??*/)   -- <<< This line    
    LOOP  
    EXECUTE format('UPDATE %I SET %s = %s + 100000 ;',  
        rec.table_name,rec.column_name,rec.column_name); 
    END LOOP;
END;
$$
LANGUAGE plpgsql;

I know already how to get all tables (+column_name) having a specific column_name that I use when the foreign key column share the name with the column it references. Or even if it's a list of column_name I know:

SELECT col.table_name, col.column_name
        FROM information_schema.columns col 
        right join
        information_schema.tables tab 
        ON col.table_name = tab.table_name 
        WHERE column_name = 'foo_id' 
        --  IN ('FOO_ID','BAR_FOO_ID') | or : like '%foo_id' | both works well most of the time
        and tab.table_type = 'BASE TABLE' 

But... I am now with a table PLACES with the place_id column being referenced on at least 60 different constraints (matching LIKE '%place_id'). Then there is columns referencing the place id named otherwise like 'foo_currentplace','foo_storageroom', 'foo_lastrecomposition_place', 'operating_theatre' and so on. In the other hand, there is columns referring 'placetype_id' from placetype table which are LIKE '%place%' and I do NOT want to change the placetype_id, so we can not guess which column is to include or not only from their name.

I know there is the information_schema.table_constraints table, but it does NOT tell the referenced column. If we can have the definition from the constraint name, it could be possible to match :
ILIKE format('%%REFERENCES %s(%s)%%',table_name,column_name)
but the definition isn't part of the table_constraints table either.

(For those wondering, I'm working on Hospital databases related to sterilization services.)

WHAT I WANT / TL;DR

I need a SELECT query (or a function definition ) returning all column of the whole database (schema_name,table_name,column_name) or (table_name,column_name) having a foreign key constraint referencing a specified column (parameter).


Solution

  • OK so I have done it :-)
    The following query returns every column in database referencing FOO_TABLE.foo_column as foreign key:

    SELECT 
    fk.table_schema as schema, --optional in my case, I only use public
    fk.table_name as table,
    substring(fk.constraint_def, 14, position(')' in constraint_def)-14) as column
    FROM
        (SELECT tc.*,
         pg_get_constraintdef(c.oid) as constraint_def
         --,c.oid 
         from pg_constraint c
         left join information_schema.table_constraints tc
         on c.conname = tc.constraint_name
         where tc.constraint_type = 'FOREIGN KEY')
         as fk    
    WHERE constraint_def ILIKE format('%%REFERENCES %s(%s)%%',
                                     'FOO_TABLE','foo_column')  
    ORDER BY table_schema,table_name,3;
    

    Test it there

    I've found information_schema.table_constraints gets most of the information from pg_constraint which includes internal reference OID, and there is a built-in function pg_get_constraintdef() returning the definition of constraint object from it's OID.
    Then, some substring of the definition is enough to extract the column_name AND filter over the referenced column with the (I)LIKE filter I've prepared in my question.

    ----------------- OTHER ANSWER ------------------

    Another acceptable query I've built from improving @Abelisto suggestion :

    SELECT table_name, column_name
    FROM (SELECT table_name, SUBSTR(column_name, 2, LENGTH(column_name)-2) as column_name,
                 referenced_table,SUBSTR(referenced_column, 2, LENGTH(referenced_column)-2) as referenced_column
          FROM(select
                 conrelid::regclass::text as table_name,
                (select array_agg(attname) from pg_attribute where conrelid = attrelid and attnum = any(conkey))::text as column_name,
                 confrelid::regclass::text as referenced_table,
                (select array_agg(attname) from pg_attribute where confrelid = attrelid and attnum = any(confkey))::text as referenced_column
                 from pg_constraint where contype = 'f' 
                 ) b ) a
    WHERE (referenced_table, referenced_column) = ('FOO_TABLE','foo_column');
    

    Test it there

    I don't think performances really matters there, so one should pick regarding what are the side needs. I think my first solution have the advantage to get the constraint definition if you want to alter it (say, for exemple, add a ON UPDATE CASCADE clause), yet the second seems more "compact" for this very purpose of just returning the table.column.