postgresqlpostgresql-9.1postgresql-9.3postgresql-9.5

Fetch table names using regexp


Need to fetch tablename from input using sql query.I know this can be done through regex but not sure how hence any help appreciated Input:

Field1 field2
a INSERT INTO test.table2{ name, age, city, id}SELECT name, age, city, id FROM table 1
b select from test1.table3
c select *from test2.table4
d select *from test2.table4(10)

Output:

Field1 field2
a test.table2
b test1.table3
c test2.table4

Solution

  • It will be extremely difficult to achieve with just a regular expression. If you really need to do that reliably, you can check out how pgpool-II re-used PostgreSQL's own parser for their needs - that's more of a C++ intensive task, rather than a SQL one.

    Here are some more examples at the wiki, some of which you can use directly. You could even get Python sqlparse to process the query text, then .get_identifiers(), all inside the db, in a PL/Python function.

    If you only wish to cover some very basic cases, you can try to target tokens following certain keywords:

    select field1
      ,unnest(
          regexp_matches(
             field2
            ,'(?:UPDATE|INTO|FROM|JOIN|USING|TABLE)(?:\s+ONLY)*\s+([[:alpha:]_]+[[:word:]]*|"[^"]+")'
            ,'gi'
          )
       ) AS spotted_table
    from table1;
    
    field1 spotted_table
    a table2
    a table1
    b table3
    c table4

    That list of keywords is by no means exhaustive, and while it covers your examples, this regular expression is extremely brittle. See some obvious blind spots in this demo.

    It gets more and more tricky if you consider

    If the queries you're analysing operate on the same database you keep them in, you can correlate that with information_schema.tables. Note that there are 9 different things that can be in that same spot in the body of a query, and that's not counting the fact that table is just one form of one of those, a relation. You can also select from a view, materialized view, foreign table, partitioned table, all of which act like a table only to some extent. Some views are updatable, which means you can also insert/merge/update/delete from them as if you interacted with their underlying table directly.

    If you're trying to track effective interactions, you'd have to also track the rule and trigger systems, view and matview definitions as well as routine bodies and dependencies.