Need to fetch tablename from input using sql query.I know this can be done through regex but not sure how hence any help appreciated Input:
Field1 | field2 |
---|---|
a | INSERT INTO test.table2{ name, age, city, id}SELECT name, age, city, id FROM table 1 |
b | select from test1.table3 |
c | select *from test2.table4 |
d | select *from test2.table4(10) |
Output:
Field1 | field2 |
---|---|
a | test.table2 |
b | test1.table3 |
c | test2.table4 |
It will be extremely difficult to achieve with just a regular expression. If you really need to do that reliably, you can check out how pgpool-II
re-used PostgreSQL's own parser for their needs - that's more of a C++ intensive task, rather than a SQL one.
Here are some more examples at the wiki, some of which you can use directly. You could even get Python sqlparse
to process the query text, then .get_identifiers()
, all inside the db, in a PL/Python function.
If you only wish to cover some very basic cases, you can try to target tokens following certain keywords:
select field1
,unnest(
regexp_matches(
field2
,'(?:UPDATE|INTO|FROM|JOIN|USING|TABLE)(?:\s+ONLY)*\s+([[:alpha:]_]+[[:word:]]*|"[^"]+")'
,'gi'
)
) AS spotted_table
from table1;
field1 | spotted_table |
---|---|
a | table2 |
a | table1 |
b | table3 |
c | table4 |
?:
in (?:something)
makes the parentheses non-capturing, which is why it only reports matches from the last group that doesn't have the ?:
g
in 'gi'
makes this report all matches, and i
disables case-sensitivity$
oddity:
SQL identifiers and key words must begin with a letter (
a-z
, but also letters with diacritical marks and non-Latin letters) or an underscore (_
). Subsequent characters in an identifier or key word can be letters, underscores, digits (0-9
), or dollar signs ($
)
That list of keywords is by no means exhaustive, and while it covers your examples, this regular expression is extremely brittle. See some obvious blind spots in this demo.
It gets more and more tricky if you consider
"select from table7 join TABLE8 on waitaminute;"
(this whole thing is the name of the table, not just the table7
part): demo//
and in-line/multi-line /*...*/
commentsIf the queries you're analysing operate on the same database you keep them in, you can correlate that with information_schema.tables
. Note that there are 9 different things that can be in that same spot in the body of a query, and that's not counting the fact that table is just one form of one of those, a relation. You can also select
from a view
, materialized view
, foreign table
, partitioned table
, all of which act like a table
only to some extent. Some views are updatable, which means you can also insert
/merge
/update
/delete
from them as if you interacted with their underlying table directly.
If you're trying to track effective interactions, you'd have to also track the rule
and trigger
systems, view and matview definitions as well as routine
bodies and dependencies.