sqlpostgresqlparallel-processingexecutiondblink

PostgreSQL Multiple Queries in one statement can be done parallel


I have created a application which is sending queries using DBLink from many tenant-databases to a single group-database.

An example of a query is the following:

INSERT INTO public.<tableA> (<columns>) SELECT <columns> FROM dblink(<connection>, <statement>) AS rt(<output definition>)
INSERT INTO public.<tableB> (<columns>) SELECT <columns> FROM dblink(<connection>, <statement>) AS rt(<output definition>)
INSERT INTO public.<tableC> (<columns>) SELECT <columns> FROM dblink(<connection>, <statement>) AS rt(<output definition>)
INSERT INTO public.<tableD> (<columns>) SELECT <columns> FROM dblink(<connection>, <statement>) AS rt(<output definition>)

Questions:

  1. Is this statement run in parallel these four queries?
  2. How can I check how PostgreSQL is handling the plan? (PGAdmin is not showing complete plan these four queries together)
  3. If not executing in parallel is there any way to force parallel execution in individual queries in the same statement?

Solution

  • No, what you show is not run in parallel between the queries. You could submit the queries with dblink_send_query over different dblink connections and then juggle the results, but that is tedious and error prone and wouldn't do the insert part in parallel anyway.

    I don't know of a way to get the plan passed back from the remote side and displayed on the local side, other than just adding "EXPLAIN..." into the query text (in which case you will not get the data, only the plan). The best method might be to configure auto_explain on the remote side, then troll through the remote logs. That is what I do.

    No, you can't force parallel execution in any meaningful way. You can tweak parallel_tuple_cost and things like that on the remote side to encourage it, same as if you were not using dblink and instead were just connected directly.