postgresqlcitus

Query against worker nodes omitting the coordinator in Citus


Existing documentation for Citus11 explicitly points out that external clients should interact with the Citus cluster through the coordinator node, which is supposed to route request among workers.

However, if I create a cluster using the docker-compose, then create the distributed tables like described in the article - I am able to query any data from any node. E.g. it works perfectly fine to execute the select * from public.github_events limit 100 from the worker node.

Does anyone know what are the practical implications of working only-through-coordinator? I doubt such an "distributed" execution works "just because". Likely someone had put some effort in making it work the way it is.

There is no place in documentation claiming "you must not use workers for sending SQL requests" so I wonder what are the real limitations of using them as client-facing nodes.

Thank you in advance!


Solution

  • Running queries from any node is supported since Citus 11.0, which was released ~6 months ago. Some documentation is probably simply not updated yet with that big change in mind. The docs repo is on github and you can suggest changes or file issues.

    However, there's still one important difference between workers and the coordinator. DDL statements, such as CREATE/ALTER TABLE, can only be run from the coordinator. This discrepancy might be changed in the future though.