apache-superset

No data about SELECT queries in Superset in ClickHouse table system.processes


In ClickHouse there is a table system.processes which contains info about all queries running right now. And we can kill any too long query.

This table works great. I can run any long query any type. Then I can see that query in system.processes and kill that query. I checked this functionality in DBeaver. Also I tested that in Superset / SQLLab. All queries running in SQLLab I can see in system.processes except of SELECT queries. I want to see all queries from Superset, not only ins/upd/del but also SELECT queries.

system.processes kill log

How to reproduce the bug

  1. Connect ClickHouse DB to Superset.
  2. Open DBeaver and on current ClickHouse DB create table with a lot of data. Execute SQL code below: create table eso.t2(v String) ENGINE = MergeTree() order by v insert into eso.t2(v) values(generateUUIDv4()) insert into eso.t2(v) SELECT v from eso.t2 -- repeat that row 20+ times to have 1m+ rows in table eso.t2
  3. Open Superset/SQLLab, select UI LIMIT: 1m+ and run long query: SELECT v, generateUUIDv4() as uid from eso.t2 limit 500000
  4. In DBeaver on current ClickHouse DB run query: SELECT query, * FROM system.processes where query not like '%processes%'
  5. There are no rows returned by query. This is the BUG. Exprected result: returned one row from system.processes with query from point 3.

Success case: I run in Superset query: "insert into eso.t2(v) SELECT v from eso.t2" and can see that query in system.processes.


Solution

  • I received workaround for this issue from @den-crane - https://github.com/ClickHouse/ClickHouse/issues/62604#issuecomment-2051992160

    TLDR: add sleep(1) to SELECT query. So, instead this "SELECT * FROM t" I should write this: "SELECT sleep(1),* FROM t"