javasqljsonparsingapache-calcite

Is there a way to get more information on the Identifier Nodes in Apache Calcite before Validation phase?


I was working on parsing sql queries and then turning them into their parse trees, but in calcite, the smallest node is broken down to SqlIdentifier. Which doesn't specify the context of the node.

I want to know if that Identifier node is a table name or a column name etc. But I haven't found any way to do that in the parsing stage, all require me to go ahead with the Validation stage. I just want the parse tree to be more detailed.

I was looking at approaches like backtracking, or maintaining a data structure to figure out the context but I am stuck there as well.

Any help is appreciated, please, thank you!

I have tried backtracking to get more context, even maintaining a special data structure to store nodes and figure out, except for the validator class I have found no inbuilt method. If there is no inbuilt method, is there a data structure approach I can use to figure this out?


Solution

  • In short, no. Suppose your query is

    SELECT empno
    FROM employees AS e,
      departments AS d
    

    In order to resolve that empno column in the SELECT clause, Calcite needs to look at all the tables in the FROM clause, make sure they exist, and gather the names and types of the columns in those tables. That is the validation process.

    After validation, it can resolve empno to e.empno and report that its type is INTEGER NOT NULL, but not before.