apache-calcite

Calcite does not return correct SqlKind


I'm trying to use Apache Calcite to parse the SQL statement, I find that it does not give correct SqlKind for funciton like 'avg', 'sum', etc.

Here is the code snippet,

public void test() throws SqlParseException {
    String sql = "select avg(age) from foobar";
    SqlParser parser = SqlParser.create(sql);
    SqlNode root = parser.parseQuery(); 
    SqlSelect ss = (SqlSelect) root;
    SqlNodeList snl = ss.getSelectList();
    SqlBasicCall sbc = (SqlBasicCall) snl.get(0);
    System.out.println(sbc.getOperator().kind); // OTHER_FUNCTION
}

I was expecting that it will return 'SqlKind.AVG', unfortunately, it gives 'SqlKind.OTHER_FUNCTION'.

Am I doing something wrong here?


Solution

  • If you evaluate sbc.getOperator() the result will be a SqlUnresolvedFunction. This is because you have an unresolved AST. To resolve it, you will need to run it through the validator. The validator derives types and looks up operators in an operator table.

    Splitting parsing and validation into separate steps is a feature, not a bug. When we built Calcite, we made an intentional design choice that the parser would just parse, not attempt any semantic analysis. This makes the parser simpler, quicker and more predictable.

    I have extended your example to create a validator and validate the AST:

      public static void test() throws SqlParseException {
        String sql = "select avg(age) from foobar";
        SqlParser parser = SqlParser.create(sql);
        SqlNode root = parser.parseQuery();
    
        SqlSelect ss = (SqlSelect) root;
        SqlNodeList snl = ss.getSelectList();
        SqlBasicCall sbc = (SqlBasicCall) snl.get(0);
        System.out.println(sbc.getOperator().kind); // prints "OTHER_FUNCTION"
    
        SqlTypeFactoryImpl typeFactory =
            new SqlTypeFactoryImpl(RelDataTypeSystem.DEFAULT);
        RelDataType rowType = typeFactory.builder()
            .add("age", SqlTypeName.INTEGER)
            .build();
        CalciteCatalogReader catalogReader =
            SqlValidatorUtil.createSingleTableCatalogReader(false,
                "foobar", typeFactory, rowType);
        SqlOperatorTable operatorTable =
            SqlStdOperatorTable.instance();
        SqlValidator.Config config = SqlValidator.Config.DEFAULT;
        SqlValidator validator =
            SqlValidatorUtil.newValidator(operatorTable,
                catalogReader, typeFactory, config);
        SqlNode validatedRoot = validator.validate(root);
    
        ss = (SqlSelect) validatedRoot;
        snl = ss.getSelectList();
        sbc = (SqlBasicCall) snl.get(0);
        System.out.println(sbc.getOperator().kind); // prints "AVG"
        System.out.println(validator.getValidatedNodeType(ss)); // prints "RecordType(INTEGER EXPR$0)"
      }
    
    

    As you can see, the output from the validator (validatedRoot) is also a tree of SqlNode objects but the operator inside sbc has been re-assigned and is now an instance of SqlAvgAggFunction.

    The last line shows how you can also get the type of a SqlNode by calling validator.getValidatedNodeType. The validator only keeps the types of expressions it will need later, so it works for some AST nodes but not all.