I'm using the foundationDB SQL parser (https://github.com/FoundationDB/sql-parser) to parse a query inside Java, but I'm not very familiar with the visitor design pattern, that is used by the parser to consume a query.
I would like to send a query to the parser like this: "SELECT a, b FROM c WHERE d > 5" and get as a result:
Thats the code I'm implementing:
@Override
public QueryDescription parse() throws StandardException {
SQLParser parser = new SQLParser();
StatementNode stmt = parser.parseStatement(sql);
Visitor v = new Visitor() {
@Override
public boolean visitChildrenFirst(Visitable arg0) {
// TODO Auto-generated method stub
return false;
}
@Override
public Visitable visit(Visitable arg0) throws StandardException {
// Temporary stores the QueryDescription parameters
StatementEnum se = null;
String fromTable = null;
String[] fields = null;
if(arg0 instanceof CursorNode) {
CursorNode cn = (CursorNode) arg0;
// print out what statement is been declared in sql query
System.out.println("Statement: " + cn.statementToString());
// temporarly stores the statement
String statement = cn.statementToString();
// creates the right StatementEnum
if(statement == "CREATE TABLE") {
se = StatementEnum.CREATE_TABLE;
} else if(statement == "INSERT") {
se = StatementEnum.INSERT;
} else if(statement == "SELECT") {
se = StatementEnum.SELECT;
} else if(statement == "DROP TABLE") {
se = StatementEnum.DROP_TABLE;
}
}
description = new QueryDescription(se, fromTable, fields);
return arg0;
}
@Override
public boolean stopTraversal() { return false; }
@Override
public boolean skipChildren(Visitable arg0) throws StandardException { return false; }
};
stmt.accept(v);
// TODO remove, only for debug purpose
stmt.treePrint();
return description;
}
And that's the QueryDescription class code:
public class QueryDescription {
/* Member variables: */
private QueryTypeEnum queryType;
private StatementEnum statement;
private String fromTable;
private String[] fields;
/* Constructors: */
/**
*
* @param statement
* @param fromTable
* @param fields
*/
public QueryDescription(StatementEnum statement, String fromTable, String[] fields) {
this.statement = statement;
this.fromTable = fromTable;
this.fields = fields;
}
/* Methods: */
/**
* Analyze which type of query is the one passed by parameter and assigns the right queryTypeEnum
*/
public void assignType() {
switch(statement) {
case CREATE_TABLE:
break;
case SELECT:
if(fields[0] == "allFields")
queryType = QueryTypeEnum.DUMP;
else {
// TODO risolvere questione del WHERE
queryType = QueryTypeEnum.SELECT_FROM;
}
break;
case UPDATE:
break;
case INSERT:
break;
case DROP_TABLE:
break;
}
}
/* Getters and Setter: */
/**
*
* @return the queryType
*/
public QueryTypeEnum getQueryType() {
return queryType;
}
/**
*
* @return the statement
*/
public StatementEnum getStatement() {
return statement;
}
/**
*
* @return the from table
*/
public String getFromTable() {
return fromTable;
}
/**
*
* @return the fields
*/
public String[] getFields() {
return fields;
}
}
Your code does not show what the QueryDescription
class does, but I can guess.
In dealing with the where clause, you are looking for three types of nodes:
BinaryLogicalOperatorNode
- This has the AND, OR, IS operators that separate the individual clauses in the WHERE clause.BinaryOperatorNode
- This has the individual >, <, and other operations. In your example, you would visit a BinaryOperatorNode
, with a type of LT
, and the two children would be ConstantNode
with a value of 5, and a ColumnReference
node with the value of "d".
Note: the parser does not connect the Column references to the tables in the tables list. This is a separate step in the query processing. So you will have a ColumnReference
node, but no link to which table the column is referencing. The reason is the parser does not have enough information to correctly link the "d" column to the correct table.
This is enough to process the simple query you gave in the example. Obviously queries can become much more complex.
The one node I would add to the list of checks in the InListOperatorNode
which handles WHERE d IN (1,2,3,4)
.
EDIT to add:
Keep in mind the Vistor#visit()
method gets called for each and every node in the tree created by the query parser. The visit method will need to check, and set your variables properly, for FromBaseTable
, ConstantNode
, ColumnReference
.
Visitor v = new Visitor() {
List<String> fromTable = new ArrayList<String>();
List<String> fields = new ArrayList<String>();
// other Visitor methods go here, not copied for conciseness.
@Override
public Visitable visit(Visitable arg0) throws StandardException {
// other code from your visit() method goes here
//
if (arg0 instanceof FromBaseTable) {
FromBaseTable table = (FromBaseTable)arg0;
fromTable.append(table.getTableName());
} else if (arg0 instanceof ColumnReference) {
ColumnReference column = (ColumnReference) arg0;
fields.append(column.getColumnName())
}
// Remove the call to create QueryDescription
}
public QueryDescription getQueryDescription() {
return new QueryDescription(se, fromTable, fields)
}
}
Then in your main line of code you call:
stmt.accept(v);
QueryDescription description = v.getQueryDescription();
Now if you have other parts of the query you are interested in, you need to add those nodes to the visit method, and capture the part of the node (names, values, etc) you are interested in.