I want to extract the local data flow of a Java method. So far I have this query to extract wherever a variable is accessed, declared, or assigned within the function:
/**
* @name Empty block
* @kind problem
* @problem.severity warning
* @id java/example/empty-block
*/
import java
import semmle.code.java.dataflow.DataFlow
from File fl, LocalVariableDeclExpr ld, VarAccess va, Assignment asn
where
fl.getBaseName() = "Calculator.java"
and
ld.getEnclosingCallable().getName()= "calc"
and va.getEnclosingCallable().getName() = "calc"
and asn.getEnclosingCallable().getName() = "calc"
and ld.getLocation().getFile() = fl
and va.getLocation().getFile() = fl
and asn.getLocation().getFile() = fl
and va.getLocation().getStartLine() = ld.getLocation().getStartLine()
select ld, "\"" + va.getVariable().getName()+"\"" + "->" + "\"" +ld.getVariable().getName()+"\"" + "\n" + "\"" +asn.getDest()+"\"" + "->" + "\"" +asn.getSource()+"\"" + "\n"
The problem is it takes so long in the SELECT phase.
I am using this repository as a database. The file name is Calculator.Java
and this is the method:
public double calc(double x, String input, char opt) {
inText.setFont(inText.getFont().deriveFont(Font.PLAIN));
double y = Double.parseDouble(input);
switch (opt) {
case '+':
return x + y;
case '-':
return x - y;
case '*':
return x * y;
case '/':
return x / y;
case '%':
return x % y;
case '^':
return Math.pow(x, y);
default:
inText.setFont(inText.getFont().deriveFont(Font.PLAIN));
return y;
}
}
Thanks.
Do you want declaration, assignment and read of the same variable? Because the way your query is currently written just selects combinations of any variable inside the calc
method.
Also, your query might not be very performant because you keep checking the callable name and file. It would probably be more performant (and also easier to read) to introduce a separate variable for that, see the query code further below.
Keep in mind that CodeQL is a database language, so the intermediate result in your case is a tuple (LocalVariableDeclExpr, VarAccess, Assignment)
, this means:
Assignment
then the query has no result for that variable (note that initialization of local variables is not considered an Assignment
, see GitHub issue)(Decl, Access1, Assign1)
, (Decl, Access1, Assign2)
, (Decl, Access2, Assign1)
, ...So maybe it would be more interesting to just for every variable get the VarAccess
(which covers access for reading and writing to the variable), for example:
import java
from Method method, LocalVariableDeclExpr ld, VarAccess va
where
method.getDeclaringType().hasName("Calculator") and
method.hasName("calc") and
ld.getEnclosingCallable() = method and
va.getEnclosingCallable() = method and
// And both belong to the same variable
ld.getVariable() = va.getVariable()
select ld, va
Also note that your query is not related to dataflow, it just finds declaration, assignment and usage of variables, which do not necessarily have to be in the right (or even any) order. See the documentation for more information about tracking dataflow. Maybe you are also interested in visualizing the dataflow with path queries. It is also important to consider the difference between dataflow and taint tracking. Dataflow only covers cases where the exact same value flows between variables and calls, whereas taint tracking also covers case where the value is converted or transformed, for example obtaining a substring from a string (see also the documentation).