I am new to CodeQL and have started learning about dataflow queries for C/C++ programs. Following is a excerpt of a C program that I want to analyse:
int main(int argc, char * argv[])
{
unsigned short size, x, y;
int r1, r2;
x = atoi(argv[1]);// one dim of the data
y = atoi(argv[2]);//other dim of the data
size = x*y; //total size of the data
r1 = MyVuln(size*sizeof(char));
r2 = MyVuln(x*sizeof(char));
...
// some code
...
return 0
}
In the above example, I want to capture if MyVuln
function is called with size
as argument. The size
is defined as a result of AssignExpr
such that its Rvalue
is a result of multiplication. Following is the COdeQL queries that I wrote:
/*
@kind path-problem
*/
import cpp
import semmle.code.cpp.dataflow.new.DataFlow
//import DataFlow::PathGraph
from Function myvuln, FunctionCall fc, AssignExpr ab
where
myvuln.hasGlobalName("MyVuln")
and fc.getTarget() = myvuln
and ab.getLValue().getType().getUnspecifiedType() instanceof IntegralType
and ab.getRValue() instanceof MulExpr
and exists (DataFlow::Node src, DataFlow::Node sink|
src.asExpr() = ab.getLValue()
and sink.asExpr() = fc.getArgument(0)
and DataFlow::localFlow(src, sink)
)
select fc, "MyVuln with Arithmetic arg at " + fc.getLocation().toString()
The query returns no result (I am using CodeQl with VS Code). I also checked if a smaller partial query can detect expression corresponding to size
definition and it is working. I also checked if the query finds calls to MyVuln and it is working. Only when I start writing dataflow path query, I am getting no result. This type of query seems pretty straight forward, but I am not getting any clue where I have gone wrong or what is that I am missing in this query. A help is highly appreciates.
thanks
So, based on the suggestions from @Marcono1234, following is the query that worked for my problem mentioned in the question above.
/*
@kind path-problem
*/
import cpp
import semmle.code.cpp.dataflow.new.DataFlow
import semmle.code.cpp.dataflow.new.TaintTracking
//import DataFlow::PathGraph
from Function myvuln, FunctionCall fc, AssignExpr ab, Expr p, DataFlow::Node src, DataFlow::Node sink
where
// getting the call that I am interested in as sink
myvuln.hasGlobalName("MyVuln")
and fc.getTarget() = myvuln
// getting the "interesting" parameter that will flow into the parameter of MyVuln
and ab.getLValue().getType().getUnspecifiedType() instanceof IntegralType
and ab.getRValue() instanceof MulExpr
and src.asExpr() = ab.getRValue() // this was problematic as in my earlier query, I was extracting LValue. But it turns out that I need to select the expression that will compute the value that will flow into the parameter of MyVuln. thus the RValue expression
and sink.asExpr() = fc.getArgument(0)
and TaintTracking::localTaint(src, sink)
select fc, sink.toString(), "MyVul with Arithmetic operation at " + fc.getLocation().toString()
As I am learning CodeQL, I also wanted to understand various ways of doing the stuff. So, I explored the same problem as dataflow
by extracting the size
var from the expression size*sizeof(char)
by using `getAChild*(). this will require changes in the above queries at two place as follows:
and sink.asExpr() = fc.getArgument(0).getAChild*()
//and TaintTracking::localTaint(src, sink)
and DataFlow::localFlow(src, sink)