We have SemGrep which we are allowed using as a static code analysis tool. I'm trying to write a Java rule which verifies if the variable name has exactly the same name as the field name within the same class.
I've tried different approaches but I failed every time. Now I wonder if it's possible at all.
Example Java code that should generate findings (I know it does not cover all cases but I would like to start with something simple and then increase the complexity):
public class ShadowingExample {
// This is a class field.
private String name = "Class Field";
public void example() {
String name = "Local Variable"; // The local variable 'name' shadows the field.
System.out.println("The class field is: " + this.name);
System.out.println("The local variable is: " + name);
}
public void anotherExample() {
System.out.println("Do something else first")
String name = "Local Variable"; // The local variable 'name' also shadows the field.
System.out.println("The class field is: " + this.name);
System.out.println("The local variable is: " + name);
}
}
First approach that I thought it might work (I include patterns block only because the rest does not matter here):
patterns:
- pattern: |
class $CLASS_NAME {
...
$FIELD_TYPE $SHADOW_NAME;
...
$RETURN_TYPE $METHOD_NAME(...) {
...
$VARL_TYPE $SHADOW_NAME =...;
...
}
}
I thought it might work because it uses the same SHADOW_NAME as the field name and as the variable name. However it shows 0 matches for the above Java code.
The second approach I've used was similar:
patterns:
- pattern-inside: |
class $CLASS_NAME {
$FIELD_TYPE $SHADOW_NAME;
...
}
- pattern: |
$RET $M(...) {
...
$VAR_TYPE $SHADOW_NAME = ...;
...
}
This approach show findings for both patterns in separation but the patterns block which joins them with and by default does not generate any findings.
I've also tried using metavariable-comparison but it failed as well
patterns:
- pattern: |
class $CLASS_NAME {
...
$FIELD_TYPE $FIELD_NAME = ... ;
...
}
- pattern: |
$RETURN_TYPE $METHOD_NAME(...) {
...
$VAR_TYPE $VAR_NAME = ...;
...
}
- metavariable-comparison:
comparison: $VAR_NAME == $FIELD_NAME
I expected this comparison to verify the names of these two metavariables. Two first patterns generate matches but when it comes to the comparison it generates 0 findings
To check if two metavariables are bound to symbols with the same name,
use the str function within a metavariable-comparison and check for
equality:
- metavariable-comparison:
comparison: str($FIELDNAME) == str($VARNAME)
Here is a complete rule that matches the example in your question:
rules:
- id: var_shadows_field_rule
languages:
- java
severity: ERROR
message: Field "$FIELDNAME" shadowed by variable "$VARNAME".
patterns:
- pattern: |
class $NAME {
...
$FIELDTYPE $FIELDNAME = ...;
...
$RETTYPE $METHOD(...)
{
...
$VARTYPE $VARNAME = ...;
...
}
}
- metavariable-comparison:
comparison: str($FIELDNAME) == str($VARNAME)
This rule can be tested directly on the Semgrep playground (click on "advanced" to be able to enter the rule as YAML).
Caveat: The rule above requires that both the field and the variable have an initializer, which is not ideal. As far as I can tell, you will have to write four different patterns to cover all four cases of field and variable being initialized or not.
The reason you need the str function in metavariable-comparison is
that, without that, Semgrep tries to get the value of the field or
variable, as if it was a constant, and then compares the values, not
the variable names.
Uncertainty: Based on the examples in the
metavariable-comparison documentation,
I would have thought that a metavariable key is needed inside
metavariable-comparison, but it evidently is not; and if it were, I
don't know how one would specify multiple metavariables--here we are
comparing two.
Conjecture: I think the reason repeating the same metavariable does not work here is that a single metavariable can only be bound to syntax of the same "kind", and Semgrep therefore rejects binding one to both a field and a variable since it sees them as different kinds.