Recently I have encountered the following hashcode "equality" scenario in a Java codebase using Apache Commons Lang 3, and was surprised that I could not find much information on how to handle what seems like it could be a common problem:
MyObject one = new MyObject();
one.setFoo("foo");
one.setBar(null);
MyObject two = new MyObject();
two.setFoo("foo");
two.setBar((short) 0);
int oneHash = HashCodeBuilder.reflectionHashCode(one);
int twoHash = HashCodeBuilder.reflectionHashCode(two);
System.out.println("oneHash: " + oneHash);
System.out.println("twoHash: " + twoHash);
System.out.println("Bar equality: " + Objects.equals(one.getBar(), two.getBar()));
The preceding code produces the following output, which shows that both objects have the same hashcode, despite being un-equal:
oneHash: 3781511
twoHash: 3781511
Bar equality: false
MyObject definition:
public class MyObject {
private String foo;
private Short bar;
public String getFoo() {
return foo;
}
public void setFoo(String foo) {
this.foo = foo;
}
public Short getBar() {
return bar;
}
public void setBar(Short bar) {
this.bar = bar;
}
}
While I could maybe understand a null Numeric and 0 Numeric having the same hash in a purely mathematical sense, in any practical setting this causes non-equal objects to have the same hashcode, which can lead to fairly major collision problems.
Clarification/Complication: While I would love to be able to just call equals()
or hashcode()
on the object, the codebase I am working with is unfortunately comparing two Object
s, which means I have no insight to whether equals()
or hashcode()
is actually defined for any given input, and I am not able to edit the class definitions to add these methods in cases where it is missing. This is likely why the original author of this code chose to use reflectionHashCode()
. With this in mind, is there a programmatic / code-based solution or workaround to this problem, such as an alternative library that wouldn't require equals()
or hashcode()
to be defined on the objects being compared?
The code-based solution is to implement a hash function in a way that distinguishes between null and 0. There are lots of ways to do it, here is one:
// this could be called hashCode, but you don't want to override hashCode
public int yourCustomHashFunction() {
if (bar == null) {
return Objects.hashCode(foo, 1234567);
} else {
return Objects.hashCode(foo, bar);
}
}
Since bar
is a Short
, a value outside the valid range for short
like 1234567 is unlikely cause collisions with valid short values.