I am using Hibernate Validation 6.x. I have a field in an object that I'm validating which contains a list, List<@NotNull Double> doubles
for example. The issue I'm facing is that when the list is very large, the performance degradation is substantial. To investigate the issue I implemented the validation of list elements as a custom validator on the List, @ValidDoubles List<Double> doubles
, using a stream to iterate over the elements, and achieved a ~65% performance improvement for that validator.
After profiling the application I can see that the majority of time is being spent in ListValueExtractor.extractValues
, which can be found here. I am hoping that someone could explain why this method seems to be so expensive and if there are any known workarounds.
An example Object:
public class myDataObject {
private List<@NotNull Double> doubles // List which can contain thousands of values
// Getters and Setters
}
Update
Upon further profiling and investigation I believe the issue is related to Hibernate keeping track of which beans have already been validated when performing cascading validation, in particular the use of System.identityHashCode
when doing so (Here is the code).
Looking at my profiler, I can see that 11.6% of CPU time is spent validating the input beans. Of that time, 11.3% of time is spent calling System.identityHashCode
. Interestingly, it is the second child object where the time is being spent even though they contain relatively simple validations. I wonder if I have configured either the validator or beans wrongly as this seems to be very weird.
My Validator configuration looks like so:
<bean id="validator" class="org.springframework.validation.beanvalidation.LocalValidatorFactoryBean">
<property name="validationPropertyMap">
<util:map>
<entry key="hibernate.validator.fail_fast" value="true"/>
</util:map>
</property>
</bean>
Validator invocation:
Set<ConstraintViolation<InputObject>> violations = validator.validate(input);
Example object structure
public class InputObject {
@NotNull
String name;
@Valid
List<FirstChild> firstChildren; // on average 10 objects but can be up to very large
// Getters and Setters
}
public class FirstChild {
@SomeCustomValidator // Not important
Integer someValue;
// 3 to 4 further fields with simple validators
@Valid
List<SecondChild> secondChildren; // On average around 40 objects but can be very large
// Getters and Setters
}
public class SecondChild {
@NotBlank
String foo;
@NotBlank
String bar;
// Getters and Setters
}
In conclusion:
@Valid
annotations on the lists.System.identityHashCode
as the method taking up the majority of time spent validating.Is this an optimization issue with Hibernate or could I either configure my validator or input object structure in some way that would produce better performance?
Tough one.
So the issue you see is that we create a BeanGroupProcessedUnit
per list value so when you have plenty, it doesn't scale well.
You don't have the issue when moving things outside of the list as we only keep a processed unit for the whole list.
I'm not entirely sure there's an easy fix for this that doesn't break other use cases but we should at least check if we can improve the situation in the case you have.
That being said, I would appreciate if you could take the time to open an issue on our tracker https://hibernate.atlassian.net/projects/HV/issues with a reproducer based on https://github.com/hibernate/hibernate-test-case-templates/tree/master/validator ? That would be helpful to start the process.