hibernatespring-data-jpaehcache-3

How to tell Hibernate's 2nd Level Cache to use the proper Id?


In a Spring Boot 3 application using Hibernate 6 and Ehcache 3 I ran into a weird problem. My entities have an id property which property-name is prefixed by the entity name, so for example a Display entity would have an id named displayId.

The entity with cache annotation looks like that:

@Entity
@Access(AccessType.FIELD)
@org.hibernate.annotations.Cache(region = "display-cache",
                                 usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)
@Table(name = "display")
public class Display {

    @Id
    @Column(name = "display_id")
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long displayId;

    @Column(name = "description")
    private String description;

    //...
}

Now, as long as I query for a display using the built-in findById() method of the JpaRepository everything is fine and the display gets cached as expected. But when I try to query for the id-property field displayId itself there is no caching, although the parameter is the id of the entity:

The JpaRepository looks like this:

public interface DisplayRepository extends JpaRepository<Display, Long> {

    Optional<Display> findByDisplayId(Long displayId);

}

And the query looks like that:

displayRepository.findByDisplayId(id);  // cache is not working

// just for comparison:
displayRepository.findById(id);         // cache is working

So my question is:

How can I tell Hibernate that the used displayId is the id of the entity so that Hibernate will do the caching as expected?


Solution

  • Implementing 2nd-level cache in Hibernate is not so straightforward as you might read in someone's blogposts, and the rule of thumb is following: if you are able to implement caching on service/business level - do that without looking back and stay clear of 2nd-level cache.

    Below are some my thoughts based on my previous researches:

    I. The implementation is buggy and no one is going to fix that: 2nd-level cache does not work as expected, that worth to note I have discovered those issues using integration tests only, so, I have no idea why this functionality is not covered by tests in Hibernate project.

    II. You need to think about how are you going to deal with stale data in cache: you may both override actual data with the stale one or make wrong decisions based on stale data, both situations do not look good. The most straightforward way is to enable version stamp checks via @Version fields, however, "version checking" is completely another universe and you may face with challenges you have never faced with before (on the other hand I can't understand how someone uses JPA without version checking)

    III. Do not use spring caching capabilities together with JPA repositories: spring is not designed for caching mutable data, JPA entities are mutable by design, instead of performance improvements you will get wrong data in DB.

    IV. If application modifies entities via update (@Modifying and @Query(update/delete) in JPA repositories), those operations invalidate caches - avoid using such patterns

    V. Query caching does not work at all due to following reasons:

    1. If you are retrieving entities, HBN caches corresponding identifiers, successive cache hits will retrieve entities using those cached ids one-by-one, slow-by-slow
    2. modifying entities from the same query space (i.e. entities backed by the same tables as tables involved in the query to be cached) invalidates query cache - HBN is unable to figure out whether the entity to be updated/deleted affects query result or not, so it invalidates everything.

    VI. global/distributed caches seem to be useless, local caches accepting remote invalidation messages seem to be OK: the problem is if entity does not have "a lot of" associations retrieving it from DB via single query should not be slower than retrieving it from remote cache, so, from user experience perspective global/distributed cache does improve nothing.

    VII. I do believe the idea of controlling cache behaviour via annotations over entity classes is completely wrong, the point is following: entities just define data, however assumptions about possible optimisations and data consistency is a responsibility of particular application, so, in my opinion the best option to setup caching is to take advantage of org.hibernate.integrator.spi.Integrator, for example:

    @Override
    public void integrate(Metadata metadata, SessionFactoryImplementor sessionFactory, SessionFactoryServiceRegistry serviceRegistry) {
        for (PersistentClass persistentClass : metadata.getEntityBindings()) {
            if (persistentClass instanceof RootClass) {
                RootClass rootClass = (RootClass) persistentClass;
                if ("myentity".equals(rootClass.getEntityName())) {
                    rootClass.setCached(true);
                    rootClass.setCacheRegionName("myregion");
                    rootClass.setCacheConcurrencyStrategy(AccessType.NONSTRICT_READ_WRITE.getExternalName());
                }
            }
        }
    }
    

    VIII The safest way of implementing 2nd-level cache in Hibernate is following:

    1. at first, let HBN to feed up 2nd-level cache:
    @Bean
    public HibernatePropertiesCustomizer hibernateSecondLevelCacheCustomizer() {
        return map -> {
            map.put(AvailableSettings.JPA_SHARED_CACHE_RETRIEVE_MODE, CacheRetrieveMode.BYPASS);
            map.put(AvailableSettings.JPA_SHARED_CACHE_STORE_MODE, CacheStoreMode.USE);
        };
    }
    
    
    1. after that you may call EntityManager#find(Class<T>, Object, Map<?,?>) method with AvailableSettings#JAKARTA_JPA_SHARED_CACHE_RETRIEVE_MODE property set to CacheRetrieveMode#USE if you think it is appropriate

    As regards to your problem...

    There are a couple of options to retrieve entity by id in Hibernate:

    1. EntityManager#find - the most common one, does respect both 2nd- and 1st-level caches
    2. EntityManager#getReference - instead of retrieving an entity, it creates proxy object, there are some scenarios when it could be useful, however HBN implementation seems to be broken: successive call of EntityManager#find returns proxy object instead of full-functional entity
    3. Session#byMultipleIds - allows to retrieve entities of the same type in batches, does respect both 2nd- and 1st-level caches, unfortunately, is not supported by JPA repositories
    4. via JPQL query like select e from entity e where e.id=:id - the most bizarre option to do the simple thing:
      • when auto flush is enabled (which is actually a reasonable default for the most Hibernate applications), Hibernate tends to keep DB state in sync with persistence context, which in turn means that before executing any JPQL query Hibernate will check whether persistence context contains dirty entities (that takes some time) and flush those dirty entities into DB.
      • if entity to be retrieved is already present in persistent context, Hibernate won't refresh its state using DB data, such behaviour seems to be weird

    From JPA repository perspective every declared method, which is not default and is not implemented by base repository (SimpleJpaRepository in the most cases) or fragment, is backed up by JPQL query and, thus, may not work as intended/desired in some corner cases.

    so, the best option for particular case is to give up on using naming convention which causes performance issues, if that is not possible you may take advantage of using default methods:

    default Optional<Display> findByDisplayId(Long displayId) {
       return findById(displayId);
    }