How to customize Hibernate EnVers

We're designing our entity audit library with two specific requirements: one table per entity and incremental logs (JSON diff) not (full snapshots).

I know only about Hibernate EnVers and JaVers for entity auditing (if there's anything else please let me know) and none of them satisfies our requirements.

We already know what audit records look like, and we'd like to re-use as much as possible while implementing the library. I'm thinking of adding Hibernate EnVers as dependency and customizing/extending it to support our needs.

First question is, considering our needs and our platform (we use Spring+JPA+Hibernate), is there any strong argument to go with JaVers rather than Hibernate EnVers?

Assuming that the answer to the first question is no and we're going to use Hibernate EnVers, to what extend Hibernate EnVers is extensible? And also, apart from some super simple examples, I couldn't find any resources on how to exactly do it. Is there any? Where can I start?

Our design would be more or less as follow:

Entity Table
Employees (EmployeeId, FirstName, LastName, DepartmentId, ...)

Audit Table
Employees_AUD (EmployeeId, Diff, REV, REVTYPE)
-- EmployeeId: employee ID
-- Diff: diff of the two states of entity
-- REV: revision id (Hinbernate EnVers)
-- REVTYPE: revision type (Hinbernate EnVers)

REVINFO Table:
-- Original table created by Hibernate EnVers

Thanks.

Solution

Envers cannot be extended in the way which you describe.

What you describe would require a complete replacement of the metadata processing for the entity models and how the XML schema gets generated for ORM along with how the mappers are used for both reading and writing entities during the ORM transaction and AuditReader queries. In short, it ends up being a shift in design paradigm which influences a considerable amount of internal functionality.

The storage of a JSON string is not ideal, particularly on relational databases and even more so in environments that do not yet truly support that data-type. You'll incur a performance hit both at write and read times for converting between JSON and a Character Blob and then incur an additional seek and read/write at the database to store that blob in a separate area away from the other column data by virtue of its unknown/varying size.

While there are database platforms out there which can help with the above concerns, it's still worth being mindful about particularly in an audit tool where you can potentially be dealing with considerable volumes of rows as data changes.

I believe the closest Envers could come here would be that a configuration option could be realized that would allow us to still maintain a column-based audit table but only stash the modified columns per snapshot rather than all columns.

What we'd have to see is how much of an impact this has on the AuditReader API. If that's something you'd like to see, feel free to open a JIRA and submit a PR for my review.