I am genuinely puzzled... but first, let me give you a rough overview.
I've done some restructuring in a database, by merging 4 tables into two. All tables have simple numeric sequences as primary keys. The tables were in fact in pairs very (very) similar. The only reason why they were split in two was based on historic data that had to be imported. Without this split, there would have been a lot of redundancy, and conceptually it made sense.
Now, after a good deal of work went into data cleansing, it's now finally possible to merge them and simply use one of the fields as discriminator. To talk less abstract, the tables contain companies. And they are either local residents or not (the two classes). They can easily be distinguished by their post code (the discriminator field). These tables are slowly changing dimensions (the sequence being the surrogate key). The other two tables contain the normal data attached to these SCDs. Hence, 4 tables. 2 for local companies, and 2 for non-locals.
These tables have now been simplified and merged, so I now only have Company
and CompanyData
.
To be on the safe side, and not to lose any historic information I created the two new tables with new sequence fields. The old sequences are kept in case 10 years from now I realise something went wrong ;)
So far so good.
The restructuring was fairly easy, reconnecting the correct entries was a no-brainer as well. Next, I needed to update the application which interfaces with this DB, which was a bit more work, but still easy. The application uses JPA using EclipseLink 2.0 on top of - as said already - a PostgreSQL 9.0 database.
And here comes the weird part:
When I try to insert a new company, I get a duplicate key error, stating that the given ID exists already. But that should be handled by the sequence object... should it not?
So I did some digging around. I could verify that subsequent inerts indeed returned duplicate key errors with incrementing ids. This means, that the sequence logic is OK. The only problem is that the current value is too low. So a call to nextval (or whatever JPA uses) will return an ID which already exists.
I have the following in the JPA-Entity:
@Id
@GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "enterprise_id_seq")
@Column(name = "id", nullable = false)
private Integer id;
And my sequence looks like this:
test_db=# \d enterprise_id_seq
Sequence "public.enterprise_id_seq"
Column | Type | Value
---------------+---------+---------------------
sequence_name | name | enterprise_id_seq
last_value | bigint | 19659
start_value | bigint | 1
increment_by | bigint | 1
max_value | bigint | 9223372036854775807
min_value | bigint | 1
cache_value | bigint | 1
log_cnt | bigint | 32
is_cycled | boolean | f
is_called | boolean | t
The errors I get are these:
[...]
Caused by: Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.0.1.v20100213-
r6600): org.eclipse.persistence.exceptions.DatabaseException
Internal Exception: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "enterprise_pkey"
Detail: Key (id)=(19611) already exists.
Error Code: 0
Call: INSERT INTO en...
[...]
As you can see, it tries to insert an entity with id 19611
, but the last value on the sequence is 19659
. Which is clearly wrong.
I also tried to restart the application server behind all this so to close all open connections ans sessions. No luck... Another thing I noticed: The field is defined as Integer
. Should it rather be Long
? That would necessitate quite some changes in the code, and I did not yet have time to fix this.
As I'm only 50 entries behind I could simply try to run the insert 50 times, but I'd rather know exactly what went wrong...
What am I missing here?
Update: After some more digging, I came across allocationSize which has a default of 50. Interestingly, this comes quite close to the difference of IDs I'm seeing. It may not be 100% the same due to some testing and frobnicating. Could it be related? I honestly have not understood the idea behind this setting...
Certainly for Hibernate, the default if using GenerationType.SEQUENCE is to use a hi/lo strategy, up to allocationSize
ids ahead of the value returned from the database. Set allocationSize to 1 and it should do the right thing.
A previous answer to a very similar problem: Hibernate generating two different sequence Ids for PostgreSQL insert