javahibernatebatch-updatesbatch-insert

Is Session.Save sending a request to the database?


I have to improve the performance of a very slow code and I am pretty new to Hibernate. I have studied carefully the code and concluded that the issue is that it has a large set of entities to load and update/insert. To translate the algorithm to a more digestible example, let's say we have an algorithm like this:

for each competitionToSave in competitionsToSave
    competition <- load a Competition by competitionToSave from database

    winner <- load Person by competitionToSave.personID

    do some preprocessing

    if (newCompetition) then
        insert competition
    else
        update competition
    end if
end for

This algorithm is of course problematic when there are lots of competitions in competitionToSave. So, my plan is to select all competitions and winners involved with two database requests the most, preprocess data, which will quicken the read, but more importantly, to make sure I will save via insert/update batches of 100 competitions instead of saving them separately. Since I am pretty new to Hibernate, I consulted the documentation and found the following example:

Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();

for ( int i=0; i<100000; i++ ) {
    Customer customer = new Customer(.....);
    session.save(customer);
    if ( i % 20 == 0 ) { //20, same as the JDBC batch size
        //flush a batch of inserts and release memory:
        session.flush();
        session.clear();
    }
}

tx.commit();
session.close();

However, I am not sure I understand it correctly. About the method .save() I read:

Persist the given transient instance, first assigning a generated identifier. (Or using the current value of the identifier property if the assigned generator is used.) This operation cascades to associated instances if the association is mapped with cascade="save-update".

But it is unclear to me whether a request to the database is send upon every save. Am I accurate if I assume that in the example taken from the documentation session.save(customer) saves the modification of the object in the Session without sending a request to the database and then on every 20th item the session.flush() sends the request to the database and session.clear() removes the cache of the Session?


Solution

  • You are correct in your assumptions, though the inserts will be triggered one-by-one:

    insert into Customer(id , name) values (1, 'na1');
    insert into Customer(id , name) values (2, 'na2');
    insert into Customer(id , name) values (3, 'na3'); 
    

    You can try and take advantage of the bulk insert feature to increase the performance even more.

    There is hibernate property which you can define as one of the properties of hibernate's SessionFactory:

    <property name="jdbc.batch_size">20</property>
    

    With this batch setting you should have output like this after each flush:

    insert into Customer(id , name) values (1, 'na1') , (2, 'na2') ,(3, 'na3')..
    

    One insert instead of a twenty.