javaspringdatabasehibernate

How to bulk create / insert data in Java Spring?


I'm currently trying to read a csv file and want to create, for example a user for each row.

Currently I have this import method:

@Override
@Transactional
public void importUserData(final UserDataDTO toBeSavedDTO) {
    final String file = decodeFile(toBeSavedDTO); // Base64 decoding

    final List<UserDTO> imports = parser.parseImport(file);

    if (imports == null || imports.isEmpty()) {
        throw new Exception();
    }

    // delete existing entries
    dao.deleteUserEntries();

    imports.foreach(dao::createUser);
}

Here I decode that file and parse it to a list of UserDTO. Then I delete the DB and create an entry for each user. It basically does a persist. This work but it takes a lot of time and is not performant.

Is there a way to do a batch insert or something like that? Maybe with a flush after every batch?


Solution

  • Bulk inserts are significantly more performant than inserting each record individually. Spring + JPA (or Hibernate) doesn’t do batch inserts out of the box unless explicitly configured, but with a few optimizations, you can improve performance dramatically.

    In application.properties or application.yml, configure Hibernate to use batch inserts:

    spring.jpa.properties.hibernate.jdbc.batch_size=50
    spring.jpa.properties.hibernate.order_inserts=true
    spring.jpa.properties.hibernate.order_updates=true
    spring.jpa.properties.hibernate.generate_statistics=true
    

    This tells Hibernate to group insert statements in batches of 50 (or your preferred size).

    Modify your loop to flush and clear the EntityManager every N records to avoid memory bloat and allow batching to trigger:

    @Autowired
    private EntityManager entityManager;
    
    @Override
    @Transactional
    public void importUserData(final UserDataDTO toBeSavedDTO) {
        final String file = decodeFile(toBeSavedDTO); // Base64 decoding
    
        final List<UserDTO> imports = parser.parseImport(file);
    
        if (imports == null || imports.isEmpty()) {
            throw new RuntimeException("No user data to import");
        }
    
        dao.deleteUserEntries(); // Clear the table first
    
        int batchSize = 50;
        int count = 0;
    
        for (UserDTO dto : imports) {
            dao.createUser(dto);  // Persists a User entity
            count++;
    
            if (count % batchSize == 0) {
                entityManager.flush();
                entityManager.clear(); // Detach entities from persistence context
            }
        }
    
        entityManager.flush(); // flush remaining
        entityManager.clear();
    }
    

    Make sure dao.createUser() does not call save() on a CrudRepository which breaks batching. Instead, persist using EntityManager.persist(entity).

    public void createUser(UserDTO dto) {
        User entity = mapDtoToEntity(dto);
        entityManager.persist(entity); // This supports batching
    }