I’m working on a Spring Batch job with parallel chunk processing. The problem I’m facing is a LazyInitializationException
due to nested lazy-loaded collections in my JPA entities.
I’m using JpaPagingItemReader
for reading, JpaTransactionManager
for managing transactions, and SimpleAsyncTaskExecutor
for parallel processing.
Setup:
JpaTransactionManager
SimpleAsyncTaskExecutor
Simple example:
Customer entity:
@Entity
public class Customer {
@Id
private Long id;
@ManyToOne
@JoinColumn(name = "order_id")
private Order order;
}
Order entity:
The @OneToMany
relation is FetchType.LAZY
by default.
@Entity
public class Order {
@Id
private Long id;
@OneToMany(mappedBy = "order", cascade = CascadeType.ALL, orphanRemoval = true)
private List<Type> types;
// Other @OneToMany collection fields...
}
Spring Batch Step Configuration:
I’m using a JpaPagingItemReader
and an ItemProcessor
. The step is parallelized using a SimpleAsyncTaskExecutor
.
@Bean
public JpaPagingItemReader<Customer> customerItemReader(EntityManagerFactory entityManagerFactory) {
return new JpaPagingItemReaderBuilder<Customer>()
.name("customerItemReader")
.entityManagerFactory(entityManagerFactory)
.queryString("SELECT c FROM Customer c")
.pageSize(100)
.build();
}
@Bean
public Step processCustomersStep(JobRepository jobRepository,
PlatformTransactionManager transactionManager,
JpaPagingItemReader<Customer> customerItemReader,
ItemProcessor<Customer, Customer> customerItemProcessor,
ItemWriter<Customer> customerItemWriter,
TaskExecutor taskExecutor) {
return new StepBuilder("processCustomersStep", jobRepository)
.<Customer, Customer>chunk(100, transactionManager)
.reader(customerItemReader)
.processor(customerItemProcessor)
.writer(customerItemWriter)
.taskExecutor(taskExecutor)
.build();
}
When running the batch job, I get the following error when accessing the lazy-loaded order or its nested types collection in the ItemProcessor
:
org.hibernate.LazyInitializationException: failed to lazily initialize a collection of role: com.myapp.Order.types: could not initialize proxy - no Session
What I’ve tried:
JOIN FETCH
: I can’t use JOIN FETCH
because the
Order
entity contains multiple nested lazy-loaded collections,
leading to large, inefficient queries.JpaTransactionManager
, I assumed it would manage
the transaction scope properly, but the LazyInitializationException
still occurs when accessing the nested lazy-loaded collections in
the ItemProcessor
.Question:
How can I prevent the LazyInitializationException
in a Spring Batch step with parallel chunks when using JpaPagingItemReader
and JpaTransactionManager
?
Should I handle the EntityManager
differently, or is there another pattern to process entities with nested lazy-loaded collections in parallel?
UPDATE (processor, writter and JPA transaction manager)
Item processor:
In the ItemProcessor, I’m creating a new Customer instance and copying the nested Order and its associated Types. This requires accessing the lazy-loaded collections in the ItemProcessor, which is where the LazyInitializationException occurs.
@Component
public class CustomerItemProcessor implements ItemProcessor<Customer, Customer> {
@Override
public Customer process(Customer customer) throws Exception {
Customer newCustomer = new Customer();
Order order = new Order();
order.setTypes(customer.getOrder().getTypes()));
newCustomer.setOrder(order);
return newCustomer;
}
}
Item writer:
The ItemWriter uses the injected EntityManager to persist the processed entities.
@Bean
public JpaItemWriter<Customer> customerItemWriter(EntityManagerFactory entityManagerFactory) {
return new JpaItemWriterBuilder<Customer>()
.entityManagerFactory(entityManagerFactory)
.build();
}
JPA transaction manager:
The JpaTransactionManager is used to manage the transaction scope within the batch job.
@Bean
public PlatformTransactionManager transactionManager(EntityManagerFactory entityManagerFactory) {
return new JpaTransactionManager(entityManagerFactory);
}
The LazyInitializationException
occurs because the ItemProcessor
runs in a separate thread after the JPA transaction is complete — meaning the persistence context is closed and lazy collections can’t be initialized.
To solve this, I implemented a custom paging ItemReader
that eagerly loads the required collections during the read phase, inside the transaction.
public class CustomerItemReader extends AbstractPagingItemReader<Customer> {
private final EntityManagerFactory entityManagerFactory;
private EntityManager entityManager;
public CustomerItemReader(EntityManagerFactory entityManagerFactory, int pageSize) {
super();
this.entityManagerFactory = entityManagerFactory;
setPageSize(pageSize);
setName("customerItemReader");
}
@Override
public void afterPropertiesSet() throws Exception {
super.afterPropertiesSet();
Assert.state(entityManagerFactory != null, "EntityManager cannot be null");
}
@Override
protected void doOpen() throws Exception {
super.doOpen();
entityManager = entityManagerFactory.createEntityManager();
if (entityManager == null) {
throw new DataAccessResourceFailureException("Unable to obtain an EntityManager");
}
}
@Override
protected void doReadPage() {
EntityTransaction tx = entityManager.getTransaction();
tx.begin();
entityManager.flush();
entityManager.clear();
if (results == null) {
results = new CopyOnWriteArrayList<>();
} else {
results.clear();
}
// Step 1: Fetch Customers and their Orders
List<Customer> customers = entityManager.createQuery("""
SELECT c FROM Customer c
LEFT JOIN c.order
ORDER BY c.id
""", Customer.class)
.setFirstResult(getPage() * getPageSize())
.setMaxResults(getPageSize())
.getResultList();
if (customers.isEmpty()) {
tx.commit();
return;
}
// Step 2: Fetch nested Order collections using IDs
List<Long> orderIds = customers.stream()
.map(Customer::getOrder)
.map(Order::getId)
.distinct()
.toList();
// Step 3: Load Order.types
entityManager.createQuery("""
SELECT DISTINCT o FROM Order o
LEFT JOIN FETCH o.types
WHERE o.id IN :orderIds
""", Order.class)
.setParameter("orderIds", orderIds)
.getResultList();
// Repeat step 3 to load other @OneToMany collections one by one to avoid MultipleBagFetchException...
tx.commit();
results.addAll(customers);
}
@Override
protected void doClose() throws Exception {
entityManager.close();
super.doClose();
}
}
This approach:
LazyInitializationException
— collections are loaded within the active transactionMultipleBagFetchException
— by fetching each collection in its own queryRelated: If you’re using an aggregate ItemReader
that returns a List<T>
per read (such as AggregatePagingItemReader
), you may run into similar LazyInitializationException
issues when accessing nested lazy-loaded collections in parallel chunk processing.
I’ve written a separate answer that shows how to eagerly load nested @ElementCollection
fields using separate batch queries to avoid LazyInitializationException
, MultipleBagFetchException
, and N+1 problems in that scenario.