I'm working on a Java application that requires me to process a large dataset of rows retrieved from the database. Here is the sample situation:
class Example {
@Autowired
private Service service;
public static void main(String[] args) {
long totalRows = service.getTotalRows(); // E.g., 976451 rows
int totalPages = (int) Math.ceil((double) totalRows / 20 ); //48823
ExecutorService executorService = Executors.newFixedThreadPool(20); //lets say 20 threads
// Fetch rows for each page and process
for (int pageNumber = 0; pageNumber < totalPages; pageNumber++) {
// so now i want to fetch each page with (0,48823 ) - assign to 1 thread
// than (1, 48823 ) - assign to another thread
// ..... same for all 20 threads....
// and simultaneously do the tasks analyzeMethod(row); processMethod(row);
Page<Row> all = service.getRowsByPagination(pageNumber, pageSize);
all.get().toList().forEach(row -> {
analyzeMethod(row);
processMethod(row);
});
}
}
public void analyzeMethod(Row row) {
// Perform database-related analysis
}
public void processMethod(Row row) {
// Additional processing
}
}
So in simple terms, I will fetch thousands or millions of rows from a database. I have a method where it will retrieve the Page objects. I need to perform the tasks asynchronously.
I have tried the above way and I am new to multithreading concepts, I am not getting the expected output and getting exceptions. How to perform the above scenario?
Then you just need to use the submit (Runnable Task)
method to perform the analyze method and processMethod methods asynchrono. Something like this:
executorService.submit(() -> analyzeMethod(row));
executorService.submit(() -> processMethod(row));
Be careful with Row objects, because If in these methods you change these objects, then some data will be rewritten with other threads.