javajdbcgriddb

How to efficiently manage batch insertions with GridDB using JDBC in a Java application?


I'm working on a Java application that needs to handle a large volume of data insertions into a GridDB database. I am using the JDBC API for database connectivity, and I'm facing challenges with managing batch insertions efficiently. I would like to know the best practices for handling batch inserts in GridDB via JDBC, ensuring minimal memory usage and optimal performance.

Specifically, I'm looking for guidance on:

It would be great to see some example code that demonstrates the connection setup and an optimized batch insertion process.

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.SQLException;

public class GridDBBatchInsertExample {

    private static final String JDBC_URL = "jdbc:griddb://<host>:<port>/<cluster>";
    private static final String USER = "admin";
    private static final String PASSWORD = "admin";

    public static void main(String[] args) {
        Connection connection = null;
        PreparedStatement preparedStatement = null;

        try {
            // Step 1: Load the GridDB JDBC driver
            Class.forName("com.toshiba.mwcloud.gs.sql.Driver");

            // Step 2: Establish the connection
            connection = DriverManager.getConnection(JDBC_URL, USER, PASSWORD);
            connection.setAutoCommit(false);

            // Step 3: Create a SQL query for batch insertion
            String sql = "INSERT INTO sensor_data (sensor_id, temperature, humidity) VALUES (?, ?, ?)";
            preparedStatement = connection.prepareStatement(sql);

            // Step 4: Add data to the batch
            for (int i = 1; i <= 1000; i++) {
                preparedStatement.setInt(1, i);
                preparedStatement.setFloat(2, 20.5f + i);
                preparedStatement.setFloat(3, 60.5f + i);
                preparedStatement.addBatch();

                // Execute the batch every 100 inserts
                if (i % 100 == 0) {
                    preparedStatement.executeBatch();
                    connection.commit(); // Commit after each batch
                }
            }

            // Execute any remaining batch operations
            preparedStatement.executeBatch();
            connection.commit();

            System.out.println("Batch insertion completed successfully!");

        } catch (ClassNotFoundException | SQLException e) {
            try {
                if (connection != null) {
                    connection.rollback();
                }
            } catch (SQLException rollbackEx) {
                rollbackEx.printStackTrace();
            }
            e.printStackTrace();
        } finally {
            try {
                if (preparedStatement != null) {
                    preparedStatement.close();
                }
                if (connection != null) {
                    connection.close();
                }
            } catch (SQLException closeEx) {
                closeEx.printStackTrace();
            }
        }
    }
}

Additional context:

Any insights, code snippets, or performance tips would be greatly appreciated!


Solution

  • It is not entirely clear from your description whether there is any performance issue at the moment, or whether you are doing premature optimization.

    If we talk about your code, then:

    1. Don't commit() after each executeBatch(). Only at the end.
    2. Try either not to executeBatch() every 100 inserts, or increase the batch size, at least to 1000.