javatomcatjdbcbulkinsertjdbc-pool

Apache Tomcat JDBC Connection Pool bad performance on batch \ bulk inserts


I have recently incorporated the Apache Tomcat JDBC Connection Pool to my application (using MySQL DB). I tried using Apache DBCP before, but didn't like its results, and the tomcat implementation seemed to fit my needs even though I run a standalone java application and don't use tomcat at all.

Recently, I encountered a huge performance problem when executing batch (aka bulk ) insert queries.

I have a flow in which I insert ~2500 records to a table in a batched fashion. It takes forever when using the jdbc connection pool, compared to a few seconds when reverting back to opening a connection for each query (no pooling).

I wrote a small application that inserts 30 rows to the same table. It takes 12 seconds when pooling, and ~ 800 millis when not pooling.

Prior to using the connection pool, I used com.mysql.jdbc.jdbc2.optional.MysqlDataSource as my DataSource. The connection was configured with the following line:

dataSource.setRewriteBatchedStatements(true);

I'm quite sure that this is the core difference between the two approaches, but couldn't find an equivalent parameter in jdbc-pool.


Solution

  • MySql JDBC driver does not support batch operations. RewriteBatchedStatement is the best that you can get. Here the code from mysql PreparedStatement.java:

     try {
                statementBegins();
    
                clearWarnings();
    
                if (!this.batchHasPlainStatements && this.connection.getRewriteBatchedStatements()) {
    
                    if (canRewriteAsMultiValueInsertAtSqlLevel()) {
                        return executeBatchedInserts(batchTimeout);
                    }
    
                    if (this.connection.versionMeetsMinimum(4, 1, 0) && !this.batchHasPlainStatements && this.batchedArgs != null
                            && this.batchedArgs.size() > 3 /* cost of option setting rt-wise */) {
                        return executePreparedBatchAsMultiStatement(batchTimeout);
                    }
                }
    
                return executeBatchSerially(batchTimeout);
            } finally {
                this.statementExecuting.set(false);
    
                clearBatch();
            }
    

    It is one of the reason why I do not like MySql and prefer Postgres

    EDIT:

    You should combine connection pool, batch operation, and RewriteBatchedStatement option. You can set RewriteBatchedStatement option through jdbc url parameter: jdbc:mysql://localhost:3307/mydb?rewriteBatchedStatements=true