javapostgresqlspring-bootprintwriterstringwriter

Fetch Millions of Records using CopyManager throwing OutOfMemoryError


When I tried running my code in tables with rows in thousands, it worked fine but as a performance testing, I tried it with tables having millions of records and then I encountered this issue.

I have tried the above approach. I can try an alternate approach without the copymanager but it would be great if I can use this piece of code. Any suggestions are welcome.

OFFSET is 0;
LIMIT is 3 million

Here is the code:

Connection connection = null;
            
                    try(StringWriter out = new StringWriter(); Writer printWriter = new PrintWriter(out, true)) {
                        connection = dataSource.getConnection();
                        OFFSET = 0;
    // totalRecords is the total rows in the table
                        while (OFFSET <= totalRecords && totalRecords > 0) {
            
                            if (connection != null && connection.isWrapperFor(PGConnection.class)) {
            
                                PGConnection pgConnection = connection.unwrap(PGConnection.class);
                                CopyManager copyManager = null;
                                copyManager = pgConnection.getCopyAPI();
            
                                String sql = null;
                                    sql = "SELECT " + table.getAttributeList() + " FROM " + table.getSchemaName().trim() + "."
                                            + table.getTableName().trim() + " WHERE " + "modified_ts" + " > " + "'" + dateTime + "'"
                                            + " OFFSET " + OFFSET + " LIMIT " + LIMIT;
            
            
                                LOGGER.info(sql);
            
                                long i;
    // Here I am trying to copy the Result into the printwriter
                                i = copyManager.copyOut("COPY (" + sql + " ) TO STDOUT WITH (FORMAT CSV)", printWriter);
                                LOGGER.info("Total no of records in {} : {}", table.getTableName(), i);
                                printWriter.flush();
                                OFFSET = OFFSET + LIMIT;
                                
            
                            }
                        }
                        String now = LocalDate.now().format(DateTimeFormatter.ofPattern("dd/MM/yyyy")).replace("/", "");
                        String localFileName = table.getTableName() + "_" + now + ".csv";
                        InputStream inputStream = new ByteArrayInputStream(out.toString().getBytes(UTF8));
                        postObjectToS3.uploadFile(saveFilePath + localFileName, inputStream);
                        
                        inputStream.close();
                    }

and here is the stacktrace:

Exception in thread "main" java.lang.reflect.InvocationTargetException at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:49) at org.springframework.boot.loader.Launcher.launch(Launcher.java:108) at org.springframework.boot.loader.Launcher.launch(Launcher.java:58) at org.springframework.boot.loader.JarLauncher.main(JarLauncher.java:88) Caused by: java.lang.OutOfMemoryError: Java heap space at java.base/java.util.Arrays.copyOf(Arrays.java:3745) at java.base/java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:172) at java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:633) at java.base/java.lang.StringBuffer.append(StringBuffer.java:397) at java.base/java.io.StringWriter.write(StringWriter.java:122) at java.base/java.io.PrintWriter.write(PrintWriter.java:542) at java.base/java.io.PrintWriter.write(PrintWriter.java:559) at org.postgresql.copy.CopyManager.copyOut(CopyManager.java:92)


Solution

  • The issue was with the following line:

    while (OFFSET <= totalRecords && totalRecords > 0) {
    
    // some code
    i = copyManager.copyOut("COPY (" + sql + " ) TO STDOUT WITH (FORMAT CSV)", printWriter);
    }
    

    I was iterating the loop and was copying the result set of the query to the printwriter. It store as much records as possible but in the end, ran out of memory.

    To resolve this, I simply used the data inside the printwriter in every iteration. The basic idea which solved this was:

    while (OFFSET <= totalRecords && totalRecords > 0) {
    printwriter = new PrintWriter(out);
    // some code
    i = copyManager.copyOut("COPY (" + sql + " ) TO STDOUT WITH (FORMAT CSV)", printWriter);
    /* Used printwriter data to create a file or append the data if file exists. Closed the printwriter.*/
    }
    

    Hope this clarifies the problem and the solution. Let me know if more clarification is needed.