javasupercsv

Out of memory with superCSV java library


Here is the code that counts the number of lines in a file. It works with BufferedReader and is fine. No problem . In total there are over 25,000,000 rows

  BufferedReader br = new BufferedReader(new FileReader("C:\\...test.csv")); 
            int lineNbr = 0; 
            while(br.readLine() != null) { 
                lineNbr++; 
                if (lineNbr%1000000==0) { 
                    System.out.println(lineNbr);
                } 
            } 
  br.close(); 
  System.exit(0); 

Here is a similar code with SuperCSV . It throws out of memory after line 11,000,000

 CsvListReader reader = new CsvListReader(new FileReader("C:\\... test.csv"), CsvPreference.EXCEL_PREFERENCE ); 

             List<String> row = reader.read();
            row = reader.read();
                lineNbr = 0;   
            while (reader.read() != null) { 
                lineNbr++; 
                if (lineNbr%1000000==0) { 
                    System.out.println(lineNbr);
                } 


            }

            reader.close(); 
            System.exit(0); 

What am i doing wrong? How to correctly read a file with SuperCSV ?


Solution

  • Based on your sample code and quick review of the SuperCSV code, I don't see any reason for an OutOfMemory exception to be thrown. I suspect you did not post all information in your sample, or something else is at play.

    You can review the source code for SuperCSV here:

    I do not see any state being stored that would cause referenced heap memory to grow in a way that could not be garbage collected.

    Another possibility is that your CSV file is corrupt, perhaps missing line breaks at some point. The library makes a readLine call at at least one location.