I have an .xlsx file with lots of formulas. I want to transform this file to a new .xlsx file with all formulas replaced with its absolute value. Why? probably not related with this question.
My target was to perform this task in as low amount of heap memory as possible. So I used a combination of apache POI XSSF event API to read the source file and SXSSF API to write the output file. It worked well.
All measurements are taken using JProfiler 10
When I run my code to convert a file of around 25K+ rows (with around 25K * 23 formulas), it used around 250 MB of heap space at the peak. Now I ran the same command with -Xmx24M
and the code is managed to run within this memory limit which is significantly lower than the first run.
The garbage collector did not get around to immediately freeing memory in your first run. Processing XLSX files will end up generating a lot of transient objects, so the first run lets them build up to around 250MB before cleaning up; while the second, memory-constrained run, is forced to clean up the objects sooner.
The garbage collector has many, many options and strategies that can be configured. Off the top of my head, the only way I know to limit consumption for only that specific code is to run that code only in its own JVM process, with appropriate GC parameters.