talend

How to fix Java heap space error in Talend?


I have an ETL flow through talend and there:

  1. Read the zipped files from a remote server with a job.
  2. Take this files unzipes them and parse them into HDFS with a job. Inside the job exists a schema check so if something is not

My problem is that TAC server stopes the execution because of this error:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at org.talend.fileprocess.TOSDelimitedReader$ColumnBuffer4Joiner.saveCharInJoiner(TOSDelimitedReader.java:503) at org.talend.fileprocess.TOSDelimitedReader.joinAndRead(TOSDelimitedReader.java:261) at org.talend.fileprocess.TOSDelimitedReader.readRecord_SplitField(TOSDelimitedReader.java:148) at org.talend.fileprocess.TOSDelimitedReader.readRecord(TOSDelimitedReader.java:125) ....

Is there any option to avoid and handle this error automatically? There are only few files which cause this error but I want to find a solution for further similar situation.


Solution

  • In the TAC Job Conductor, for a selected job, you can add JVM parameters.

    enter image description here

    Add the -Xmx parameter to specify the maximum heap size. The default value depends on various factors like the JVM release/vendor, the actual memory of the machine, etc... In your situation, the java.lang.OutOfMemoryError: Java heap space reveals that the default value is not enough for this job so you need to override it.

    For example, specify -Xmx2048m for 2048Mb or 2gb