sql-serverpentahopdi

Pentaho PDI: String truncated to 9.999.999 characters with table_output to sql server


Hi I'm having some issue with long strings in combination with PDI and SQL Server...

I load a long string (+- 19M charactares) from postgresql server. (text field, no issues). When i try to immediatly insert the same field in a sql server field i get the message the field is truncated in the logs, and i see the field in sql server (varchar (max)) has only 9.999.999 characters.

enter image description here

When i read the string after table_input it is still 19M characters long. The truncating seems to happen on the insert.

Does anyone know why this happens? It's my understanding the varchar(max) should be able to hold 2^31-1 bytes (2 GB).

Total transformation: enter image description here


Solution

  • It seems PDI has a limitation in the table_output step of 9.999.999 characters for any kind of text which as far as i can see has not been mentioned in their documentation. Inserting via other means, for example a python script in pentaho does seem to do the trick.