sqlhadoopdatabricksazure-synapsetruncated

Databricks throwing error: truncating data


Whenever I try to save a specific DataFrame on the DW I get the message:

ERROR: An error occurred while calling o692.save. : com.databricks.spark.sqldw.SqlDWSideException: SQL DW failed to execute the JDBC query produced by the connector. Underlying SQLException(s): - com.microsoft.sqlserver.jdbc.SQLServerException: HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: HadoopSqlException: String or binary data would be truncated. [ErrorCode = 107090] [SQLState = S0001]

I've checked the size of the strings in my csv file. The bigger one has 38 chars.

This is my save/write method (worked for other DataFrames):

df.write\
 .format('com.databricks.spark.sqldw') \
 .option('url', conn_string_dw) \
 .option('maxStrLength', '4000') \
 .option('forwardSparkAzureStorageCredentials', 'true') \
 .option('dbTable', db_table_name) \
 .option('tempDir', dw_temporary_path_url) \
 .option('truncate', 'False')\
 .mode('append')\
 .save()

What could be happening here?


Solution

  • The problem was on the final file. One specific cell contained multiple lines which caused this truncating problem.