sql-serverssisetlsql-server-2017ssis-2017

Import fixed width file with virtual decimals


I have a fixed width master data file. The file contains many columns with costs but no decimal point.This data needs to be imported to SQL server.

With the help of specification document, I know the precision and scale of those columns.

I tried below and a combination of one but not other in SSIS 2017 as below to no success:

  1. Set the precision and scale in the 'flat file connection manager'.
  2. Use data conversion transformation DT_Numeric specifying precision and scale

I set up a redirect on error path and all the rows are taking that path.


Solution

  • When dealing with implied decimal places, there are really two approaches: string manipulation, as Hadi demonstrates, to put the decimal character into the string for an explicit value.

    The second approach is to just math it out.

    Either approach will result in you having to define the column in the flat file connection manager differently than the target type. Once the data is injected into the pipeline, then you will need to modify it to meet the destination type.

    If you use Hadi's solution, save yourself the trouble of multiple casts (data type changes) and just import the value as a string. You perform string manipulation using a Derived Column Transformation to split the string apart at the location of the implicit decimal place. Logic approximately

    SUBSTRING(MyColumn, 1, IntegerDigits) + "." + RIGHT(MyColumn, ScaleDigits)
    

    where IntegerDigits and ScaleDigits are the number of digits before and after the decimal place.

    I would call that column something like MyColumnWithDecimalPlace in the Derived Column Transformation.

    I would then add a second Derived Column Transformation (or a Data Conversion Transformation) after the above Derived Column Transformation. In this step, we'll convert this string value to a numeric value. Why two steps? If something is weird with the data, I can put a data view (or data tap) between those two points and eye ball what is wrong. If I try to do it all in one fell swoop, then I can't see whether it's the adding a decimal place into a string that is failing or the string value I just built is broken and can't be turned into a number.

    The other approach would be to define the source column as something like an Integer and then use a Derived Column Transformation to math it out. That would look something like

    MyColumn / POWER(10, ScaleDigits)
    

    again where ScaleDigits is the number of decimal places we expect to have. I tend to favor this approach as it's simpler - one operation and done.