sql-serverssisdata-conversionssis-2016

Data Conversion text to numeric in SSIS is removing characters


I am facing a strange issue while using SSIS "Data Conversion component" to convert string to decimal datatype. I use SSIS 2016.

The source data input has values of mixed data types- string, integer, decimal and is defined as varchar in the flat file source. The target data type expected is numeric. When explicit type conversion happens from string to decimal, we expect the alphanumeric values to get rejected to error table and only the numeric values to pass through. Instead, we are seeing some alphanumeric values shedding the characters in the value and passing through successfully with no error.

Examples: Value "3,5" converted to 35 Value "11+" converted to 11

We do not have control over source data and will not be able to replace char data before passing data into Data conversion component. We have tried the below steps as a workaround and it has worked. i.e,

  1. First Data Conversion from DT_STR to DT_NUMERIC
  2. Capture error rows that fail the above conversion
  3. Second Data Conversion from DT_NUMERIC to DT_DECIMAL

But as the source data is not reliable, we may have to apply this workaround wherever there are numeric fields (int types & deicmals) which is not a friendly solution. So checking with you all to understand if there is an easier and better solution tried out by anyone.


Solution

  • I did not expect this result, but I tried an expression task and it worked for DT_DECIMAL:

    (DT_DECIMAL,1)"11+" -- evaluates to 11.0
    

    But it does not work for DT_NUMERIC. SSIS won't allow a direct numeric result, but it can be nested inside a cast to DT_DECIMAL. Just to demonstrate that, in an expression task even this "numerically valid" cast would not be permitted, because the output simply can't be of type DT_NUMERIC:

    (DT_NUMERIC, 3, 0)123
    

    But this is permitted:

    (DT_DECIMAL,0)((DT_NUMERIC, 3, 0)123)
    

    So as long as you are happy to specify a precision and scale big enough to hold your data during the "validity" check done by DT_NUMERIC, and then cast it from there to DT_DECIMAL, all in a derived column transform, then DT_NUMERIC seems to enforce the strict semantics you want.

    SSIS allows this:

    (DT_DECIMAL,0)((DT_NUMERIC, 2, 0)"11") 
    

    But not either of these:

    (DT_DECIMAL,0)((DT_NUMERIC, 2, 0)"11+") 
    
    (DT_DECIMAL,0)((DT_NUMERIC, 2, 0)"3,5")