sqljoinsnowflake-cloud-data-platformdate-conversioninner-query

How do I handle exception while converting date


I have a Sales table and a Period table.

Sales table

+--------------+-------------+
| Country_Code | Period_Code |
+--------------+-------------+
| CH           | MAI_18      |
| CH           | JUN_18      |
| NO           | 2020-01-21  |
| NO           | 2020-01-21  |
+--------------+-------------+

Period table

+--------------+-------------+
| Country_Code | Period_Code |
+--------------+-------------+
| NO           | 200121      |
| NO           | 200122      |
+--------------+-------------+

I am getting the below error while I join these 2 tables.

enter image description here

Below is the sql I used. As per my understanding, since I have given the country filter as NO in the 2nd line, it should first execute the 2nd line and then do the join only on NO country. But internally it is considering the CH country also which is causing to have the below sql fail.

SELECT DISTINCT SAL.COUNTRY_CODE,PER.PERIOD_CODE
FROM (SELECT * FROM MYSCHEMA.SALES WHERE COUNTRY_CODE in('NO')) SAL
JOIN MYSCHEMA.PERIOD PER 
    ON SAL.COUNTRY_CODE=PER.COUNTRY_CODE    
    AND TO_CHAR(TO_DATE(SAL.PERIOD_CODE,'YYYY-MM-DD'),'YYMMDD') = PER.PERIOD_CODE
ORDER BY 1

I used the CTE also which is behaving the same way. There could be some solution for this. I would really appreciate if some one can help on this to have the sql run without any errors.


Solution

  • Snowflake explicitly does transforms earlier than the SQL specification states, thus the transform of you text/variant data is happen before you expect. I have raised this issue with them in 2016, while at times they fix/alter some of the behavior, it can also change and thus start breaking.

    In cases where the contents of the column is not all valid, which is your situation you should use the TRY_ version of commands, thus here use TRY_TO_DATE to avoid the error.

    thus you should find this works:

    SELECT DISTINCT SAL.COUNTRY_CODE,PER.PERIOD_CODE
    FROM (SELECT * FROM MYSCHEMA.SALES WHERE COUNTRY_CODE in('NO')) SAL
    JOIN MYSCHEMA.PERIOD PER 
        ON SAL.COUNTRY_CODE=PER.COUNTRY_CODE    
        AND TO_CHAR(TRY_TO_DATE(SAL.PERIOD_CODE,'YYYY-MM-DD'),'YYMMDD') = PER.PERIOD_CODE
    ORDER BY 1
    

    I would tend to move the date transform into the sub-select (or CTE if you where using one of those) to make the join simpler, albeit, Snowflake do this for you, thus I would write it like:

    SELECT DISTINCT sal.country_code,
        per.period_code
    FROM (
        SELECT country_code,
            TO_CHAR(TRY_TO_DATE(SAL.PERIOD_CODE,'YYYY-MM-DD'),'YYMMDD') AS join_code
        FROM myschema.sales 
        WHERE country_code in ('NO')
    ) sal
    JOIN myschema.period per 
        ON sal.country_code = per.country_code    
        AND sal.join_code = per.period_code
    ORDER BY 1
    

    But given your example table, which I assume do not hold all the columns, just the needed to reproduce the problem, which should be able to be triggered via:

    SELECT country_code,
        TO_CHAR(TRY_TO_DATE(SAL.PERIOD_CODE,'YYYY-MM-DD'),'YYMMDD') AS join_code
    FROM myschema.sales 
    WHERE country_code in ('NO')