snowflake-cloud-data-platformsnowflake-schema

External staging data of parquet returns NULL in Snowflake


I am having an external stage on S3 where the parquet files are stored. Now, I have created a table as T1_D1:

CREATE TABLE T1_D1(ID NUMBER(38,0),
TRANSACTION_DATE TIMESTAMP_NTZ(9),
PRODUCT VARCHAR(16777216))

Now, I am running the below script, to test whether the data I am loading is correct or not.

select 
$1:ID::number,
$1:PRODUCT::VARCHAR(16777216)
from @my_s3/T1/day_2_sales.parquet
(file_format => my_parq)    

It successfully returns the ID numbers, whereas the product is returned as NULL, in all the rows. I have even tried to run with the transaction date column but the same problem persists. (i.e) ID matches and returns the data correctly but timestamp values throw me NULL.

$1:ID::NUMBER, $1:TRANSACTION_DATE::TIMESTAMP_NTZ(9)
1,
2,
3,
6,
7,
8,
9,
6,
7,
8,
9,
10,
11,

As an additional step i have used AWS crawler, on the source parquet files which gives me a schema of the following table as follows.

1 id : bigint
2 transaction_date : string 
3 product : string

Solution

  • Things i have Experimented out are as follows.

    select $1:ID::number, $1:PRODUCT::VARCHAR(16777216) from @my_s3/T1/day_2_sales.parquet (file_format => my_parq)

     It returned ID's correctly but the product values are null.
    

    select $1:ID::number, $1:Transaction_date::timestamp, $1:Product::varchar from @my_s3/T1/day_1_sales.parquet (file_format => my_parq)