amazon-web-servicestimestampparquetdata-ingestionaws-iot-analytics

How to format Timestamp for ingesting JSON data into AWS IoT Analytics datastore using Parquet file format?


I'd like to ingest data into an AWS IoT Analytics datastore in Parquet format. This is how the records are in the channel.

{
  "Total_in": 1825.5841,
  "Time": "2023-02-17T14:08:19"
}

Question is, how do I need to format the time (in a transformation as part of a pipeline activity), to be used as a "timestamp" in the parquet file?

The schema of the parquet files looks like the following.

Column name   Data type
time          TIMESTAMP
total_in      FLOAT

I tried to use timestamp in seconds, in milliseconds as well as the %Y-%m-%dT%H:%M:%S (Python) and in this case never a records gets into the data store ("Last message arrival time" is always none). If I change to %Y-%m-%dT%H:%M:%S..%fZ records arrive in the data store ("Last message arrival time" is not null), but if I run a query (Select * from datastore), then the result set is empty.

I already enabled logging, but neither the pipeline logs nor the datastore logs contain any information.

The datastore does not contain partitions/partitions are disabled.


Solution

  • The timestamp needs to be provided in the format yyyy-MM-dd HH:mm:ss (eg: 2020-10-22 11:23:48).