amazon-s3aws-glueamazon-athenaaws-glue-data-catalog

How can we update existing partition data in aws glue table without running crawler?


When we are updating data in existing partition by using manual upload to s3 bucket then the data is showing in existing partition in athena glue table.

But when data is updated by using API, the data uploaded to s3 bucket is in existing partition, but in glue table data is stored in the different partition which is current date[last modified] (August 2, 2022, 17:52:15 (UTC+05:30)),

enter image description here

but in my s3 bucket partition date is different(s3://aiq-grey-s3-sink-created-at-partition/topics/core.Test.s3/2022/07/19/) which is 2022/07/19 this.

so when I check same object in glue table I want the partition by this date 2022/07/19.

but it shows the partition by current date without running crawler.

When I run crawler it writes data in correct partition,

enter image description here

but I don't want to run crawler every single time.

How can I update data in existing partition on glue table by using API ? Am I missing some configuration that is needed to achieve required result for this process ? Please suggest if anybody has Idea on this.


Solution

  • Here're two solutions I proposed:

        athena = boto3.client('athena')
        response = athena.start_query_execution(
            QueryString='ALTER TABLE table ADD PARTITION ... LOCATION ... ', // compose the query as you need
            QueryExecutionContext={
                'Database': database
            },
            ResultConfiguration={
                'OutputLocation': output,
            }
        )