I'm building a database on Starburst using unmanaged (or external) tables that read CSV files stored on S3.
The file path structure will be as follows:
bucket_name/file_name/year/month/day
The CSV files are loaded weekly, and my tables need to read only the most recent file, i.e., the one with the latest date in the path.
Is there a way to define a parameterized external location in the table definition so that only the latest file is read?
For example, something like:
bucket_name/file_name/<year>/<month>/<day>
No, that's not possible, but since you are also using year/month/day as what looks to be partitioning you can put the partition_by definition in the create table statement then you could add the dates you are searching for in your where clause and thereby limiting what is being read.