amazon-redshift

Is there any way to find table creation date in redshift?


I am having trouble with finding table creation date in Amazon Redshift. I know svv_table_info will give all the info about the table but the creation date.Can anyone help please?


Solution

  • In Redshift the other ways you can get the create time of your table by searching for the start and stop time of any create table sql run in the svl_qlog. There are other tables you can look at to get similar data but the problem with this way is that it's only kept for a couple of days (3 - 5). Although everyone would like metadata stored along with the table itself to query. Amazon recommends to keep this data to export the data to S3 from the logs you want to retain to S3. Then in my opinion you could import these s3 files back into a permanent table you want called aws_table_history or something so that this special data you keep forever.

    select * from svl_qlog where substring ilike 'create table%' order by starttime desc limit 100;
    
    select * from stl_query a, stl_querytext b where a.query = b.query and b.text ilike 'create table%' order by a.starttime desc limit 100; 
    

    Or get just the Table name and date like this:

    select split_part(split_part(b.text,'table ', 2), ' ', 1) as tablename, 
    starttime as createdate 
    from stl_query a, stl_querytext b 
    where a.query = b.query and b.text ilike 'create table%' order by a.starttime desc;
    

    Export the Create Table data history you want to your created S3 bucket with your keys. The below select statement will output the table name created and the datetime it was created.

    Create a temp table with the data you want to export to S3.

    create table temp_history as 
    (select split_part(split_part(b.text,'table ', 2), ' ', 1) as tablename, starttime as createdate 
    from stl_query a, stl_querytext b 
    where a.query = b.query 
    and b.text ilike 'create table%' order by a.starttime desc);
    

    Then upload this table to S3.

    unload ('select * from temp_history') 
    to 's3://tablehistory' credentials 'aws_access_key_id=myaccesskey;aws_secret_access_key=mysecretkey' 
    DELIMITER '|' NULL AS '' ESCAPE ALLOWOVERWRITE;
    

    Create a new table in AWS Redshift.

    CREATE TABLE aws_table_history
    (
    tablename VARCHAR(150),
    createdate DATETIME
    );
    

    Then import it back in to your custom table.

    copy aws_table_history from 's3://tablehistory' credentials 'aws_access_key_id=MYKEY;aws_secret_access_key=MYID'
    emptyasnull
    blanksasnull
    removequotes
    escape
    dateformat 'YYYY-MM-DD'
    timeformat 'YYYY-MM-DD HH:MI:SS'
    maxerror 20;
    delimiter '|';
    

    I tested all this and it works for us. I hope this helps some people. Lastly a simpler method would be to use Talend Big Data Open Studio and create a new job grab the component tRedshiftRow and paste the following SQL into it. Then build the job and you can schedule to run the .bat (windows) or .sh (unix) in any environment you want.

    INSERT INTO temp_history 
    (select split_part(split_part(b.text,'table ', 2), ' ', 1) as tablename, starttime as createdate 
    from stl_query a, stl_querytext b 
    where a.query = b.query 
    and b.text ilike 'create table%' order by a.starttime desc);
    COMMIT;
    insert into historytable
    select distinct s.* 
    from temp_history s;
    COMMIT;
    --remove  duplicates 
    DELETE FROM historytable USING historytable a2 
    WHERE historytable.tablename = a2.tablename AND
    historytable.createdate < a2.createdate;
    COMMIT;
    ---clear everything from prestage
    TRUNCATE temp_history;
    COMMIT;