sqloracle-databasebulk-operations

Group By on Consecutive Records for audit tables which has huge data in Oracle


I have an audit table in oracle whose data is growing very fast and has already became very large Looking that i have decided to cut short it by not auditing 'CALLED_TIME' column.

So I need to delete records from audit table where audit got recorded for CALLED_TIME and later I want to delete CALLED_TIME column from my table(which is easy) to get it not recorded further.

It will be better if I can list out REVs to be deleted

Data in table before

REV     CALLED_TIME             REVTYPE DATA1   DATA2   DATA3
239402  2014-08-20 20:48:20     0       12122   4       22
239403  2014-08-20 20:52:17     1       12122   4       22
239404  2014-08-20 20:52:58     1       12122   4       22
239405  2014-08-20 20:53:13     1       12122   4       22
239406  2014-08-20 20:53:13     1       12122   4       223
239407  2014-08-20 21:02:05     1       12122   4       223
239408  2014-08-20 21:02:39     1       12122   4       223
239409  2014-08-20 21:04:22     1       12122   4       223
239410  2014-08-20 21:27:53     1       12122   4       223
239411  2014-08-20 21:28:51     1       12122   4       223
239412  2014-08-20 21:29:50     1       12122   4       223
239413  2014-08-20 21:29:50     1       12122   43      223
239414  2014-08-20 21:46:19     1       12122   43      223
239415  2014-08-20 21:46:51     1       12122   43      223
239416  2014-08-20 21:53:08     1       12122   43      223
239417  2014-08-20 22:00:45     1       12122   43      223
239418  2014-08-20 22:01:26     1       12122   43      223
239419  2014-08-20 22:23:01     1       111141  43      223
239420  2014-08-20 22:23:48     1       111141  43      223
239421  2014-08-20 22:32:11     1       111141  43      223
239422  2014-08-20 22:44:42     1       111141  43      223
239423  2014-08-20 22:46:38     1       111141  43      223
239414  2014-08-20 22:55:33     2       111141  43      223

Output should look like

REV     CALLED_TIME             REVTYPE DATA1   DATA2   DATA3
239402  2014-08-20 20:48:20     0       12122   4       22
239406  2014-08-20 20:53:13     1       12122   4       223
239413  2014-08-20 21:29:50     1       12122   43      223
239419  2014-08-20 22:23:01     1       111141  43      223
239414  2014-08-20 22:55:33     2       111141  43      223

I already have looked many related solutions but the one i chose is joining table twice and hence getting very lazy.


Solution

  • From your sample data it looks like you want to delete most of your records. So this is a case where the best approach is to create a table with the data you want to retain and then just zap your original table.

    create table temp_audit as 
         select * from (
             select t.*
                    , row_number() over (partition by REVTYPE, DATA1, DATA2, DATA3
                                         order by CALLED_TIME asc) rn
              from orig_audit
                  )
         when rn = 1;
    

    Then you need to raze the original table and substitute the reserved table. The quickest approach might be:

    drop table orig_audit;
    rename temp_audit to orig_audit;
    

    But if you have a lot of indexes, grants, foreign keys to reinstate that might not be so quick. Alternatives? If you have no foreign keys referencing the table then this will do the trick ...

    truncate table  orig_audit;
    insert into orig_audit
         select * from temp_audit;
    

    ... otherwise ...

    delete from orig_audit;
    insert into orig_audit
         select * from temp_audit;