guptateamdevelopersqlbasecentura

Deleting duplicates on SQLBase


Im currently running a SQLBase database and tried several was to select and delete duplicate items.

Here a scenario:

Tablename: test1

||=========||=========||==================||
|| column1 || column2 || rowid            ||
||=========||=========||==================||
||    1    ||    1    || AAAAAAAAAAAAAAAA ||
||    1    ||    1    || BBBBBBBBBBBBBBBB ||
||    1    ||    1    || CCCCCCCCCCCCCCCC ||
||    1    ||    2    || DDDDDDDDDDDDDDDD ||
||    1    ||    2    || EEEEEEEEEEEEEEEE ||
||    1    ||    3    || FFFFFFFFFFFFFFFF ||
||    1    ||    4    || GGGGGGGGGGGGGGGG ||
||=========||=========||==================||

RowID is a virtual column. This gets created by the system.

What I want to do is delete all duplicates so I end up with:

||=========||=========||==================||
|| column1 || column2 || rowid            ||
||=========||=========||==================||
||    1    ||    1    || AAAAAAAAAAAAAAAA ||
||    1    ||    2    || DDDDDDDDDDDDDDDD ||
||    1    ||    3    || FFFFFFFFFFFFFFFF ||
||    1    ||    4    || GGGGGGGGGGGGGGGG ||
||=========||=========||==================||

The problem is, its in SQLBase and there is no function that iterates my duplicates.

I find duplicates like:

SELECT column1, column2 COUNT(*) 
FROM test1 
GROUP BY column1, column2 
HAVING COUNT(*) > 1;

And there is my problem. Can't find a way to delete them from that point on. Also I obviously can't add the rowid to the duplicate select because of the group by statement.

Are there any possibilitys delete the duplicates so having only one entry each combination of column1 und column2?


Solution

  • For find rows to delete, you can use better select:

    select a.rowid 
    from test1 a 
    where a.rowid not in 
    (
       select max (x.rowid)
       from test1 x
       group by x.column1, x.column2
    );
    

    Unfortunately, does not apply:

    delete
    from test1 a 
    where a.rowid not in 
    (
       select max (x.rowid)
       from test1 x
       group by x.column1, x.column2
    );
    

    But you can delete this with two sql handle:

    Call SqlPrepareAndExecute(hSql1, 'select a.rowid from test1 ...(like above)...into :sRowid')
    While SqlFetchNext(hSql1, nFetch)
       Call SqlPrepareAndExecute(hSql2, 'delete from test1 where rowid = :sRowid')
    

    Example as Sql Base stored procedure:

    Procedure: delDuplicityTest1
    Parameters
    Local Variables
       Sql Handle: hSql1
       Sql Handle: hSql2
       String: sSelect
       String: sRowid
       Number: nFetch
    Actions
          Call SqlConnect(hSql1)
          Call SqlConnect(hSql2)
          Set sSelect = 'select a.rowid 
                         from test1 a 
                         where a.rowid not in 
                         (
                            select min (x.rowid)
                            from test1 x
                            group by x.column1, x.column2
                         )
                         into :sRowid '
          Call SqlPrepareAndExecute(hSql1, sSelect)
          While SqlFetchNext(hSql1, nFetch)
             Call SqlPrepareAndExecute(hSql2, 'delete from test1 where rowid = :sRowid')
    
          Call SqlCommit(hSql2)
          Call SqlDisconnect(hSql1)
          Call SqlDisconnect(hSql2)
    /