sqloracle-database

DELETE Multiple Duplicate Rows Without Column ID


I have a table that contains 15 duplicate rows (for a total of 30 including originals + duplicates). I have been tasked with writing a script to grab the duplicate records and delete them. My problem is, the table does not contain an ID column like "shipment_item_id". So I found a solution to use rowid to add my own id column and then I can distinguish the dupes from the originals.

I wrote a SELECT statement that gets the duplicates, but when I try to include it in a DELETE statement, I get an error :

ORA-00913 - too many values.

I find it hard to believe that 15 records is too many.

Here is the script. I'm sure it's an easy fix but I have not been able to figure it out:

DELETE shipment_item
 WHERE rowid NOT IN (
                     SELECT MAX(rowid),
                            po_number,
                            item_number
                       FROM shipment_item
                      GROUP BY po_number, item_number
                     HAVING COUNT(*) > 1
                    )

Solution

  • The "two many values" does not talk about the rows, but about the columns:
    you do a:

    where <1 column> not in (select <3 columns>)
    

    Do keep only the max(rowid) in your select and you're good to go.

    That said...

    You shall remove your having count(*) > 1 to avoid deleting non duplicated entries.

    Let's say your table has:

    rowid po_number item_number
    1 A B
    2 C D
    3 C D

    Your subselect will return 3 for C,D (as intended),
    but the having count(*) > 1 will filter out 1 for A,B because as you have only one A,B ,its count(*) = 1.
    Thus the subselect returns only 3 as an entry to preserve, and thus... pooof with A,B!

    And so...

    You'll end up with:

    DELETE 
    FROM SHIPMENT_ITEM
    WHERE rowid not in (
    SELECT max(rowid)
    FROM shipment_item
    GROUP BY
        po_number,
        item_number
    )