sqloracle-databasecountduplicatesinstance

How can I number the instance of a duplicate record?


I have contacts who have the same phone number listed multiple times. I'd like to delete all but one of those records. I only have read only access to the database, so I'll have to create a csv file to delete the records through the program interface.

in the csv file, I would like to number the instance of the duplicate

Something like this:

id PhoneNumber Instance
1243 5555555555 1
1511 5555555555 2
1631 5555555555 3
1131 5515551111 1

I know how to show a count on the records, but I'm really hoping to get the instance instead.

This is my code :

SELECT DISTINCT
  PS.PHONENUMBER.PHONENUMBER,
  PA1.PERSONPHONENUMBERASSOCID,
  PA1.PERSONID,
  PA1.ISPREFERRED,
  PA1.PHONENUMBERPRIORITYORDER
FROM
  PS.PHONENUMBER
  INNER JOIN PS.PERSONPHONENUMBERASSOC PA1 ON PA1.PHONENUMBERID = PS.PHONENUMBER.PHONENUMBERID
  INNER JOIN PS.PERSONPHONENUMBERASSOC PA2 ON PA1.PERSONID = PA2.PERSONID AND PA1.PHONENUMBERID = PA2.PHONENUMBERID AND PA1.PERSONPHONENUMBERASSOCID <> PA2.PERSONPHONENUMBERASSOCID
ORDER BY
  PA1.PERSONID,
  PA1.PHONENUMBERPRIORITYORDER

Solution

  • --    S a m p l e    D a t a :
    Create Table PHONES AS
    ( Select 1131 "ID",     4515551111 "PHONE_NUMBER" From Dual Union All
      Select 1154,  4400051133  From Dual Union All
      Select 1214,  4400051133  From Dual Union All
      Select 1243,  5555555555  From Dual Union All
      Select 1511,  5555555555  From Dual Union All
      Select 1631,  5555555555  From Dual Union All
      Select 1965,  3333051133  From Dual 
    );
    

    Use one of analytic functions like below (there are 3 of them with the same result) to get the instances. The main thing is to Partition By PHONE_NUMBER and order by whatever suits your desired order of instances - I did it by ID - could be anything else:

    --    S Q L :
    Select    ID, PHONE_NUMBER,
              ROW_NUMBER() OVER(Partition By PHONE_NUMBER Order By ID) "INSTANCE_ROW_NUMBER",
              COUNT(ID) OVER(Partition By PHONE_NUMBER Order By ID) "INSTANCE_COUNT",
              SUM(1) OVER(Partition By PHONE_NUMBER Order By ID Rows Between Unbounded Preceding And Current Row) "INSTANCE_SUM_1"
    From      PHONES
    Order By  ID
    
    /*      R e s u l t :
      ID  PHONE_NUMBER  INSTANCE_ROW_NUMBER INSTANCE_COUNT  INSTANCE_SUM_1
    ----  ------------  ------------------- --------------  --------------
    1131    4515551111                    1              1               1
    1154    4400051133                    1              1               1
    1214    4400051133                    2              2               2
    1243    5555555555                    1              1               1
    1511    5555555555                    2              2               2
    1631    5555555555                    3              3               3
    1965    3333051133                    1              1               1    */
    

    See the fiddle here.