abapinternal-tables

Find, delete and extract duplicates from an internal table


I have an internal table with 2 million rows that's been uploaded from a file. I want to delete any lines that are duplicates and extract the row numbers of the duplicates and add them to another table. What's the best/most efficient way to do this with ABAP 7.40? Classic ABAP is also fine.

So here's an example of my original table and I want to find duplicates by comparing columns A and B

A  | B  | C
-----------
a1 | b1 | c1
a1 | b2 | c1
a2 | b1 | C2
a1 | b1 | c2
a2 | b2 | c2

Rows 1 and 4 are duplicates so I'd want to remove both of them to end up with

A  | B  | C
-----------
a1 | b2 | c1
a2 | b1 | C2
a2 | b2 | c2

and also have another table that stores duplicates:

Row number  | Error 
-------------------
1           | Duplicate
4           | Duplicate      

I've seen similar requests on this site but they work a bit differently to what I need. Thanks.


Solution

  • This is the code to find which lines are duplicates (valid >= 7.40) :

    TYPES : BEGIN OF ty_line,
              a TYPE c LENGTH 2,
              b TYPE c LENGTH 2,
              c TYPE c LENGTH 2,
            END OF ty_line,
            ty_lines TYPE STANDARD TABLE OF ty_line WITH EMPTY KEY.
    
    DATA(itab) = VALUE ty_lines(
    ( a = 'a1' b = 'b1' c = 'c1' )
    ( a = 'a1' b = 'b2' c = 'c1' )
    ( a = 'a2' b = 'b1' c = 'c2' )
    ( a = 'a1' b = 'b1' c = 'c2' )
    ( a = 'a2' b = 'b2' c = 'c2' ) ).
    
    DATA(duplicates) = VALUE string_table(
        FOR GROUPS <group> OF <line> IN itab
        GROUP BY ( a = <line>-a b = <line>-b size = GROUP SIZE )
        ( LINES OF COND #( WHEN <group>-size > 1 THEN VALUE string_table( (
            concat_lines_of(
                table = VALUE string_table( 
                        FOR <line2> IN GROUP <group> INDEX INTO tabix ( |{ tabix }| ) )
                sep   = ',' ) ) ) ) ) ).
    
    ASSERT duplicates = VALUE string_table( ( `1,4` ) ).
    

    I use LINES OF to not generate a line if the group has a size of 1.