I've got 4 columns of data where some of the rows were accidentally duplicated, but the only duplicated value was column A. I need to basically find an identify all the duplicates, which Excel can do by default, but instead of removing the duplicates I want to remove everything but the duplicates. I'll then take that list and remove them from my SQL database.
Here is a small example of what the data looks like:
This is what I would expect to see after doing whatever is needed to get just a duplicate row showing.
Here is one more way to accomplish the desired output, this uses GROUPBY()
function
Source Data:
• Method One:
=GROUPBY(A1:A6, B1:D6, SINGLE, 3, 0, , COUNTIF(A2:A6, A2:A6)>1)
• Method Two: Same as above but uses TRIMRANGE()
refs operator (Refer the dot after the colons), therefore one don't have to worry about the ranges, if they don't have access to the same then better to use Strucutred References aka Tables.
=GROUPBY(A:.A, B:.D, SINGLE, 3, 0, , COUNTIF(A:.A, A:.A)>1)
• Method Three: Uses Tables:
=VSTACK(Table1[#Headers],
GROUPBY(Table1[Item],
Table1[[Ord]:[Qty]], SINGLE, , 0, ,
COUNTIF(Table1[Item], Table1[Item])>1))
Or,
=LET(
_a, Table1[#All],
_b, TAKE(_a, 1),
_c, DROP(TAKE(_a, , 1), 1),
_d, MAP(_c, LAMBDA(x, COUNTIF(INDEX(_c, 1):x, x))),
_e, FILTER(DROP(_a, 1), (ISNA(XMATCH(_c, UNIQUE(_c, , 1)))*_d)=1),
VSTACK(_b, _e))