I have time in SAS Enterprise Guide like below:
COL1 | COL2 | COL3 |
---|---|---|
10 | 12 | 44 |
15 | 8 | 2 |
10 | 12 | 44 |
And I need to select duplicated rows --> select duplicate rows, but only those where duplicates occur in each column
COL1 | COL2 | COL3 |
---|---|---|
10 | 12 | 44 |
10 | 12 | 44 |
How can I do that in SAS Enterprise Guide / PROC SQL ?
If you want to eliminate the observations that occur only once then you could do something like:
proc sort data=have out=want ;
by col1 col2 col3 ;
run;
data want;
set want;
by col1 col2 col3 ;
if not (first.col3 and last.col3) ;
run;
The unique rows are the only rows in their by grouping, so they are both the first row and the last row.
Here is a trick to use if there are a lot of variables and you are not sure which one will end up last if you use the _all_ variable list. Just add any one to the end and use that.
proc sort data=have out=want ;
by _all_;
run;
data want;
set want;
by _all_ col2;
if not (first.col2 and last.col2) ;
run;