Is it possible to remove duplicated records in sequence inside a specific group and output only last of them (based od date) with 4GL (SAS)? I have data like:
data example;
input obs id dt value WANT_TO_SELECT;
cards;
1 10 1 500 0
2 10 2 750 1
3 10 3 750 1
4 10 4 750 0
5 10 5 500 0
6 20 1 150 1
7 20 2 150 0
8 20 3 370 0
9 20 4 150 0
;
run;
As You see for id=10
I would like to have only one (last) record with value 750, because there is one after the other while value 500 can be twice because they are separated. I was trying use last/first but I am not sure how to sort the data.
Looks like a use case for the NOTSORTED keyword of the BY statement. This will let you use VALUE as a BY variable even though the data is not actually sorted by VALUE. That way the LAST.VALUE flag can be used.
data want;
set example;
by id value notsorted;
if last.value;
run;
Results:
WANT_TO_
Obs obs id dt value SELECT
1 1 10 1 500 0
2 4 10 4 750 0
3 5 10 5 500 0
4 7 20 2 150 0
5 8 20 3 370 0
6 9 20 4 150 0