I tried to filter data where they are on a list by using Data step in SAS
proc sql;
create table id_list as
select distinct id from customer;
quit;
data test;
set fulldata;
where id in id_list;
run;
It doesnt work. However, if I use "where id in (1,2,3)" it works. Could anyone please help me with where in a list of data ? Thanks
If your fulldata is sorted or indexed by id then you can use a MERGE.
This approach is useful when the list is very large and could exceed 64K characters when placed in a macro variable.
proc sort data=customer(keep=id) out=list nodupkey;
by id;
data want;
merge fulldata(in=full) list(in=list);
by id;
if full and list;
run;
For the case of wanting to stack multiple data sets the use of a hash is recommended.
Example:
Several big
tables with some overlapping id
values are to be stacked and filtered by matching id
s to those in a smaller table that might have repeated ids.
data big1 big2 big3 big4 big5 big6 big7;
do id = 1 to 6666;
if 0.00 <= id / 3333 <= 0.50 then output big1;
if 0.25 <= id / 3333 <= 0.75 then output big2;
if 0.50 <= id / 3333 <= 1.00 then output big3;
if 0.75 <= id / 3333 <= 1.25 then output big4;
if 1.00 <= id / 3333 <= 1.50 then output big5;
if 1.25 <= id / 3333 <= 1.75 then output big6;
if 1.50 <= id / 3333 <= 2.00 then output big7;
end;
run;
data small;
do _n_ = 1 to 666;
id = rand('integer', 6666);
output;
do while (rand('uniform') < 0.10);
output;
end;
end;
run;
data want;
attrib id length=8;
if _n_ = 1 then do;
declare hash lookup (dataset:'small');
lookup.defineKey('id');
lookup.defineDone();
end;
set big1-big7 indsname=from;
source=from;
if lookup.check() = 0;
run;