sqlsaswhere-clausedatastep

SAS DATA STEP WHERE CONDITION IN A LIST


I tried to filter data where they are on a list by using Data step in SAS

proc sql;
create table id_list as
select distinct id from customer;
quit;

data test;
set fulldata;
where id in id_list;
run;

It doesnt work. However, if I use "where id in (1,2,3)" it works. Could anyone please help me with where in a list of data ? Thanks


Solution

  • If your fulldata is sorted or indexed by id then you can use a MERGE.

    This approach is useful when the list is very large and could exceed 64K characters when placed in a macro variable.

    proc sort data=customer(keep=id) out=list nodupkey;
      by id;
    
    data want;
      merge fulldata(in=full) list(in=list);
      by id;
      if full and list;
    run;
    

    For the case of wanting to stack multiple data sets the use of a hash is recommended.

    Example:

    Several big tables with some overlapping id values are to be stacked and filtered by matching ids to those in a smaller table that might have repeated ids.

    data big1 big2 big3 big4 big5 big6 big7;
      do id = 1 to 6666;
        if 0.00 <= id / 3333 <= 0.50 then output big1;
        if 0.25 <= id / 3333 <= 0.75 then output big2;
        if 0.50 <= id / 3333 <= 1.00 then output big3;
        if 0.75 <= id / 3333 <= 1.25 then output big4;
        if 1.00 <= id / 3333 <= 1.50 then output big5;
        if 1.25 <= id / 3333 <= 1.75 then output big6;
        if 1.50 <= id / 3333 <= 2.00 then output big7;
      end;
    run;
    
    data small;
      do _n_ = 1 to 666;
        id = rand('integer', 6666);
        output;
        do while (rand('uniform') < 0.10);
          output;
        end;
      end;
    run;
    
    data want;
      attrib id length=8;
    
      if _n_ = 1 then do;
        declare hash lookup (dataset:'small');
        lookup.defineKey('id');
        lookup.defineDone();
      end;
    
      set big1-big7 indsname=from;
      source=from;
    
      if lookup.check() = 0;
    run;