sasdatasteprecode

Setting *most* variables to missing, while preserving the contents of a select few


I have a dataset like this (but with several hundred vars):

id  q1  g7  q3  b2  zz  gl  az  tre
1   1   2   1   1   1   2   1   1
2   2   3   3   2   2   2   1   1
3   1   2   3   3   2   1   3   3
4   3   1   2   2   3   2   1   1
5   2   1   2   2   1   2   3   3
6   3   1   1   2   2   1   3   3

I'd like to keep id, b2, and tre, but set everything else to missing. In a dataset this small, I can easily use call missing (q1, g7, q3, zz, gl, az) - but in a set with many more variables, I would effectively like to say call missing (of _ALL_ *except ID, b2, tre*).

Obviously, SAS can't read my mind. I've considered workarounds that involve another data step or proc sql where I copy the original variables to a new ds and merge them back on post, but I'm trying to find a more elegant solution.


Solution

  • This technique uses an un-executed set statement (compile time function only) to define all variables in the original data set. Keeps the order and all variable attributes type, labels, format etc. Basically setting all the variables to missing. The next SET statement which will execute brings in only the variables the are NOT to be set to missing. It doesn't explicitly set variables to missing but achieves the same result.

       data nomiss;
           input id  q1  g7  q3  b2  zz  gl  az  tre;
           cards;
        1   1   2   1   1   1   2   1   1
        2   2   3   3   2   2   2   1   1
        3   1   2   3   3   2   1   3   3
        4   3   1   2   2   3   2   1   1
        5   2   1   2   2   1   2   3   3
        6   3   1   1   2   2   1   3   3
        ;;;;
           run;
        proc print;
           run;
        data manymiss;
           if 0 then set nomiss;
           set nomiss(keep=id b2 tre:);
           run;
        proc print;
           run;
    

    enter image description here