sqlsasproc-sqlsas-studio

SAS Studio - Count and filter over multiple columns from a macro variable list


I am currently checking a large number of tables for linking keys(all with different names) between datasets. My current code checks for how many records exist and for how many records have a 0 key value. My code looks like this:

%let table = table1;
proc sql;
create table linking_keys as
select * 
from work.my_datasets
where Table_Name = "&table" 
and linking_key NE 'N';
quit;

proc sql;
select Column_Name
into :lnkkeycol separated by ','
from linking_keys;
quit;

%put &lnkkeycol;
proc sql;
create table lnk_key_only as
select count(*)
from library1.&table;
quit;

proc sql;
create table lnk_key_only_2 as
select count(&lnkkeycol)
from library1.&table
where &lnkkeycol = 0;
quit;

This works fine when there is only 1 linking key in the table, but when there is multiple, it falls over. I have tried grouping by macro variables but not had much success with the below:

proc sql;
create table lnk_key_only_2 as
select count(&lnkkeycol)
from library1.&table
where &lnkkeycol = 0
group by &lnkkeycol;
quit;

I am not entirely sure where I am going wrong.


Solution

  • You need to specify each link key column separately. If you have multiple link key columns, you are passing this into the count() function:

    count(col1, col2, col3)

    Which is invalid for the count() function. Instead, create a separate macro variable that counts each link key col only when its value is 0.

    proc sql;
        select cats('sum( (', column_name, '=0) ) as count_', column_name)
        into :lnkkeycol separated by ','
        from linking_keys;
    quit;
    

    This will create individual sum() functions that will only count the condition where column_name = 0. (boolean conditon) is a shortcut in SAS that creates a 1/0 value if the condition is true or false. If we encapsulate it in a sum function, sum( (boolean condition) ), we can sum only the 1's.

    proc sql;
        create table lnk_key_only_2 as
        select &lnkkeycol
        from library1.&table;
    quit;
    

    This translates into:

    proc sql;
        create table lnk_key_only_2 as
           select sum( (col1=0) ) as count_col1
                , sum( (col2=0) ) as count_col2
                , sum( (col3=0) ) as count_col3
        from library1.&table;
    quit;