I have tables in SAS Enterprise Guide like below:
TABLE 1
COL1 | COL2 | ... | COLn
-----|------|------|-------
123 | | ... | xxx
| AAA | ... | xxx
122 | BCC | ... | xxx
... | ... | ... | xxx
And also 2 other tables like below:
TABLE 2
COL1 | ...| COLn
-----|----|------
998 | ...| xxx
999 | ...| xxx
001 | ...| xxx
... | ...| ...
TABLE 3
COL8 | ...| COLn
-----|----|------
117 | ...| xxx
906 | ...| xxx
201 | ...| xxx
... | ...| ...
As you can see TABLE 1 has missing values and I need to:
So, as a result I need something like below, because COL1 (where missing is fill by 0) is in TABLE 2 and COL2 is neither in TABLE 2 nor in TABLE 3, moreover COL2 is not numeric
COL1 | COL2 | ... | COLn
-----|------|------|-------
123 | | ... | xxx
0 | AAA | ... | xxx
122 | BCC | ... | xxx
... | ... | ... | xxx
Only TABLE 1 is relevant, if some column from TABLE 1 is numeric, has missing and exists in TABLE 2 or TABLE 3 fill this column by 0
How can I do that in SAS Enterprise Guide ?
Create sample data sets
data tab1;
infile datalines delimiter='|' dsd;
input col1 col2 $ col3 col4 $ col5 col6;
datalines;
1|ABC|.|ABC|5|9
.|DEF|8||10|.
;
run;
data tab2;
format col1 8. col2 $8. col5 8.;
stop;
run;
data tab3;
format col3 8. col4 $8.;
stop;
run;
col1 col2 col3 col4 col5 col6
1 ABC . ABC 5 9
. DEF 8 10 .
If you are only interested in the number of missing values for numeric variables, then a single call to the MEANS procedure computes the answer.
We use the ods output
statement to output the results to a SAS table tofill
.
We use the stackods
option, it allows the data set to resemble the default printed output from PROC MEANS
.
proc means data=tab1 nmiss n stackods;
ods output summary=tofill(where=(nmiss>0));
run;
Then we use the same kind of approach as your last questions here and here to match retrieve the columns that exists in either TAB2
or TAB3
. We fill the column names in a macro variable tofill
proc sql noprint;
create table names as
select distinct upcase(name) as name
from sashelp.vcolumn
where memname in ('TAB2', 'TAB3');
select upcase(variable) into :tofill separated by ' '
from tofill
where upcase(variable) in (select name from names)
;
quit;
Finally, we replace the missing values of the columns we are interested in by 0
data want;
set tab1;
array cols &tofill.;
do over cols;
if missing(cols) then cols = 0;
end;
run;
col1 col2 col3 col4 col5 col6
1 ABC 0 ABC 5 9
0 DEF 8 10 .
As desired
col1
and col3
are replaced by 0 because they are both numeric columns and exists in either tab2
or tab3
col4
remain missing as it is a character variablecol6
remain missing as this column does not exist in either tab2
or tab3
.