I have a SAS data step that uses a %do
loop to build a set statement that combines multiple files. The files have names of the form datafile2020q1
or datafile2021q3
, and they're each stored in libraries with names of the form DataLibrary2020
or DataLibrary2021
.
%macro processclaimsdata;
data rawclaimsdata (keep=&claimvariables.);
set
%do yr=2019 %to 2021;
%do qrtr=1 %to 4;
DataLibrary&yr..datafile&yr.q&qrtr. (keep=&claimvariables.)
%end;
%end;
;
run;
%mend;
My goal is to add a variable to the dataset that has YYYYqQ
in it, e.g., 2020q1 or 2021q3, for each dataset, so I can keep track of which file it came from. Is this possible by modifying my code above, or do I need to rework it to use proc append
and/or proc sql
?
Use the indsname=
option to keep track of the dataset that a row came from. Although it won't be in the format you specify, this will tell you the exact dataset name that a row came from.
The code below will create a new temporary variable in the dataset called dsn
that stores the name of the dataset a row came from. You can add it to the dataset by assigning it to a permanent variable.
data want;
set have1
have2
have3
indsname=dsn;
dsname = dsn;
run;
You can then use substr()
or other string functions to snag the name of the quarter from each dataset name. For example:
qtr = substr(dsn, -6);