I have a SAS data step that uses a %do loop to build a set statement that combines multiple files. The files have names of the form datafile2020q1 or datafile2021q3, and they're each stored in libraries with names of the form DataLibrary2020 or DataLibrary2021.
%macro processclaimsdata;
data rawclaimsdata (keep=&claimvariables.);
set
%do yr=2019 %to 2021;
%do qrtr=1 %to 4;
DataLibrary&yr..datafile&yr.q&qrtr. (keep=&claimvariables.)
%end;
%end;
;
run;
%mend;
My goal is to add a variable to the dataset that has YYYYqQ in it, e.g., 2020q1 or 2021q3, for each dataset, so I can keep track of which file it came from. Is this possible by modifying my code above, or do I need to rework it to use proc append and/or proc sql?
Use the
indsname=option to keep track of the dataset that a row came from. Although it won't be in the format you specify, this will tell you the exact dataset name that a row came from.The code below will create a new temporary variable in the dataset called
dsnthat stores the name of the dataset a row came from. You can add it to the dataset by assigning it to a permanent variable.You can then use
substr()or other string functions to snag the name of the quarter from each dataset name. For example:qtr = substr(dsn, -6);