I would like to sum up and create & rates, each month of data is a column in my dataset. I want to have a dataset with the rate and sum for each month at the end. I currently run this code to create some calculations:
PROC SQL;
CREATE TABLE WANT AS
SELECT
(SUM(ROLL_&mmmyy2) /SUM(ACCOUNTS_&mmmyy2))*100 AS &mmmyy2
,(SUM(ROLL_&mmmyy3) /SUM(ACCOUNTS_&mmmyy3))*100 AS &mmmyy3
,(SUM(ROLL_&mmmyy4) /SUM(ACCOUNTS_&mmmyy4))*100 AS &mmmyy4
,(SUM(ROLL_&mmmyy5) /SUM(ACCOUNTS_&mmmyy5))*100 AS &mmmyy5
,(SUM(ROLL_&mmmyy6) /SUM(ACCOUNTS_&mmmyy6))*100 AS &mmmyy6
,(SUM(ROLL_&mmmyy7) /SUM(ACCOUNTS_&mmmyy7))*100 AS &mmmyy7
,(SUM(ROLL_&mmmyy8) /SUM(ACCOUNTS_&mmmyy8))*100 AS &mmmyy8
,(SUM(ROLL_&mmmyy9) /SUM(ACCOUNTS_&mmmyy9))*100 AS &mmmyy9
,(SUM(ROLL_&mmmyy10) /SUM(ACCOUNTS_&mmmyy10))*100 AS &mmmyy10
,(SUM(ROLL_&mmmyy11) /SUM(ACCOUNTS_&mmmyy11))*100 AS &mmmyy11
,(SUM(ROLL_&mmmyy12) /SUM(ACCOUNTS_&mmmyy12))*100 AS &mmmyy12
,(SUM(ROLL_&mmmyy13) /SUM(ACCOUNTS_&mmmyy13))*100 AS &mmmyy13
,(SUM(ROLL_&mmmyy14) /SUM(ACCOUNTS_&mmmyy14))*100 AS &mmmyy14
,SUM(ROLL_&mmmyy2) AS SUM_&mmmyy2
,SUM(ROLL_&mmmyy3) AS SUM_&mmmyy3
,SUM(ROLL_&mmmyy4) AS SUM_&mmmyy4
,SUM(ROLL_&mmmyy5) AS SUM_&mmmyy5
,SUM(ROLL_&mmmyy6) AS SUM_&mmmyy6
,SUM(ROLL_&mmmyy7) AS SUM_&mmmyy7
,SUM(ROLL_&mmmyy8) AS SUM_&mmmyy8
,SUM(ROLL_&mmmyy9) AS SUM_&mmmyy9
,SUM(ROLL_&mmmyy10) AS SUM_&mmmyy10
,SUM(ROLL_&mmmyy11) AS SUM_&mmmyy11
,SUM(ROLL_&mmmyy12) AS SUM_&mmmyy12
,SUM(ROLL_&mmmyy13) AS SUM_&mmmyy13
,SUM(ROLL_&mmmyy14) AS SUM_&mmmyy14
FROM HAVE;
QUIT;
The &mmmyy(number) macros are all dates.
Is there a way to simplify this, I'm currently trying:
DATA WANT;
SET HAVE;
ARRAY ROLL_&&MMMYY[12] ROLL_&&MMMYY2-ROLL_&&MMMYY14;
ARRAY ACCOUNTS_&&MMMYY[12] ACCOUNT&&MMMYY2-ACCOUNT&&MMMYY14;
DO I = 2 TO 14;
&&MMMYY[I]= (SUM(ROLL_&&MMMYY[I])/SUM(ACCOUNTS_&&MMMYY[I]))*100;
END;
RUN;
This is an example of how the data looks:
Account Number ROLL_Jun23 Roll_May23 Roll_Apr23 Roll_Mar23 Accounts_Jun23 Accounts_May23 Accounts_Apr23 Accounts_Mar23
1 1 0 0 0 1 1 1 1
2 0 1 1 0 1 1 0 0
3 0 0 0 0 1 0 0 0
4 1 1 1 1 1 1 1 0
5 1 1 1 1 1 1 1 0
6 1 1 0 0 1 1 0 0
The idea of my array is that I would end up with data like:
Jun23 May23 Apr23 Mar23
66.67 80.00 100.00 200.00
You should treat data as data, and thus I emphasize using a pivot strategy to take the dates out of the variable names.
Use a reporting procedure such as TABULATE or REPORT to present summaries of your now categorical data.
Example:
A DATA step view is used to pivot the data for use in TABULATE.
data have;
input Account_Number ROLL_Jun23 Roll_May23 Roll_Apr23 Roll_Mar23 Accounts_Jun23 Accounts_May23 Accounts_Apr23 Accounts_Mar23;
datalines;
1 1 0 0 0 1 1 1 1
2 0 1 1 0 1 1 0 0
3 0 0 0 0 1 0 0 0
4 1 1 1 1 1 1 1 0
5 1 1 1 1 1 1 1 0
6 1 1 0 0 1 1 0 0
;
data flags / view=flags;
set have;
array rolls roll_:;
array accounts accounts_:;
do over rolls;
date = input('01'||scan(vname(rolls),2,'_'),date7.);
roll_flag = rolls;
acct_flag = accounts;
output;
end;
format date monyy5.;
keep account_number date roll_flag acct_flag;
run;
proc tabulate data=flags;
class account_number;
class date / descending;
var roll_flag acct_flag;
table
date='Month' * (roll_flag='Roll' acct_flag='Acct') * f=5. * sum='';
;
run;
Results
-------------------------------------------------
| Month |
|-----------------------------------------------|
| JUN23 | MAY23 | APR23 | MAR23 |
|-----------+-----------+-----------+-----------|
|Roll |Acct |Roll |Acct |Roll |Acct |Roll |Acct |
|-----+-----+-----+-----+-----+-----+-----+-----|
| 4| 6| 4| 5| 3| 3| 2| 1|
-------------------------------------------------