sasaverageproc-sqlenterprise-guide4gl

How to calculate mean values per values in other column in one PROC SQL query in SAS Enterprise Guide?


I have query in PROC SQL with result as table in SAS Enterprise Guide like below:

My query:

proc sql;
create table work.my_table as 
select 
ID
, COUNTRY
, VALUE
from library1.table1
;quit;

Result of query in PROC SQL:

ID  | COUNTRY   | VALUE
----|-----------|---------
111 | FRANCE    | 1
222 | GERMANY   | 3
333 | FRAMNCE   | 2
444 | FRAMCE    | 3
555 | GERMANY   | 5

My task:

I need to modify my query, so as to have one more column: "MEAN_1" where will be mean of value (columns: "VALUE") per country (column: "COUNTRY")

Desire output:

ID  | COUNTRY   | VALUE   | MEAN_1
----|-----------|---------|-------
111 | FRANCE    | 1       | 2
222 | GERMANY   | 4       | 4.5
333 | FRAMNCE   | 2       | 2
444 | FRAMCE    | 3       | 2
555 | GERMANY   | 5       | 4.5

Because:

How can I modify my query in PROC SQL in SAS Enterprise Guide from the begining of my question, so as to have in result table like above in "Desire output" ?


Solution

  • The result of your query (input) does not match the expected output.

    In other flavors of SQL, you would achieve this with a subquery and a join.
    However using SAS SQL you don't have to as SAS it allows you to include other variables and remerges the data together automatically. More in-depth explanation can be found in this answer.

    proc sql;
       create table want as
       select *,
              mean(value) as mean_value
       from have 
       group by country
       ;
    quit;
    

    Using a double DoW Loop

    proc sort data=have; by country; run;
    
    data want;
        do _n_ = 1 by 1 until (last.country);
            set have;
            by country;
            sum=sum(sum, value);
        end;
    
        mean_value=divide(sum, _n_);
    
        do until (last.country);
            set have;
            by country;
            output;
        end;
    
        drop sum;
    run;
    

    Both hold the same result

    id  country value mean_value
    111 FRANCE    1       2
    222 GERMANY   4      4.5
    333 FRANCE    2       2
    444 FRANCE    3       2
    555 GERMANY   5      4.5