sasdata-manipulationinner-product

Inner product between a table and a column in sas


I would like to calculate an inner product between a table (matrix) and a column (vector). Below are the sample datasets.

DATA tempdf ;
    input v1 v2 v3 ;
    datalines ;
1  2   3
2  4   6
3  6   9
4  8  12
5 10  15
    ;
RUN ;

DATA testcoef ;
    length Variable $3. ;
    input Variable $ coef ;
    datalines ;
v1 0.1
v2 0.2
v3 0.3
    ;
RUN ;

I want to calculate it by v1*0.1 + v2*0.2 + v3*0.3 by each row. And the final result will be look like :

1.4
2.8
4.2
5.6
7

as a column.

Which respectively calculated by

1*0.1 +  2*0.2 +  3*0.3 = 1.4
2*0.1 +  4*0.2 +  6*0.3 = 2.8
3*0.1 +  6*0.2 +  9*0.3 = 4.2
4*0.1 +  8*0.2 + 12*0.3 = 5.6
5*0.1 + 10*0.2 + 15*0.3 = 7

THANKS.

I have try to proc transpose the tempdf datasets and merge the coef columns in testcoef dataset then do array over all columns by multiplying with coef column, and eventually sum all the columns up.

But the process will be very slow if the dataset is large in rows, I wonder if there is any smarter or faster way to do it.

PROC TRANSPOSE data = tempdf out = temptrans name = Variable;
    var _all_ ;
RUN ;

PROC SQL ;
    create table trans_coef as
    select a.*, b.Coef
    from temptrans as a
    left join testcoef as b
        on a.Variable = b.Variable ;
QUIT ;

DATA out1 ;
    set trans_coef ;
    array colarr  COL: ;
    do over colarr ;
    colarr = colarr * coef ;
    end ;
RUN ;

PROC MEANS data = out1 sum;
    var col: ; 
    output out = out1_score(drop = _TYPE_ _FREQ_) sum = ;
RUN;

PROC TRANSPOSE data = out1_score out = final_out name = Cust;
    var COL: ;
RUN ;

final_out table will be like :


  | Cust |  COL1
-----------------
1 | COL1 |  1.4 
2 | COL2 |  2.8 
3 | COL3 |  4.2 
4 | COL4 |  5.6 
5 | COL5 |  7   

Solution

  • PROC SCORE.

    DATA tempdf ;
        input v1 v2 v3 ;
        datalines ;
    1  2   3
    2  4   6
    3  6   9
    4  8  12
    5 10  15
        ;
    RUN ;
    
    DATA testcoef ;
       retain _TYPE_ 'SCORE';
       length Variable $32;
       input Variable v1-v3;
       rename variable=_NAME_;
       datalines ;
    New 0.1 0.2 0.3
    ;
    RUN ;
    proc print;
       run;
    proc score data=tempdf score=testcoef out=t3;
       var V:;
       run;
    proc print;
       run;
    

    enter image description here