The following is the c code
const int M = 4;
const int N = 1;
const int K = 2;
const int LDA = M;
const int LDB = K;
const int LDC = M;
float input_data[2]{1, 1};
float weight_data[8]{1.1, 2.01, 3.001, 4.0001, 5.1, 6.01, 7.001, 8.0001};
float output_data[4];
cblas_sgemm(CblasColMajor, CblasNoTrans, CblasNoTrans, M, N, K, 1, weight_data, LDA, input_data, LDB, 0, output_data, LDC);
The expected result is {6.2, 8.02, 10.002, 12.0002}. Instead, I got {4.101, 6.0101, 12.101, 14.0101}.
The code is very simple. I have checked the document many times, but don't know where I did wrong.
Could you please help point out the problem? Thanks in advance!
Update:
I tried 2*2 and 3*2 weight_data
, both results are correct. However 4*2 weight_data
produces wrong result
It turns out a bug of OpenBLAS. I have never thought openblas may have a bug. It spends me two days.