normalizationsklearn-pandasojalgo

Is there a way on ojAlgo to normalize a Matrix?


I am interested to know if there is a way to find a normalized form for a MatrixStore using the ojAlgo matrix library.

Perhaps a routine or a task that once performed on a MatrixStore will cause each of the rows to have a mean of 0 and a standard deviation of 1.

If one is familiar with sklearn, what I'm looking for is some function on ojAlgo that functions similarly like the preprocessing module on sklearn.


Solution

  • Not directly. You have to write some loops and calculations yourself. Here's one possible way to do it:

    PrimitiveDenseStore matrix = ...;
    
    SampleSet sampleSet = SampleSet.make();
    for (int j = 0; j < matrix.countColumns(); j++) {
        sampleSet.swap(matrix.sliceColumn(j));
        for (int i = 0; i < matrix.countRows(); i++) {
            matrix.set(i, j, sampleSet.getStandardScore(i));
        }
    }
    

    With ojAlgo I strongly recommend organising data in columns.

    I didn't actually test that code. Possibly there could be a problem to update the matrix in-place like this.

    ...

    With v47.1.1 (just released) it is now possible to do it this way:

    matrix.modifyAny(DataPreprocessors.STANDARD_SCORE);