pythonscikit-learnnmf

Using scikit-learn NMF with a precomputed set of basis vectors (Python)


I want to use scikit-learn NMF (from here) (or any other NMF if it does the job, actually).

Specifically, I have an input matrix (which is an audio magnitude spectrogram), and I want to decompose it.

I already have the W matrix pre-computed. How do I use a fixed W in sklearn.decompose.NMF? I haven't found any other question asking this.

I see that this method also mentions something similar in the fit parameter: "If False, components are assumed to be pre-computed and stored in transformer, and are not changed.". However, I am not sure how to make that transformer object.


Solution

  • This part of the code explains internal processing a bit.

    It sounds you want to fix W. According to the code, you can only fix H, while optimizing W. That's not a problem, as you can just switch those matrices (invert their roles).

    Doing this, the code says: use init='custom' and set update_h=False.

    So in general i would expect usage to look like (based on the example here):

    Untested!

    import numpy as np
    X = np.array([[1,1], [2, 1], [3, 1.2], [4, 1], [5, 0.8], [6, 1]])
    
    fixed_W = np.array([[1,1,1],[1,1,1],[1,1,1],[1,1,1],[1,1,1],[1,1,1])  # size=3 just an example
                                                                          # might break
    fixed_H = fixed_W.T  # interpret W as H (transpose)
    
    from sklearn.decomposition import NMF
    model = NMF(n_components=2, init='custom', H=fixed_H, update_H=False, random_state=0)
    model.fit(X) 
    

    You probably want to switch your variables after solving again.

    Edit: As mentioned in the comments, the untested code above won't work. We need to use the more low-level function available to do that.

    Here is a quick hack (where i don't care much about the right preprocessing; transpose and co.) which should make you able to tackle your task:

    import numpy as np
    X = np.array([[1,1], [2, 1], [3, 1.2], [4, 1], [5, 0.8], [6, 1]])
    
    fixed_W = np.array([[0.4,0.4],[0.2,0.1]])  # size=2 just an example
    fixed_H = fixed_W.T  # interpret W as H (transpose)
    
    from sklearn.decomposition import NMF, non_negative_factorization
    
    W, H, n_iter = non_negative_factorization(X, n_components=2, init='random', random_state=0)
    print(W)
    print(H)
    print('error: ')
    print(W.dot(H) - X)  # just a demo, it's not the loss minimized!
    
    W, H, n_iter = non_negative_factorization(X, n_components=2, init='custom', random_state=0, update_H=False, H=fixed_H)
    print(W)
    print(H)
    print('error: ')
    print(W.dot(H) - X)
    

    Output:

    [[ 0.          0.46880684]
     [ 0.55699523  0.3894146 ]
     [ 1.00331638  0.41925352]
     [ 1.6733999   0.22926926]
     [ 2.34349311  0.03927954]
     [ 2.78981512  0.06911798]]
    [[ 2.09783018  0.30560234]
     [ 2.13443044  2.13171694]]
    error: 
    [[  6.35579822e-04  -6.36528773e-04]
     [ -3.40231372e-04   3.40739354e-04]
     [ -3.45147253e-04   3.45662574e-04]
     [ -1.31898319e-04   1.32095249e-04]
     [  9.00218123e-05  -9.01562192e-05]
     [  8.58722020e-05  -8.60004133e-05]]
    [[  3.           0.        ]
     [  5.           0.        ]
     [  4.51221142   2.98707026]
     [  0.04070474   9.95690087]
     [  0.          12.23529412]
     [  0.          14.70588235]]
    [[ 0.4  0.2]
     [ 0.4  0.1]]
    error: 
    [[  2.00000000e-01  -4.00000000e-01]
     [ -2.22044605e-16  -1.11022302e-16]
     [ -2.87327549e-04   1.14931020e-03]
     [ -9.57758497e-04   3.83103399e-03]
     [ -1.05882353e-01   4.23529412e-01]
     [ -1.17647059e-01   4.70588235e-01]]