pythonnumpyscikit-learn

Shortest Syntax To Use numpy 1d-array As sklearn X


I often have two numpy 1d arrays, x and y, and would like to perform some quick sklearn fitting + prediction using them.

 import numpy as np
 from sklearn import linear_model

 # This is an example for the 1d aspect - it's obtained from something else.
 x = np.array([1, 3, 2, ...]) 
 y = np.array([12, 32, 4, ...])

Now I'd like to do something like

 linear_model.LinearRegression().fit(x, y)...

The problem is that it expects an X which is a 2d column array. For this reason, I usually feed it

 x.reshape((len(x), 1))

which I find cumbersome and hard to read.

Is there some shorter way to transform a 1d array to a 2d column array (or, alternatively, get sklearn to accept 1d arrays)?


Solution

  • You can slice your array, creating a newaxis:

    x[:, None]
    

    This:

    >>> x = np.arange(5)
    >>> x[:, None]
    array([[0],
           [1],
           [2],
           [3],
           [4]])
    

    Is equivalent to:

    >>> x.reshape(len(x), 1)
    array([[0],
           [1],
           [2],
           [3],
           [4]])
    

    If you find it more readable, you can use a transposed matrix:

    np.matrix(x).T
    

    If you want an array:

    np.matrix(x).T.A