valueerrorsklearn-pandastrain-test-split

why is sklearn giving me an value error in train_test_split


ValueError: Expected 2D array, got 1D array instead: array=[712. 3.]. Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

```
import pandas as pd
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import MinMaxScaler
import seaborn as sb
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

df=sb.load_dataset('titanic')

df2=df[['survived','pclass','age','parch']]

df3=df2.fillna(df2.mean())
x=df3.drop('survived',axis=1)
y=df3['survived'] 
x_train,y_train,x_test,y_test=train_test_split(x,y,test_size=0.2, random_state=51)
print('x_train',x_train.shape)
sc=StandardScaler()
sc.fit(x_train.shape)
x_train=x_train.reshape(-1,1)


x_train_sc=sc.transform(x_train)
x_test_sc=sc.transform(x_test)
print(x_train_sc)`
```
```
`I would really appreciate if could fid me a solution

I have applied train_test_split to x & y variables and also transformed it into the x_train variabel. I was trying to print x_train. But it showed me an error
`
```
 raise ValueError(
ValueError: Expected 2D array, got 1D array instead:
array=[712.   3.].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
```
`

Solution

  • You're supposed to give your StandardScaler your X_train and not the shape of your X_train :)

    sc=StandardScaler()
    sc.fit(x_train)
    
    x_train_sc=sc.transform(x_train)
    x_test_sc=sc.transform(x_test)
    

    If you want to normalize your data in a -1/1 range, it's better to use MinMaxScaler :

    from sklearn.preprocessing import MinMaxScaler
    
    ...
    
    sc = MinMaxScaler(feature_range=(-1, 1)).fit(X_train)
    x_train_sc=sc.transform(x_train)
    x_test_sc=sc.transform(x_test)