Based on the documentation given on the following link pipeline and imbalanced
i have tried to implement code on some dataset, here is code :
import numpy as np
import pandas as pd
from collections import Counter
from sklearn.preprocessing import LabelEncoder,OneHotEncoder
from imblearn.over_sampling import SMOTE
from imblearn.under_sampling import RandomUnderSampler
from sklearn.pipeline import Pipeline
from sklearn.naive_bayes import GaussianNB
data =pd.read_csv('aug_train.csv')
data.drop('id',axis=1,inplace=True)
print(data.info())
print(data.select_dtypes(include='object').columns.tolist())
data[data.select_dtypes(include='object').columns.tolist()]=data[data.select_dtypes(include='object').columns.tolist()].apply(LabelEncoder().fit_transform)
print(data.head())
#print(data['Response'].value_counts())
mymodel =GaussianNB()
y =data['Response'].values
print(Counter(y))
X =data.drop('Response',axis=1).values
#X,y =SMOTE().fit_resample(X,y)
#mymodel.fit(X,y)
#print(mymodel.score(X,y))
#print(Counter(y))
over = SMOTE(sampling_strategy=0.1)
under = RandomUnderSampler(sampling_strategy=0.5)
steps = [('o', over), ('u', under)]
pipeline = Pipeline(steps=steps)
# transform the dataset
X, y = pipeline.fit_sample(X, y)
the main problem in this code is with line :
X, y = pipeline.fit_sample(X, y)
error says that AttributeError: 'Pipeline' object has no attribute 'fit_resample' how can i fix this issue? thanks in advance
The tutorial employs imblearn.pipeline.Pipeline
, while your code uses sklearn.pipeline.Pipeline
(check import
expressions). These appear to be different kinds of Pipeline
s.