I am having a problem with my code that it is somehow having a problem of astype, I guess I need to do some cleansing with data. But I am not sure, what can be done with this situation? for data cleansing I tried a few functions such as value_count but does not help. What do you see here as a problem?
file_name='https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DA0101EN-SkillsNetwork/labs/FinalModule_Coursera/data/kc_house_data_NaN.csv'
df=pd.read_csv(file_name)
features =["floors", "waterfront","lat" ,"bedrooms" ,"sqft_basement" ,"view" ,"bathrooms","sqft_living15","sqft_above","grade","sqft_living"]
Input=[('scale',StandardScaler()),('polynomial', PolynomialFeatures(include_bias=False)),('model',LinearRegression())]
y=df['price']
pipe=Pipeline(Input)
print(pipe)
features=features.astype(float)
pipe.fit(features,y)
ypipe=pipe.predict(features)
ypipe[0:10]
You need to use the features
list to select columns from the dataframe. features
is a list of strings that does not have an attribute called astype
. So, the code would become like so.
df[features]=df[features].astype(float)
pipe.fit(df[features],y)
ypipe=pipe.predict(df[features])