I am working with the Iris dataset from sklearn
. Here's my code so far:
iris = datasets.load_iris()
data = pd.DataFrame(iris['data'])
target = pd.DataFrame(iris['target'])
frames = [data,target]
iris = pd.concat(frames,axis=1)
iris.columns = ['sepal_length','sepal_width','petal_length','petal_width','species']
def convert_target(data):
if data == 0:
return 'setosa'
elif data == 1:
return 'versicolor'
else:
return 'virginica'
iris['species'] = iris['species'].apply(convert_target)
Observe how I use convert_target
function to convert the species from a numeric value to a categorical value. My question is, is there a better and more efficient way to do this?
You can do map
:
d = {0: 'setosa', 1: 'versicolor', 2: 'virginica'}
iris['species'] = iris['species'].map(d)
You can also use numpy indexing:
cat_names = np.array(['setosa', 'versicolor', 'virginica'])
iris['species'] = cat_names[iris['species']]