pandasdataframescikit-learniris-dataset

How to convert numerical 'species' into categories in Iris dataset


I am working with the Iris dataset from sklearn. Here's my code so far:

iris = datasets.load_iris()

data = pd.DataFrame(iris['data'])
target = pd.DataFrame(iris['target'])

frames = [data,target]
iris = pd.concat(frames,axis=1)

iris.columns = ['sepal_length','sepal_width','petal_length','petal_width','species']

def convert_target(data):
    if data == 0:
        return 'setosa'
    elif data == 1:
        return 'versicolor'
    else:
        return 'virginica'
iris['species'] = iris['species'].apply(convert_target)

Observe how I use convert_target function to convert the species from a numeric value to a categorical value. My question is, is there a better and more efficient way to do this?


Solution

  • You can do map:

    d = {0: 'setosa', 1: 'versicolor', 2: 'virginica'}
    iris['species'] = iris['species'].map(d)
    

    You can also use numpy indexing:

    cat_names = np.array(['setosa', 'versicolor', 'virginica'])
    iris['species'] = cat_names[iris['species']]