I want to create (for example) a small multi-index dataframe with 3 level index and one column of values. There will only be a handful of rows.
Value | |||
---|---|---|---|
Category | Sub Category | Sub Sub Category | |
Machine | Car | Sedan | 10 |
Machine | Bike | 2 wheel | 5 |
Machine | Bike | 3 wheel | 4 |
Animal | Donkey | Big | 2 |
A requirement is that the data has to be easy to enter, a row at a time, from left to right (not top to bottom), so my first step is to make a list of lists.
Then I can use DataFrame method and then the set-index method like this:
data=[["Machine","Car","Sedan",10], ["Machine","Bike","2 Wheel",5], ["Machine","Bike","3 Wheel",4], ["Animal","Donkey","Big",2]]
column_names=["Category","Sub Category","Sub Sub Category","Value"]
df=pd.DataFrame(data,columns=column_names)
df.set_index(["Category","Sub Category","Sub Sub Category"])
But is there a way to make the multi-index DataFrame directly from a list, or something like it?
You can create a MultiIndex from a list of tuples (or list of lists) with pandas.MultiIndex.from_tuples
:
pd.MultiIndex.from_tuples([lst[:3] for lst in data],
names=['Category', 'Sub Category', 'Sub Sub Category']
)
Output:
MultiIndex([('Machine', 'Car', 'Sedan'),
('Machine', 'Bike', '2 Wheel'),
('Machine', 'Bike', '3 Wheel'),
( 'Animal', 'Donkey', 'Big')],
names=['Category', 'Sub Category', 'Sub Sub Category'])
To create a DataFrame, you could split the sublists into index/values and pass them to the DataFrame
constructor:
*index, values = map(list, zip(*data))
df = (pd.DataFrame({'Value': values}, index=index)
.rename_axis(['Category', 'Sub Category', 'Sub Sub Category'])
)
Output:
Value
Category Sub Category Sub Sub Category
Machine Car Sedan 10
Bike 2 Wheel 5
3 Wheel 4
Animal Donkey Big 2
Your approach is also perfectly valid, you could just reuse the variable with the list of columns to use as index and chain the operations:
df = (pd.DataFrame(data, columns=column_names)
.set_index(column_names[:3])
)