For my given multi-indexed DataFrame:
df = pd.DataFrame(
np.random.randn(12),
index=[
[1,1,2,3,4,4,5,5,6,6,7,8],
[1,2,1,1,1,2,1,2,1,2,2,2],
]
)
0
1 1 1.667692
2 0.274428
2 1 0.216911
3 1 -0.513463
4 1 -0.642277
2 -2.563876
5 1 2.301943
2 1.455494
6 1 -1.539390
2 -1.344079
7 2 0.300735
8 2 0.089269
I would like to slice it such that I keep only rows where second index level contains BOTH 1 and 2
0
1 1 1.667692
2 0.274428
4 1 -0.642277
2 -2.563876
5 1 2.301943
2 1.455494
6 1 -1.539390
2 -1.344079
How can I do this?
Another possible solution, which is based on the following:
df.groupby(level=0)
groups the dataframe by the first level of the index.
filter(lambda x: set(x.index.get_level_values(1)) == {1, 2})
checks if the second level of the index for each group contains both 1 and 2, and retains only the groups that meet this condition.
df.groupby(level=0).filter(lambda x: set(x.index.get_level_values(1)) == {1, 2})
Output:
0
1 1 -1.085631
2 0.997345
4 1 -0.578600
2 1.651437
5 1 -2.426679
2 -0.428913
6 1 1.265936
2 -0.866740