If I build a dataframe like this
arrays = [
np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]),
np.array(["one", "two", "one", "two", "one", "two", "one", "two"])]
df = pd.DataFrame(np.random.randn(8, 4), index=arrays)
0 1 2 3
bar one -0.424972 0.567020 0.276232 -1.087401
two -0.673690 0.113648 -1.478427 0.524988
baz one 0.404705 0.577046 1.715002 -1.039268
two -0.370647 -1.157892 1.344312 0.844885
foo one 1.075770 -0.109050 1.643563 -1.469388
two 0.357021 -0.674600 -1.776904 -0.968914
qux one -1.294524 0.413738 0.276662 -0.472035
two -0.013960 -0.362543 -0.006154 -0.923061
I want to select the entire multi-index when one part of the multi-index (a subindex) obeys a condition. In the example above, I want to select the full multi-index when the values of two
in column 2
are less than 0, so I want
0 1 2 3
bar one -0.424972 0.567020 0.276232 -1.087401
two -0.673690 0.113648 -1.478427 0.524988
foo one 1.075770 -0.109050 1.643563 -1.469388
two 0.357021 -0.674600 -1.776904 -0.968914
qux one -1.294524 0.413738 0.276662 -0.472035
two -0.013960 -0.362543 -0.006154 -0.923061
You can easily do this like this
import numpy as np
import pandas as pd
arrays = [
np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]),
np.array(["one", "two", "one", "two", "one", "two", "one", "two"])]
df = pd.DataFrame(np.random.randn(8, 4), index=arrays, columns=[0, 1, 2, 3])
print("Original DataFrame:")
print(df)
condition = df.loc[(slice(None), 'two'), 2] < 0
main_index = condition[condition].index.get_level_values(0)
result = df.loc[main_index]
print("\nFiltered DataFrame:")
print(result)
which gives
Original DataFrame:
0 1 2 3
bar one 0.009890 1.877523 0.493127 -0.624485
two 1.184245 0.123974 0.232514 0.715477
baz one 1.302266 -0.966551 1.806716 1.651682
two -0.597303 -1.597426 1.528042 -1.297541
foo one -0.925038 0.574236 0.385195 1.318643
two -0.256540 0.391510 0.906078 0.082847
qux one 0.464847 0.930058 -1.026141 -0.963413
two -0.314374 0.262289 -0.182630 1.228360
Filtered DataFrame:
0 1 2 3
qux one 0.464847 0.930058 -1.026141 -0.963413
two -0.314374 0.262289 -0.182630 1.228360