pythonpandasmulti-index

Select multi-index when one subindex obeys condition


If I build a dataframe like this

arrays = [
    np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]),
    np.array(["one", "two", "one", "two", "one", "two", "one", "two"])]
df = pd.DataFrame(np.random.randn(8, 4), index=arrays)

                0         1         2         3
bar one -0.424972  0.567020  0.276232 -1.087401
    two -0.673690  0.113648 -1.478427  0.524988
baz one  0.404705  0.577046  1.715002 -1.039268
    two -0.370647 -1.157892  1.344312  0.844885
foo one  1.075770 -0.109050  1.643563 -1.469388
    two  0.357021 -0.674600 -1.776904 -0.968914
qux one -1.294524  0.413738  0.276662 -0.472035
    two -0.013960 -0.362543 -0.006154 -0.923061

I want to select the entire multi-index when one part of the multi-index (a subindex) obeys a condition. In the example above, I want to select the full multi-index when the values of two in column 2 are less than 0, so I want

                0         1         2         3
bar one -0.424972  0.567020  0.276232 -1.087401
    two -0.673690  0.113648 -1.478427  0.524988
foo one  1.075770 -0.109050  1.643563 -1.469388
    two  0.357021 -0.674600 -1.776904 -0.968914
qux one -1.294524  0.413738  0.276662 -0.472035
    two -0.013960 -0.362543 -0.006154 -0.923061

Solution

  • You can easily do this like this

    import numpy as np
    import pandas as pd
    
    arrays = [
        np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]),
        np.array(["one", "two", "one", "two", "one", "two", "one", "two"])]
    df = pd.DataFrame(np.random.randn(8, 4), index=arrays, columns=[0, 1, 2, 3])
    
    print("Original DataFrame:")
    print(df)
    
    condition = df.loc[(slice(None), 'two'), 2] < 0
    main_index = condition[condition].index.get_level_values(0)
    result = df.loc[main_index]
    
    print("\nFiltered DataFrame:")
    print(result)
    

    which gives

    Original DataFrame:
                    0         1         2         3
    bar one  0.009890  1.877523  0.493127 -0.624485
        two  1.184245  0.123974  0.232514  0.715477
    baz one  1.302266 -0.966551  1.806716  1.651682
        two -0.597303 -1.597426  1.528042 -1.297541
    foo one -0.925038  0.574236  0.385195  1.318643
        two -0.256540  0.391510  0.906078  0.082847
    qux one  0.464847  0.930058 -1.026141 -0.963413
        two -0.314374  0.262289 -0.182630  1.228360
    
    Filtered DataFrame:
                    0         1         2         3
    qux one  0.464847  0.930058 -1.026141 -0.963413
        two -0.314374  0.262289 -0.182630  1.228360