When running the snippet of example code below with pandas 2.2.3, I get an error saying KeyError: 'D'
index = pd.MultiIndex.from_tuples(
[('A', 1), ('A', 2), ('A', 3), ('B', 1), ('B', 2), ('B', 2)],
names=['letter', 'number']
)
df = pd.DataFrame({'value': [10, 20, 30, 40, 50, 60]}, index=index)
idx = pd.IndexSlice
result = df.loc[idx[['A', 'D'], [1,2]], :]
Does pandas offer any alternatives for searching a multi-index with values that don't exist?
If I run the same code using pandas 1.5.3, I get the expected value:
value
letter number
A 1 10
2 20
When you run this code with pandas 1.5.3
you should in fact receive a FutureWarning
:
FutureWarning: The behavior of indexing on a MultiIndex with a nested sequence of labels is deprecated and will change in a future version.
series.loc[label, sequence]
will raise if any members of 'sequence' or not present in the index's second level. To retain the old behavior, useseries.index.isin(sequence, level=1)
(Note that it should read: "are not present".)
So, let's indeed use Index.isin
to allow boolean indexing:
m = (df.index.isin(['A', 'D'], level='letter')
& df.index.isin([1, 2], level='number'))
out = df.loc[m, :]
Output:
value
letter number
A 1 10
2 20
If you have many different conditions, you could consider creating a dictionary and use np.logical_and
+ reduce
:
dict_isin = {
'letter': ['A', 'D'],
'number': [1, 2]
}
m = np.logical_and.reduce(
[df.index.isin(v, level=k) for k, v in dict_isin.items()]
)
out2 = df.loc[m, :]
out2.equals(out)
# True