pythonpandas

Behavior of df.map() inside another df.apply()


I find this code very interesting. I modified the code a little to improve the question. Essentially, the code uses a DataFrame to format the style of another DataFrame using pd.style.

t1 = pd.DataFrame({'x':[300,200,700], 'y':[100,300,200]})
t2 = pd.DataFrame({'x':['A','B','C'], 'y':['C','B','D']})

def highlight_cell(val, props=''):
    return props if val > 200 else ''
    
t2.style.apply(lambda x: t1.map(highlight_cell, props='background-color:yellow'), axis=None)

enter image description here

But can anyone explain how the last line works? I couldn't find Pandas documentation that clarifies the behavior of df.map() inside another df.apply().

To me, the code reads like for each item in t1, apply highlight_cell() to the entire t2 at once, like this pseudocode.

for x in all_items_in_t1:
    yield [highlight_cell(y) for y in all_items_in_t2]

However, the output is saying for each item in t1, apply highlight_cell() only to the corresponding item in t2 that has the same (x, y) location as that item in t1, like this.

for x, y in zip(all_items_in_t1, all_items_in_t2):
    yield highlight_cell(y)

I'm still having trouble understanding this pattern because it seems a bit confusing. Can anyone explain it more clearly?


Solution

  • DataFrame.style.apply is used here, not DataFrame.apply.

    By using the parameter axis=None, the callable is applied once (not per cell) on the whole DataFrame. Since the callable is a lambda, this essentially means we run:

    t1.map(highlight_cell, props='background-color:yellow')
    

    and use the output as format.

                             x                        y
    0  background-color:yellow                         
    1                           background-color:yellow
    2  background-color:yellow                         
    

    Note that using DataFrame.map here is not needed (and inefficient), better go for a vectorial approach:

    t2.style.apply(lambda x: np.where(t1>200, 'background-color:yellow', ''), axis=None)