pythonpandasdataframe

pandas How to find the group with the maximum value and delete the group


I have dataframe like this:

import numpy as np
import pandas as pd
dataA = [["2005-1-20", "9:35", 5], ["2005-1-20", "9:40", 8], ["2005-1-20", "9:45", 7],
   ["2005-1-20","9:50", 4], ["2005-1-20", "10:00", 2],
   ["2005-1-21", "9:35", 2], ["2005-1-21", "9:40", 3], ["2005-1-21", "9:45", 4],
   ["2005-1-21","9:50", 4], ["2005-1-21", "10:00", 775],
   ["2005-1-22", "9:35", 12], ["2005-1-22", "9:40", 13], ["2005-1-22", "9:45", 14],
   ["2005-1-22","9:50", 14], ["2005-1-22", "10:00", 15]]
df = pd.DataFrame(data = dataA, columns=["date", "min", "val"])
print(df)
     date      min    val
0   2005-1-20   9:35    5
1   2005-1-20   9:40    8
2   2005-1-20   9:45    7
3   2005-1-20   9:50    4
4   2005-1-20   10:00   2
5   2005-1-21   9:35    2
6   2005-1-21   9:40    3
7   2005-1-21   9:45    4
8   2005-1-21   9:50    4
9   2005-1-21   10:00   775
10  2005-1-22   9:35    12
11  2005-1-22   9:40    13
12  2005-1-22   9:45    14
13  2005-1-22   9:50    14
14  2005-1-22   10:00   15

i want to find the group with the maximum value where min=10:00, and delete the group groupby date, how to it?


Solution

  • This is my solution, it only removes one, even if two dates have the highest value.

    #Filter, so that you only have the values you want to compare
    only_data_at_ten = df[ df.minute == '10:00']
    
    #Find the highest value by sorting ascending and getting the last value
    date_to_remove = only_data_at_ten.sort_values('val').date.iloc[-1]
    
    #Filter your data so that you have only the dates, that are NOT the one you found
    cleaned_data = df[ df.date != date_to_remove]