I have dataframe like this:
import numpy as np
import pandas as pd
dataA = [["2005-1-20", "9:35", 5], ["2005-1-20", "9:40", 8], ["2005-1-20", "9:45", 7],
["2005-1-20","9:50", 4], ["2005-1-20", "10:00", 2],
["2005-1-21", "9:35", 2], ["2005-1-21", "9:40", 3], ["2005-1-21", "9:45", 4],
["2005-1-21","9:50", 4], ["2005-1-21", "10:00", 775],
["2005-1-22", "9:35", 12], ["2005-1-22", "9:40", 13], ["2005-1-22", "9:45", 14],
["2005-1-22","9:50", 14], ["2005-1-22", "10:00", 15]]
df = pd.DataFrame(data = dataA, columns=["date", "min", "val"])
print(df)
date min val
0 2005-1-20 9:35 5
1 2005-1-20 9:40 8
2 2005-1-20 9:45 7
3 2005-1-20 9:50 4
4 2005-1-20 10:00 2
5 2005-1-21 9:35 2
6 2005-1-21 9:40 3
7 2005-1-21 9:45 4
8 2005-1-21 9:50 4
9 2005-1-21 10:00 775
10 2005-1-22 9:35 12
11 2005-1-22 9:40 13
12 2005-1-22 9:45 14
13 2005-1-22 9:50 14
14 2005-1-22 10:00 15
i want to find the group with the maximum value where min=10:00, and delete the group groupby date, how to it?
This is my solution, it only removes one, even if two dates have the highest value.
#Filter, so that you only have the values you want to compare
only_data_at_ten = df[ df.minute == '10:00']
#Find the highest value by sorting ascending and getting the last value
date_to_remove = only_data_at_ten.sort_values('val').date.iloc[-1]
#Filter your data so that you have only the dates, that are NOT the one you found
cleaned_data = df[ df.date != date_to_remove]