My goal is to get the complete list up until the last unique value. Ideally, I would like methods for performing this operation Pythonically and using Pandas, but a single method solution will work great. Also, I need to preserve the ordering of the list.
Also, in my example shown below, the last unique value happens to be the largest value in the list. This is not necessarily true for my application. The last unique value in my list can take on any value; meaning, it could be the smallest, the largest, or any value in between.
Below I show the progress I have made so far.
import pandas as pd
data_dict = {"RAW": [4000076160, 5354368, 4641792, 4641792, 4289860736, 982783232, 2122384768,
4136386944, 5440384, 4772864, 4772864, 4289881216,
4270354816, 4293477248, 4286243840, 4286243840, 3400832, 982783232, 2122384768],
"ADC_TYPE": [3, 7, 8, 8, 9, 10, 11,
3, 7, 8, 8, 9,
3, 7, 8, 8, 9, 10, 11]}
df = pd.DataFrame(data_dict)
print(df)
The returned DataFrame (i.e. df):
RAW ADC_TYPE
0 4000076160 3
1 5354368 7
2 4641792 8
3 4641792 8
4 4289860736 9
5 982783232 10
6 2122384768 11
7 4136386944 3
8 5440384 7
9 4772864 8
10 4772864 8
11 4289881216 9
12 4270354816 3
13 4293477248 7
14 4286243840 8
15 4286243840 8
16 3400832 9
17 982783232 10
18 2122384768 11
I can use the following piece of code, but it will not return the complete list up to the last unique value.
unique_types = df["ADC_TYPE"].unique().tolist() # return type is python list
print(unique_types)
Which returns:
[3, 7, 8, 9, 10, 11]
My goal is to return:
[3, 7, 8, 8, 9, 10, 11]
I have searched through this forum and Google, but I have not found a solution to my problem thus far. I have found several examples that return a list of unique values, but not an example that returns the complete list up until the last unique value. Thanks!
You can use idxmax()
to find the first occurrence of the max value (adding one due to zero-indexing), then use iloc
to slice the dataframe to only that value
df.iloc[:df['ADC_TYPE'].idxmax()+1,1].tolist()
[3, 7, 8, 8, 9, 10, 11]
Or operating just on the column in question to get the same result
df['ADC_TYPE'][:df['ADC_TYPE'].idxmax()+1].tolist()
New version based on unsorted data (switched 10 and 11 in the first occurrence):
data_dict = {"RAW": [4000076160, 5354368, 4641792, 4641792, 4289860736, 982783232, 2122384768,
4136386944, 5440384, 4772864, 4772864, 4289881216,
4270354816, 4293477248, 4286243840, 4286243840, 3400832, 982783232, 2122384768],
"ADC_TYPE": [3, 7, 8, 8, 9, 11, 10,
3, 7, 8, 8, 9,
3, 7, 8, 8, 9, 10, 11]}
df = pd.DataFrame(data_dict)
RAW ADC_TYPE
0 4000076160 3
1 5354368 7
2 4641792 8
3 4641792 8
4 4289860736 9
5 982783232 11
6 2122384768 10
7 4136386944 3
8 5440384 7
9 4772864 8
10 4772864 8
11 4289881216 9
12 4270354816 3
13 4293477248 7
14 4286243840 8
15 4286243840 8
16 3400832 9
17 982783232 10
18 2122384768 11
#We get the list of unique values in the order they appear
vals=[]
for i in df['ADC_TYPE']:
if i not in vals:
vals.append(i)
print(vals)
#We take the _last_ value from the list
last_unique=vals.pop()
print(last_unique)
#We find the index of the first occurrence of that value
idx = (df['ADC_TYPE'] == last_unique).idxmax()
print(idx)
#We use the previous method to get the values up to that index
up_to_last=df.iloc[:idx+1,1].tolist()
print(up_to_last)
[3, 7, 8, 9, 11, 10]
10
6
[3, 7, 8, 8, 9, 11, 10]