[SOLVED] How can I find the first row after a number of duplicated rows?

How can I find the first row after a number of duplicated rows?

My DataFrame is:

import pandas as pd
df = pd.DataFrame(
    {
        'x': ['a', 'a', 'a','b', 'b','c', 'c', 'c',],
        'y': list(range(8))
    }
)

And this is the expected output. I want to create column z:

   x  y    z
0  a  0    NaN
1  a  1    NaN 
2  a  2    NaN
3  b  3    3
4  b  4    NaN
5  c  5    NaN
6  c  6    NaN
7  c  7    NaN

The logic is:

I want to find the first row after the first group of duplicated rows. For example in column x, the value a is the first duplicated value. I want to find one row after the a values end. And then put the y of that row for z column.

This is my attempt that did not give me the output:

m = (df.x.duplicated())
out = df[m]

Solution

One option, using a custom mask:

# flag rows after the first group
m = df['x'].ne(df['x'].iat[0]).cummax()

# pick the first one
out = df[m & ~m.shift(fill_value=False)]

If your first value is always a and you want to find the first non-a you could also use:

m2 = df['x'].eq('a')

out = df[m2.shift(fill_value=False) & ~m2]

Or, if you're sure there is at least one row after the leading as:

out = df.loc[[df['x'].ne('a').idxmax()]]

Output:

   x  y
3  b  3

Some intermediates (all approaches):

   x  y      m  ~m.shift(fill_value=False)     m2  m2.shift(fill_value=False)  df['x'].ne('a')
0  a  0  False                        True   True                       False            False
1  a  1  False                        True   True                        True            False
2  a  2  False                        True   True                        True            False
3  b  3   True                        True  False                        True             True
4  b  4   True                       False  False                       False             True
5  c  5   True                       False  False                       False             True
6  c  6   True                       False  False                       False             True
7  c  7   True                       False  False                       False             True