I am not sure , how challenging the problem is, but this is my problem:
I have a data frame say as below:
df-in
ColA ColB ColC ColD
PA 1 2 c
PB 3 3 d
PA 5 11 x
Ind1 Suma 20 14 z
QA 3 7 a
QB 3 7 b
QC 5 8 c
QD 5 12 c
Ind2 Dat 202 3 y
RA 12 1 a
RB 13 1 v
RC 14 1 q
Ind3 CapT 120 3 t
Other 10 4 x
So now the problem is that what I have to do is to create a column (say ColN) which will have the Indicator values only ( Ind1 or Ind2 or Ind3 as in ColA) but with the following rule: The ColN will have all Ind1 till ColA has the Ind1 value, , Ind2, between Ind1 & Ind2 and Ind3 between Ind3 and Ind2.
(Just to emphasize here , that Ind1, Ind2, Ind3 can be different strings like 'star', 'planet', 'moon', but they will always remain that. They are not going to change)
So my output df will look like the following:
df-out
ColA ColB ColC ColD ColN
PA 1 2 c Ind1
PB 3 3 d Ind1
PA 5 11 x Ind1
Ind1 Suma 20 14 z Ind1
QA 3 7 a Ind2
QB 3 7 b Ind2
QC 5 8 c Ind2
QD 5 12 c Ind2
Ind2 Dat 202 3 y Ind2
RA 12 1 a Ind3
RB 13 1 v Ind3
RC 14 1 q Ind3
Ind3 CapT 120 3 t Ind3
Other 10 4 x Ind3
So how can I achieve that? I am trying to iterate over ColA or trying to put it in a list and trying, but nothing seems to work. Any help will be appreciated.
Example
@RomanPerekhrest and @mozway answers are excellent.
If you cannot solve your problem with their answers, I think you should have provided a more accurate example to help you solve your problem.
I will create a new example.
import pandas as pd
data1 = {'ColA': ['PA', 'PB', 'star Suma', 'QA', 'moon Dat', 'RA', 'planet CapT']}
df = pd.DataFrame(data1)
df
ColA
0 PA
1 PB
2 star Suma
3 QA
4 moon Dat
5 RA
6 planet CapT
I didn't create colB and colC because they didn't seem necessary.
Code
We can group the data based on whether it includes star, moon, or planet
.
kw = ['star', 'moon', 'planet']
df['ColN'] = df['ColA'].str.extract(r'({})'.format('|'.join(kw))).bfill()
df
ColA ColN
0 PA star
1 PB star
2 star Suma star
3 QA moon
4 moon Dat moon
5 RA planet
6 planet CapT planet
The problem has already been solved by other answers.
However, I have modified the example and solution to help you apply it to your situation.