pythonpandas

Parse the column value and save the first section in new column


I need to parse column values in a data frame and save the first parsed section in a new column if it has a parsing delimiter like "-" if not leave it empty

raw_data = {'name': ['Willard Morris', 'Al Jennings', 'Omar Mullins', 'Spencer McDaniel'],
'code': ['01-02-11-55-00115','11-02-11-55-00445','test', '31-0t-11-55-00115'],
'favorite_color': ['blue', 'blue', 'yellow', 'green'],  
'grade': [88, 92, 95, 70]}
df = pd.DataFrame(raw_data)
df.head()

adding a new column that has the first parsed section and the expected column values are : 01 11 null 31


Solution

  • df['parsed'] = df['code'].apply(lambda x: x.split('-')[0] if '-' in x else 'null')
    

    will output:

                   name               code favorite_color  grade parsed
    0    Willard Morris  01-02-11-55-00115           blue     88     01
    1       Al Jennings  11-02-11-55-00445           blue     92     11
    2      Omar Mullins               test         yellow     95   null
    3  Spencer McDaniel  31-0t-11-55-00115          green     70     31