pythonpandasdataframesplit

Pandas - Data transformation of column using now delimiters


I have a pandas dataframe which consists of players names and statistics from a sporting match. The only source of data lists them in the following format:

#                 PLAYER   M  FG  3PT FT REB AST STL PTS
34  BLAKE Brad        38  17  5  6  3   0    3   0   24
12  JONES Ben         42  10  2  6  1   0    4   1   12
8   SMITH Todd J.     16   9  1  4  1   0    3   2   18
5   MAY-DOUGLAS James  9   9  0  3  1   0    2   1   6
44  EDLIN Taylor      12   6  0  5  1   0    0   1   8

The players names are in reverse order: Surname Firstname. I need to transform the names to the current order of firstname lastname. So, specifically:

BLAKE Brad -> Brad BLAKE
SMITH Todd J. -> Todd J. SMITH
MAY-DOUGLAS James -> James MAY-DOUGLAS

The case of the letters do not matter, however I thought potentially they could be used to differentiate the first and lastname. I know all lastnames with always be in uppercase even if they include a hyphen. The first name will always be sentence case (first letter uppercase and the rest lowercase). However some names include the middle name to differentiate players with the same name. I see how a space character can be used a delimiter and potentially use a "split" transformation but it guess difficult with the middle name character.

Is there any suggestions of a function from Pandas I can use to achieve this?

The desired out put is:

#                 PLAYER   M  FG  3PT FT REB AST STL PTS
34  Brad BLAKE        38  17  5  6  3   0    3   0   24
12  Ben JONES         42  10  2  6  1   0    4   1   12
8   Todd J. SMITH     16   9  1  4  1   0    3   2   18
5   James MAY-DOUGLAS  9   9  0  3  1   0    2   1   6
44  Taylor EDLIN      12   6  0  5  1   0    0   1   8

Solution

  • Try to split by first whitespace, then reverse the list and join list values with whitespace.

    df['PLAYER'] = df['PLAYER'].str.split(' ', 1).str[::-1].str.join(' '))
    

    To reverse only certain names, you can use isin then boolean indexing

    names = ['BLAKE Brad', 'SMITH Todd J.', 'MAY-DOUGLAS James']
    
    mask = df['PLAYER'].isin(names)
    
    df.loc[mask, 'PLAYER'] = df.loc[mask, 'PLAYER'].str.split('-', 1).str[::-1].str.join(' ')