I have two address columns and I want to extract the last word from the first column and the first word from the second column. In the provided example there aren't two words in column 'Address2', but I want to build the code in such a way that it will work regardless of how the dataset will look like. Sometimes the address2 can be one word, something it will have 2, etc..
data = {
'Address1': ['3 Steel Street', '1 Arnprior Crescent', '40 Bargeddie Street Blackhill'],
'Address2': ['Saltmarket', 'Castlemilk', 'Blackhill']
}
df = pd.DataFrame(data)
I have no problem with column 'Address1':
df[['StringStart', 'LastWord']] = df['Address1'].str.rsplit(' ', n=1, expand=True)
The problem comes with column 'Address2' where if I apply the above code I an error: Columns must be same length as key
I understand where the problem is coming from - I am trying to split one column which has one element into two columns. I am sure there is a way in which this can be handled to allow the split anyway and return Null if there isn't a word and a value if there is.
Using str.extract()
might be better for several reasons: it handles all cases, offers precision with regular expressions, and eliminates the risk of value errors.
import pandas as pd
data = {
'Address1': ['3 Steel Street', '1 Arnprior Crescent', '40 Bargeddie Street Blackhill'],
'Address2': ['Saltmarket', 'Castlemilk East', 'Blackhill']
}
df = pd.DataFrame(data)
df[['StringStart', 'LastWord']] = df['Address1'].str.rsplit(' ', n=1, expand=True)
df[['FirstWord_Address2', 'Remaining_Address2']] = (
df['Address2'].str.extract(r'^(\S+)\s*(.*)$')
)
print(df)
Or:
df[['Address1_Prefix', 'Address1_LastWord']] = df['Address1'].str.extract(r'^(.*\b)\s+(\S+)$')
df[['Address2_FirstWord', 'Address2_Remaining']] = df['Address2'].str.extract(r'^(\S+)\s*(.*)$')
Output:
Address1 Address2 StringStart LastWord FirstWord_Address2 Remaining_Address2
0 3 Steel Street Saltmarket 3 Steel Street Saltmarket
1 1 Arnprior Crescent Castlemilk East 1 Arnprior Crescent Castlemilk East
2 40 Bargeddie Street Blackhill Blackhill 40 Bargeddie Street Blackhill Blackhill