pythonpandasstringdataframe

How to update Pandas DataFrame column using string concatenation in function


I have a dataframe where I would like to add a full address column, which would be the combination of 4 other columns (street, city, county, postalcode) from that dataframe. Example output of the address column would be:

5 Test Street, Worthing, West Sussex, RH5 3BX

Or if the city was empty as an example:

5 Test Street, West Sussex, RH5 3BX

This is my code, which after testing I see I might need to use something like apply, but I can't workout how to do it.

def create_address(street: str, city: str, county: str, postalcode: str) -> str:
    
    list_address = []
    
    if street:
        list_address.append(street)
    if city:
        list_address.append(city)
    if county:
        list_address.append(county)
    if postalcode:
        list_address.append(postalcode)

    address = ", ".join(list_address).rstrip(", ")

    return address

df["address"] = create_address(df["Street"], df["City"], df["County"], df["PostalCode"])

Solution

  • you can use lambda and apply to get the concatenated of full address

    Example input

    EDIT : postalcode with None

    data = {
    'street': ['street1', 'street2', 'street3'],
    'city': ['city1', '', 'city2'],
    'county': ['county1', 'county2', 'county3'],
    'postalcode': ['postalcode1', 'postalcode2', '']
    }
    

    Sample code

    df['full_address'] = df.apply(
        lambda row: ', '.join(filter(None, [row['street'], row['city'], row['county'], row['postalcode']])),
        axis=1
    )
    

    None is used as a filter so that unavailable elements are removed.

    Output

        street   city   county   postalcode                          full_address
    0  street1  city1  county1  postalcode1  street1, city1, county1, postalcode1
    1  street2         county2  postalcode2         street2, county2, postalcode2
    2  street3  city2  county3                            street3, city2, county3