pythonpandasdataframe

Call and Rename not working for Column in Pandas Dataframe


I've cast two .csv files into two separate pandas dataframes for preprossessing. The dataframes are 'properties_data' and 'customers_data'. Calling and renaming does not work for the first imported column ('id' and 'customerid', respectively) of both dataframes. What can be wrong? Thank you in advance for assisting!

Here is the code with output, for the 'properties_data' dataframe (run in a Jupyter environment):

import pandas as pd
properties = pd.read_csv("properties.csv", usecols = range(1,10))
properties_data = properties.copy()
properties_data.index.name = 'propertiesIndex'
properties_data.rename(columns={'id': 'propertyId', 'date_sale': 'dateSale', 'property#': 'propertyNo', 'customerid' : 'customerId'}, inplace=True)
properties_data.dtypes

The output shows that the name of the 'id' column did not change to 'propertyId':

id             int64
building        int64
dateSale       object
type           object
propertyNo      int64
area          float64
price          object
status         object
customerId     object
dtype: object

Other columns are callable:

properties_data.loc[: ,'dateSale']
# or properties_data.dateSale

# Giving output:

propertiesIndex
0    11/1/2005
1    10/1/2005
2     7/1/2007
3    12/1/2007
4    11/1/2004
Name: dateSale, dtype: object

But the 'id' column (or 'propertyId' just for in case) is not:

properties_data.loc[: ,'id']
# or properties_data.id

# Giving a long error massage ending with:

KeyError: 'id'

The same goes for the 'customers_data' dataframe:

customers = pd.read_csv("customers.csv", usecols = [i for i in range(1,13)])
customers_data = customers.copy()
customers_data.index.name = 'customersIndex'
customers_data.rename(columns={'customerid' : 'customerId', 'birth_date': 'birthDate', 'deal_satisfaction': 'dealSatisfaction'}, inplace=True)
customers_data.dtypes

There the output shows that the name of the 'customerid' column did not change to 'customerId':

customerid         object
entity              object
name                object
surname             object
birthDate           object
sex                 object
country             object
state               object
purpose             object
dealSatisfaction     int64
mortgage            object
source              object
dtype: object

Again, other columns are callable:

customers_data.loc[: ,'entity']
# or customers_data.entity

# Giving output:

customersIndex
0    Individual
1    Individual
2    Individual
3    Individual
4       Company
Name: entity, dtype: object

But the 'customerid' column (or 'customerId' just for in case) is not:

customers_data.loc[: ,'customerid']
# or customers_data.customerid

#Again, a long error massage ending with:

KeyError: 'customerid'

Here is a sample of the "properties.csv" file for the 'properties_data' dataframe:

,id,building,date_sale,type,property#,area,price,status,customerid
0,1030,1,11/1/2005,Apartment,30,743.09,"$246,172.68 ", Sold , C0028 
1,1029,1,10/1/2005,Apartment,29,756.21,"$246,331.90 ", Sold , C0027 
2,2002,2,7/1/2007,Apartment,2,587.28,"$209,280.91 ", Sold , C0112 
3,2031,2,12/1/2007,Apartment,31,1604.75,"$452,667.01 ", Sold , C0160 
4,1049,1,11/1/2004,Apartment,49,1375.45,"$467,083.31 ", Sold , C0014 

Here is a sample of the "customers.csv" file for the 'customers_data' dataframe:

,customerid,entity,name,surname,birth_date,sex,country,state,purpose,deal_satisfaction,mortgage,source
0,C0110,Individual,Kareem,Liu,5/11/1968,F,USA,California,Home,4,Yes,Website
1,C0010,Individual,Trystan,Oconnor,11/26/1962,M,USA,California,Home,1,No,Website
2,C0132,Individual,Kale,Gay,4/7/1959,M,USA,California,Home,4,Yes,Agency
3,C0137,Individual,Russell,Gross,11/25/1959,M,USA,California,Home,5,No,Website
4,C0174,Company,Marleez,Co,,,USA ,California,Investment,5,No,Website

Solution

  • The solution is simply that there was an invisible character in front of 'id' and 'customerid' respectively, that had to be removed in the two csv files. The two columns or series were thus created as '[invisible character]id' and '[invisible character]customerid,' making them impossible to call and rename. It's funny that when I copy the pieces of the csv samples I shared here back, that these invisible characters are resolved and the code works as expected.