pythonpandasnumpygenfromtxt

np.genfromtxt returns string with 'b'


I am learning about different functions of NUmpy, And I have a dummy dataset here named as 100-Sales-Records.

Now I want to read it using np.genfromtxt. My code to read it is

df3 = np.genfromtxt('100 Sales Records.csv', delimiter=',',names=True, dtype=None)

Because it is a 'csv' file and have strings as well as float. Now the Output of

pd.DataFramge(df3).head() is enter image description here

WHich you can see that all strings have b in front of them. WHat is this b and how to remove it?


Solution

  • The answer is that b before strings means that it is a byte object normally returned with utf-8 encoding. It is a bytes object.

    To remove it, there is a parameter in genfromtxt that is encoding, set it to utf-8

    i.e

    df3 = np.genfromtxt('100 Sales Records.csv', delimiter=',',names=True, dtype=None, encoding='utf-8')
    

    This will give you the desired results.