pythoncsvpandas

Importing a CSV file in pandas into a pandas dataframe


I have a CSV file taken from a SQL dump that looks like the below (first few lines using head file.csv from terminal):

??AANAT,AANAT1576,4
AANAT,AANAT1704,1
AAP,AAP-D-12-00691,8
AAP,AAP-D-12-00834,3

When I use the pd.read_csv('file.csv') command I get an error "ValueError: No columns to parse from file".

Any ideas on how to import the CSV file into a table and avoid the error?

ELABORATION OF QUESTION (following Ed's comment)

I have tried header = None, skiprows=1 to avoid the ?? (which appear when using the head command from the terminal).


Solution

  • So the ?? characters you see are in fact non-printable characters which after looking at your raw csv file using a hex editor show that they are in fact utf-16 little endian \FFEE which is the Byte-Order-Mark.

    So all you need to do is to pass this as the encoding type and it reads in fine:

    In [46]:
    
    df = pd.read_csv('otherfile.csv', encoding='utf-16', header=None)
    df
    Out[46]:
           0               1   2
    0  AANAT       AANAT1576   4
    1  AANAT       AANAT1704   1
    2    AAP  AAP-D-12-00691   8
    3    AAP  AAP-D-12-00834   3
    4    AAP  AAP-D-13-00215  10
    5    AAP  AAP-D-13-00270   7
    6    AAP  AAP-D-13-00435   5
    7    AAP  AAP-D-13-00498   4
    8    AAP  AAP-D-13-00530   0
    9    AAP  AAP-D-13-00747   3