python-3.xexport-to-csvsharepoint-rest-apijson-api-response-converter

How do I convert my response with byte characters to readable CSV - PYTHON


I am building an API to save CSVs from Sharepoint Rest API using python 3. I am using a public dataset as an example. The original csv has 3 columns Group,Team,FIFA Ranking with corresponding data in the rows.For reference. the original csv on sharepoint ui looks like this:

after using data=response.content the output of data is:

b'Group,Team,FIFA Ranking\r\nA,Qatar,50\r\nA,Ecuador,44\r\nA,Senegal,18\r\nA,Netherlands,8\r\nB,England,5\r\nB,Iran,20\r\nB,United States,16\r\nB,Wales,19\r\nC,Argentina,3\r\nC,Saudi Arabia,51\r\nC,Mexico,13\r\nC,Poland,26\r\nD,France,4\r\nD,Australia,38\r\nD,Denmark,10\r\nD,Tunisia,30\r\nE,Spain,7\r\nE,Costa Rica,31\r\nE,Germany,11\r\nE,Japan,24\r\nF,Belgium,2\r\nF,Canada,41\r\nF,Morocco,22\r\nF,Croatia,12\r\nG,Brazil,1\r\nG,Serbia,21\r\nG,Switzerland,15\r\nG,Cameroon,43\r\nH,Portugal,9\r\nH,Ghana,61\r\nH,Uruguay,14\r\nH,South Korea,28\r\n'

how do I convert the above to csv that pandas can manipulate with the columns being Group,Team,FIFA and then the corresponding data dynamically so this method works for any csv.

I tried:

data=response.content.decode('utf-8', 'ignore').split(',')

however, when I convert the data variable to a dataframe then export the csv the csv just returns all the values in one column.

I tried:

data=response.content.decode('utf-8') or data=response.content.decode('utf-8', 'ignore') without the split

however, pandas does not take this in as a valid df and returns invalid use of dataframe constructor

I tried:

data=json.loads(response.content)

however, the format itself is invalid json format as you will get the error json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)


Solution

  • Given:

    data = b'Group,Team,FIFA Ranking\r\nA,Qatar,50\r\nA,Ecuador,44\r\nA,Senegal,18\r\n' #...
    

    If you just want a CSV version of your data you can simply do:

    with open("foo.csv", "wt", encoding="utf-8", newline="") as file_out:
        file_out.writelines(data.decode())
    

    If your objective is to load this data into a pandas dataframe and the CSV is not actually important, you can:

    import io
    import pandas
    foo = pandas.read_csv(io.StringIO(data.decode()))
    print(foo)