I have this code using which I am able to download data from API. But I don't know how to convert this to pandas data frame.
import requests
import json
url = "https://statdb.luke.fi:443/PxWeb/api/v1/en/LUKE/02 Maatalous/04 Tuotanto/06 Lihantuotanto/02 Kuukausitilastot/02_Lihantuotanto_teurastamoissa_kk.px"
payload = json.dumps({
"query": [
{
"code": "Muuttuja",
"selection": {
"filter": "item",
"values": [
"Lihantuotanto"
]
}
},
{
"code": "Laji",
"selection": {
"filter": "item",
"values": [
"Lehmät"
]
}
}
],
"response": {
"format": "csv"
}
})
headers = {
'Content-Type': 'application/json',
'Cookie': 'rxid=710a361a-7044-494f-95b7-15261822712c'
}
response = requests.request("POST", url, headers=headers, data=payload)
print(response.text)
The code returns a text format data. I need guidance on how to make a pandas data frame with this output.
For this particular data frame, the Column headers would be
"Month",
"Variable",
"Cows 7) 8)"
So on and So forth.
You can use StringIO
to mimic a file-like object and then create a pd.DataFrame
from it. Additionally, there are some weird symbols in the first column (
), which I expect you don't need. Therefore, make sure to rename the first column as well.
import requests
import json
import pandas as pd
from io import StringIO
url = "https://statdb.luke.fi:443/PxWeb/api/v1/en/LUKE/02 Maatalous/04 Tuotanto/06 Lihantuotanto/02 Kuukausitilastot/02_Lihantuotanto_teurastamoissa_kk.px"
payload = json.dumps({ "query": [
{
"code": "Muuttuja",
"selection": {
"filter": "item",
"values": [
"Lihantuotanto"
]
}
},
{
"code": "Laji",
"selection": {
"filter": "item",
"values": [
"Lehmät"
]
}
} ], "response": {
"format": "csv" } })
headers = { 'Content-Type': 'application/json', 'Cookie': 'rxid=710a361a-7044-494f-95b7-15261822712c' }
response = requests.request("POST", url, headers=headers, data = payload)
# Create StringIO object to mimic file-like object
content_file = StringIO(response.text)
# Create pd.DataFrame
df = pd.read_csv(content_file)
# Clean column name
df.rename(columns = {'"Month"': 'Month'}, inplace = True)
print(df.head())