I am trying to pull all pages from this specific endpoint in the API, and then convert the .json responses into a Pandas Dataframe. My mechanic is looking at a Key Value in the .json file, which is nested in the "meta" key, under hasMore`. If there is another page of results, it will produce True, if no more pages, it will return False as the values.
I can print the result of has_More and it comes back True, so I know it has found the correct value. However, when I run the function, I get back a blank list with .json or dataframes. Been at this for awhile trying new code, but always get back blank lists.
import requests
import json
import pandas as pd
url = "https://xxxxxxxx.xxxxxxxxx.com/projects/api/v3/tasklists"
def get_all_tasklists(url, pageSize=100, page=1):
response = requests.get(f'{url}?page={page}&pageSize={pageSize}',
headers = {"authorization": xxxxxxxxxxxx}
)
meta_data = json.loads(response.text)['meta']
has_More = meta_data['page']['hasMore']
timelogs = []
if has_More == True:
x = pd.json_normalize(json.loads(response.text))
timelogs.extend(x)
page += 1
else:
print(timelogs)
get_all_tasklists(url, pageSize=100, page=1)
A couple problems:
One, your code will only ever fetch one page per call. Given that you've named it get_all_xxx
, this is probably not intended behavior. Two, you're using list.extend
instead of list.append
, which is probably what you want. Three, there's no return on this function.
I've tweaked and cleaned up your code, this should serve as a minimal working example of what you'll need. It's still missing error handling around the request itself, but this should be a good base. Note that the pagination is now within a loop.
import requests
import json
import pandas as pd
url = "https://xxxxxxxx.xxxxxxxxx.com/projects/api/v3/tasklists"
def get_all_tasklists(url, pageSize=100, page=1):
dataframes = []
while True:
response = requests.get(
f'{url}?page={page}&pageSize={pageSize}',
headers = {"authorization": xxxxxxxxxxxx}
)
meta_data = json.loads(response.text)['meta']
has_more = meta_data['page']['hasMore']
x = pd.json_normalize(json.loads(response.text))
dataframes.append(x)
if has_more:
page += 1
else:
return pd.concat(timelogs)
get_all_tasklists(url, pageSize=100, page=1)