I'm trying to search for subreddits associated with a keyword by using the Reddit API but I'm having trouble trying to get 50 or more results. I've used the API with the requests library and now with praw and both had the same issue.
If I run this code, for most of the keywords I get the requested amount of 50 subreddits but for keywords like Entertainment, I get 48 or 49. Furthermore, if I set the limit to 100, I get 54 or 53 results for the other keywords. I tried to solve this using the after
parameter, however even when the after
key is assigned the fullname of the last subreddit of the previous get request, the r.get
with that as a parameter returns 0 posts.
Is there any way to work around this or am I just missing something?
import praw
keywords = [
"Comedy",
"Education",
"Entertainment",
"Movies",
...
]
r = praw.Reddit(
client_id='',
client_secret='',
password='',
user_agent='',
username='',
)
prev_completed = True
def get_data_reddit(search, after=None):
global prev_completed
params = {"q": search, "limit": 50}
if after and not prev_completed:
params["after"] = "t5_" + str(after.id)
prev_completed = True
posts = r.get('https://oauth.reddit.com/subreddits/search', params=params)
if(len(posts) == 50):
prev_completed = True
else:
prev_completed = False
return posts
post = None
count = 0
j = 0
for i in range(0, len(keywords)):
posts = get_data_reddit(keywords[j], post)
if len(posts) < 50:
j -= 1
for post in posts:
count += 1
j += 1
print(count)
I realized that the parameter I was actually looking for was confusingly the before attribute, not the after attribute. With this in mind here is the code that worked for me.
prev_completed = True
postCount = 0
def get_data_reddit(search, limit, after=None,):
global prev_completed
global postCount
params = {"q": search, "limit": limit}
if after and not prev_completed:
params["before"] = "t5_" + str(after.id)
prev_completed = True
posts = r.get('https://oauth.reddit.com/subreddits/search', params=params)
postCount += len(posts)
if(postCount == 50):
prev_completed = True
else:
prev_completed = False
return posts
post = None
count = 0
i = 0
while i < len(keywords):
posts = get_data_reddit(keywords[i], 50 - postCount, post)
if postCount < 50:
i -= 1
else:
postCount = 0
for post in posts:
count += 1
i += 1
print(count)