I am practicing on extracting data from Reddit. I have tried to obtain the 20 most relevant communities that contain the word "sport". There are hundreds of them, but my API request gave me back not even 20 of them. Do you know why?
Here is the code:
parameters = {'query': 'sport', "limit":20, 'sort':'relevance'}
res = requests.get("https://oauth.reddit.com//api/search_reddit_names", headers=headers, params=parameters)
res.json()
output:
{'names': ['sports',
'sportsbook',
'sportsarefun',
'sportsbetting',
'SportsFR',
'sportscards',
'SportsPorn',
'SportingCP',
'Sports_Women']}
As seen in the Reddit API Docs,
GET /api/search_reddit_names
- List subreddit names that begin with a query string.
This difference between "begins with" and "contains" is probably the cause of your problem.
From a quick glance at the API, I haven't found a suitable function for your needs.
EDIT:
you could use a different API, specifically for search.
import requests
# Set the URL endpoint for the API request
url = "https://www.reddit.com/subreddits/search.json"
# Set the parameters for the API request
search_param = "sport"
params = {
"q": search_param, # Search query
"limit": 100, # Maximum number of results to retrieve
"type": "sr" # Limit search to subreddits
}
# Send the API request
response = requests.get(url, params=params)
# Check if the request was successful (status code 200)
if response.status_code == 200:
# Extract the JSON data from the response
data = response.json()
# Extract the list of subreddits from the JSON data
subreddits = [item["data"]["display_name"] for item in data["data"]["children"]]
# Print the list of subreddits
print(f"Subreddits containing '{search_param}':")
for subreddit in subreddits:
print(subreddit)
else:
print("An error occurred while retrieving the data. Status code:", response.status_code)
That outputs:
Subreddits containing 'sport':
sport
soccer
sports
BroncoSport
AskReddit
nba
sportsbetting
formula1
baseball
leagueoflegends
sportsbook
teenagers
Dualsport
sportvids
SportWagon
nfl
CFB
OriginSport
sportsarefun
hockey
granturismo
weightlifting
football
unpopularopinion
MMA
GranTurismoSport
...
As you can see, many don't contain 'sport', so you're gonna have to filter some more.
E.g., in the for subreddit in subreddits
:
if search_param in subreddit:
or in the loop comprehension beforehand.