I'm using newspaper3 to extract URLs from news.google, but the problem is I keep getting all the URLs (I've disabled memoize because I need the full list). I would like to only print the top 5 links or 5 random links doesn't really matter. I've tried setting a max, but that didn't work. Any ideas?
import newspaper
news = newspaper.build('https://news.google.com/topics/CAAqJggKIiBDQkFTRWdvSUwyMHZNRGx6TVdZU0FtVnVHZ0pWVXlnQVAB?oc=3&ceid=US:en', memoize_articles=False)
for article in news.articles:
print(article.url)
This code snippet should be exactly what you want. It doesn't use a newspaper function but rather random to select a certain number of urls. The output from newspaper isn't a list therefore it has to be converted into a list using the append function. Enjoy!
import newspaper
business_news = newspaper.build('https://news.google.com/topics/CAAqJggKIiBDQkFTRWdvSUwyMHZNRGx6TVdZU0FtVnVHZ0pWVXlnQVAB?hl=en-US&gl=US&ceid=US%3Aen', language='en', memoize_articles = False)
myList = []
for article in business_news.articles:
myList.append(str(article.url))
print(myList) #not necessary just for display purposes
import random
aselect = myList
randarticles = random.sample(aselect, 5)
print(randarticles)