I'm trying to parse this feed: https://feeds.podcastmirror.com/dudesanddadspodcast
The channel
section has two entries for podcast:person
<podcast:person role="host" img="https://dudesanddadspodcast.com/files/2019/03/andy.jpg" href="https://www.podchaser.com/creators/andy-lehman-107aRuVQLA">Andy Lehman</podcast:person>
<podcast:person role="host" img="https://dudesanddadspodcast.com/files/2019/03/joel.jpg" href="https://www.podchaser.com/creators/joel-demott-107aRuVQLH" >Joel DeMott</podcast:person>
When parsed, feedparser only brings in one name
> import feedparser
> d = feedparser.parse('https://feeds.podcastmirror.com/dudesanddadspodcast')
> d.feed['podcast_person']
> {'role': 'host', 'img': 'https://dudesanddadspodcast.com/files/2019/03/joel.jpg', 'href': 'https://www.podchaser.com/creators/joel-demott-107aRuVQLH'}
What would I change so it would instead show a list for podcast_person
so I could loop through each one?
Idea #1:
from bs4 import BeautifulSoup
import requests
r = requests.get("https://feeds.podcastmirror.com/dudesanddadspodcast").content
soup = BeautifulSoup(r, 'html.parser')
soup.find_all("podcast:person")
Output:
[<podcast:person href="https://www.podchaser.com/creators/andy-lehman-107aRuVQLA" img="https://dudesanddadspodcast.com/files/2019/03/andy.jpg" role="host">Andy Lehman</podcast:person>,
<podcast:person href="https://www.podchaser.com/creators/joel-demott-107aRuVQLH" img="https://dudesanddadspodcast.com/files/2019/03/joel.jpg" role="host">Joel DeMott</podcast:person>,
<podcast:person href="https://www.podchaser.com/creators/cory-martin-107aRwmCuu" img="" role="guest">Cory Martin</podcast:person>,
<podcast:person href="https://www.podchaser.com/creators/julie-lehman-107aRuVQPL" img="" role="guest">Julie Lehman</podcast:person>]
Idea #2:
import feedparser
d = feedparser.parse('https://feeds.podcastmirror.com/dudesanddadspodcast')
hosts = d.entries[1]['authors'][1]['name'].split(", ")
print("The hosts of this Podcast are {} and {}.".format(hosts[0], hosts[1]))
Output:
The hosts of this Podcast are Joel DeMott and Andy Lehman.