In this example RSS feed, the optional item element pubDate is included in all entries. But it is not available as a item element in the Python module feedparser. This code:
import feedparser
rss_object = feedparser.parse("http://cyber.law.harvard.edu/rss/examples/rss2sample.xml")
for entry in rss_object.entries:
print entry.pubDate
Causes the error AttributeError: object has no attribute 'pubDate'
but I can successfully do print entry.description
and see the contents of all the description tags.
feedparser
is an opinionated parser, not simply returning XML in a dictionary. The text of pubDate
is available as entries[i].published
.
The date this entry was first published, as a string in the same format as it was published in the original feed.
Working code:
for entry in rss_object.entries:
print entry.published
Note: published
is extracted from one of several possible XML tags depending on the format of the feed. See the reference manual for a list.
This manual also claims the pubDate element is parsed "as a date" in entries[i].published_parsed
. What's in published_parsed
is a time.struct_time
object; you may want to re-parse the date yourself to maintain time zone information, if the original feed included time zones.