I am building an RSS aggregator and trying to handle images from a broad range of sources.
The majority of my sources are using images in the content or media:RSS enclosures.
I have managed to get Simple Pie to pick up enclosures for media:RSS
and also to strip images from the content of an RSS post.
But feeds from a CMS called Silver Stripe have an embedded image in each post with an atom style
notation that I cannot get Simple Pie to read and extract images from.
<link rel="enclosure" type="image/JPG" href="http://example.com/image.jpg" />
Do I need to modify the enclosure class to get this to work or am I missing something, is it
something to do with the namespaces I am using?
Here is one of the feeds I am trying to get
I am accessing about 7 other different SilverStipe sites, all of these include the same image links...
Here is my current image script:
if ($enclosures = $prPost->get_enclosures())
{
foreach ($enclosures as $enclosure)
{
$this->Fields['image'] = $enclosure->get_link();
}
}
if (preg_match('/<img.+?src="(.+?)"/', $this->Fields['desc'], $matches) && strlen($this->Fields['image']) < 5) {
$this->Fields['image'] = $matches[1];
$this->Fields['desc'] = preg_replace('/<img(.*)>/i' , "" , $this->Fields['desc'], 1);
}
These link
elements are children of the item
elements:
<rss>
<channel>
...
<item>
...
<description>...</description>
<link rel="enclosure" ... />
You can use get_item_tags
to get specific elements from an item
.
They should be in the Atom namespace (<atom:link>
), but are instead left in the default namespace (<link>
), which is a bug in the feed. but one you'll need to work around:
$links = $item->get_item_tags(SIMPLEPIE_NAMESPACE_RSS_20, 'link');