I need to store new items from multiple RSS feeds in a database. I would like to use the GUID tag of each item to determine, whether it already exists in the database.
See the W3C specification:
guid stands for globally unique identifier. It's a string that uniquely identifies the item. When present, an aggregator may choose to use this string to determine if an item is new.
...
There are no rules for the syntax of a guid. Aggregators must view them as a string. It's up to the source of the feed to establish the uniqueness of the string.
So my question is, is it safe to consider a GUID unique among different feeds? Or will I need to combine the GUID with the feed it comes from, to make sure there are no duplicate GUIDs?
The GUID is not even mandatory, so in my opinion it is not safe to consider it unique. I'd suggest you read this blog post about rss feed duplicate detection. (archived copy)