djangorss-reader

Get content and image from RSS url in django-yarr


I'm using django-yarr for my RSS reader applications. Is there any way to fetch content from RSS URL and save in database? Or is there any library that could do that?


Solution

  • Are you looking to read data from an RSS, process it and save it?

    Use Requests to fetch the data.

    import requests
    
    req = requests.get('http://feeds.bbci.co.uk/news/technology/rss.xml')
    reg.text // XML as a string
    

    BeautifulSoup, lxml or ElementTree to process the data (or similar libraries that can process xml)

    from bs4 import BeautifulSoup
    soup = BeautifulSoup(req.text)
    
    images = soup.findAll('media:thumbnail')
    

    Finally do whatever you want with the data

    for image in images:
        thing = DjangoModelThing()
        thing.image = image.attrs.get('url')
        thing.save()
    

    UPDATE

    Alternatively you could grab each article from the RSS

    articles = soup.findAll('item')
    
    for article in articles:
        title = article.find('title')
        description = article.find('description')
        link = article.find('link')
        images = article.find('media:thumbnail')