pythonbeautifulsoup

Where can I find some "hello world"-simple Beautiful Soup examples?


I'd like to do a very simple replacement using Beautiful Soup. Let's say I want to visit all A tags in a page and append "?foo" to their href. Can someone post or link to an example of how to do something simple like that?


Solution

  • from BeautifulSoup import BeautifulSoup
    
    soup = BeautifulSoup('''
    <html>
       <head><title>Testing</title></head>
       <body>
         <a href="http://foo.com/">foo</a>
         <a href="http://bar.com/bar">Bar</a>
       </body>
    </html>''')
    
    for link in soup.findAll('a'): # find all links
        link['href'] = link['href'] + '?foo'
    
    print soup
    

    That prints:

    <html>
    <head><title>Testing</title></head>
    <body>
    <a href="http://foo.com/?foo">foo</a>
    <a href="http://bar.com/bar?foo">Bar</a>
    </body>
    </html>
    

    The documentation also has some examples for changing attributes. It is an extensive tutorial that covers all common aspects of BeautifulSoup. I don't know what is missing from the documentation, maybe you should clarify.