python-3.xbeautifulsoupsiblings

How to get all direct children of a BeautifulSoup Tag?


How to retrieve (not recursively) all children using BeautifulSoup (bs4)?

<div class='body'><span>A</span><span><span>B</span></span><span>C</span></div>

I want to get blocks like this:

block1 : <span>A</span>
block2 : <span><span>B</span></span>
block3 : <span>C</span>

I'm doing this way:

for j in soup.find_all(True)[:1]:
    if isinstance(j, NavigableString):
        continue
    if isinstance(j, Tag):
        tags.append(j.name)
        # Get siblings
        for k in j.find_next_siblings():
            # k is sibling of first element

Is there a cleaner way to do that?


Solution

  • You can set the recursive argument to False if you want to select only direct descendants.
    An example with the html you provided:

    from bs4 import BeautifulSoup
    
    html = "<div class='body'><span>A</span><span><span>B</span></span><span>C</span></div>"
    soup = BeautifulSoup(html, "lxml") 
    for j in soup.div.find_all(recursive=False):
        print(j)
    

    <span>A</span>
    <span><span>B</span></span>
    <span>C</span>