pythonpython-3.xbeautifulsouppython-3.7nextsibling

Assign Extracted text from HTML table to Variable for later use -- Beautiful Soup / Python 3.7


I have the below code working perfectly to dynamically search for a specific text within a HTML table source code and pull the nextSibling of the row where the specific text was found.

Current Code

r = requests.get(url)
soup = BeautifulSoup(r.text, 'lxml')
           
# Find xxxxxxx (row-by-row) and split trailing zeros
row = soup.find_all('td', string="xxxxxxx")
for r in row:
        LE = r.nextSibling
        while LE.name != 'td' and LE is not None:
                LE = LE.nextSibling

The main issue I am having (it is probably super easy and I have just been staring at this for so long now) is that I need to assign the nextSibling to the LE variable.

LE is formatted as "001234" where I need to strip the leading zeros to have "1234" as the variable.

If I print the variable as print(LE.text[2:6]) the result is correct. Implemented into the code as, LE = LE.nextSibling.text[2:6] does not produce anything.

I have tried the following statements, but none work and am hoping for guidance.

LE = LE.nextSibling.text[2:6]
&
LE = LE.text[2:6]

I need this to be assigned to a variable after extracting to utilize the variable later on within my script. I appreciate the help in advance!

EDIT --> included source code:

<tr>
     <td class='label' nowrap title="xxxxxxx">TEXT TO FIND</td>
     <td class='attribute'>001234</td>
</tr>

Solution

  • Change:

    != to ==

     row = soup.find_all('td', string="xxxxxx")
                for r in row:
                    LE = r.nextSibling
                        LE = LE.text[2:6]