What is difference between strings and stripped_strings in BeautifulSoup
import requests
from bs4 import BeautifulSoup
url = "https://codewithharry.com"
r = requests.get(url)
htmlcontent = r.content
soup = BeautifulSoup(htmlcontent, 'html.parser')
tags = soup.find(id="imgpreview2")
# Using strings method
for item in tags.strings:
print(item)
# Using stripped_strings method
for item in tags.stripped_strings:
print(item)
Stripped Strings: it omits lines that consist of just empty spaces, and also removes leading and trailing spaces.
Strings: it does not omit lines with spaces or leading trailing space and contain \n etc.
Example:
from bs4 import BeautifulSoup
html = """
<html>
<body>
<div>
<p> This is some text. </p>
<ul>
<li>Item 1</li>
<li>Item 2</li>
</ul>
</div>
</body>
</html>
"""
# Parse the HTML
soup = BeautifulSoup(html, 'html.parser')
div_tag = soup.find('div')
print("strings method:")
for item in div_tag.strings:
print(repr(item))
print("\nUsing stripped_strings method:")
for item in div_tag.stripped_strings:
print(repr(item))
Output:
strings method:
'\n'
' This is some text. '
'\n'
'\n'
'Item 1'
'\n'
'Item 2'
'\n'
'\n'
Using stripped_strings method:
'This is some text.'
'Item 1'
'Item 2'