I want to replace all div
tags with the class name "figure"
<div class="figure">
<p>Some content.</p>
</div>
with a non-HTML tag (in my case it's a Hugo shortcode)
{{% row %}}
<p>Some content.</p>
{{% /row %}}
It's easy to replace html tags with other html tags, but I have no idea how to do it if there are non-html tags involved.
I cannot see "easy" solution, because the shortcodes can contain /
, <
, >
characters as well, so you cannot have them as part of the document tree.
One solution is to replace the <div class="figure">
with custom tag and at the final replace these custom tags with your shortcodes:
from bs4 import BeautifulSoup
txt = '''
<div>
<div class="figure">
<p>Some content.</p>
</div>
</div>
<div class="figure">
<p>Some other content.</p>
</div>
'''
soup = BeautifulSoup(txt, 'html.parser')
for div in soup.select('div.figure'):
t = soup.new_tag('xxx-row')
t.contents = div.contents
div.replace_with(t)
s = str(soup).replace('<xxx-row>', '{{% row %}}')
s = s.replace('</xxx-row>', '{{% /row %}}')
print(s)
Prints:
<div>
{{% row %}}
<p>Some content.</p>
{{% /row %}}
</div>
{{% row %}}
<p>Some other content.</p>
{{% /row %}}